metadata automation meets: Topics by WorldWideScience.org

Sample records for metadata automation meets

Biomedical word sense disambiguation with ontologies and metadata: automation meets accuracy

Directory of Open Access Journals (Sweden)

Hakenberg Jörg

2009-01-01

Full Text Available Abstract Background Ontology term labels can be ambiguous and have multiple senses. While this is no problem for human annotators, it is a challenge to automated methods, which identify ontology terms in text. Classical approaches to word sense disambiguation use co-occurring words or terms. However, most treat ontologies as simple terminologies, without making use of the ontology structure or the semantic similarity between terms. Another useful source of information for disambiguation are metadata. Here, we systematically compare three approaches to word sense disambiguation, which use ontologies and metadata, respectively. Results The 'Closest Sense' method assumes that the ontology defines multiple senses of the term. It computes the shortest path of co-occurring terms in the document to one of these senses. The 'Term Cooc' method defines a log-odds ratio for co-occurring terms including co-occurrences inferred from the ontology structure. The 'MetaData' approach trains a classifier on metadata. It does not require any ontology, but requires training data, which the other methods do not. To evaluate these approaches we defined a manually curated training corpus of 2600 documents for seven ambiguous terms from the Gene Ontology and MeSH. All approaches over all conditions achieve 80% success rate on average. The 'MetaData' approach performed best with 96%, when trained on high-quality data. Its performance deteriorates as quality of the training data decreases. The 'Term Cooc' approach performs better on Gene Ontology (92% success than on MeSH (73% success as MeSH is not a strict is-a/part-of, but rather a loose is-related-to hierarchy. The 'Closest Sense' approach achieves on average 80% success rate. Conclusion Metadata is valuable for disambiguation, but requires high quality training data. Closest Sense requires no training, but a large, consistently modelled ontology, which are two opposing conditions. Term Cooc achieves greater 90
Extraction of CT dose information from DICOM metadata: automated Matlab-based approach.

Science.gov (United States)

Dave, Jaydev K; Gingold, Eric L

2013-01-01

The purpose of this study was to extract exposure parameters and dose-relevant indexes of CT examinations from information embedded in DICOM metadata. DICOM dose report files were identified and retrieved from a PACS. An automated software program was used to extract from these files information from the structured elements in the DICOM metadata relevant to exposure. Extracting information from DICOM metadata eliminated potential errors inherent in techniques based on optical character recognition, yielding 100% accuracy.
Automating the Extraction of Metadata from Archaeological Data Using iRods Rules

Directory of Open Access Journals (Sweden)

David Walling

2011-10-01

Full Text Available The Texas Advanced Computing Center and the Institute for Classical Archaeology at the University of Texas at Austin developed a method that uses iRods rules and a Jython script to automate the extraction of metadata from digital archaeological data. The first step was to create a record-keeping system to classify the data. The record-keeping system employs file and directory hierarchy naming conventions designed specifically to maintain the relationship between the data objects and map the archaeological documentation process. The metadata implicit in the record-keeping system is automatically extracted upon ingest, combined with additional sources of metadata, and stored alongside the data in the iRods preservation environment. This method enables a more organized workflow for the researchers, helps them archive their data close to the moment of data creation, and avoids error prone manual metadata input. We describe the types of metadata extracted and provide technical details of the extraction process and storage of the data and metadata.
Automated Creation of Datamarts from a Clinical Data Warehouse, Driven by an Active Metadata Repository

Science.gov (United States)

Rogerson, Charles L.; Kohlmiller, Paul H.; Stutman, Harris

1998-01-01

A methodology and toolkit are described which enable the automated metadata-driven creation of datamarts from clinical data warehouses. The software uses schema-to-schema transformation driven by an active metadata repository. Tools for assessing datamart data quality are described, as well as methods for assessing the feasibility of implementing specific datamarts. A methodology for data remediation and the re-engineering of operational data capture is described.
Automated metadata--final project report

Energy Technology Data Exchange (ETDEWEB)

Schissel, David [General Atomics, San Diego, CA (United States)

2016-04-01

This report summarizes the work of the Automated Metadata, Provenance Cataloging, and Navigable Interfaces: Ensuring the Usefulness of Extreme-Scale Data Project (MPO Project) funded by the United States Department of Energy (DOE), Offices of Advanced Scientific Computing Research and Fusion Energy Sciences. Initially funded for three years starting in 2012, it was extended for 6 months with additional funding. The project was a collaboration between scientists at General Atomics, Lawrence Berkley National Laboratory (LBNL), and Massachusetts Institute of Technology (MIT). The group leveraged existing computer science technology where possible, and extended or created new capabilities where required. The MPO project was able to successfully create a suite of software tools that can be used by a scientific community to automatically document their scientific workflows. These tools were integrated into workflows for fusion energy and climate research illustrating the general applicability of the project’s toolkit. Feedback was very positive on the project’s toolkit and the value of such automatic workflow documentation to the scientific endeavor.
Automated metadata--final project report

International Nuclear Information System (INIS)

Schissel, David

2016-01-01

This report summarizes the work of the Automated Metadata, Provenance Cataloging, and Navigable Interfaces: Ensuring the Usefulness of Extreme-Scale Data Project (MPO Project) funded by the United States Department of Energy (DOE), Offices of Advanced Scientific Computing Research and Fusion Energy Sciences. Initially funded for three years starting in 2012, it was extended for 6 months with additional funding. The project was a collaboration between scientists at General Atomics, Lawrence Berkley National Laboratory (LBNL), and Massachusetts Institute of Technology (MIT). The group leveraged existing computer science technology where possible, and extended or created new capabilities where required. The MPO project was able to successfully create a suite of software tools that can be used by a scientific community to automatically document their scientific workflows. These tools were integrated into workflows for fusion energy and climate research illustrating the general applicability of the project's toolkit. Feedback was very positive on the project's toolkit and the value of such automatic workflow documentation to the scientific endeavor.
Active Marine Station Metadata

Data.gov (United States)

National Oceanic and Atmospheric Administration, Department of Commerce — The Active Marine Station Metadata is a daily metadata report for active marine bouy and C-MAN (Coastal Marine Automated Network) platforms from the National Data...
Automated Metadata Extraction

Science.gov (United States)

2008-06-01

Store [4]. The files purchased from the iTunes Music Store include the following metadata. • Name • Email address of purchaser • Year • Album ...6 3. Music : MP3 and AAC .........................................................................7 4. Tagged Image File Format...Expert Group (MPEG) set of standards for music encoding. Open Document Format (ODF) – an open, license-free, and clearly documented file format
Summary Record of the First Meeting of the Radioactive Waste Repository Metadata Management (RepMet) Initiative

International Nuclear Information System (INIS)

2014-01-01

National radioactive waste repository programmes are collecting large amounts of data to support the long-term management of their nations' radioactive wastes. The data and related records increase in number, type and quality as programmes proceed through the successive stages of repository development: pre-siting, siting, characterisation, construction, operation and finally closure. Regulatory and societal approvals are included in this sequence. Some programmes are also documenting past repository projects and facing a challenge in allowing both current and future generations to understand actions carried out in the past. Metadata allows context to be stored with data and information so that it can be located, used, updated and maintained. Metadata helps waste management organisations better utilise their data in carrying out their statutory tasks and can also help verify and demonstrate that their programmes are appropriately driven. The NEA Radioactive Waste Repository Metadata Management (RepMet) initiative aims to bring about a better understanding of the identification and administration of metadata - a key aspect of data management - to support national programmes in managing their radioactive waste repository data, information and records in a way that is both harmonised internationally and suitable for long-term management and use. This is a summary record of the 1. meeting of the RepMet initiative. The actions and decisions from this meeting were sent separately to the group after the meeting, but are also included in this document (Annex A). The list of participants is attached as well (Annex B)
Handbook of metadata, semantics and ontologies

CERN Document Server

Sicilia, Miguel-Angel

2013-01-01

Metadata research has emerged as a discipline cross-cutting many domains, focused on the provision of distributed descriptions (often called annotations) to Web resources or applications. Such associated descriptions are supposed to serve as a foundation for advanced services in many application areas, including search and location, personalization, federation of repositories and automated delivery of information. Indeed, the Semantic Web is in itself a concrete technological framework for ontology-based metadata. For example, Web-based social networking requires metadata describing people and
Creating preservation metadata from XML-metadata profiles

Science.gov (United States)

Ulbricht, Damian; Bertelmann, Roland; Gebauer, Petra; Hasler, Tim; Klump, Jens; Kirchner, Ingo; Peters-Kottig, Wolfgang; Mettig, Nora; Rusch, Beate

2014-05-01

Metadata Encoding and Transmission Standard (METS). To find datasets in future portals and to make use of this data in own scientific work, proper selection of discovery metadata and application metadata is very important. Some XML-metadata profiles are not suitable for preservation, because version changes are very fast and make it nearly impossible to automate the migration. For other XML-metadata profiles schema definitions are changed after publication of the profile or the schema definitions become inaccessible, which might cause problems during validation of the metadata inside the preservation system [2]. Some metadata profiles are not used widely enough and might not even exist in the future. Eventually, discovery and application metadata have to be embedded into the mdWrap-subtree of the METS-XML. [1] http://www.archivematica.org [2] http://dx.doi.org/10.2218/ijdc.v7i1.215
U.S. EPA Metadata Editor (EME)

Data.gov (United States)

U.S. Environmental Protection Agency — The EPA Metadata Editor (EME) allows users to create geospatial metadata that meets EPA's requirements. The tool has been developed as a desktop application that...
Stop the Bleeding: the Development of a Tool to Streamline NASA Earth Science Metadata Curation Efforts

Science.gov (United States)

le Roux, J.; Baker, A.; Caltagirone, S.; Bugbee, K.

2017-12-01

The Common Metadata Repository (CMR) is a high-performance, high-quality repository for Earth science metadata records, and serves as the primary way to search NASA's growing 17.5 petabytes of Earth science data holdings. Released in 2015, CMR has the capability to support several different metadata standards already being utilized by NASA's combined network of Earth science data providers, or Distributed Active Archive Centers (DAACs). The Analysis and Review of CMR (ARC) Team located at Marshall Space Flight Center is working to improve the quality of records already in CMR with the goal of making records optimal for search and discovery. This effort entails a combination of automated and manual review, where each NASA record in CMR is checked for completeness, accuracy, and consistency. This effort is highly collaborative in nature, requiring communication and transparency of findings amongst NASA personnel, DAACs, the CMR team and other metadata curation teams. Through the evolution of this project it has become apparent that there is a need to document and report findings, as well as track metadata improvements in a more efficient manner. The ARC team has collaborated with Element 84 in order to develop a metadata curation tool to meet these needs. In this presentation, we will provide an overview of this metadata curation tool and its current capabilities. Challenges and future plans for the tool will also be discussed.
Viewing and Editing Earth Science Metadata MOBE: Metadata Object Browser and Editor in Java

Science.gov (United States)

Chase, A.; Helly, J.

2002-12-01

Metadata is an important, yet often neglected aspect of successful archival efforts. However, to generate robust, useful metadata is often a time consuming and tedious task. We have been approaching this problem from two directions: first by automating metadata creation, pulling from known sources of data, and in addition, what this (paper/poster?) details, developing friendly software for human interaction with the metadata. MOBE and COBE(Metadata Object Browser and Editor, and Canonical Object Browser and Editor respectively), are Java applications for editing and viewing metadata and digital objects. MOBE has already been designed and deployed, currently being integrated into other areas of the SIOExplorer project. COBE is in the design and development stage, being created with the same considerations in mind as those for MOBE. Metadata creation, viewing, data object creation, and data object viewing, when taken on a small scale are all relatively simple tasks. Computer science however, has an infamous reputation for transforming the simple into complex. As a system scales upwards to become more robust, new features arise and additional functionality is added to the software being written to manage the system. The software that emerges from such an evolution, though powerful, is often complex and difficult to use. With MOBE the focus is on a tool that does a small number of tasks very well. The result has been an application that enables users to manipulate metadata in an intuitive and effective way. This allows for a tool that serves its purpose without introducing additional cognitive load onto the user, an end goal we continue to pursue.
Making Interoperability Easier with the NASA Metadata Management Tool

Science.gov (United States)

Shum, D.; Reese, M.; Pilone, D.; Mitchell, A. E.

2016-12-01

ISO 19115 has enabled interoperability amongst tools, yet many users find it hard to build ISO metadata for their collections because it can be large and overly flexible for their needs. The Metadata Management Tool (MMT), part of NASA's Earth Observing System Data and Information System (EOSDIS), offers users a modern, easy to use browser based tool to develop ISO compliant metadata. Through a simplified UI experience, metadata curators can create and edit collections without any understanding of the complex ISO-19115 format, while still generating compliant metadata. The MMT is also able to assess the completeness of collection level metadata by evaluating it against a variety of metadata standards. The tool provides users with clear guidance as to how to change their metadata in order to improve their quality and compliance. It is based on NASA's Unified Metadata Model for Collections (UMM-C) which is a simpler metadata model which can be cleanly mapped to ISO 19115. This allows metadata authors and curators to meet ISO compliance requirements faster and more accurately. The MMT and UMM-C have been developed in an agile fashion, with recurring end user tests and reviews to continually refine the tool, the model and the ISO mappings. This process is allowing for continual improvement and evolution to meet the community's needs.
How libraries use publisher metadata

Directory of Open Access Journals (Sweden)

Steve Shadle

2013-11-01

Full Text Available With the proliferation of electronic publishing, libraries are increasingly relying on publisher-supplied metadata to meet user needs for discovery in library systems. However, many publisher/content provider staff creating metadata are unaware of the end-user environment and how libraries use their metadata. This article provides an overview of the three primary discovery systems that are used by academic libraries, with examples illustrating how publisher-supplied metadata directly feeds into these systems and is used to support end-user discovery and access. Commonly seen metadata problems are discussed, with recommendations suggested. Based on a series of presentations given in Autumn 2012 to the staff of a large publisher, this article uses the University of Washington Libraries systems and services as illustrative examples. Judging by the feedback received from these presentations, publishers (specifically staff not familiar with the big picture of metadata standards work would benefit from a better understanding of the systems and services libraries provide using the data that is created and managed by publishers.
Correcting Inconsistencies and Errors in Bacterial Genome Metadata Using an Automated Curation Tool in Excel (AutoCurE).

Science.gov (United States)

Schmedes, Sarah E; King, Jonathan L; Budowle, Bruce

2015-01-01

Whole-genome data are invaluable for large-scale comparative genomic studies. Current sequencing technologies have made it feasible to sequence entire bacterial genomes with relative ease and time with a substantially reduced cost per nucleotide, hence cost per genome. More than 3,000 bacterial genomes have been sequenced and are available at the finished status. Publically available genomes can be readily downloaded; however, there are challenges to verify the specific supporting data contained within the download and to identify errors and inconsistencies that may be present within the organizational data content and metadata. AutoCurE, an automated tool for bacterial genome database curation in Excel, was developed to facilitate local database curation of supporting data that accompany downloaded genomes from the National Center for Biotechnology Information. AutoCurE provides an automated approach to curate local genomic databases by flagging inconsistencies or errors by comparing the downloaded supporting data to the genome reports to verify genome name, RefSeq accession numbers, the presence of archaea, BioProject/UIDs, and sequence file descriptions. Flags are generated for nine metadata fields if there are inconsistencies between the downloaded genomes and genomes reports and if erroneous or missing data are evident. AutoCurE is an easy-to-use tool for local database curation for large-scale genome data prior to downstream analyses.
A programmatic view of metadata, metadata services, and metadata flow in ATLAS

International Nuclear Information System (INIS)

Malon, D; Albrand, S; Gallas, E; Stewart, G

2012-01-01

The volume and diversity of metadata in an experiment of the size and scope of ATLAS are considerable. Even the definition of metadata may seem context-dependent: data that are primary for one purpose may be metadata for another. ATLAS metadata services must integrate and federate information from inhomogeneous sources and repositories, map metadata about logical or physics constructs to deployment and production constructs, provide a means to associate metadata at one level of granularity with processing or decision-making at another, offer a coherent and integrated view to physicists, and support both human use and programmatic access. In this paper we consider ATLAS metadata, metadata services, and metadata flow principally from the illustrative perspective of how disparate metadata are made available to executing jobs and, conversely, how metadata generated by such jobs are returned. We describe how metadata are read, how metadata are cached, and how metadata generated by jobs and the tasks of which they are a part are communicated, associated with data products, and preserved. We also discuss the principles that guide decision-making about metadata storage, replication, and access.
Document Classification in Support of Automated Metadata Extraction Form Heterogeneous Collections

Science.gov (United States)

Flynn, Paul K.

2014-01-01

A number of federal agencies, universities, laboratories, and companies are placing their documents online and making them searchable via metadata fields such as author, title, and publishing organization. To enable this, every document in the collection must be catalogued using the metadata fields. Though time consuming, the task of identifying…
Building a Disciplinary Metadata Standards Directory

Directory of Open Access Journals (Sweden)

Alexander Ball

2014-07-01

Full Text Available The Research Data Alliance (RDA Metadata Standards Directory Working Group (MSDWG is building a directory of descriptive, discipline-specific metadata standards. The purpose of the directory is to promote the discovery, access and use of such standards, thereby improving the state of research data interoperability and reducing duplicative standards development work.This work builds upon the UK Digital Curation Centre's Disciplinary Metadata Catalogue, a resource created with much the same aim in mind. The first stage of the MSDWG's work was to update and extend the information contained in the catalogue. In the current, second stage, a new platform is being developed in order to extend the functionality of the directory beyond that of the catalogue, and to make it easier to maintain and sustain. Future work will include making the directory more amenable to use by automated tools.

Mercury Toolset for Spatiotemporal Metadata

Science.gov (United States)

Devarakonda, Ranjeet; Palanisamy, Giri; Green, James; Wilson, Bruce; Rhyne, B. Timothy; Lindsley, Chris

2010-06-01

Mercury (http://mercury.ornl.gov) is a set of tools for federated harvesting, searching, and retrieving metadata, particularly spatiotemporal metadata. Version 3.0 of the Mercury toolset provides orders of magnitude improvements in search speed, support for additional metadata formats, integration with Google Maps for spatial queries, facetted type search, support for RSS (Really Simple Syndication) delivery of search results, and enhanced customization to meet the needs of the multiple projects that use Mercury. It provides a single portal to very quickly search for data and information contained in disparate data management systems, each of which may use different metadata formats. Mercury harvests metadata and key data from contributing project servers distributed around the world and builds a centralized index. The search interfaces then allow the users to perform a variety of fielded, spatial, and temporal searches across these metadata sources. This centralized repository of metadata with distributed data sources provides extremely fast search results to the user, while allowing data providers to advertise the availability of their data and maintain complete control and ownership of that data. Mercury periodically (typically daily)harvests metadata sources through a collection of interfaces and re-indexes these metadata to provide extremely rapid search capabilities, even over collections with tens of millions of metadata records. A number of both graphical and application interfaces have been constructed within Mercury, to enable both human users and other computer programs to perform queries. Mercury was also designed to support multiple different projects, so that the particular fields that can be queried and used with search filters are easy to configure for each different project.
Mercury Toolset for Spatiotemporal Metadata

Science.gov (United States)

Wilson, Bruce E.; Palanisamy, Giri; Devarakonda, Ranjeet; Rhyne, B. Timothy; Lindsley, Chris; Green, James

2010-01-01

Mercury (http://mercury.ornl.gov) is a set of tools for federated harvesting, searching, and retrieving metadata, particularly spatiotemporal metadata. Version 3.0 of the Mercury toolset provides orders of magnitude improvements in search speed, support for additional metadata formats, integration with Google Maps for spatial queries, facetted type search, support for RSS (Really Simple Syndication) delivery of search results, and enhanced customization to meet the needs of the multiple projects that use Mercury. It provides a single portal to very quickly search for data and information contained in disparate data management systems, each of which may use different metadata formats. Mercury harvests metadata and key data from contributing project servers distributed around the world and builds a centralized index. The search interfaces then allow the users to perform a variety of fielded, spatial, and temporal searches across these metadata sources. This centralized repository of metadata with distributed data sources provides extremely fast search results to the user, while allowing data providers to advertise the availability of their data and maintain complete control and ownership of that data. Mercury periodically (typically daily) harvests metadata sources through a collection of interfaces and re-indexes these metadata to provide extremely rapid search capabilities, even over collections with tens of millions of metadata records. A number of both graphical and application interfaces have been constructed within Mercury, to enable both human users and other computer programs to perform queries. Mercury was also designed to support multiple different projects, so that the particular fields that can be queried and used with search filters are easy to configure for each different project.
Metadata Wizard: an easy-to-use tool for creating FGDC-CSDGM metadata for geospatial datasets in ESRI ArcGIS Desktop

Science.gov (United States)

Ignizio, Drew A.; O'Donnell, Michael S.; Talbert, Colin B.

2014-01-01

Creating compliant metadata for scientific data products is mandated for all federal Geographic Information Systems professionals and is a best practice for members of the geospatial data community. However, the complexity of the The Federal Geographic Data Committee’s Content Standards for Digital Geospatial Metadata, the limited availability of easy-to-use tools, and recent changes in the ESRI software environment continue to make metadata creation a challenge. Staff at the U.S. Geological Survey Fort Collins Science Center have developed a Python toolbox for ESRI ArcDesktop to facilitate a semi-automated workflow to create and update metadata records in ESRI’s 10.x software. The U.S. Geological Survey Metadata Wizard tool automatically populates several metadata elements: the spatial reference, spatial extent, geospatial presentation format, vector feature count or raster column/row count, native system/processing environment, and the metadata creation date. Once the software auto-populates these elements, users can easily add attribute definitions and other relevant information in a simple Graphical User Interface. The tool, which offers a simple design free of esoteric metadata language, has the potential to save many government and non-government organizations a significant amount of time and costs by facilitating the development of The Federal Geographic Data Committee’s Content Standards for Digital Geospatial Metadata compliant metadata for ESRI software users. A working version of the tool is now available for ESRI ArcDesktop, version 10.0, 10.1, and 10.2 (downloadable at http:/www.sciencebase.gov/metadatawizard).
Metadata

CERN Document Server

Zeng, Marcia Lei

2016-01-01

Metadata remains the solution for describing the explosively growing, complex world of digital information, and continues to be of paramount importance for information professionals. Providing a solid grounding in the variety and interrelationships among different metadata types, Zeng and Qin's thorough revision of their benchmark text offers a comprehensive look at the metadata schemas that exist in the world of library and information science and beyond, as well as the contexts in which they operate. Cementing its value as both an LIS text and a handy reference for professionals already in the field, this book: * Lays out the fundamentals of metadata, including principles of metadata, structures of metadata vocabularies, and metadata descriptions * Surveys metadata standards and their applications in distinct domains and for various communities of metadata practice * Examines metadata building blocks, from modelling to defining properties, and from designing application profiles to implementing value vocabu...
Standardizing metadata and taxonomic identification in metabarcoding studies

NARCIS (Netherlands)

Tedersoo, Leho; Ramirez, Kelly; Nilsson, R; Kaljuvee, Aivi; Koljalg, Urmas; Abarenkov, Kessy

2015-01-01

High-throughput sequencing-based metabarcoding studies produce vast amounts of ecological data, but a lack of consensus on standardization of metadata and how to refer to the species recovered severely hampers reanalysis and comparisons among studies. Here we propose an automated workflow covering
A Programmatic View of Metadata, Metadata Services, and Metadata Flow in ATLAS

CERN Multimedia

CERN. Geneva

2012-01-01

The volume and diversity of metadata in an experiment of the size and scope of ATLAS is considerable. Even the definition of metadata may seem context-dependent: data that are primary for one purpose may be metadata for another. Trigger information and data from the Large Hadron Collider itself provide cases in point, but examples abound. Metadata about logical or physics constructs, such as data-taking periods and runs and luminosity blocks and events and algorithms, often need to be mapped to deployment and production constructs, such as datasets and jobs and files and software versions, and vice versa. Metadata at one level of granularity may have implications at another. ATLAS metadata services must integrate and federate information from inhomogeneous sources and repositories, map metadata about logical or physics constructs to deployment and production constructs, provide a means to associate metadata at one level of granularity with processing or decision-making at another, offer a coherent and ...
Enhancing Seismic Calibration Research Through Software Automation and Scientific Information Management

Energy Technology Data Exchange (ETDEWEB)

Ruppert, S D; Dodge, D A; Ganzberger, M D; Harris, D B; Hauk, T F

2009-07-07

The National Nuclear Security Administration (NNSA) Ground-Based Nuclear Explosion Monitoring Research and Development (GNEMRD) Program at LLNL continues to make significant progress enhancing the process of deriving seismic calibrations and performing scientific integration, analysis, and information management with software automation tools. Our tool efforts address the problematic issues of very large datasets and varied formats encountered during seismic calibration research. New information management and analysis tools have resulted in demonstrated gains in efficiency of producing scientific data products and improved accuracy of derived seismic calibrations. In contrast to previous years, software development work this past year has emphasized development of automation at the data ingestion level. This change reflects a gradually-changing emphasis in our program from processing a few large data sets that result in a single integrated delivery, to processing many different data sets from a variety of sources. The increase in the number of sources had resulted in a large increase in the amount of metadata relative to the final volume of research products. Software developed this year addresses the problems of: (1) Efficient metadata ingestion and conflict resolution; (2) Automated ingestion of bulletin information; (3) Automated ingestion of waveform information from global data centers; and (4) Site Metadata and Response transformation required for certain products. This year, we also made a significant step forward in meeting a long-standing goal of developing and using a waveform correlation framework. Our objective for such a framework is to extract additional calibration data (e.g. mining blasts) and to study the extent to which correlated seismicity can be found in global and regional scale environments.
Leveraging Python to improve ebook metadata selection, ingest, and management

Directory of Open Access Journals (Sweden)

Kelly Thompson

2017-10-01

Full Text Available Libraries face many challenges in managing descriptive metadata for ebooks, including quality control, completeness of coverage, and ongoing management. The recent emergence of library management systems that automatically provide descriptive metadata for e-resources activated in system knowledge bases means that ebook management models are moving toward both greater efficiency and more complex implementation and maintenance choices. Automated and data-driven processes for ebook management have always been desirable, but in the current environment, they become necessary. In addition to initial selection of a record source, automation can be applied to quality control processes and ongoing maintenance in order to keep manual, eyes-on work to a minimum while providing the best possible discovery and access. In this article, we describe how we are using Python scripts to address these challenges.
Improvements to the Ontology-based Metadata Portal for Unified Semantics (OlyMPUS)

Science.gov (United States)

Linsinbigler, M. A.; Gleason, J. L.; Huffer, E.

2016-12-01

The Ontology-based Metadata Portal for Unified Semantics (OlyMPUS), funded by the NASA Earth Science Technology Office Advanced Information Systems Technology program, is an end-to-end system designed to support Earth Science data consumers and data providers, enabling the latter to register data sets and provision them with the semantically rich metadata that drives the Ontology-Driven Interactive Search Environment for Earth Sciences (ODISEES). OlyMPUS complements the ODISEES' data discovery system with an intelligent tool to enable data producers to auto-generate semantically enhanced metadata and upload it to the metadata repository that drives ODISEES. Like ODISEES, the OlyMPUS metadata provisioning tool leverages robust semantics, a NoSQL database and query engine, an automated reasoning engine that performs first- and second-order deductive inferencing, and uses a controlled vocabulary to support data interoperability and automated analytics. The ODISEES data discovery portal leverages this metadata to provide a seamless data discovery and access experience for data consumers who are interested in comparing and contrasting the multiple Earth science data products available across NASA data centers. Olympus will support scientists' services and tools for performing complex analyses and identifying correlations and non-obvious relationships across all types of Earth System phenomena using the full spectrum of NASA Earth Science data available. By providing an intelligent discovery portal that supplies users - both human users and machines - with detailed information about data products, their contents and their structure, ODISEES will reduce the level of effort required to identify and prepare large volumes of data for analysis. This poster will explain how OlyMPUS leverages deductive reasoning and other technologies to create an integrated environment for generating and exploiting semantically rich metadata.
Metadata

CERN Document Server

Pomerantz, Jeffrey

2015-01-01

When "metadata" became breaking news, appearing in stories about surveillance by the National Security Agency, many members of the public encountered this once-obscure term from information science for the first time. Should people be reassured that the NSA was "only" collecting metadata about phone calls -- information about the caller, the recipient, the time, the duration, the location -- and not recordings of the conversations themselves? Or does phone call metadata reveal more than it seems? In this book, Jeffrey Pomerantz offers an accessible and concise introduction to metadata. In the era of ubiquitous computing, metadata has become infrastructural, like the electrical grid or the highway system. We interact with it or generate it every day. It is not, Pomerantz tell us, just "data about data." It is a means by which the complexity of an object is represented in a simpler form. For example, the title, the author, and the cover art are metadata about a book. When metadata does its job well, it fades i...
Automated Atmospheric Composition Dataset Level Metadata Discovery. Difficulties and Surprises

Science.gov (United States)

Strub, R. F.; Falke, S. R.; Kempler, S.; Fialkowski, E.; Goussev, O.; Lynnes, C.

2015-12-01

The Atmospheric Composition Portal (ACP) is an aggregator and curator of information related to remotely sensed atmospheric composition data and analysis. It uses existing tools and technologies and, where needed, enhances those capabilities to provide interoperable access, tools, and contextual guidance for scientists and value-adding organizations using remotely sensed atmospheric composition data. The initial focus is on Essential Climate Variables identified by the Global Climate Observing System - CH4, CO, CO2, NO2, O3, SO2 and aerosols. This poster addresses our efforts in building the ACP Data Table, an interface to help discover and understand remotely sensed data that are related to atmospheric composition science and applications. We harvested GCMD, CWIC, GEOSS metadata catalogs using machine to machine technologies - OpenSearch, Web Services. We also manually investigated the plethora of CEOS data providers portals and other catalogs where that data might be aggregated. This poster is our experience of the excellence, variety, and challenges we encountered.Conclusions:1.The significant benefits that the major catalogs provide are their machine to machine tools like OpenSearch and Web Services rather than any GUI usability improvements due to the large amount of data in their catalog.2.There is a trend at the large catalogs towards simulating small data provider portals through advanced services. 3.Populating metadata catalogs using ISO19115 is too complex for users to do in a consistent way, difficult to parse visually or with XML libraries, and too complex for Java XML binders like CASTOR.4.The ability to search for Ids first and then for data (GCMD and ECHO) is better for machine to machine operations rather than the timeouts experienced when returning the entire metadata entry at once. 5.Metadata harvest and export activities between the major catalogs has led to a significant amount of duplication. (This is currently being addressed) 6.Most (if not
Improving Access to NASA Earth Science Data through Collaborative Metadata Curation

Science.gov (United States)

Sisco, A. W.; Bugbee, K.; Shum, D.; Baynes, K.; Dixon, V.; Ramachandran, R.

2017-12-01

The NASA-developed Common Metadata Repository (CMR) is a high-performance metadata system that currently catalogs over 375 million Earth science metadata records. It serves as the authoritative metadata management system of NASA's Earth Observing System Data and Information System (EOSDIS), enabling NASA Earth science data to be discovered and accessed by a worldwide user community. The size of the EOSDIS data archive is steadily increasing, and the ability to manage and query this archive depends on the input of high quality metadata to the CMR. Metadata that does not provide adequate descriptive information diminishes the CMR's ability to effectively find and serve data to users. To address this issue, an innovative and collaborative review process is underway to systematically improve the completeness, consistency, and accuracy of metadata for approximately 7,000 data sets archived by NASA's twelve EOSDIS data centers, or Distributed Active Archive Centers (DAACs). The process involves automated and manual metadata assessment of both collection and granule records by a team of Earth science data specialists at NASA Marshall Space Flight Center. The team communicates results to DAAC personnel, who then make revisions and reingest improved metadata into the CMR. Implementation of this process relies on a network of interdisciplinary collaborators leveraging a variety of communication platforms and long-range planning strategies. Curating metadata at this scale and resolving metadata issues through community consensus improves the CMR's ability to serve current and future users and also introduces best practices for stewarding the next generation of Earth Observing System data. This presentation will detail the metadata curation process, its outcomes thus far, and also share the status of ongoing curation activities.
Metadata capture in an electronic notebook: How to make it as simple as possible?

Directory of Open Access Journals (Sweden)

Menzel, Julia

2015-09-01

Full Text Available In the last few years electronic laboratory notebooks (ELNs have become popular. ELNs offer the great possibility to capture metadata automatically. Due to the high documentation effort metadata documentation is neglected in science. To close the gap between good data documentation and high documentation effort for the scientists a first user-friendly solution to capture metadata in an easy way was developed.At first, different protocols for the Western Blot were collected within the Collaborative Research Center 1002 and analyzed. Together with existing metadata standards identified in a literature search a first version of the metadata scheme was developed. Secondly, the metadata scheme was customized for future users including the implementation of default values for automated metadata documentation.Twelve protocols for the Western Blot were used to construct one standard protocol with ten different experimental steps. Three already existing metadata standards were used as models to construct the first version of the metadata scheme consisting of 133 data fields in ten experimental steps. Through a revision with future users the final metadata scheme was shortened to 90 items in three experimental steps. Using individualized default values 51.1% of the metadata can be captured with present values in the ELN.This lowers the data documentation effort. At the same time, researcher could benefit by providing standardized metadata for data sharing and re-use.
Log-Less Metadata Management on Metadata Server for Parallel File Systems

Directory of Open Access Journals (Sweden)

Jianwei Liao

2014-01-01

Full Text Available This paper presents a novel metadata management mechanism on the metadata server (MDS for parallel and distributed file systems. In this technique, the client file system backs up the sent metadata requests, which have been handled by the metadata server, so that the MDS does not need to log metadata changes to nonvolatile storage for achieving highly available metadata service, as well as better performance improvement in metadata processing. As the client file system backs up certain sent metadata requests in its memory, the overhead for handling these backup requests is much smaller than that brought by the metadata server, while it adopts logging or journaling to yield highly available metadata service. The experimental results show that this newly proposed mechanism can significantly improve the speed of metadata processing and render a better I/O data throughput, in contrast to conventional metadata management schemes, that is, logging or journaling on MDS. Besides, a complete metadata recovery can be achieved by replaying the backup logs cached by all involved clients, when the metadata server has crashed or gone into nonoperational state exceptionally.
Metadata Dictionary Database: A Proposed Tool for Academic Library Metadata Management

Science.gov (United States)

Southwick, Silvia B.; Lampert, Cory

2011-01-01

This article proposes a metadata dictionary (MDD) be used as a tool for metadata management. The MDD is a repository of critical data necessary for managing metadata to create "shareable" digital collections. An operational definition of metadata management is provided. The authors explore activities involved in metadata management in…
SM4AM: A Semantic Metamodel for Analytical Metadata

DEFF Research Database (Denmark)

Varga, Jovan; Romero, Oscar; Pedersen, Torben Bach

2014-01-01

Next generation BI systems emerge as platforms where traditional BI tools meet semi-structured and unstructured data coming from the Web. In these settings, the user-centric orientation represents a key characteristic for the acceptance and wide usage by numerous and diverse end users in their data....... We present SM4AM, a Semantic Metamodel for Analytical Metadata created as an RDF formalization of the Analytical Metadata artifacts needed for user assistance exploitation purposes in next generation BI systems. We consider the Linked Data initiative and its relevance for user assistance...
Standardizing metadata and taxonomic identification in metabarcoding studies.

Science.gov (United States)

Tedersoo, Leho; Ramirez, Kelly S; Nilsson, R Henrik; Kaljuvee, Aivi; Kõljalg, Urmas; Abarenkov, Kessy

2015-01-01

High-throughput sequencing-based metabarcoding studies produce vast amounts of ecological data, but a lack of consensus on standardization of metadata and how to refer to the species recovered severely hampers reanalysis and comparisons among studies. Here we propose an automated workflow covering data submission, compression, storage and public access to allow easy data retrieval and inter-study communication. Such standardized and readily accessible datasets facilitate data management, taxonomic comparisons and compilation of global metastudies.
Towards an Interoperable Field Spectroscopy Metadata Standard with Extended Support for Marine Specific Applications

Directory of Open Access Journals (Sweden)

Barbara A. Rasaiah

2015-11-01

Full Text Available This paper presents an approach to developing robust metadata standards for specific applications that serves to ensure a high level of reliability and interoperability for a spectroscopy dataset. The challenges of designing a metadata standard that meets the unique requirements of specific user communities are examined, including in situ measurement of reflectance underwater, using coral as a case in point. Metadata schema mappings from seven existing metadata standards demonstrate that they consistently fail to meet the needs of field spectroscopy scientists for general and specific applications (μ = 22%, σ = 32% conformance with the core metadata requirements and μ = 19%, σ = 18% for the special case of a benthic (e.g., coral reflectance metadataset. Issues such as field measurement methods, instrument calibration, and data representativeness for marine field spectroscopy campaigns are investigated within the context of submerged benthic measurements. The implication of semantics and syntax for a robust and flexible metadata standard are also considered. A hybrid standard that serves as a “best of breed” incorporating useful modules and parameters within the standards is proposed. This paper is Part 3 in a series of papers in this journal, examining the issues central to a metadata standard for field spectroscopy datasets. The results presented in this paper are an important step towards field spectroscopy metadata standards that address the specific needs of field spectroscopy data stakeholders while facilitating dataset documentation, quality assurance, discoverability and data exchange within large-scale information sharing platforms.
THE NEW ONLINE METADATA EDITOR FOR GENERATING STRUCTURED METADATA

Energy Technology Data Exchange (ETDEWEB)

Devarakonda, Ranjeet [ORNL; Shrestha, Biva [ORNL; Palanisamy, Giri [ORNL; Hook, Leslie A [ORNL; Killeffer, Terri S [ORNL; Boden, Thomas A [ORNL; Cook, Robert B [ORNL; Zolly, Lisa [United States Geological Service (USGS); Hutchison, Viv [United States Geological Service (USGS); Frame, Mike [United States Geological Service (USGS); Cialella, Alice [Brookhaven National Laboratory (BNL); Lazer, Kathy [Brookhaven National Laboratory (BNL)

2014-01-01

Nobody is better suited to describe data than the scientist who created it. This description about a data is called Metadata. In general terms, Metadata represents the who, what, when, where, why and how of the dataset [1]. eXtensible Markup Language (XML) is the preferred output format for metadata, as it makes it portable and, more importantly, suitable for system discoverability. The newly developed ORNL Metadata Editor (OME) is a Web-based tool that allows users to create and maintain XML files containing key information, or metadata, about the research. Metadata include information about the specific projects, parameters, time periods, and locations associated with the data. Such information helps put the research findings in context. In addition, the metadata produced using OME will allow other researchers to find these data via Metadata clearinghouses like Mercury [2][4]. OME is part of ORNL s Mercury software fleet [2][3]. It was jointly developed to support projects funded by the United States Geological Survey (USGS), U.S. Department of Energy (DOE), National Aeronautics and Space Administration (NASA) and National Oceanic and Atmospheric Administration (NOAA). OME s architecture provides a customizable interface to support project-specific requirements. Using this new architecture, the ORNL team developed OME instances for USGS s Core Science Analytics, Synthesis, and Libraries (CSAS&L), DOE s Next Generation Ecosystem Experiments (NGEE) and Atmospheric Radiation Measurement (ARM) Program, and the international Surface Ocean Carbon Dioxide ATlas (SOCAT). Researchers simply use the ORNL Metadata Editor to enter relevant metadata into a Web-based form. From the information on the form, the Metadata Editor can create an XML file on the server that the editor is installed or to the user s personal computer. Researchers can also use the ORNL Metadata Editor to modify existing XML metadata files. As an example, an NGEE Arctic scientist use OME to register
OlyMPUS - The Ontology-based Metadata Portal for Unified Semantics

Science.gov (United States)

Huffer, E.; Gleason, J. L.

2015-12-01

The Ontology-based Metadata Portal for Unified Semantics (OlyMPUS), funded by the NASA Earth Science Technology Office Advanced Information Systems Technology program, is an end-to-end system designed to support data consumers and data providers, enabling the latter to register their data sets and provision them with the semantically rich metadata that drives the Ontology-Driven Interactive Search Environment for Earth Sciences (ODISEES). OlyMPUS leverages the semantics and reasoning capabilities of ODISEES to provide data producers with a semi-automated interface for producing the semantically rich metadata needed to support ODISEES' data discovery and access services. It integrates the ODISEES metadata search system with multiple NASA data delivery tools to enable data consumers to create customized data sets for download to their computers, or for NASA Advanced Supercomputing (NAS) facility registered users, directly to NAS storage resources for access by applications running on NAS supercomputers. A core function of NASA's Earth Science Division is research and analysis that uses the full spectrum of data products available in NASA archives. Scientists need to perform complex analyses that identify correlations and non-obvious relationships across all types of Earth System phenomena. Comprehensive analytics are hindered, however, by the fact that many Earth science data products are disparate and hard to synthesize. Variations in how data are collected, processed, gridded, and stored, create challenges for data interoperability and synthesis, which are exacerbated by the sheer volume of available data. Robust, semantically rich metadata can support tools for data discovery and facilitate machine-to-machine transactions with services such as data subsetting, regridding, and reformatting. Such capabilities are critical to enabling the research activities integral to NASA's strategic plans. However, as metadata requirements increase and competing standards emerge

Evolution in Metadata Quality: Common Metadata Repository's Role in NASA Curation Efforts

Science.gov (United States)

Gilman, Jason; Shum, Dana; Baynes, Katie

2016-01-01

Metadata Quality is one of the chief drivers of discovery and use of NASA EOSDIS (Earth Observing System Data and Information System) data. Issues with metadata such as lack of completeness, inconsistency, and use of legacy terms directly hinder data use. As the central metadata repository for NASA Earth Science data, the Common Metadata Repository (CMR) has a responsibility to its users to ensure the quality of CMR search results. This poster covers how we use humanizers, a technique for dealing with the symptoms of metadata issues, as well as our plans for future metadata validation enhancements. The CMR currently indexes 35K collections and 300M granules.
Automation Problems of 1968; Papers Presented at the Meeting...October 4-5, 1968.

Science.gov (United States)

Andrews, Theodora, Ed.

Librarians and their concerned colleagues met to give, hear and discuss papers on library automation, primarily by computers. Noted at this second meeting on library automation were: (1) considerably more sophistication and casualness about the techniques involved, (2) considerably more assurance of what and where things can be applied and (3)…
Metadata Creation, Management and Search System for your Scientific Data

Science.gov (United States)

Devarakonda, R.; Palanisamy, G.

2012-12-01

Mercury Search Systems is a set of tools for creating, searching, and retrieving of biogeochemical metadata. Mercury toolset provides orders of magnitude improvements in search speed, support for any metadata format, integration with Google Maps for spatial queries, multi-facetted type search, search suggestions, support for RSS (Really Simple Syndication) delivery of search results, and enhanced customization to meet the needs of the multiple projects that use Mercury. Mercury's metadata editor provides a easy way for creating metadata and Mercury's search interface provides a single portal to search for data and information contained in disparate data management systems, each of which may use any metadata format including FGDC, ISO-19115, Dublin-Core, Darwin-Core, DIF, ECHO, and EML. Mercury harvests metadata and key data from contributing project servers distributed around the world and builds a centralized index. The search interfaces then allow the users to perform a variety of fielded, spatial, and temporal searches across these metadata sources. This centralized repository of metadata with distributed data sources provides extremely fast search results to the user, while allowing data providers to advertise the availability of their data and maintain complete control and ownership of that data. Mercury is being used more than 14 different projects across 4 federal agencies. It was originally developed for NASA, with continuing development funded by NASA, USGS, and DOE for a consortium of projects. Mercury search won the NASA's Earth Science Data Systems Software Reuse Award in 2008. References: R. Devarakonda, G. Palanisamy, B.E. Wilson, and J.M. Green, "Mercury: reusable metadata management data discovery and access system", Earth Science Informatics, vol. 3, no. 1, pp. 87-94, May 2010. R. Devarakonda, G. Palanisamy, J.M. Green, B.E. Wilson, "Data sharing and retrieval using OAI-PMH", Earth Science Informatics DOI: 10.1007/s12145-010-0073-0, (2010);
ATLAS Metadata Interface (AMI), a generic metadata framework

CERN Document Server

Fulachier, Jerome; The ATLAS collaboration

2016-01-01

The ATLAS Metadata Interface (AMI) is a mature application of more than 15 years of existence. Mainly used by the ATLAS experiment at CERN, it consists of a very generic tool ecosystem for metadata aggregation and cataloguing. We briefly describe the architecture, the main services and the benefits of using AMI in big collaborations, especially for high energy physics. We focus on the recent improvements, for instance: the lightweight clients (Python, Javascript, C++), the new smart task server system and the Web 2.0 AMI framework for simplifying the development of metadata-oriented web interfaces.
Distributed metadata servers for cluster file systems using shared low latency persistent key-value metadata store

Science.gov (United States)

Bent, John M.; Faibish, Sorin; Pedone, Jr., James M.; Tzelnic, Percy; Ting, Dennis P. J.; Ionkov, Latchesar A.; Grider, Gary

2017-12-26

A cluster file system is provided having a plurality of distributed metadata servers with shared access to one or more shared low latency persistent key-value metadata stores. A metadata server comprises an abstract storage interface comprising a software interface module that communicates with at least one shared persistent key-value metadata store providing a key-value interface for persistent storage of key-value metadata. The software interface module provides the key-value metadata to the at least one shared persistent key-value metadata store in a key-value format. The shared persistent key-value metadata store is accessed by a plurality of metadata servers. A metadata request can be processed by a given metadata server independently of other metadata servers in the cluster file system. A distributed metadata storage environment is also disclosed that comprises a plurality of metadata servers having an abstract storage interface to at least one shared persistent key-value metadata store.
Automating standards based metadata creation using free and open source GIS tools

NARCIS (Netherlands)

Ellull, C.D.; Tamash, N.; Xian, F.; Stuiver, H.J.; Rickles, P.

2013-01-01

The importance of understanding the quality of data used in any GIS operation should not be underestimated. Metadata (data about data) traditionally provides a description of this quality information, but it is frequently deemed as complex to create and maintain. Additionally, it is generally stored
Improving Scientific Metadata Interoperability And Data Discoverability using OAI-PMH

Science.gov (United States)

Devarakonda, Ranjeet; Palanisamy, Giri; Green, James M.; Wilson, Bruce E.

2010-12-01

the lessons learned. References: [1] R. Devarakonda, G. Palanisamy, B.E. Wilson, and J.M. Green, "Mercury: reusable metadata management data discovery and access system", Earth Science Informatics, vol. 3, no. 1, pp. 87-94, May 2010. [2] R. Devarakonda, G. Palanisamy, J.M. Green, B.E. Wilson, "Data sharing and retrieval using OAI-PMH", Earth Science Informatics DOI: 10.1007/s12145-010-0073-0, (2010). [3] Devarakonda, R.; Palanisamy, G.; Green, J.; Wilson, B. E. "Mercury: An Example of Effective Software Reuse for Metadata Management Data Discovery and Access", Eos Trans. AGU, 89(53), Fall Meet. Suppl., IN11A-1019 (2008).
Harvesting NASA's Common Metadata Repository

Science.gov (United States)

Shum, D.; Mitchell, A. E.; Durbin, C.; Norton, J.

2017-12-01

As part of NASA's Earth Observing System Data and Information System (EOSDIS), the Common Metadata Repository (CMR) stores metadata for over 30,000 datasets from both NASA and international providers along with over 300M granules. This metadata enables sub-second discovery and facilitates data access. While the CMR offers a robust temporal, spatial and keyword search functionality to the general public and international community, it is sometimes more desirable for international partners to harvest the CMR metadata and merge the CMR metadata into a partner's existing metadata repository. This poster will focus on best practices to follow when harvesting CMR metadata to ensure that any changes made to the CMR can also be updated in a partner's own repository. Additionally, since each partner has distinct metadata formats they are able to consume, the best practices will also include guidance on retrieving the metadata in the desired metadata format using CMR's Unified Metadata Model translation software.
Playing the Metadata Game: Technologies and Strategies Used by Climate Diagnostics Center for Cataloging and Distributing Climate Data.

Science.gov (United States)

Schweitzer, R. H.

2001-05-01

The Climate Diagnostics Center maintains a collection of gridded climate data primarily for use by local researchers. Because this data is available on fast digital storage and because it has been converted to netCDF using a standard metadata convention (called COARDS), we recognize that this data collection is also useful to the community at large. At CDC we try to use technology and metadata standards to reduce our costs associated with making these data available to the public. The World Wide Web has been an excellent technology platform for meeting that goal. Specifically we have developed Web-based user interfaces that allow users to search, plot and download subsets from the data collection. We have also been exploring use of the Pacific Marine Environment Laboratory's Live Access Server (LAS) as an engine for this task. This would result in further savings by allowing us to concentrate on customizing the LAS where needed, rather that developing and maintaining our own system. One such customization currently under development is the use of Java Servlets and JavaServer pages in conjunction with a metadata database to produce a hierarchical user interface to LAS. In addition to these Web-based user interfaces all of our data are available via the Distributed Oceanographic Data System (DODS). This allows other sites using LAS and individuals using DODS-enabled clients to use our data as if it were a local file. All of these technology systems are driven by metadata. When we began to create netCDF files, we collaborated with several other agencies to develop a netCDF convention (COARDS) for metadata. At CDC we have extended that convention to incorporate additional metadata elements to make the netCDF files as self-describing as possible. Part of the local metadata is a set of controlled names for the variable, level in the atmosphere and ocean, statistic and data set for each netCDF file. To allow searching and easy reorganization of these metadata, we loaded
Geo-Enrichment and Semantic Enhancement of Metadata Sets to Augment Discovery in Geoportals

Directory of Open Access Journals (Sweden)

Bernhard Vockner

2014-03-01

Full Text Available Geoportals are established to function as main gateways to find, evaluate, and start “using” geographic information. Still, current geoportal implementations face problems in optimizing the discovery process due to semantic heterogeneity issues, which leads to low recall and low precision in performing text-based searches. Therefore, we propose an enhanced semantic discovery approach that supports multilingualism and information domain context. Thus, we present workflow that enriches existing structured metadata with synonyms, toponyms, and translated terms derived from user-defined keywords based on multilingual thesauri and ontologies. To make the results easier and understandable, we also provide automated translation capabilities for the resource metadata to support the user in conceiving the thematic content of the descriptive metadata, even if it has been documented using a language the user is not familiar with. In addition, to text-enable spatial filtering capabilities, we add additional location name keywords to metadata sets. These are based on the existing bounding box and shall tweak discovery scores when performing single text line queries. In order to improve the user’s search experience, we tailor faceted search strategies presenting an enhanced query interface for geo-metadata discovery that are transparently leveraging the underlying thesauri and ontologies.
ATLAS Metadata Interface (AMI), a generic metadata framework

Science.gov (United States)

Fulachier, J.; Odier, J.; Lambert, F.; ATLAS Collaboration

2017-10-01

The ATLAS Metadata Interface (AMI) is a mature application of more than 15 years of existence. Mainly used by the ATLAS experiment at CERN, it consists of a very generic tool ecosystem for metadata aggregation and cataloguing. We briefly describe the architecture, the main services and the benefits of using AMI in big collaborations, especially for high energy physics. We focus on the recent improvements, for instance: the lightweight clients (Python, JavaScript, C++), the new smart task server system and the Web 2.0 AMI framework for simplifying the development of metadata-oriented web interfaces.
ATLAS Metadata Interface (AMI), a generic metadata framework

CERN Document Server

AUTHOR|(SzGeCERN)573735; The ATLAS collaboration; Odier, Jerome; Lambert, Fabian

2017-01-01

The ATLAS Metadata Interface (AMI) is a mature application of more than 15 years of existence. Mainly used by the ATLAS experiment at CERN, it consists of a very generic tool ecosystem for metadata aggregation and cataloguing. We briefly describe the architecture, the main services and the benefits of using AMI in big collaborations, especially for high energy physics. We focus on the recent improvements, for instance: the lightweight clients (Python, JavaScript, C++), the new smart task server system and the Web 2.0 AMI framework for simplifying the development of metadata-oriented web interfaces.
USGIN ISO metadata profile

Science.gov (United States)

Richard, S. M.

2011-12-01

The USGIN project has drafted and is using a specification for use of ISO 19115/19/39 metadata, recommendations for simple metadata content, and a proposal for a URI scheme to identify resources using resolvable http URI's(see http://lab.usgin.org/usgin-profiles). The principal target use case is a catalog in which resources can be registered and described by data providers for discovery by users. We are currently using the ESRI Geoportal (Open Source), with configuration files for the USGIN profile. The metadata offered by the catalog must provide sufficient content to guide search engines to locate requested resources, to describe the resource content, provenance, and quality so users can determine if the resource will serve for intended usage, and finally to enable human users and sofware clients to obtain or access the resource. In order to achieve an operational federated catalog system, provisions in the ISO specification must be restricted and usage clarified to reduce the heterogeneity of 'standard' metadata and service implementations such that a single client can search against different catalogs, and the metadata returned by catalogs can be parsed reliably to locate required information. Usage of the complex ISO 19139 XML schema allows for a great deal of structured metadata content, but the heterogenity in approaches to content encoding has hampered development of sophisticated client software that can take advantage of the rich metadata; the lack of such clients in turn reduces motivation for metadata producers to produce content-rich metadata. If the only significant use of the detailed, structured metadata is to format into text for people to read, then the detailed information could be put in free text elements and be just as useful. In order for complex metadata encoding and content to be useful, there must be clear and unambiguous conventions on the encoding that are utilized by the community that wishes to take advantage of advanced metadata
Using Google Tag Manager and Google Analytics to track DSpace metadata fields as custom dimensions

Directory of Open Access Journals (Sweden)

Suzanna Conrad

2015-01-01

Full Text Available DSpace can be problematic for those interested in tracking download and pageview statistics granularly. Some libraries have implemented code to track events on websites and some have experimented with using Google Tag Manager to automate event tagging in DSpace. While these approaches make it possible to track download statistics, granular details such as authors, content types, titles, advisors, and other fields for which metadata exist are generally not tracked in DSpace or Google Analytics without coding. Moreover, it can be time consuming to track and assess pageview data and relate that data back to particular metadata fields. This article will detail the learning process of incorporating custom dimensions for tracking these detailed fields including trial and error attempts to use the data import function manually in Google Analytics, to automate the data import using Google APIs, and finally to automate the collection of dimension data in Google Tag Manager by mimicking SEO practices for capturing meta tags. This specific case study refers to using Google Tag Manager and Google Analytics with DSpace; however, this method may also be applied to other types of websites or systems.
Automated Information Enrichment for a Better Search

OpenAIRE

José Luis Preza

2016-01-01

The process of adding the Metadata when uploading a digital object onto a repository is usually manual. This means that the user has to have already at hand the keywords and all the other information about the asset. This paper addresses the possibility of enriching the “manual metadata” by generating automated metadata using the cognitive services provided by technologies like IBM Watson platform. The cognitive computing services offered by IBM Watson automatically generate Semantic Data (in...
MMI's Metadata and Vocabulary Solutions: 10 Years and Growing

Science.gov (United States)

Graybeal, J.; Gayanilo, F.; Rueda-Velasquez, C. A.

2014-12-01

The Marine Metadata Interoperability project (http://marinemetadata.org) held its public opening at AGU's 2004 Fall Meeting. For 10 years since that debut, the MMI guidance and vocabulary sites have served over 100,000 visitors, with 525 community members and continuous Steering Committee leadership. Originally funded by the National Science Foundation, over the years multiple organizations have supported the MMI mission: "Our goal is to support collaborative research in the marine science domain, by simplifying the incredibly complex world of metadata into specific, straightforward guidance. MMI encourages scientists and data managers at all levels to apply good metadata practices from the start of a project, by providing the best guidance and resources for data management, and developing advanced metadata tools and services needed by the community." Now hosted by the Harte Research Institute at Texas A&M University at Corpus Christi, MMI continues to provide guidance and services to the community, and is planning for marine science and technology needs for the next 10 years. In this presentation we will highlight our major accomplishments, describe our recent achievements and imminent goals, and propose a vision for improving marine data interoperability for the next 10 years, including Ontology Registry and Repository (http://mmisw.org/orr) advancements and applications (http://mmisw.org/cfsn).
Metadata Realities for Cyberinfrastructure: Data Authors as Metadata Creators

Science.gov (United States)

Mayernik, Matthew Stephen

2011-01-01

As digital data creation technologies become more prevalent, data and metadata management are necessary to make data available, usable, sharable, and storable. Researchers in many scientific settings, however, have little experience or expertise in data and metadata management. In this dissertation, I explore the everyday data and metadata…
Habitat-Lite: A GSC case study based on free text terms for environmental metadata

Energy Technology Data Exchange (ETDEWEB)

Kyrpides, Nikos; Hirschman, Lynette; Clark, Cheryl; Cohen, K. Bretonnel; Mardis, Scott; Luciano, Joanne; Kottmann, Renzo; Cole, James; Markowitz, Victor; Kyrpides, Nikos; Field, Dawn

2008-04-01

There is an urgent need to capture metadata on the rapidly growing number of genomic, metagenomic and related sequences, such as 16S ribosomal genes. This need is a major focus within the Genomic Standards Consortium (GSC), and Habitat is a key metadata descriptor in the proposed 'Minimum Information about a Genome Sequence' (MIGS) specification. The goal of the work described here is to provide a light-weight, easy-to-use (small) set of terms ('Habitat-Lite') that captures high-level information about habitat while preserving a mapping to the recently launched Environment Ontology (EnvO). Our motivation for building Habitat-Lite is to meet the needs of multiple users, such as annotators curating these data, database providers hosting the data, and biologists and bioinformaticians alike who need to search and employ such data in comparative analyses. Here, we report a case study based on semi-automated identification of terms from GenBank and GOLD. We estimate that the terms in the initial version of Habitat-Lite would provide useful labels for over 60% of the kinds of information found in the GenBank isolation-source field, and around 85% of the terms in the GOLD habitat field. We present a revised version of Habitat-Lite and invite the community's feedback on its further development in order to provide a minimum list of terms to capture high-level habitat information and to provide classification bins needed for future studies.
The metadata manual a practical workbook

CERN Document Server

Lubas, Rebecca; Schneider, Ingrid

2013-01-01

Cultural heritage professionals have high levels of training in metadata. However, the institutions in which they practice often depend on support staff, volunteers, and students in order to function. With limited time and funding for training in metadata creation for digital collections, there are often many questions about metadata without a reliable, direct source for answers. The Metadata Manual provides such a resource, answering basic metadata questions that may appear, and exploring metadata from a beginner's perspective. This title covers metadata basics, XML basics, Dublin Core, VRA C
A Metadata-Rich File System

Energy Technology Data Exchange (ETDEWEB)

Ames, S; Gokhale, M B; Maltzahn, C

2009-01-07

Despite continual improvements in the performance and reliability of large scale file systems, the management of file system metadata has changed little in the past decade. The mismatch between the size and complexity of large scale data stores and their ability to organize and query their metadata has led to a de facto standard in which raw data is stored in traditional file systems, while related, application-specific metadata is stored in relational databases. This separation of data and metadata requires considerable effort to maintain consistency and can result in complex, slow, and inflexible system operation. To address these problems, we have developed the Quasar File System (QFS), a metadata-rich file system in which files, metadata, and file relationships are all first class objects. In contrast to hierarchical file systems and relational databases, QFS defines a graph data model composed of files and their relationships. QFS includes Quasar, an XPATH-extended query language for searching the file system. Results from our QFS prototype show the effectiveness of this approach. Compared to the defacto standard, the QFS prototype shows superior ingest performance and comparable query performance on user metadata-intensive operations and superior performance on normal file metadata operations.

METADATA, DESKRIPSI SERTA TITIK AKSESNYA DAN INDOMARC

Directory of Open Access Journals (Sweden)

Sulistiyo Basuki

2012-07-01

Full Text Available lstilah metadata mulai sering muncul dalam literature tentang database management systems (DBMS pada tahun 1980 an. lstilah tersebut digunakan untuk menggambarkan informasi yang diperlukan untuk mencatat karakteristik informasi yang terdapat pada pangkalan data. Banyak sumber yang mengartikan istilah metadata. Metadata dapat diartikan sumber, menunjukan lokasi dokumen, serta memberikan ringkasan yang diperlukan untuk memanfaat-kannya. Secara umum ada 3 bagian yang digunakan untuk membuat metadata sebagai sebuah paket informasi, dan penyandian (encoding pembuatan deskripsi paket informasi, dan penyediaan akses terhadap deskripsi tersebut. Dalam makalah ini diuraikan mengenai konsep data dalam kaitannya dengan perpustakaan. Uraian meliputi definisi metadata; fungsi metadata; standar penyandian (encoding, cantuman bibliografis. surogat, metadata; penciptaan isi cantuman surogat; ancangan terhadap format metadata; serta metadata dan standar metadata.
Department of the Interior metadata implementation guide—Framework for developing the metadata component for data resource management

Science.gov (United States)

Obuch, Raymond C.; Carlino, Jennifer; Zhang, Lin; Blythe, Jonathan; Dietrich, Christopher; Hawkinson, Christine

2018-04-12

The Department of the Interior (DOI) is a Federal agency with over 90,000 employees across 10 bureaus and 8 agency offices. Its primary mission is to protect and manage the Nation’s natural resources and cultural heritage; provide scientific and other information about those resources; and honor its trust responsibilities or special commitments to American Indians, Alaska Natives, and affiliated island communities. Data and information are critical in day-to-day operational decision making and scientific research. DOI is committed to creating, documenting, managing, and sharing high-quality data and metadata in and across its various programs that support its mission. Documenting data through metadata is essential in realizing the value of data as an enterprise asset. The completeness, consistency, and timeliness of metadata affect users’ ability to search for and discover the most relevant data for the intended purpose; and facilitates the interoperability and usability of these data among DOI bureaus and offices. Fully documented metadata describe data usability, quality, accuracy, provenance, and meaning.Across DOI, there are different maturity levels and phases of information and metadata management implementations. The Department has organized a committee consisting of bureau-level points-of-contacts to collaborate on the development of more consistent, standardized, and more effective metadata management practices and guidance to support this shared mission and the information needs of the Department. DOI’s metadata implementation plans establish key roles and responsibilities associated with metadata management processes, procedures, and a series of actions defined in three major metadata implementation phases including: (1) Getting started—Planning Phase, (2) Implementing and Maintaining Operational Metadata Management Phase, and (3) the Next Steps towards Improving Metadata Management Phase. DOI’s phased approach for metadata management addresses
The RBV metadata catalog

Science.gov (United States)

Andre, Francois; Fleury, Laurence; Gaillardet, Jerome; Nord, Guillaume

2015-04-01

RBV (Réseau des Bassins Versants) is a French initiative to consolidate the national efforts made by more than 15 elementary observatories funded by various research institutions (CNRS, INRA, IRD, IRSTEA, Universities) that study river and drainage basins. The RBV Metadata Catalogue aims at giving an unified vision of the work produced by every observatory to both the members of the RBV network and any external person interested by this domain of research. Another goal is to share this information with other existing metadata portals. Metadata management is heterogeneous among observatories ranging from absence to mature harvestable catalogues. Here, we would like to explain the strategy used to design a state of the art catalogue facing this situation. Main features are as follows : - Multiple input methods: Metadata records in the catalog can either be entered with the graphical user interface, harvested from an existing catalogue or imported from information system through simplified web services. - Hierarchical levels: Metadata records may describe either an observatory, one of its experimental site or a single dataset produced by one instrument. - Multilingualism: Metadata can be easily entered in several configurable languages. - Compliance to standards : the backoffice part of the catalogue is based on a CSW metadata server (Geosource) which ensures ISO19115 compatibility and the ability of being harvested (globally or partially). On going tasks focus on the use of SKOS thesaurus and SensorML description of the sensors. - Ergonomy : The user interface is built with the GWT Framework to offer a rich client application with a fully ajaxified navigation. - Source code sharing : The work has led to the development of reusable components which can be used to quickly create new metadata forms in other GWT applications You can visit the catalogue (http://portailrbv.sedoo.fr/) or contact us by email rbv@sedoo.fr.
Towards an automated TLD system that meets international requirements

International Nuclear Information System (INIS)

Boetter-Jensen, L.; Vanamo, V.

1988-01-01

The new recently introduced fully automated TLD system developed by Alnor OY on the basis of the Riso prototype, is intended to meet draft IEC/ISO proposals and ANSI requirements. Part of the system is a personal dosemeter badge and an environmental dosemeter package following ICRU recommendations. The overall system consists of a software-controlled automated reader, a programable irradiator/calibrator, a computer, and dosemeters for environmental, whole body, extremity and clinical applications. The personal TLD badge that contains four TLD pellets is designed to agree with ICRU H p (10) and H s (0.07) quantities for determining dose equivalent. The badge can accommodate a large variety of the most commonly used solid TL dosemeter products. A special effort was put into the evaluation of skin dose by considering the use of graphite-mixed hot-sintered LiF pellets. The TLD system is described and results from a performance test that comprised measurements of photon energy response, angular dependence, and reproducibility are presented
Creating context for the experiment record. User-defined metadata: investigations into metadata usage in the LabTrove ELN.

Science.gov (United States)

Willoughby, Cerys; Bird, Colin L; Coles, Simon J; Frey, Jeremy G

2014-12-22

The drive toward more transparency in research, the growing willingness to make data openly available, and the reuse of data to maximize the return on research investment all increase the importance of being able to find information and make links to the underlying data. The use of metadata in Electronic Laboratory Notebooks (ELNs) to curate experiment data is an essential ingredient for facilitating discovery. The University of Southampton has developed a Web browser-based ELN that enables users to add their own metadata to notebook entries. A survey of these notebooks was completed to assess user behavior and patterns of metadata usage within ELNs, while user perceptions and expectations were gathered through interviews and user-testing activities within the community. The findings indicate that while some groups are comfortable with metadata and are able to design a metadata structure that works effectively, many users are making little attempts to use it, thereby endangering their ability to recover data in the future. A survey of patterns of metadata use in these notebooks, together with feedback from the user community, indicated that while a few groups are comfortable with metadata and are able to design a metadata structure that works effectively, many users adopt a "minimum required" approach to metadata. To investigate whether the patterns of metadata use in LabTrove were unusual, a series of surveys were undertaken to investigate metadata usage in a variety of platforms supporting user-defined metadata. These surveys also provided the opportunity to investigate whether interface designs in these other environments might inform strategies for encouraging metadata creation and more effective use of metadata in LabTrove.
ETICS meta-data software editing - from check out to commit operations

International Nuclear Information System (INIS)

Begin, M-E; Sancho, G D-A; Ronco, S D; Gentilini, M; Ronchieri, E; Selmi, M

2008-01-01

People involved in modular projects need to improve the build software process, planning the correct execution order and detecting circular dependencies. The lack of suitable tools may cause delays in the development, deployment and maintenance of the software. Experience in such projects has shown that the use of version control and build systems is not able to support the development of the software efficiently, due to a large number of errors each of which causes the breaking of the build process. Common causes of errors are for example the adoption of new libraries, libraries incompatibility, the extension of the current project in order to support new software modules. In this paper, we describe a possible solution implemented in ETICS, an integrated infrastructure for the automated configuration, build and test of Grid and distributed software. ETICS has defined meta-data software abstractions, from which it is possible to download, build and test software projects, setting for instance dependencies, environment variables and properties. Furthermore, the meta-data information is managed by ETICS reflecting the version control system philosophy, because of the existence of a meta-data repository and the handling of a list of operations, such as check out and commit. All the information related to a specific software are stored in the repository only when they are considered to be correct. By means of this solution, we introduce a sort of flexibility inside the ETICS system, allowing users to work accordingly to their needs. Moreover, by introducing this functionality, ETICS will be a version control system like for the management of the meta-data
Moving toward the automation of the systematic review process: a summary of discussions at the second meeting of International Collaboration for the Automation of Systematic Reviews (ICASR).

Science.gov (United States)

O'Connor, Annette M; Tsafnat, Guy; Gilbert, Stephen B; Thayer, Kristina A; Wolfe, Mary S

2018-01-09

The second meeting of the International Collaboration for Automation of Systematic Reviews (ICASR) was held 3-4 October 2016 in Philadelphia, Pennsylvania, USA. ICASR is an interdisciplinary group whose aim is to maximize the use of technology for conducting rapid, accurate, and efficient systematic reviews of scientific evidence. Having automated tools for systematic review should enable more transparent and timely review, maximizing the potential for identifying and translating research findings to practical application. The meeting brought together multiple stakeholder groups including users of summarized research, methodologists who explore production processes and systematic review quality, and technologists such as software developers, statisticians, and vendors. This diversity of participants was intended to ensure effective communication with numerous stakeholders about progress toward automation of systematic reviews and stimulate discussion about potential solutions to identified challenges. The meeting highlighted challenges, both simple and complex, and raised awareness among participants about ongoing efforts by various stakeholders. An outcome of this forum was to identify several short-term projects that participants felt would advance the automation of tasks in the systematic review workflow including (1) fostering better understanding about available tools, (2) developing validated datasets for testing new tools, (3) determining a standard method to facilitate interoperability of tools such as through an application programming interface or API, and (4) establishing criteria to evaluate the quality of tools' output. ICASR 2016 provided a beneficial forum to foster focused discussion about tool development and resources and reconfirm ICASR members' commitment toward systematic reviews' automation.
Multi-facetted Metadata - Describing datasets with different metadata schemas at the same time

Science.gov (United States)

Ulbricht, Damian; Klump, Jens; Bertelmann, Roland

2013-04-01

Inspired by the wish to re-use research data a lot of work is done to bring data systems of the earth sciences together. Discovery metadata is disseminated to data portals to allow building of customized indexes of catalogued dataset items. Data that were once acquired in the context of a scientific project are open for reappraisal and can now be used by scientists that were not part of the original research team. To make data re-use easier, measurement methods and measurement parameters must be documented in an application metadata schema and described in a written publication. Linking datasets to publications - as DataCite [1] does - requires again a specific metadata schema and every new use context of the measured data may require yet another metadata schema sharing only a subset of information with the meta information already present. To cope with the problem of metadata schema diversity in our common data repository at GFZ Potsdam we established a solution to store file-based research data and describe these with an arbitrary number of metadata schemas. Core component of the data repository is an eSciDoc infrastructure that provides versioned container objects, called eSciDoc [2] "items". The eSciDoc content model allows assigning files to "items" and adding any number of metadata records to these "items". The eSciDoc items can be submitted, revised, and finally published, which makes the data and metadata available through the internet worldwide. GFZ Potsdam uses eSciDoc to support its scientific publishing workflow, including mechanisms for data review in peer review processes by providing temporary web links for external reviewers that do not have credentials to access the data. Based on the eSciDoc API, panMetaDocs [3] provides a web portal for data management in research projects. PanMetaDocs, which is based on panMetaWorks [4], is a PHP based web application that allows to describe data with any XML-based schema. It uses the eSciDoc infrastructures
Tethys Acoustic Metadata Database

Data.gov (United States)

National Oceanic and Atmospheric Administration, Department of Commerce — The Tethys database houses the metadata associated with the acoustic data collection efforts by the Passive Acoustic Group. These metadata include dates, locations...
An integrated overview of metadata in ATLAS

International Nuclear Information System (INIS)

Gallas, E J; Malon, D; Hawkings, R J; Albrand, S; Torrence, E

2010-01-01

Metadata (data about data) arise in many contexts, from many diverse sources, and at many levels in ATLAS. Familiar examples include run-level, luminosity-block-level, and event-level metadata, and, related to processing and organization, dataset-level and file-level metadata, but these categories are neither exhaustive nor orthogonal. Some metadata are known a priori, in advance of data taking or simulation; other metadata are known only after processing, and occasionally, quite late (e.g., detector status or quality updates that may appear after initial reconstruction is complete). Metadata that may seem relevant only internally to the distributed computing infrastructure under ordinary conditions may become relevant to physics analysis under error conditions ('What can I discover about data I failed to process?'). This talk provides an overview of metadata and metadata handling in ATLAS, and describes ongoing work to deliver integrated metadata services in support of physics analysis.
Metadata Life Cycles, Use Cases and Hierarchies

Directory of Open Access Journals (Sweden)

Ted Habermann

2018-05-01

Full Text Available The historic view of metadata as “data about data” is expanding to include data about other items that must be created, used, and understood throughout the data and project life cycles. In this context, metadata might better be defined as the structured and standard part of documentation, and the metadata life cycle can be described as the metadata content that is required for documentation in each phase of the project and data life cycles. This incremental approach to metadata creation is similar to the spiral model used in software development. Each phase also has distinct users and specific questions to which they need answers. In many cases, the metadata life cycle involves hierarchies where latter phases have increased numbers of items. The relationships between metadata in different phases can be captured through structure in the metadata standard, or through conventions for identifiers. Metadata creation and management can be streamlined and simplified by re-using metadata across many records. Many of these ideas have been developed to various degrees in several Geoscience disciplines and are being used in metadata for documenting the integrated life cycle of environmental research in the Arctic, including projects, collection sites, and datasets.
Critical Metadata for Spectroscopy Field Campaigns

Directory of Open Access Journals (Sweden)

Barbara A. Rasaiah

2014-04-01

Full Text Available A field spectroscopy metadata standard is defined as those data elements that explicitly document the spectroscopy dataset and field protocols, sampling strategies, instrument properties and environmental and logistical variables. Standards for field spectroscopy metadata affect the quality, completeness, reliability, and usability of datasets created in situ. Currently there is no standardized methodology for documentation of in situ spectroscopy data or metadata. This paper presents results of an international experiment comprising a web-based survey and expert panel evaluation that investigated critical metadata in field spectroscopy. The survey participants were a diverse group of scientists experienced in gathering spectroscopy data across a wide range of disciplines. Overall, respondents were in agreement about a core metadataset for generic campaign metadata, allowing for a prioritization of critical metadata elements to be proposed including those relating to viewing geometry, location, general target and sampling properties, illumination, instrument properties, reference standards, calibration, hyperspectral signal properties, atmospheric conditions, and general project details. Consensus was greatest among individual expert groups in specific application domains. The results allow the identification of a core set of metadata fields that enforce long term data storage and serve as a foundation for a metadata standard. This paper is part one in a series about the core elements of a robust and flexible field spectroscopy metadata standard.
Enriching the Metadata on CERN Document Server And Implementation of a Book Acquisition System To Predict the potential bottlenecks in availability of records in Library

CERN Document Server

Ahuja, Chakshu

2015-01-01

The automated script for data acquisition project ( bibtasklet ) which aims to enhance existing metadata in our CERN Document Server with data from Springer was implemented. An implicit part of this task is to manage disambiguation (within incoming data), removal of multiple entries, handle replications between new and existing records and further automate the latest file upload task to CDS. All such elements and their corresponding changes are integrated within Invenio to make the upgraded metadata available on the CDS. Another implementation was to make a web interface for the invenio software to calculate the normalized loan period to help the librarians to predict bottleneck on the books on loan over certain period of time and facilitate their decision to buy the new records accordingly.
A Framework for Fully Automated Performance Testing for Smart Buildings

DEFF Research Database (Denmark)

Markoska, Elena; Johansen, Aslak; Lazarova-Molnar, Sanja

2018-01-01

, setup of performance tests has been manual and labor-intensive and has required intimate knowledge of buildings’ complexity and systems. The emergence of the concept of smart buildings has provided an opportunity to overcome this restriction. In this paper, we propose a framework for automated......A significant proportion of energy consumption by buildings worldwide, estimated to ca. 40%, has yielded a high importance to studying buildings’ performance. Performance testing is a mean by which buildings can be continuously commissioned to ensure that they operate as designed. Historically...... performance testing of smart buildings that utilizes metadata models. The approach features automatic detection of applicable performance tests using metadata queries and their corresponding instantiation, as well as continuous commissioning based on metadata. The presented approach has been implemented...
XML for catalogers and metadata librarians

CERN Document Server

Cole, Timothy W

2013-01-01

How are today's librarians to manage and describe the everexpanding volumes of resources, in both digital and print formats? The use of XML in cataloging and metadata workflows can improve metadata quality, the consistency of cataloging workflows, and adherence to standards. This book is intended to enable current and future catalogers and metadata librarians to progress beyond a bare surfacelevel acquaintance with XML, thereby enabling them to integrate XML technologies more fully into their cataloging workflows. Building on the wealth of work on library descriptive practices, cataloging, and metadata, XML for Catalogers and Metadata Librarians explores the use of XML to serialize, process, share, and manage library catalog and metadata records. The authors' expert treatment of the topic is written to be accessible to those with little or no prior practical knowledge of or experience with how XML is used. Readers will gain an educated appreciation of the nuances of XML and grasp the benefit of more advanced ...
Security in a Replicated Metadata Catalogue

CERN Document Server

Koblitz, B

2007-01-01

The gLite-AMGA metadata has been developed by NA4 to provide simple relational metadata access for the EGEE user community. As advanced features, which will be the focus of this presentation, AMGA provides very fine-grained security also in connection with the built-in support for replication and federation of metadata. AMGA is extensively used by the biomedical community to store medical images metadata, digital libraries, in HEP for logging and bookkeeping data and in the climate community. The biomedical community intends to deploy a distributed metadata system for medical images consisting of various sites, which range from hospitals to computing centres. Only safe sharing of the highly sensitive metadata as provided in AMGA makes such a scenario possible. Other scenarios are digital libraries, which federate copyright protected (meta-) data into a common catalogue. The biomedical and digital libraries have been deployed using a centralized structure already for some time. They now intend to decentralize ...
Mdmap: A Tool for Metadata Collection and Matching

Directory of Open Access Journals (Sweden)

Rico Simke

2014-10-01

Full Text Available This paper describes a front-end for the semi-automatic collection, matching, and generation of bibliographic metadata obtained from different sources for use within a digitization architecture. The Library of a Billion Words project is building an infrastructure for digitizing text that requires high-quality bibliographic metadata, but currently only sparse metadata from digitized editions is available. The project’s approach is to collect metadata for each digitized item from as many sources as possible. An expert user can then use an intuitive front-end tool to choose matching metadata. The collected metadata are centrally displayed in an interactive grid view. The user can choose which metadata they want to assign to a certain edition, and export these data as MARCXML. This paper presents a new approach to bibliographic work and metadata correction. We try to achieve a high quality of the metadata by generating a large amount of metadata to choose from, as well as by giving librarians an intuitive tool to manage their data.
Automated Metadata Formatting for Cornell’s Print-on-Demand Books

Directory of Open Access Journals (Sweden)

Dianne Dietrich

2009-11-01

Full Text Available Cornell University Library has made Print-On Demand (POD books available for many of its digitized out-of-copyright books. The printer must be supplied with metadata from the MARC bibliographic record in order to produce book covers. Although the names of authors are present in MARC records, they are given in an inverted order suitable for alphabetical filing rather than the natural order that is desirable for book covers. This article discusses a process for parsing and manipulating the MARC author strings to identify their various component parts and to create natural order strings. In particular, the article focuses on processing non-name information in author strings, such as titles that were commonly used in older works, e.g., baron or earl, and suffixes appended to names, e.g., "of Bolsena." Relevant patterns are identified and a Python script is used to manipulate the author name strings.
The Halden Reactor Project workshop meeting on human centred automation and function allocation methods

International Nuclear Information System (INIS)

Sebok, Angelia; Green, Marit; Larsen, Marit; Miberg, Ann Britt; Morisseau, Dolores

1998-02-01

A workshop on Human Centred Automation (HCA) and Function Allocation Methods was organised in Halden, September 29-30, 1997. The purpose of the workshop was to discuss and make recommendations on requirements for the Halden Project research agenda. The workshop meeting began with several presentations summarising current issues in HCA, Function Allocation Methods and Functional Modelling. Invited speakers presented their research or modelling efforts. Following the presentations, the workshop was divided into three working groups, all tasked with answering the same four questions: (1) What are the most important issues in Human Centred Automation? (2) Which strengths could be achieved by integrating Functional Modelling Methods into experimental Human Centred Automation research? (3) How should analytical and experimental methods be balanced? (4) What are the most important aspects in automation design methodology? Each group discussed the questions and produced specific recommendations that were summarised by the group's facilitator in a joint session of the workshop. (author)
The essential guide to metadata for books

CERN Document Server

Register, Renee

2013-01-01

In The Essential Guide to Metadata for Books, you will learn exactly what you need to know to effectively generate, handle and disseminate metadata for books and ebooks. This comprehensive but digestible document will explain the life-cycle of book metadata, industry standards, XML, ONIX and the essential elements of metadata. It will also show you how effective, well-organized metadata can improve your efforts to sell a book, especially when it comes to marketing, discoverability and converting at the point of sale. This information-packed document also includes a glossary of terms

Metadata Effectiveness in Internet Discovery: An Analysis of Digital Collection Metadata Elements and Internet Search Engine Keywords

Science.gov (United States)

Yang, Le

2016-01-01

This study analyzed digital item metadata and keywords from Internet search engines to learn what metadata elements actually facilitate discovery of digital collections through Internet keyword searching and how significantly each metadata element affects the discovery of items in a digital repository. The study found that keywords from Internet…
Science friction: data, metadata, and collaboration.

Science.gov (United States)

Edwards, Paul N; Mayernik, Matthew S; Batcheller, Archer L; Bowker, Geoffrey C; Borgman, Christine L

2011-10-01

When scientists from two or more disciplines work together on related problems, they often face what we call 'science friction'. As science becomes more data-driven, collaborative, and interdisciplinary, demand increases for interoperability among data, tools, and services. Metadata--usually viewed simply as 'data about data', describing objects such as books, journal articles, or datasets--serve key roles in interoperability. Yet we find that metadata may be a source of friction between scientific collaborators, impeding data sharing. We propose an alternative view of metadata, focusing on its role in an ephemeral process of scientific communication, rather than as an enduring outcome or product. We report examples of highly useful, yet ad hoc, incomplete, loosely structured, and mutable, descriptions of data found in our ethnographic studies of several large projects in the environmental sciences. Based on this evidence, we argue that while metadata products can be powerful resources, usually they must be supplemented with metadata processes. Metadata-as-process suggests the very large role of the ad hoc, the incomplete, and the unfinished in everyday scientific work.
A metadata catalog for organization and systemization of fusion simulation data

International Nuclear Information System (INIS)

Greenwald, M.; Fredian, T.; Schissel, D.; Stillerman, J.

2012-01-01

Highlights: ► We find that modeling and simulation data need better systemization. ► Workflow, data provenance and relations among data items need to be captured. ► We have begun a design for a simulation metadata catalog that meets these needs. ► The catalog design also supports creation of science notebooks for simulation. - Abstract: Careful management of data and associated metadata is a critical part of any scientific enterprise. Unfortunately, most current fusion simulation efforts lack systematic, project-wide organization of their data. This paper describes an approach to managing simulation data through creation of a comprehensive metadata catalog, currently under development. The catalog is intended to document all past and current simulation activities (including data provenance); to enable global data location and to facilitate data access, analysis and visualization through uniform provision of metadata. The catalog will capture workflow, holding entries for each simulation activity including, at least, data importing and staging, data pre-processing and input preparation, code execution, data storage, post-processing and exporting. The overall aim is that between the catalog and the main data archive, the system would hold a complete and accessible description of the data, all of its attributes and the processes used to generate the data. The catalog will describe data collections, including those representing simulation workflows as well as any other useful groupings. Finally it would be populated with user supplied comments to explain the motivation and results of any activity documented by the catalog.
The role of automated categorization in e-government information retrieval

DEFF Research Database (Denmark)

Jonasen, Tanja Svarre; Lykke, Marianne

2013-01-01

High-precision search results are essential for helping e-government employees complete work-based tasks. Prior studies have shown that existing features of e-government systems need improvement in terms of search facilities, navigation, and metadata adoption. This paper investigates how automated...
CMO: Cruise Metadata Organizer for JAMSTEC Research Cruises

Science.gov (United States)

Fukuda, K.; Saito, H.; Hanafusa, Y.; Vanroosebeke, A.; Kitayama, T.

2011-12-01

JAMSTEC's Data Research Center for Marine-Earth Sciences manages and distributes a wide variety of observational data and samples obtained from JAMSTEC research vessels and deep sea submersibles. Generally, metadata are essential to identify data and samples were obtained. In JAMSTEC, cruise metadata include cruise information such as cruise ID, name of vessel, research theme, and diving information such as dive number, name of submersible and position of diving point. They are submitted by chief scientists of research cruises in the Microsoft Excel° spreadsheet format, and registered into a data management database to confirm receipt of observational data files, cruise summaries, and cruise reports. The cruise metadata are also published via "JAMSTEC Data Site for Research Cruises" within two months after end of cruise. Furthermore, these metadata are distributed with observational data, images and samples via several data and sample distribution websites after a publication moratorium period. However, there are two operational issues in the metadata publishing process. One is that duplication efforts and asynchronous metadata across multiple distribution websites due to manual metadata entry into individual websites by administrators. The other is that differential data types or representation of metadata in each website. To solve those problems, we have developed a cruise metadata organizer (CMO) which allows cruise metadata to be connected from the data management database to several distribution websites. CMO is comprised of three components: an Extensible Markup Language (XML) database, an Enterprise Application Integration (EAI) software, and a web-based interface. The XML database is used because of its flexibility for any change of metadata. Daily differential uptake of metadata from the data management database to the XML database is automatically processed via the EAI software. Some metadata are entered into the XML database using the web
Optimising metadata workflows in a distributed information environment

OpenAIRE

Robertson, R. John; Barton, Jane

2005-01-01

The different purposes present within a distributed information environment create the potential for repositories to enhance their metadata by capitalising on the diversity of metadata available for any given object. This paper presents three conceptual reference models required to achieve this optimisation of metadata workflow: the ecology of repositories, the object lifecycle model, and the metadata lifecycle model. It suggests a methodology for developing the metadata lifecycle model, and ...
Metadata in Scientific Dialects

Science.gov (United States)

Habermann, T.

2011-12-01

Discussions of standards in the scientific community have been compared to religious wars for many years. The only things scientists agree on in these battles are either "standards are not useful" or "everyone can benefit from using my standard". Instead of achieving the goal of facilitating interoperable communities, in many cases the standards have served to build yet another barrier between communities. Some important progress towards diminishing these obstacles has been made in the data layer with the merger of the NetCDF and HDF scientific data formats. The universal adoption of XML as the standard for representing metadata and the recent adoption of ISO metadata standards by many groups around the world suggests that similar convergence is underway in the metadata layer. At the same time, scientists and tools will likely need support for native tongues for some time. I will describe an approach that combines re-usable metadata "components" and restful web services that provide those components in many dialects. This approach uses advanced XML concepts of referencing and linking to construct complete records that include reusable components and builds on the ISO Standards as the "unabridged dictionary" that encompasses the content of many other dialects.
A Framework for Semi-Automated Implementation of Multidimensional Data Models

Directory of Open Access Journals (Sweden)

Ilona Mariana NAGY

2012-08-01

Full Text Available Data warehousing solution development represents a challenging task which requires the employment of considerable resources on behalf of enterprises and sustained commitment from the stakeholders. Costs derive mostly from the amount of time invested in the design and physical implementation of these large projects, time that we consider, may be decreased through the automation of several processes. Thus, we present a framework for semi-automated implementation of multidimensional data models and introduce an automation prototype intended to reduce the time of data structures generation in the warehousing environment. Our research is focused on the design of an automation component and the development of a corresponding prototype from technical metadata.
Inheritance rules for Hierarchical Metadata Based on ISO 19115

Science.gov (United States)

Zabala, A.; Masó, J.; Pons, X.

2012-04-01

Mainly, ISO19115 has been used to describe metadata for datasets and services. Furthermore, ISO19115 standard (as well as the new draft ISO19115-1) includes a conceptual model that allows to describe metadata at different levels of granularity structured in hierarchical levels, both in aggregated resources such as particularly series, datasets, and also in more disaggregated resources such as types of entities (feature type), types of attributes (attribute type), entities (feature instances) and attributes (attribute instances). In theory, to apply a complete metadata structure to all hierarchical levels of metadata, from the whole series to an individual feature attributes, is possible, but to store all metadata at all levels is completely impractical. An inheritance mechanism is needed to store each metadata and quality information at the optimum hierarchical level and to allow an ease and efficient documentation of metadata in both an Earth observation scenario such as a multi-satellite mission multiband imagery, as well as in a complex vector topographical map that includes several feature types separated in layers (e.g. administrative limits, contour lines, edification polygons, road lines, etc). Moreover, and due to the traditional split of maps in tiles due to map handling at detailed scales or due to the satellite characteristics, each of the previous thematic layers (e.g. 1:5000 roads for a country) or band (Landsat-5 TM cover of the Earth) are tiled on several parts (sheets or scenes respectively). According to hierarchy in ISO 19115, the definition of general metadata can be supplemented by spatially specific metadata that, when required, either inherits or overrides the general case (G.1.3). Annex H of this standard states that only metadata exceptions are defined at lower levels, so it is not necessary to generate the full registry of metadata for each level but to link particular values to the general value that they inherit. Conceptually the metadata
The Machinic Temporality of Metadata

Directory of Open Access Journals (Sweden)

Claudio Celis

2015-03-01

Full Text Available In 1990 Deleuze introduced the hypothesis that disciplinary societies are gradually being replaced by a new logic of power: control. Accordingly, Matteo Pasquinelli has recently argued that we are moving towards societies of metadata, which correspond to a new stage of what Deleuze called control societies. Societies of metadata are characterised for the central role that meta-information acquires both as a source of surplus value and as an apparatus of social control. The aim of this article is to develop Pasquinelli’s thesis by examining the temporal scope of these emerging societies of metadata. In particular, this article employs Guattari’s distinction between human and machinic times. Through these two concepts, this article attempts to show how societies of metadata combine the two poles of capitalist power formations as identified by Deleuze and Guattari, i.e. social subjection and machinic enslavement. It begins by presenting the notion of metadata in order to identify some of the defining traits of contemporary capitalism. It then examines Berardi’s account of the temporality of the attention economy from the perspective of the asymmetric relation between cyber-time and human time. The third section challenges Berardi’s definition of the temporality of the attention economy by using Guattari’s notions of human and machinic times. Parts four and five fall back upon Deleuze and Guattari’s notions of machinic surplus labour and machinic enslavement, respectively. The concluding section tries to show that machinic and human times constitute two poles of contemporary power formations that articulate the temporal dimension of societies of metadata.
Incorporating ISO Metadata Using HDF Product Designer

Science.gov (United States)

Jelenak, Aleksandar; Kozimor, John; Habermann, Ted

2016-01-01

The need to store in HDF5 files increasing amounts of metadata of various complexity is greatly overcoming the capabilities of the Earth science metadata conventions currently in use. Data producers until now did not have much choice but to come up with ad hoc solutions to this challenge. Such solutions, in turn, pose a wide range of issues for data managers, distributors, and, ultimately, data users. The HDF Group is experimenting on a novel approach of using ISO 19115 metadata objects as a catch-all container for all the metadata that cannot be fitted into the current Earth science data conventions. This presentation will showcase how the HDF Product Designer software can be utilized to help data producers include various ISO metadata objects in their products.
Evaluating the privacy properties of telephone metadata

Science.gov (United States)

Mayer, Jonathan; Mutchler, Patrick; Mitchell, John C.

2016-01-01

Since 2013, a stream of disclosures has prompted reconsideration of surveillance law and policy. One of the most controversial principles, both in the United States and abroad, is that communications metadata receives substantially less protection than communications content. Several nations currently collect telephone metadata in bulk, including on their own citizens. In this paper, we attempt to shed light on the privacy properties of telephone metadata. Using a crowdsourcing methodology, we demonstrate that telephone metadata is densely interconnected, can trivially be reidentified, and can be used to draw sensitive inferences. PMID:27185922
Geospatial metadata retrieval from web services

Directory of Open Access Journals (Sweden)

Ivanildo Barbosa

Full Text Available Nowadays, producers of geospatial data in either raster or vector formats are able to make them available on the World Wide Web by deploying web services that enable users to access and query on those contents even without specific software for geoprocessing. Several providers around the world have deployed instances of WMS (Web Map Service, WFS (Web Feature Service and WCS (Web Coverage Service, all of them specified by the Open Geospatial Consortium (OGC. In consequence, metadata about the available contents can be retrieved to be compared with similar offline datasets from other sources. This paper presents a brief summary and describes the matching process between the specifications for OGC web services (WMS, WFS and WCS and the specifications for metadata required by the ISO 19115 - adopted as reference for several national metadata profiles, including the Brazilian one. This process focuses on retrieving metadata about the identification and data quality packages as well as indicates the directions to retrieve metadata related to other packages. Therefore, users are able to assess whether the provided contents fit to their purposes.
Metadata and Service at the GFZ ISDC Portal

Science.gov (United States)

Ritschel, B.

2008-05-01

The online service portal of the GFZ Potsdam Information System and Data Center (ISDC) is an access point for all manner of geoscientific geodata, its corresponding metadata, scientific documentation and software tools. At present almost 2000 national and international users and user groups have the opportunity to request Earth science data from a portfolio of 275 different products types and more than 20 Million single data files with an added volume of approximately 12 TByte. The majority of the data and information, the portal currently offers to the public, are global geomonitoring products such as satellite orbit and Earth gravity field data as well as geomagnetic and atmospheric data for the exploration. These products for Earths changing system are provided via state-of-the art retrieval techniques. The data product catalog system behind these techniques is based on the extensive usage of standardized metadata, which are describing the different geoscientific product types and data products in an uniform way. Where as all ISDC product types are specified by NASA's Directory Interchange Format (DIF), Version 9.0 Parent XML DIF metadata files, the individual data files are described by extended DIF metadata documents. Depending on the beginning of the scientific project, one part of data files are described by extended DIF, Version 6 metadata documents and the other part are specified by data Child XML DIF metadata documents. Both, the product type dependent parent DIF metadata documents and the data file dependent child DIF metadata documents are derived from a base-DIF.xsd xml schema file. The ISDC metadata philosophy defines a geoscientific product as a package consisting of mostly one or sometimes more than one data file plus one extended DIF metadata file. Because NASA's DIF metadata standard has been developed in order to specify a collection of data only, the extension of the DIF standard consists of new and specific attributes, which are necessary for
The PDS4 Metadata Management System

Science.gov (United States)

Raugh, A. C.; Hughes, J. S.

2018-04-01

We present the key features of the Planetary Data System (PDS) PDS4 Information Model as an extendable metadata management system for planetary metadata related to data structure, analysis/interpretation, and provenance.
Data, Metadata, and Ted

OpenAIRE

Borgman, Christine L.

2014-01-01

Ted Nelson coined the term “hypertext” and developed Xanadu in a universe parallel to the one in which librarians, archivists, and documentalists were creating metadata to establish cross-connections among the myriad topics of this world. When these universes collided, comets exploded as ontologies proliferated. Black holes were formed as data disappeared through lack of description. Today these universes coexist, each informing the other, if not always happily: the formal rules of metadata, ...
Pembuatan Aplikasi Metadata Generator untuk Koleksi Peninggalan Warisan Budaya

Directory of Open Access Journals (Sweden)

Wimba Agra Wicesa

2017-03-01

Full Text Available Warisan budaya merupakan suatu aset penting yang digunakan sebagai sumber informasi dalam mempelajari ilmu sejarah. Mengelola data warisan budaya menjadi suatu hal yang harus diperhatikan guna menjaga keutuhan data warisan budaya di masa depan. Menciptakan sebuah metadata warisan budaya merupakan salah satu langkah yang dapat diambil untuk menjaga nilai dari sebuah artefak. Dengan menggunakan konsep metadata, informasi dari setiap objek warisan budaya tersebut menjadi mudah untuk dibaca, dikelola, maupun dicari kembali meskipun telah tersimpan lama. Selain itu dengan menggunakan konsep metadata, informasi tentang warisan budaya dapat digunakan oleh banyak sistem. Metadata warisan budaya merupakan metadata yang cukup besar. Sehingga untuk membangun metada warisan budaya dibutuhkan waktu yang cukup lama. Selain itu kesalahan (human error juga dapat menghambat proses pembangunan metadata warisan budaya. Proses pembangkitan metadata warisan budaya melalui Aplikasi Metadata Generator menjadi lebih cepat dan mudah karena dilakukan secara otomatis oleh sistem. Aplikasi ini juga dapat menekan human error sehingga proses pembangkitan menjadi lebih efisien.
CCD characterization and measurements automation

International Nuclear Information System (INIS)

Kotov, I.V.; Frank, J.; Kotov, A.I.; Kubanek, P.; O'Connor, P.; Prouza, M.; Radeka, V.; Takacs, P.

2012-01-01

Modern mosaic cameras have grown both in size and in number of sensors. The required volume of sensor testing and characterization has grown accordingly. For camera projects as large as the LSST, test automation becomes a necessity. A CCD testing and characterization laboratory was built and is in operation for the LSST project. Characterization of LSST study contract sensors has been performed. The characterization process and its automation are discussed, and results are presented. Our system automatically acquires images, populates a database with metadata information, and runs express analysis. This approach is illustrated on 55 Fe data analysis. 55 Fe data are used to measure gain, charge transfer efficiency and charge diffusion. Examples of express analysis results are presented and discussed.
Evolving Metadata in NASA Earth Science Data Systems

Science.gov (United States)

Mitchell, A.; Cechini, M. F.; Walter, J.

2011-12-01

NASA's Earth Observing System (EOS) is a coordinated series of satellites for long term global observations. NASA's Earth Observing System Data and Information System (EOSDIS) is a petabyte-scale archive of environmental data that supports global climate change research by providing end-to-end services from EOS instrument data collection to science data processing to full access to EOS and other earth science data. On a daily basis, the EOSDIS ingests, processes, archives and distributes over 3 terabytes of data from NASA's Earth Science missions representing over 3500 data products ranging from various types of science disciplines. EOSDIS is currently comprised of 12 discipline specific data centers that are collocated with centers of science discipline expertise. Metadata is used in all aspects of NASA's Earth Science data lifecycle from the initial measurement gathering to the accessing of data products. Missions use metadata in their science data products when describing information such as the instrument/sensor, operational plan, and geographically region. Acting as the curator of the data products, data centers employ metadata for preservation, access and manipulation of data. EOSDIS provides a centralized metadata repository called the Earth Observing System (EOS) ClearingHouse (ECHO) for data discovery and access via a service-oriented-architecture (SOA) between data centers and science data users. ECHO receives inventory metadata from data centers who generate metadata files that complies with the ECHO Metadata Model. NASA's Earth Science Data and Information System (ESDIS) Project established a Tiger Team to study and make recommendations regarding the adoption of the international metadata standard ISO 19115 in EOSDIS. The result was a technical report recommending an evolution of NASA data systems towards a consistent application of ISO 19115 and related standards including the creation of a NASA-specific convention for core ISO 19115 elements. Part of
The XML Metadata Editor of GFZ Data Services

Science.gov (United States)

Ulbricht, Damian; Elger, Kirsten; Tesei, Telemaco; Trippanera, Daniele

2017-04-01

Following the FAIR data principles, research data should be Findable, Accessible, Interoperable and Reuseable. Publishing data under these principles requires to assign persistent identifiers to the data and to generate rich machine-actionable metadata. To increase the interoperability, metadata should include shared vocabularies and crosslink the newly published (meta)data and related material. However, structured metadata formats tend to be complex and are not intended to be generated by individual scientists. Software solutions are needed that support scientists in providing metadata describing their data. To facilitate data publication activities of 'GFZ Data Services', we programmed an XML metadata editor that assists scientists to create metadata in different schemata popular in the earth sciences (ISO19115, DIF, DataCite), while being at the same time usable by and understandable for scientists. Emphasis is placed on removing barriers, in particular the editor is publicly available on the internet without registration [1] and the scientists are not requested to provide information that may be generated automatically (e.g. the URL of a specific licence or the contact information of the metadata distributor). Metadata are stored in browser cookies and a copy can be saved to the local hard disk. To improve usability, form fields are translated into the scientific language, e.g. 'creators' of the DataCite schema are called 'authors'. To assist filling in the form, we make use of drop down menus for small vocabulary lists and offer a search facility for large thesauri. Explanations to form fields and definitions of vocabulary terms are provided in pop-up windows and a full documentation is available for download via the help menu. In addition, multiple geospatial references can be entered via an interactive mapping tool, which helps to minimize problems with different conventions to provide latitudes and longitudes. Currently, we are extending the metadata editor

Improving Metadata Compliance for Earth Science Data Records

Science.gov (United States)

Armstrong, E. M.; Chang, O.; Foster, D.

2014-12-01

One of the recurring challenges of creating earth science data records is to ensure a consistent level of metadata compliance at the granule level where important details of contents, provenance, producer, and data references are necessary to obtain a sufficient level of understanding. These details are important not just for individual data consumers but also for autonomous software systems. Two of the most popular metadata standards at the granule level are the Climate and Forecast (CF) Metadata Conventions and the Attribute Conventions for Dataset Discovery (ACDD). Many data producers have implemented one or both of these models including the Group for High Resolution Sea Surface Temperature (GHRSST) for their global SST products and the Ocean Biology Processing Group for NASA ocean color and SST products. While both the CF and ACDD models contain various level of metadata richness, the actual "required" attributes are quite small in number. Metadata at the granule level becomes much more useful when recommended or optional attributes are implemented that document spatial and temporal ranges, lineage and provenance, sources, keywords, and references etc. In this presentation we report on a new open source tool to check the compliance of netCDF and HDF5 granules to the CF and ACCD metadata models. The tool, written in Python, was originally implemented to support metadata compliance for netCDF records as part of the NOAA's Integrated Ocean Observing System. It outputs standardized scoring for metadata compliance for both CF and ACDD, produces an objective summary weight, and can be implemented for remote records via OPeNDAP calls. Originally a command-line tool, we have extended it to provide a user-friendly web interface. Reports on metadata testing are grouped in hierarchies that make it easier to track flaws and inconsistencies in the record. We have also extended it to support explicit metadata structures and semantic syntax for the GHRSST project that can be
Handling multiple metadata streams regarding digital learning material

NARCIS (Netherlands)

Roes, J.B.M.; Vuuren, J. van; Verbeij, N.; Nijstad, H.

2010-01-01

This paper presents the outcome of a study performed in the Netherlands on handling multiple metadata streams regarding digital learning material. The paper describes the present metadata architecture in the Netherlands, the present suppliers and users of metadata and digital learning materials. It
On the Origin of Metadata

Directory of Open Access Journals (Sweden)

Sam Coppens

2012-12-01

Full Text Available Metadata has been around and has evolved for centuries, albeit not recognized as such. Medieval manuscripts typically had illuminations at the start of each chapter, being both a kind of signature for the author writing the script and a pictorial chapter anchor for the illiterates at the time. Nowadays, there is so much fragmented information on the Internet that users sometimes fail to distinguish the real facts from some bended truth, let alone being able to interconnect different facts. Here, the metadata can both act as noise-reductors for detailed recommendations to the end-users, as it can be the catalyst to interconnect related information. Over time, metadata thus not only has had different modes of information, but furthermore, metadata’s relation of information to meaning, i.e., “semantics”, evolved. Darwin’s evolutionary propositions, from “species have an unlimited reproductive capacity”, over “natural selection”, to “the cooperation of mutations leads to adaptation to the environment” show remarkable parallels to both metadata’s different modes of information and to its relation of information to meaning over time. In this paper, we will show that the evolution of the use of (metadata can be mapped to Darwin’s nine evolutionary propositions. As mankind and its behavior are products of an evolutionary process, the evolutionary process of metadata with its different modes of information is on the verge of a new-semantic-era.
Developing Cyberinfrastructure Tools and Services for Metadata Quality Evaluation

Science.gov (United States)

Mecum, B.; Gordon, S.; Habermann, T.; Jones, M. B.; Leinfelder, B.; Powers, L. A.; Slaughter, P.

2016-12-01

Metadata and data quality are at the core of reusable and reproducible science. While great progress has been made over the years, much of the metadata collected only addresses data discovery, covering concepts such as titles and keywords. Improving metadata beyond the discoverability plateau means documenting detailed concepts within the data such as sampling protocols, instrumentation used, and variables measured. Given that metadata commonly do not describe their data at this level, how might we improve the state of things? Giving scientists and data managers easy to use tools to evaluate metadata quality that utilize community-driven recommendations is the key to producing high-quality metadata. To achieve this goal, we created a set of cyberinfrastructure tools and services that integrate with existing metadata and data curation workflows which can be used to improve metadata and data quality across the sciences. These tools work across metadata dialects (e.g., ISO19115, FGDC, EML, etc.) and can be used to assess aspects of quality beyond what is internal to the metadata such as the congruence between the metadata and the data it describes. The system makes use of a user-friendly mechanism for expressing a suite of checks as code in popular data science programming languages such as Python and R. This reduces the burden on scientists and data managers to learn yet another language. We demonstrated these services and tools in three ways. First, we evaluated a large corpus of datasets in the DataONE federation of data repositories against a metadata recommendation modeled after existing recommendations such as the LTER best practices and the Attribute Convention for Dataset Discovery (ACDD). Second, we showed how this service can be used to display metadata and data quality information to data producers during the data submission and metadata creation process, and to data consumers through data catalog search and access tools. Third, we showed how the centrally
From CLARIN Component Metadata to Linked Open Data

NARCIS (Netherlands)

Durco, M.; Windhouwer, Menzo

2014-01-01

In the European CLARIN infrastructure a growing number of resources are described with Component Metadata. In this paper we describe a transformation to make this metadata available as linked data. After this first step it becomes possible to connect the CLARIN Component Metadata with other valuable
Collection Metadata Solutions for Digital Library Applications

Science.gov (United States)

Hill, Linda L.; Janee, Greg; Dolin, Ron; Frew, James; Larsgaard, Mary

1999-01-01

Within a digital library, collections may range from an ad hoc set of objects that serve a temporary purpose to established library collections intended to persist through time. The objects in these collections vary widely, from library and data center holdings to pointers to real-world objects, such as geographic places, and the various metadata schemas that describe them. The key to integrated use of such a variety of collections in a digital library is collection metadata that represents the inherent and contextual characteristics of a collection. The Alexandria Digital Library (ADL) Project has designed and implemented collection metadata for several purposes: in XML form, the collection metadata "registers" the collection with the user interface client; in HTML form, it is used for user documentation; eventually, it will be used to describe the collection to network search agents; and it is used for internal collection management, including mapping the object metadata attributes to the common search parameters of the system.
Metadata In, Library Out. A Simple, Robust Digital Library System

Directory of Open Access Journals (Sweden)

Tonio Loewald

2010-06-01

Full Text Available Tired of being held hostage to expensive systems that did not meet our needs, the University of Alabama Libraries developed an XML schema-agnostic, light-weight digital library delivery system based on the principles of "Keep It Simple, Stupid!" Metadata and derivatives reside in openly accessible web directories, which support the development of web agents and new usability software, as well as modification and complete retrieval at any time. The file name structure is echoed in the file system structure, enabling the delivery software to make inferences about relationships, sequencing, and complex object structure without having to encapsulate files in complex metadata schemas. The web delivery system, Acumen, is built of PHP, JSON, JavaScript and HTML5, using MySQL to support fielded searching. Recognizing that spreadsheets are more user-friendly than XML, an accompanying widget, Archivists Utility, transforms spreadsheets into MODS based on rules selected by the user. Acumen, Archivists Utility, and all supporting software scripts will be made available as open source.
Study on high-level waste geological disposal metadata model

International Nuclear Information System (INIS)

Ding Xiaobin; Wang Changhong; Zhu Hehua; Li Xiaojun

2008-01-01

This paper expatiated the concept of metadata and its researches within china and abroad, then explain why start the study on the metadata model of high-level nuclear waste deep geological disposal project. As reference to GML, the author first set up DML under the framework of digital underground space engineering. Based on DML, a standardized metadata employed in high-level nuclear waste deep geological disposal project is presented. Then, a Metadata Model with the utilization of internet is put forward. With the standardized data and CSW services, this model may solve the problem in the data sharing and exchanging of different data form A metadata editor is build up in order to search and maintain metadata based on this model. (authors)
An Automation Planning Primer.

Science.gov (United States)

Paynter, Marion

1988-01-01

This brief planning guide for library automation incorporates needs assessment and evaluation of options to meet those needs. A bibliography of materials on automation planning and software reviews, library software directories, and library automation journals is included. (CLB)
A conceptual model of the automated credibility assessment of the volunteered geographic information

International Nuclear Information System (INIS)

Idris, N H; Jackson, M J; Ishak, M H I

2014-01-01

The use of Volunteered Geographic Information (VGI) in collecting, sharing and disseminating geospatially referenced information on the Web is increasingly common. The potentials of this localized and collective information have been seen to complement the maintenance process of authoritative mapping data sources and in realizing the development of Digital Earth. The main barrier to the use of this data in supporting this bottom up approach is the credibility (trust), completeness, accuracy, and quality of both the data input and outputs generated. The only feasible approach to assess these data is by relying on an automated process. This paper describes a conceptual model of indicators (parameters) and practical approaches to automated assess the credibility of information contributed through the VGI including map mashups, Geo Web and crowd – sourced based applications. There are two main components proposed to be assessed in the conceptual model – metadata and data. The metadata component comprises the indicator of the hosting (websites) and the sources of data / information. The data component comprises the indicators to assess absolute and relative data positioning, attribute, thematic, temporal and geometric correctness and consistency. This paper suggests approaches to assess the components. To assess the metadata component, automated text categorization using supervised machine learning is proposed. To assess the correctness and consistency in the data component, we suggest a matching validation approach using the current emerging technologies from Linked Data infrastructures and using third party reviews validation. This study contributes to the research domain that focuses on the credibility, trust and quality issues of data contributed by web citizen providers
Metadata to Support Data Warehouse Evolution

Science.gov (United States)

Solodovnikova, Darja

The focus of this chapter is metadata necessary to support data warehouse evolution. We present the data warehouse framework that is able to track evolution process and adapt data warehouse schemata and data extraction, transformation, and loading (ETL) processes. We discuss the significant part of the framework, the metadata repository that stores information about the data warehouse, logical and physical schemata and their versions. We propose the physical implementation of multiversion data warehouse in a relational DBMS. For each modification of a data warehouse schema, we outline the changes that need to be made to the repository metadata and in the database.
Streamlining geospatial metadata in the Semantic Web

Science.gov (United States)

Fugazza, Cristiano; Pepe, Monica; Oggioni, Alessandro; Tagliolato, Paolo; Carrara, Paola

2016-04-01

In the geospatial realm, data annotation and discovery rely on a number of ad-hoc formats and protocols. These have been created to enable domain-specific use cases generalized search is not feasible for. Metadata are at the heart of the discovery process and nevertheless they are often neglected or encoded in formats that either are not aimed at efficient retrieval of resources or are plainly outdated. Particularly, the quantum leap represented by the Linked Open Data (LOD) movement did not induce so far a consistent, interlinked baseline in the geospatial domain. In a nutshell, datasets, scientific literature related to them, and ultimately the researchers behind these products are only loosely connected; the corresponding metadata intelligible only to humans, duplicated on different systems, seldom consistently. Instead, our workflow for metadata management envisages i) editing via customizable web- based forms, ii) encoding of records in any XML application profile, iii) translation into RDF (involving the semantic lift of metadata records), and finally iv) storage of the metadata as RDF and back-translation into the original XML format with added semantics-aware features. Phase iii) hinges on relating resource metadata to RDF data structures that represent keywords from code lists and controlled vocabularies, toponyms, researchers, institutes, and virtually any description one can retrieve (or directly publish) in the LOD Cloud. In the context of a distributed Spatial Data Infrastructure (SDI) built on free and open-source software, we detail phases iii) and iv) of our workflow for the semantics-aware management of geospatial metadata.
The Global Streamflow Indices and Metadata Archive (GSIM) - Part 1: The production of a daily streamflow archive and metadata

Science.gov (United States)

Do, Hong Xuan; Gudmundsson, Lukas; Leonard, Michael; Westra, Seth

2018-04-01

This is the first part of a two-paper series presenting the Global Streamflow Indices and Metadata archive (GSIM), a worldwide collection of metadata and indices derived from more than 35 000 daily streamflow time series. This paper focuses on the compilation of the daily streamflow time series based on 12 free-to-access streamflow databases (seven national databases and five international collections). It also describes the development of three metadata products (freely available at https://doi.pangaea.de/10.1594/PANGAEA.887477" target="_blank">https://doi.pangaea.de/10.1594/PANGAEA.887477): (1) a GSIM catalogue collating basic metadata associated with each time series, (2) catchment boundaries for the contributing area of each gauge, and (3) catchment metadata extracted from 12 gridded global data products representing essential properties such as land cover type, soil type, and climate and topographic characteristics. The quality of the delineated catchment boundary is also made available and should be consulted in GSIM application. The second paper in the series then explores production and analysis of streamflow indices. Having collated an unprecedented number of stations and associated metadata, GSIM can be used to advance large-scale hydrological research and improve understanding of the global water cycle.
Metadata Aided Run Selection at ATLAS

CERN Document Server

Buckingham, RM; The ATLAS collaboration; Tseng, JC-L; Viegas, F; Vinek, E

2010-01-01

Management of the large volume of data collected by any large scale sci- entiﬁc experiment requires the collection of coherent metadata quantities, which can be used by reconstruction or analysis programs and/or user in- terfaces, to pinpoint collections of data needed for speciﬁc purposes. In the ATLAS experiment at the LHC, we have collected metadata from systems storing non-event-wise data (Conditions) into a relational database. The Conditions metadata (COMA) database tables not only contain conditions known at the time of event recording, but also allow for the addition of conditions data collected as a result of later analysis of the data (such as improved measurements of beam conditions or assessments of data quality). A new web based interface called “runBrowser” makes these Conditions Metadata available as a Run based selection service. runBrowser, based on php and javascript, uses jQuery to present selection criteria and report results. It not only facilitates data selection by conditions at...
Metadata aided run selection at ATLAS

CERN Document Server

Buckingham, RM; The ATLAS collaboration; Tseng, JC-L; Viegas, F; Vinek, E

2011-01-01

Management of the large volume of data collected by any large scale scientiﬁc experiment requires the collection of coherent metadata quantities, which can be used by reconstruction or analysis programs and/or user interfaces, to pinpoint collections of data needed for speciﬁc purposes. In the ATLAS experiment at the LHC, we have collected metadata from systems storing non-event-wise data (Conditions) into a relational database. The Conditions metadata (COMA) database tables not only contain conditions known at the time of event recording, but also allow for the addition of conditions data collected as a result of later analysis of the data (such as improved measurements of beam conditions or assessments of data quality). A new web based interface called “runBrowser” makes these Conditions Metadata available as a Run based selection service. runBrowser, based on php and javascript, uses jQuery to present selection criteria and report results. It not only facilitates data selection by conditions attrib...
The Global Streamflow Indices and Metadata Archive (GSIM – Part 1: The production of a daily streamflow archive and metadata

Directory of Open Access Journals (Sweden)

H. X. Do

2018-04-01

Full Text Available This is the first part of a two-paper series presenting the Global Streamflow Indices and Metadata archive (GSIM, a worldwide collection of metadata and indices derived from more than 35 000 daily streamflow time series. This paper focuses on the compilation of the daily streamflow time series based on 12 free-to-access streamflow databases (seven national databases and five international collections. It also describes the development of three metadata products (freely available at https://doi.pangaea.de/10.1594/PANGAEA.887477: (1 a GSIM catalogue collating basic metadata associated with each time series, (2 catchment boundaries for the contributing area of each gauge, and (3 catchment metadata extracted from 12 gridded global data products representing essential properties such as land cover type, soil type, and climate and topographic characteristics. The quality of the delineated catchment boundary is also made available and should be consulted in GSIM application. The second paper in the series then explores production and analysis of streamflow indices. Having collated an unprecedented number of stations and associated metadata, GSIM can be used to advance large-scale hydrological research and improve understanding of the global water cycle.
Prediction of Solar Eruptions Using Filament Metadata

Science.gov (United States)

Aggarwal, Ashna; Schanche, Nicole; Reeves, Katharine K.; Kempton, Dustin; Angryk, Rafal

2018-05-01

We perform a statistical analysis of erupting and non-erupting solar filaments to determine the properties related to the eruption potential. In order to perform this study, we correlate filament eruptions documented in the Heliophysics Event Knowledgebase (HEK) with HEK filaments that have been grouped together using a spatiotemporal tracking algorithm. The HEK provides metadata about each filament instance, including values for length, area, tilt, and chirality. We add additional metadata properties such as the distance from the nearest active region and the magnetic field decay index. We compare trends in the metadata from erupting and non-erupting filament tracks to discover which properties present signs of an eruption. We find that a change in filament length over time is the most important factor in discriminating between erupting and non-erupting filament tracks, with erupting tracks being more likely to have decreasing length. We attempt to find an ensemble of predictive filament metadata using a Random Forest Classifier approach, but find the probability of correctly predicting an eruption with the current metadata is only slightly better than chance.
Semantic Metadata for Heterogeneous Spatial Planning Documents

Science.gov (United States)

Iwaniak, A.; Kaczmarek, I.; Łukowicz, J.; Strzelecki, M.; Coetzee, S.; Paluszyński, W.

2016-09-01

Spatial planning documents contain information about the principles and rights of land use in different zones of a local authority. They are the basis for administrative decision making in support of sustainable development. In Poland these documents are published on the Web according to a prescribed non-extendable XML schema, designed for optimum presentation to humans in HTML web pages. There is no document standard, and limited functionality exists for adding references to external resources. The text in these documents is discoverable and searchable by general-purpose web search engines, but the semantics of the content cannot be discovered or queried. The spatial information in these documents is geographically referenced but not machine-readable. Major manual efforts are required to integrate such heterogeneous spatial planning documents from various local authorities for analysis, scenario planning and decision support. This article presents results of an implementation using machine-readable semantic metadata to identify relationships among regulations in the text, spatial objects in the drawings and links to external resources. A spatial planning ontology was used to annotate different sections of spatial planning documents with semantic metadata in the Resource Description Framework in Attributes (RDFa). The semantic interpretation of the content, links between document elements and links to external resources were embedded in XHTML pages. An example and use case from the spatial planning domain in Poland is presented to evaluate its efficiency and applicability. The solution enables the automated integration of spatial planning documents from multiple local authorities to assist decision makers with understanding and interpreting spatial planning information. The approach is equally applicable to legal documents from other countries and domains, such as cultural heritage and environmental management.
A web-based, dynamic metadata interface to MDSplus

International Nuclear Information System (INIS)

Gardner, Henry J.; Karia, Raju; Manduchi, Gabriele

2008-01-01

We introduce the concept of a Fusion Data Grid and discuss the management of metadata within such a Grid. We describe a prototype application which serves fusion data over the internet together with metadata information which can be flexibly created and modified over time. The application interfaces with the MDSplus data acquisition system and it has been designed to capture metadata which is generated by scientists from the post-processing of experimental data. The implementation of dynamic metadata tables using the Java programming language together with an object-relational mapping system, Hibernate, is described in the Appendix
Creating metadata that work for digital libraries and Google

OpenAIRE

Dawson, Alan

2004-01-01

For many years metadata has been recognised as a significant component of the digital information environment. Substantial work has gone into creating complex metadata schemes for describing digital content. Yet increasingly Web search engines, and Google in particular, are the primary means of discovering and selecting digital resources, although they make little use of metadata. This article considers how digital libraries can gain more value from their metadata by adapting it for Google us...

Technologies for metadata management in scientific a

OpenAIRE

Castro-Romero, Alexander; González-Sanabria, Juan S.; Ballesteros-Ricaurte, Javier A.

2015-01-01

The use of Semantic Web technologies has been increasing, so it is common using them in different ways. This article evaluates how these technologies can contribute to improve the indexing in articles in scientific journals. Initially, there is a conceptual review about metadata. Later, studying the most important technologies for the use of metadata in Web and, this way, choosing one of them to apply it in the case of study of scientific articles indexing, in order to determine the metadata ...
The role of metadata in managing large environmental science datasets. Proceedings

Energy Technology Data Exchange (ETDEWEB)

Melton, R.B.; DeVaney, D.M. [eds.] [Pacific Northwest Lab., Richland, WA (United States); French, J. C. [Univ. of Virginia, (United States)

1995-06-01

The purpose of this workshop was to bring together computer science researchers and environmental sciences data management practitioners to consider the role of metadata in managing large environmental sciences datasets. The objectives included: establishing a common definition of metadata; identifying categories of metadata; defining problems in managing metadata; and defining problems related to linking metadata with primary data.
Automated software system for checking the structure and format of ACM SIG documents

Science.gov (United States)

Mirza, Arsalan Rahman; Sah, Melike

2017-04-01

Microsoft (MS) Office Word is one of the most commonly used software tools for creating documents. MS Word 2007 and above uses XML to represent the structure of MS Word documents. Metadata about the documents are automatically created using Office Open XML (OOXML) syntax. We develop a new framework, which is called ADFCS (Automated Document Format Checking System) that takes the advantage of the OOXML metadata, in order to extract semantic information from MS Office Word documents. In particular, we develop a new ontology for Association for Computing Machinery (ACM) Special Interested Group (SIG) documents for representing the structure and format of these documents by using OWL (Web Ontology Language). Then, the metadata is extracted automatically in RDF (Resource Description Framework) according to this ontology using the developed software. Finally, we generate extensive rules in order to infer whether the documents are formatted according to ACM SIG standards. This paper, introduces ACM SIG ontology, metadata extraction process, inference engine, ADFCS online user interface, system evaluation and user study evaluations.
Efficient processing of MPEG-21 metadata in the binary domain

Science.gov (United States)

Timmerer, Christian; Frank, Thomas; Hellwagner, Hermann; Heuer, Jörg; Hutter, Andreas

2005-10-01

XML-based metadata is widely adopted across the different communities and plenty of commercial and open source tools for processing and transforming are available on the market. However, all of these tools have one thing in common: they operate on plain text encoded metadata which may become a burden in constrained and streaming environments, i.e., when metadata needs to be processed together with multimedia content on the fly. In this paper we present an efficient approach for transforming such kind of metadata which are encoded using MPEG's Binary Format for Metadata (BiM) without additional en-/decoding overheads, i.e., within the binary domain. Therefore, we have developed an event-based push parser for BiM encoded metadata which transforms the metadata by a limited set of processing instructions - based on traditional XML transformation techniques - operating on bit patterns instead of cost-intensive string comparisons.
Mining Building Metadata by Data Stream Comparison

DEFF Research Database (Denmark)

Holmegaard, Emil; Kjærgaard, Mikkel Baun

2016-01-01

to handle data streams with only slightly similar patterns. We have evaluated Metafier with points and data from one building located in Denmark. We have evaluated Metafier with 903 points, and the overall accuracy, with only 3 known examples, was 94.71%. Furthermore we found that using DTW for mining...... ways to annotate sensor and actuation points. This makes it difficult to create intuitive queries for retrieving data streams from points. Another problem is the amount of insufficient or missing metadata. We introduce Metafier, a tool for extracting metadata from comparing data streams. Metafier...... enables a semi-automatic labeling of metadata to building instrumentation. Metafier annotates points with metadata by comparing the data from a set of validated points with unvalidated points. Metafier has three different algorithms to compare points with based on their data. The three algorithms...
openPDS: protecting the privacy of metadata through SafeAnswers.

Directory of Open Access Journals (Sweden)

Yves-Alexandre de Montjoye

Full Text Available The rise of smartphones and web services made possible the large-scale collection of personal metadata. Information about individuals' location, phone call logs, or web-searches, is collected and used intensively by organizations and big data researchers. Metadata has however yet to realize its full potential. Privacy and legal concerns, as well as the lack of technical solutions for personal metadata management is preventing metadata from being shared and reconciled under the control of the individual. This lack of access and control is furthermore fueling growing concerns, as it prevents individuals from understanding and managing the risks associated with the collection and use of their data. Our contribution is two-fold: (1 we describe openPDS, a personal metadata management framework that allows individuals to collect, store, and give fine-grained access to their metadata to third parties. It has been implemented in two field studies; (2 we introduce and analyze SafeAnswers, a new and practical way of protecting the privacy of metadata at an individual level. SafeAnswers turns a hard anonymization problem into a more tractable security one. It allows services to ask questions whose answers are calculated against the metadata instead of trying to anonymize individuals' metadata. The dimensionality of the data shared with the services is reduced from high-dimensional metadata to low-dimensional answers that are less likely to be re-identifiable and to contain sensitive information. These answers can then be directly shared individually or in aggregate. openPDS and SafeAnswers provide a new way of dynamically protecting personal metadata, thereby supporting the creation of smart data-driven services and data science research.
Automated metadata, provenance cataloging and navigable interfaces: Ensuring the usefulness of extreme-scale data

Energy Technology Data Exchange (ETDEWEB)

Schissel, D.P., E-mail: schissel@fusion.gat.com [General Atomics, P.O. Box 85608, San Diego, CA 92186-5608 (United States); Abla, G.; Flanagan, S.M. [General Atomics, P.O. Box 85608, San Diego, CA 92186-5608 (United States); Greenwald, M. [Massachusetts Institute of Technology, Cambridge, MA 02139 (United States); Lee, X. [General Atomics, P.O. Box 85608, San Diego, CA 92186-5608 (United States); Romosan, A.; Shoshani, A. [Lawrence Berkeley National Laboratory, Berkeley, CA 94720 (United States); Stillerman, J.; Wright, J. [Massachusetts Institute of Technology, Cambridge, MA 02139 (United States)

2014-05-15

For scientific research, it is not the mere existence of experimental or simulation data that is important, but the ability to make use of it. This paper presents the results of research to create a data model, infrastructure, and a set of tools that support data tracking, cataloging, and integration across a broad scientific domain. The system is intended to document workflow and data provenance in the widest sense. Combining research on integrated metadata, provenance, and ontology information with research on user interfaces has allowed the construction of early prototype. While using fusion science as a test bed, the system's framework and data model is quite general.
Automated metadata, provenance cataloging and navigable interfaces: Ensuring the usefulness of extreme-scale data

International Nuclear Information System (INIS)

Schissel, D.P.; Abla, G.; Flanagan, S.M.; Greenwald, M.; Lee, X.; Romosan, A.; Shoshani, A.; Stillerman, J.; Wright, J.

2014-01-01

For scientific research, it is not the mere existence of experimental or simulation data that is important, but the ability to make use of it. This paper presents the results of research to create a data model, infrastructure, and a set of tools that support data tracking, cataloging, and integration across a broad scientific domain. The system is intended to document workflow and data provenance in the widest sense. Combining research on integrated metadata, provenance, and ontology information with research on user interfaces has allowed the construction of early prototype. While using fusion science as a test bed, the system's framework and data model is quite general
From Digital Commons to OCLC: A Tailored Approach for Harvesting and Transforming ETD Metadata into High-Quality Records

Directory of Open Access Journals (Sweden)

Marielle Veve

2016-07-01

Full Text Available The library literature contains many examples of automated and semi-automated approaches to harvest electronic theses and dissertations (ETD metadata from institutional repositories (IR to the Online Computer Library Center (OCLC. However, most of these approaches could not be implemented with the institutional repository software Digital Commons because of various reasons including proprietary schema incompatibilities and high level programming expertise requirements our institution did not want to pursue. Only one semi-automated approach was found in the library literature which met our requirements for implementation, and even though it catered to the particular needs of the DSpace IR, it could be implemented to other IR software if further customizations were applied. The following paper presents an extension of this semi-automated approach originally created by Deng and Reese, but customized and adapted to address the particular needs of the Digital Commons community and updated to integrate the latest Resource Description & Access (RDA content standards for ETDs. Advantages and disadvantages of this workflow are discussed and presented as well.
Handling Metadata in a Neurophysiology Laboratory

Directory of Open Access Journals (Sweden)

Lyuba Zehl

2016-07-01

Full Text Available To date, non-reproducibility of neurophysiological research is a matterof intense discussion in the scientific community. A crucial componentto enhance reproducibility is to comprehensively collect and storemetadata, that is all information about the experiment, the data,and the applied preprocessing steps on the data, such that they canbe accessed and shared in a consistent and simple manner. However,the complexity of experiments, the highly specialized analysis workflowsand a lack of knowledge on how to make use of supporting softwaretools often overburden researchers to perform such a detailed documentation.For this reason, the collected metadata are often incomplete, incomprehensiblefor outsiders or ambiguous. Based on our research experience in dealingwith diverse datasets, we here provide conceptual and technical guidanceto overcome the challenges associated with the collection, organization,and storage of metadata in a neurophysiology laboratory. Through theconcrete example of managing the metadata of a complex experimentthat yields multi-channel recordings from monkeys performing a behavioralmotor task, we practically demonstrate the implementation of theseapproaches and solutions with the intention that they may be generalizedto a specific project at hand. Moreover, we detail five use casesthat demonstrate the resulting benefits of constructing a well-organizedmetadata collection when processing or analyzing the recorded data,in particular when these are shared between laboratories in a modernscientific collaboration. Finally, we suggest an adaptable workflowto accumulate, structure and store metadata from different sourcesusing, by way of example, the odML metadata framework.
Metadata aided run selection at ATLAS

International Nuclear Information System (INIS)

Buckingham, R M; Gallas, E J; Tseng, J C-L; Viegas, F; Vinek, E

2011-01-01

Management of the large volume of data collected by any large scale scientific experiment requires the collection of coherent metadata quantities, which can be used by reconstruction or analysis programs and/or user interfaces, to pinpoint collections of data needed for specific purposes. In the ATLAS experiment at the LHC, we have collected metadata from systems storing non-event-wise data (Conditions) into a relational database. The Conditions metadata (COMA) database tables not only contain conditions known at the time of event recording, but also allow for the addition of conditions data collected as a result of later analysis of the data (such as improved measurements of beam conditions or assessments of data quality). A new web based interface called 'runBrowser' makes these Conditions Metadata available as a Run based selection service. runBrowser, based on PHP and JavaScript, uses jQuery to present selection criteria and report results. It not only facilitates data selection by conditions attributes, but also gives the user information at each stage about the relationship between the conditions chosen and the remaining conditions criteria available. When a set of COMA selections are complete, runBrowser produces a human readable report as well as an XML file in a standardized ATLAS format. This XML can be saved for later use or refinement in a future runBrowser session, shared with physics/detector groups, or used as input to ELSSI (event level Metadata browser) or other ATLAS run or event processing services.
NAIP National Metadata

Data.gov (United States)

Farm Service Agency, Department of Agriculture — The NAIP National Metadata Map contains USGS Quarter Quad and NAIP Seamline boundaries for every year NAIP imagery has been collected. Clicking on the map also makes...
Filament Chirality over an Entire Cycle Determined with an Automated Detection Module -- a Neat Surprise!

Science.gov (United States)

Martens, Petrus C.; Yeates, A. R.; Mackay, D.; Pillai, K. G.

2013-07-01

Using metadata produced by automated solar feature detection modules developed for SDO (Martens et al. 2012) we have discovered some trends in filament chirality and filament-sigmoid relations that are new and in part contradict the current consensus. Automated detection of solar features has the advantage over manual detection of having the detection criteria applied consistently, and in being able to deal with enormous amounts of data, like the 1 Terabyte per day that SDO produces. Here we use the filament detection module developed by Bernasconi, which has metadata from 2000 on, and the sigmoid sniffer, which has been producing metadata from AIA 94 A images since October 2011. The most interesting result we find is that the hemispheric chirality preference for filaments (dextral in the north, and v.v.), studied in detail for a three year period by Pevtsov et al. (2003) seems to disappear during parts of the decline of cycle 23 and during the extended solar minimum that followed. Moreover the hemispheric chirality rule seems to be much less pronounced during the onset of cycle 24. For sigmoids we find the expected correlation between chirality and handedness (S or Z) shape but not as strong as expected.
Sustained Assessment Metadata as a Pathway to Trustworthiness of Climate Science Information

Science.gov (United States)

Champion, S. M.; Kunkel, K.

2017-12-01

The Sustained Assessment process has produced a suite of climate change reports: The Third National Climate Assessment (NCA3), Regional Surface Climate Conditions in CMIP3 and CMIP5 for the United States: Differences, Similarities, and Implications for the U.S. National Climate Assessment, Impacts of Climate Change on Human Health in the United States: A Scientific Assessment, The State Climate Summaries, as well as the anticipated Climate Science Special Report and Fourth National Climate Assessment. Not only are these groundbreaking reports of climate change science, they are also the first suite of climate science reports to provide access to complex metadata directly connected to the report figures and graphics products. While the basic metadata documentation requirement is federally mandated through a series of federal guidelines as a part of the Information Quality Act, Sustained Assessment products are also deemed Highly Influential Scientific Assessments, which further requires demonstration of the transparency and reproducibility of the content. To meet these requirements, the Technical Support Unit (TSU) for the Sustained Assessment embarked on building a system for not only collecting and documenting metadata to the required standards, but one that also provides consumers unprecedented access to the underlying data and methods. As our process and documentation have evolved, the value of both continue to grow in parallel with the consumer expectation of quality, accessible climate science information. This presentation will detail the how the TSU accomplishes the mandated requirements with their metadata collection and documentation process, as well as the technical solution designed to demonstrate compliance while also providing access to the content for the general public. We will also illustrate how our accessibility platforms guide consumers through the Assessment science at a level of transparency that builds trust and confidence in the report
An emergent theory of digital library metadata enrich then filter

CERN Document Server

Stevens, Brett

2015-01-01

An Emergent Theory of Digital Library Metadata is a reaction to the current digital library landscape that is being challenged with growing online collections and changing user expectations. The theory provides the conceptual underpinnings for a new approach which moves away from expert defined standardised metadata to a user driven approach with users as metadata co-creators. Moving away from definitive, authoritative, metadata to a system that reflects the diversity of users’ terminologies, it changes the current focus on metadata simplicity and efficiency to one of metadata enriching, which is a continuous and evolving process of data linking. From predefined description to information conceptualised, contextualised and filtered at the point of delivery. By presenting this shift, this book provides a coherent structure in which future technological developments can be considered.
Design and Implementation of a Metadata-rich File System

Energy Technology Data Exchange (ETDEWEB)

Ames, S; Gokhale, M B; Maltzahn, C

2010-01-19

Despite continual improvements in the performance and reliability of large scale file systems, the management of user-defined file system metadata has changed little in the past decade. The mismatch between the size and complexity of large scale data stores and their ability to organize and query their metadata has led to a de facto standard in which raw data is stored in traditional file systems, while related, application-specific metadata is stored in relational databases. This separation of data and semantic metadata requires considerable effort to maintain consistency and can result in complex, slow, and inflexible system operation. To address these problems, we have developed the Quasar File System (QFS), a metadata-rich file system in which files, user-defined attributes, and file relationships are all first class objects. In contrast to hierarchical file systems and relational databases, QFS defines a graph data model composed of files and their relationships. QFS incorporates Quasar, an XPATH-extended query language for searching the file system. Results from our QFS prototype show the effectiveness of this approach. Compared to the de facto standard, the QFS prototype shows superior ingest performance and comparable query performance on user metadata-intensive operations and superior performance on normal file metadata operations.
Using a linked data approach to aid development of a metadata portal to support Marine Strategy Framework Directive (MSFD) implementation

Science.gov (United States)

Wood, Chris

2016-04-01

-compliant services relating to the dataset. The web front-end therefore enables users to effectively filter, sort, or search the metadata. As the MSFD timeline requires Member States to review their progress on achieving or maintaining GES every six years, the timely development of this metadata portal will not only aid interested stakeholders in understanding how member states are meeting their targets, but also shows how linked data can be used effectively to support policy makers and associated legislative bodies.
SEMANTIC METADATA FOR HETEROGENEOUS SPATIAL PLANNING DOCUMENTS

Directory of Open Access Journals (Sweden)

A. Iwaniak

2016-09-01

Full Text Available Spatial planning documents contain information about the principles and rights of land use in different zones of a local authority. They are the basis for administrative decision making in support of sustainable development. In Poland these documents are published on the Web according to a prescribed non-extendable XML schema, designed for optimum presentation to humans in HTML web pages. There is no document standard, and limited functionality exists for adding references to external resources. The text in these documents is discoverable and searchable by general-purpose web search engines, but the semantics of the content cannot be discovered or queried. The spatial information in these documents is geographically referenced but not machine-readable. Major manual efforts are required to integrate such heterogeneous spatial planning documents from various local authorities for analysis, scenario planning and decision support. This article presents results of an implementation using machine-readable semantic metadata to identify relationships among regulations in the text, spatial objects in the drawings and links to external resources. A spatial planning ontology was used to annotate different sections of spatial planning documents with semantic metadata in the Resource Description Framework in Attributes (RDFa. The semantic interpretation of the content, links between document elements and links to external resources were embedded in XHTML pages. An example and use case from the spatial planning domain in Poland is presented to evaluate its efficiency and applicability. The solution enables the automated integration of spatial planning documents from multiple local authorities to assist decision makers with understanding and interpreting spatial planning information. The approach is equally applicable to legal documents from other countries and domains, such as cultural heritage and environmental management.
Who tweets? Deriving the demographic characteristics of age, occupation and social class from twitter user meta-data.

Directory of Open Access Journals (Sweden)

Luke Sloan

Full Text Available This paper specifies, designs and critically evaluates two tools for the automated identification of demographic data (age, occupation and social class from the profile descriptions of Twitter users in the United Kingdom (UK. Meta-data data routinely collected through the Collaborative Social Media Observatory (COSMOS: http://www.cosmosproject.net/ relating to UK Twitter users is matched with the occupational lookup tables between job and social class provided by the Office for National Statistics (ONS using SOC2010. Using expert human validation, the validity and reliability of the automated matching process is critically assessed and a prospective class distribution of UK Twitter users is offered with 2011 Census baseline comparisons. The pattern matching rules for identifying age are explained and enacted following a discussion on how to minimise false positives. The age distribution of Twitter users, as identified using the tool, is presented alongside the age distribution of the UK population from the 2011 Census. The automated occupation detection tool reliably identifies certain occupational groups, such as professionals, for which job titles cannot be confused with hobbies or are used in common parlance within alternative contexts. An alternative explanation on the prevalence of hobbies is that the creative sector is overrepresented on Twitter compared to 2011 Census data. The age detection tool illustrates the youthfulness of Twitter users compared to the general UK population as of the 2011 Census according to proportions, but projections demonstrate that there is still potentially a large number of older platform users. It is possible to detect "signatures" of both occupation and age from Twitter meta-data with varying degrees of accuracy (particularly dependent on occupational groups but further confirmatory work is needed.
Who Tweets? Deriving the Demographic Characteristics of Age, Occupation and Social Class from Twitter User Meta-Data

Science.gov (United States)

Sloan, Luke; Morgan, Jeffrey; Burnap, Pete; Williams, Matthew

2015-01-01

This paper specifies, designs and critically evaluates two tools for the automated identification of demographic data (age, occupation and social class) from the profile descriptions of Twitter users in the United Kingdom (UK). Meta-data data routinely collected through the Collaborative Social Media Observatory (COSMOS: http://www.cosmosproject.net/) relating to UK Twitter users is matched with the occupational lookup tables between job and social class provided by the Office for National Statistics (ONS) using SOC2010. Using expert human validation, the validity and reliability of the automated matching process is critically assessed and a prospective class distribution of UK Twitter users is offered with 2011 Census baseline comparisons. The pattern matching rules for identifying age are explained and enacted following a discussion on how to minimise false positives. The age distribution of Twitter users, as identified using the tool, is presented alongside the age distribution of the UK population from the 2011 Census. The automated occupation detection tool reliably identifies certain occupational groups, such as professionals, for which job titles cannot be confused with hobbies or are used in common parlance within alternative contexts. An alternative explanation on the prevalence of hobbies is that the creative sector is overrepresented on Twitter compared to 2011 Census data. The age detection tool illustrates the youthfulness of Twitter users compared to the general UK population as of the 2011 Census according to proportions, but projections demonstrate that there is still potentially a large number of older platform users. It is possible to detect “signatures” of both occupation and age from Twitter meta-data with varying degrees of accuracy (particularly dependent on occupational groups) but further confirmatory work is needed. PMID:25729900

ASDC Collaborations and Processes to Ensure Quality Metadata and Consistent Data Availability

Science.gov (United States)

Trapasso, T. J.

2017-12-01

With the introduction of new tools, faster computing, and less expensive storage, increased volumes of data are expected to be managed with existing or fewer resources. Metadata management is becoming a heightened challenge from the increase in data volume, resulting in more metadata records needed to be curated for each product. To address metadata availability and completeness, NASA ESDIS has taken significant strides with the creation of the United Metadata Model (UMM) and Common Metadata Repository (CMR). These UMM helps address hurdles experienced by the increasing number of metadata dialects and the CMR provides a primary repository for metadata so that required metadata fields can be served through a growing number of tools and services. However, metadata quality remains an issue as metadata is not always inherent to the end-user. In response to these challenges, the NASA Atmospheric Science Data Center (ASDC) created the Collaboratory for quAlity Metadata Preservation (CAMP) and defined the Product Lifecycle Process (PLP) to work congruently. CAMP is unique in that it provides science team members a UI to directly supply metadata that is complete, compliant, and accurate for their data products. This replaces back-and-forth communication that often results in misinterpreted metadata. Upon review by ASDC staff, metadata is submitted to CMR for broader distribution through Earthdata. Further, approval of science team metadata in CAMP automatically triggers the ASDC PLP workflow to ensure appropriate services are applied throughout the product lifecycle. This presentation will review the design elements of CAMP and PLP as well as demonstrate interfaces to each. It will show the benefits that CAMP and PLP provide to the ASDC that could potentially benefit additional NASA Earth Science Data and Information System (ESDIS) Distributed Active Archive Centers (DAACs).
Metadata Authoring with Versatility and Extensibility

Science.gov (United States)

Pollack, Janine; Olsen, Lola

2004-01-01

NASA's Global Change Master Directory (GCMD) assists the scientific community in the discovery of and linkage to Earth science data sets and related services. The GCMD holds over 13,800 data set descriptions in Directory Interchange Format (DIF) and 700 data service descriptions in Service Entry Resource Format (SERF), encompassing the disciplines of geology, hydrology, oceanography, meteorology, and ecology. Data descriptions also contain geographic coverage information and direct links to the data, thus allowing researchers to discover data pertaining to a geographic location of interest, then quickly acquire those data. The GCMD strives to be the preferred data locator for world-wide directory-level metadata. In this vein, scientists and data providers must have access to intuitive and efficient metadata authoring tools. Existing GCMD tools are attracting widespread usage; however, a need for tools that are portable, customizable and versatile still exists. With tool usage directly influencing metadata population, it has become apparent that new tools are needed to fill these voids. As a result, the GCMD has released a new authoring tool allowing for both web-based and stand-alone authoring of descriptions. Furthermore, this tool incorporates the ability to plug-and-play the metadata format of choice, offering users options of DIF, SERF, FGDC, ISO or any other defined standard. Allowing data holders to work with their preferred format, as well as an option of a stand-alone application or web-based environment, docBUlLDER will assist the scientific community in efficiently creating quality data and services metadata.
Making the Case for Embedded Metadata in Digital Images

DEFF Research Database (Denmark)

Smith, Kari R.; Saunders, Sarah; Kejser, U.B.

2014-01-01

This paper discusses the standards, methods, use cases, and opportunities for using embedded metadata in digital images. In this paper we explain the past and current work engaged with developing specifications, standards for embedding metadata of different types, and the practicalities of data...... exchange in heritage institutions and the culture sector. Our examples and findings support the case for embedded metadata in digital images and the opportunities for such use more broadly in non-heritage sectors as well. We encourage the adoption of embedded metadata by digital image content creators...... and curators as well as those developing software and hardware that support the creation or re-use of digital images. We conclude that the usability of born digital images as well as physical objects that are digitized can be extended and the files preserved more readily with embedded metadata....
Interpreting the ASTM 'content standard for digital geospatial metadata'

Science.gov (United States)

Nebert, Douglas D.

1996-01-01

ASTM and the Federal Geographic Data Committee have developed a content standard for spatial metadata to facilitate documentation, discovery, and retrieval of digital spatial data using vendor-independent terminology. Spatial metadata elements are identifiable quality and content characteristics of a data set that can be tied to a geographic location or area. Several Office of Management and Budget Circulars and initiatives have been issued that specify improved cataloguing of and accessibility to federal data holdings. An Executive Order further requires the use of the metadata content standard to document digital spatial data sets. Collection and reporting of spatial metadata for field investigations performed for the federal government is an anticipated requirement. This paper provides an overview of the draft spatial metadata content standard and a description of how the standard could be applied to investigations collecting spatially-referenced field data.
Making the Case for Embedded Metadata in Digital Images

DEFF Research Database (Denmark)

Smith, Kari R.; Saunders, Sarah; Kejser, U.B.

2014-01-01

exchange in heritage institutions and the culture sector. Our examples and findings support the case for embedded metadata in digital images and the opportunities for such use more broadly in non-heritage sectors as well. We encourage the adoption of embedded metadata by digital image content creators......This paper discusses the standards, methods, use cases, and opportunities for using embedded metadata in digital images. In this paper we explain the past and current work engaged with developing specifications, standards for embedding metadata of different types, and the practicalities of data...... and curators as well as those developing software and hardware that support the creation or re-use of digital images. We conclude that the usability of born digital images as well as physical objects that are digitized can be extended and the files preserved more readily with embedded metadata....
A Novel Architecture of Metadata Management System Based on Intelligent Cache

Institute of Scientific and Technical Information of China (English)

SONG Baoyan; ZHAO Hongwei; WANG Yan; GAO Nan; XU Jin

2006-01-01

This paper introduces a novel architecture of metadata management system based on intelligent cache called Metadata Intelligent Cache Controller (MICC). By using an intelligent cache to control the metadata system, MICC can deal with different scenarios such as splitting and merging of queries into sub-queries for available metadata sets in local, in order to reduce access time of remote queries. Application can find results patially from local cache and the remaining portion of the metadata that can be fetched from remote locations. Using the existing metadata, it can not only enhance the fault tolerance and load balancing of system effectively, but also improve the efficiency of access while ensuring the access quality.
Leveraging Metadata to Create Better Web Services

Science.gov (United States)

Mitchell, Erik

2012-01-01

Libraries have been increasingly concerned with data creation, management, and publication. This increase is partly driven by shifting metadata standards in libraries and partly by the growth of data and metadata repositories being managed by libraries. In order to manage these data sets, libraries are looking for new preservation and discovery…
A Metadata Schema for Geospatial Resource Discovery Use Cases

Directory of Open Access Journals (Sweden)

Darren Hardy

2014-07-01

Full Text Available We introduce a metadata schema that focuses on GIS discovery use cases for patrons in a research library setting. Text search, faceted refinement, and spatial search and relevancy are among GeoBlacklight's primary use cases for federated geospatial holdings. The schema supports a variety of GIS data types and enables contextual, collection-oriented discovery applications as well as traditional portal applications. One key limitation of GIS resource discovery is the general lack of normative metadata practices, which has led to a proliferation of metadata schemas and duplicate records. The ISO 19115/19139 and FGDC standards specify metadata formats, but are intricate, lengthy, and not focused on discovery. Moreover, they require sophisticated authoring environments and cataloging expertise. Geographic metadata standards target preservation and quality measure use cases, but they do not provide for simple inter-institutional sharing of metadata for discovery use cases. To this end, our schema reuses elements from Dublin Core and GeoRSS to leverage their normative semantics, community best practices, open-source software implementations, and extensive examples already deployed in discovery contexts such as web search and mapping. Finally, we discuss a Solr implementation of the schema using a "geo" extension to MODS.
Managing ebook metadata in academic libraries taming the tiger

CERN Document Server

Frederick, Donna E

2016-01-01

Managing ebook Metadata in Academic Libraries: Taming the Tiger tackles the topic of ebooks in academic libraries, a trend that has been welcomed by students, faculty, researchers, and library staff. However, at the same time, the reality of acquiring ebooks, making them discoverable, and managing them presents library staff with many new challenges. Traditional methods of cataloging and managing library resources are no longer relevant where the purchasing of ebooks in packages and demand driven acquisitions are the predominant models for acquiring new content. Most academic libraries have a complex metadata environment wherein multiple systems draw upon the same metadata for different purposes. This complexity makes the need for standards-based interoperable metadata more important than ever. In addition to complexity, the nature of the metadata environment itself typically varies slightly from library to library making it difficult to recommend a single set of practices and procedures which would be releva...
International Metadata Initiatives: Lessons in Bibliographic Control.

Science.gov (United States)

Caplan, Priscilla

This paper looks at a subset of metadata schemes, including the Text Encoding Initiative (TEI) header, the Encoded Archival Description (EAD), the Dublin Core Metadata Element Set (DCMES), and the Visual Resources Association (VRA) Core Categories for visual resources. It examines why they developed as they did, major point of difference from…
Treating metadata as annotations: separating the content markup from the content

Directory of Open Access Journals (Sweden)

Fredrik Paulsson

2007-11-01

Full Text Available The use of digital learning resources creates an increasing need for semantic metadata, describing the whole resource, as well as parts of resources. Traditionally, schemas such as Text Encoding Initiative (TEI have been used to add semantic markup for parts of resources. This is not sufficient for use in a Ã¢Â€Âmetadata ecologyÃ¢Â€Â, where metadata is distributed, coherent to different Application Profiles, and added by different actors. A new methodology, where metadata is Ã¢Â€Âœpointed inÃ¢Â€Â as annotations, using XPointers, and RDF is proposed. A suggestion for how such infrastructure can be implemented, using existing open standards for metadata, and for the web is presented. We argue that such methodology and infrastructure is necessary to realize the decentralized metadata infrastructure needed for a Ã¢Â€Âœmetadata ecology".
A Generic Metadata Editor Supporting System Using Drupal CMS

Science.gov (United States)

Pan, J.; Banks, N. G.; Leggott, M.

2011-12-01

Metadata handling is a key factor in preserving and reusing scientific data. In recent years, standardized structural metadata has become widely used in Geoscience communities. However, there exist many different standards in Geosciences, such as the current version of the Federal Geographic Data Committee's Content Standard for Digital Geospatial Metadata (FGDC CSDGM), the Ecological Markup Language (EML), the Geography Markup Language (GML), and the emerging ISO 19115 and related standards. In addition, there are many different subsets within the Geoscience subdomain such as the Biological Profile of the FGDC (CSDGM), or for geopolitical regions, such as the European Profile or the North American Profile in the ISO standards. It is therefore desirable to have a software foundation to support metadata creation and editing for multiple standards and profiles, without re-inventing the wheels. We have developed a software module as a generic, flexible software system to do just that: to facilitate the support for multiple metadata standards and profiles. The software consists of a set of modules for the Drupal Content Management System (CMS), with minimal inter-dependencies to other Drupal modules. There are two steps in using the system's metadata functions. First, an administrator can use the system to design a user form, based on an XML schema and its instances. The form definition is named and stored in the Drupal database as a XML blob content. Second, users in an editor role can then use the persisted XML definition to render an actual metadata entry form, for creating or editing a metadata record. Behind the scenes, the form definition XML is transformed into a PHP array, which is then rendered via Drupal Form API. When the form is submitted the posted values are used to modify a metadata record. Drupal hooks can be used to perform custom processing on metadata record before and after submission. It is trivial to store the metadata record as an actual XML file
Distributed metadata in a high performance computing environment

Science.gov (United States)

Bent, John M.; Faibish, Sorin; Zhang, Zhenhua; Liu, Xuezhao; Tang, Haiying

2017-07-11

A computer-executable method, system, and computer program product for managing meta-data in a distributed storage system, wherein the distributed storage system includes one or more burst buffers enabled to operate with a distributed key-value store, the co computer-executable method, system, and computer program product comprising receiving a request for meta-data associated with a block of data stored in a first burst buffer of the one or more burst buffers in the distributed storage system, wherein the meta data is associated with a key-value, determining which of the one or more burst buffers stores the requested metadata, and upon determination that a first burst buffer of the one or more burst buffers stores the requested metadata, locating the key-value in a portion of the distributed key-value store accessible from the first burst buffer.
Metadata Design in the New PDS4 Standards - Something for Everybody

Science.gov (United States)

Raugh, Anne C.; Hughes, John S.

2015-11-01

The Planetary Data System (PDS) archives, supports, and distributes data of diverse targets, from diverse sources, to diverse users. One of the core problems addressed by the PDS4 data standard redesign was that of metadata - how to accommodate the increasingly sophisticated demands of search interfaces, analytical software, and observational documentation into label standards without imposing limits and constraints that would impinge on the quality or quantity of metadata that any particular observer or team could supply. And yet, as an archive, PDS must have detailed documentation for the metadata in the labels it supports, or the institutional knowledge encoded into those attributes will be lost - putting the data at risk.The PDS4 metadata solution is based on a three-step approach. First, it is built on two key ISO standards: ISO 11179 "Information Technology - Metadata Registries", which provides a common framework and vocabulary for defining metadata attributes; and ISO 14721 "Space Data and Information Transfer Systems - Open Archival Information System (OAIS) Reference Model", which provides the framework for the information architecture that enforces the object-oriented paradigm for metadata modeling. Second, PDS has defined a hierarchical system that allows it to divide its metadata universe into namespaces ("data dictionaries", conceptually), and more importantly to delegate stewardship for a single namespace to a local authority. This means that a mission can develop its own data model with a high degree of autonomy and effectively extend the PDS model to accommodate its own metadata needs within the common ISO 11179 framework. Finally, within a single namespace - even the core PDS namespace - existing metadata structures can be extended and new structures added to the model as new needs are identifiedThis poster illustrates the PDS4 approach to metadata management and highlights the expected return on the development investment for PDS, users and data
Metabolonote: A wiki-based database for managing hierarchical metadata of metabolome analyses

Directory of Open Access Journals (Sweden)

Takeshi eAra

2015-04-01

Full Text Available Metabolomics—technology for comprehensive detection of small molecules in an organism—lags behind the other omics in terms of publication and dissemination of experimental data. Among the reasons for this are difficulty precisely recording information about complicated analytical experiments (metadata, existence of various databases with their own metadata descriptions, and low reusability of the published data, resulting in submitters (the researchers who generate the data being insufficiently motivated. To tackle these issues, we developed Metabolonote, a Semantic MediaWiki-based database designed specifically for managing metabolomic metadata. We also defined a metadata and data description format, called TogoMD, with an ID system that is required for unique access to each level of the tree-structured metadata such as study purpose, sample, analytical method, and data analysis. Separation of the management of metadata from that of data and permission to attach related information to the metadata provide advantages for submitters, readers, and database developers. The metadata are enriched with information such as links to comparable data, thereby functioning as a hub of related data resources. They also enhance not only readers' understanding and use of data, but also submitters' motivation to publish the data. The metadata are computationally shared among other systems via APIs, which facilitates the construction of novel databases by database developers. A permission system that allows publication of immature metadata and feedback from readers also helps submitters to improve their metadata. Hence, this aspect of Metabolonote, as a metadata preparation tool, is complementary to high-quality and persistent data repositories such as MetaboLights. A total of 808 metadata for analyzed data obtained from 35 biological species are published currently. Metabolonote and related tools are available free of cost at http://metabolonote.kazusa.or.jp/.
GCE Data Toolbox for MATLAB - a software framework for automating environmental data processing, quality control and documentation

Science.gov (United States)

Sheldon, W.; Chamblee, J.; Cary, R. H.

2013-12-01

Environmental scientists are under increasing pressure from funding agencies and journal publishers to release quality-controlled data in a timely manner, as well as to produce comprehensive metadata for submitting data to long-term archives (e.g. DataONE, Dryad and BCO-DMO). At the same time, the volume of digital data that researchers collect and manage is increasing rapidly due to advances in high frequency electronic data collection from flux towers, instrumented moorings and sensor networks. However, few pre-built software tools are available to meet these data management needs, and those tools that do exist typically focus on part of the data management lifecycle or one class of data. The GCE Data Toolbox has proven to be both a generalized and effective software solution for environmental data management in the Long Term Ecological Research Network (LTER). This open source MATLAB software library, developed by the Georgia Coastal Ecosystems LTER program, integrates metadata capture, creation and management with data processing, quality control and analysis to support the entire data lifecycle. Raw data can be imported directly from common data logger formats (e.g. SeaBird, Campbell Scientific, YSI, Hobo), as well as delimited text files, MATLAB files and relational database queries. Basic metadata are derived from the data source itself (e.g. parsed from file headers) and by value inspection, and then augmented using editable metadata templates containing boilerplate documentation, attribute descriptors, code definitions and quality control rules. Data and metadata content, quality control rules and qualifier flags are then managed together in a robust data structure that supports database functionality and ensures data validity throughout processing. A growing suite of metadata-aware editing, quality control, analysis and synthesis tools are provided with the software to support managing data using graphical forms and command-line functions, as well as
Forensic devices for activism: Metadata tracking and public proof

Directory of Open Access Journals (Sweden)

Lonneke van der Velden

2015-10-01

Full Text Available The central topic of this paper is a mobile phone application, ‘InformaCam’, which turns metadata from a surveillance risk into a method for the production of public proof. InformaCam allows one to manage and delete metadata from images and videos in order to diminish surveillance risks related to online tracking. Furthermore, it structures and stores the metadata in such a way that the documentary material becomes better accommodated to evidentiary settings, if needed. In this paper I propose InformaCam should be interpreted as a ‘forensic device’. By using the conceptualization of forensics and work on socio-technical devices the paper discusses how InformaCam, through a range of interventions, rearranges metadata into a technology of evidence. InformaCam explicitly recognizes mobile phones as context aware, uses their sensors, and structures metadata in order to facilitate data analysis after images are captured. Through these modifications it invents a form of ‘sensory data forensics'. By treating data in this particular way, surveillance resistance does more than seeking awareness. It becomes engaged with investigatory practices. Considering the extent by which states conduct metadata surveillance, the project can be seen as a timely response to the unequal distribution of power over data.
Survey data and metadata modelling using document-oriented NoSQL

Science.gov (United States)

Rahmatuti Maghfiroh, Lutfi; Gusti Bagus Baskara Nugraha, I.

2018-03-01

Survey data that are collected from year to year have metadata change. However it need to be stored integratedly to get statistical data faster and easier. Data warehouse (DW) can be used to solve this limitation. However there is a change of variables in every period that can not be accommodated by DW. Traditional DW can not handle variable change via Slowly Changing Dimension (SCD). Previous research handle the change of variables in DW to manage metadata by using multiversion DW (MVDW). MVDW is designed using relational model. Some researches also found that developing nonrelational model in NoSQL database has reading time faster than the relational model. Therefore, we propose changes to metadata management by using NoSQL. This study proposes a model DW to manage change and algorithms to retrieve data with metadata changes. Evaluation of the proposed models and algorithms result in that database with the proposed design can retrieve data with metadata changes properly. This paper has contribution in comprehensive data analysis with metadata changes (especially data survey) in integrated storage.
Using Metadata to Build Geographic Information Sharing Environment on Internet

Directory of Open Access Journals (Sweden)

Chih-hong Sun

1999-12-01

Full Text Available Internet provides a convenient environment to share geographic information. Web GIS (Geographic Information System even provides users a direct access environment to geographic databases through Internet. However, the complexity of geographic data makes it difficult for users to understand the real content and the limitation of geographic information. In some cases, users may misuse the geographic data and make wrong decisions. Meanwhile, geographic data are distributed across various government agencies, academic institutes, and private organizations, which make it even more difficult for users to fully understand the content of these complex data. To overcome these difficulties, this research uses metadata as a guiding mechanism for users to fully understand the content and the limitation of geographic data. We introduce three metadata standards commonly used for geographic data and metadata authoring tools available in the US. We also review the current development of geographic metadata standard in Taiwan. Two metadata authoring tools are developed in this research, which will enable users to build their own geographic metadata easily.[Article content in Chinese
Development of health information search engine based on metadata and ontology.

Science.gov (United States)

Song, Tae-Min; Park, Hyeoun-Ae; Jin, Dal-Lae

2014-04-01

The aim of the study was to develop a metadata and ontology-based health information search engine ensuring semantic interoperability to collect and provide health information using different application programs. Health information metadata ontology was developed using a distributed semantic Web content publishing model based on vocabularies used to index the contents generated by the information producers as well as those used to search the contents by the users. Vocabulary for health information ontology was mapped to the Systematized Nomenclature of Medicine Clinical Terms (SNOMED CT), and a list of about 1,500 terms was proposed. The metadata schema used in this study was developed by adding an element describing the target audience to the Dublin Core Metadata Element Set. A metadata schema and an ontology ensuring interoperability of health information available on the internet were developed. The metadata and ontology-based health information search engine developed in this study produced a better search result compared to existing search engines. Health information search engine based on metadata and ontology will provide reliable health information to both information producer and information consumers.

In Interactive, Web-Based Approach to Metadata Authoring

Science.gov (United States)

Pollack, Janine; Wharton, Stephen W. (Technical Monitor)

2001-01-01

NASA's Global Change Master Directory (GCMD) serves a growing number of users by assisting the scientific community in the discovery of and linkage to Earth science data sets and related services. The GCMD holds over 8000 data set descriptions in Directory Interchange Format (DIF) and 200 data service descriptions in Service Entry Resource Format (SERF), encompassing the disciplines of geology, hydrology, oceanography, meteorology, and ecology. Data descriptions also contain geographic coverage information, thus allowing researchers to discover data pertaining to a particular geographic location, as well as subject of interest. The GCMD strives to be the preeminent data locator for world-wide directory level metadata. In this vein, scientists and data providers must have access to intuitive and efficient metadata authoring tools. Existing GCMD tools are not currently attracting. widespread usage. With usage being the prime indicator of utility, it has become apparent that current tools must be improved. As a result, the GCMD has released a new suite of web-based authoring tools that enable a user to create new data and service entries, as well as modify existing data entries. With these tools, a more interactive approach to metadata authoring is taken, as they feature a visual "checklist" of data/service fields that automatically update when a field is completed. In this way, the user can quickly gauge which of the required and optional fields have not been populated. With the release of these tools, the Earth science community will be further assisted in efficiently creating quality data and services metadata. Keywords: metadata, Earth science, metadata authoring tools
Collaborative Metadata Curation in Support of NASA Earth Science Data Stewardship

Science.gov (United States)

Sisco, Adam W.; Bugbee, Kaylin; le Roux, Jeanne; Staton, Patrick; Freitag, Brian; Dixon, Valerie

2018-01-01

Growing collection of NASA Earth science data is archived and distributed by EOSDIS’s 12 Distributed Active Archive Centers (DAACs). Each collection and granule is described by a metadata record housed in the Common Metadata Repository (CMR). Multiple metadata standards are in use, and core elements of each are mapped to and from a common model – the Unified Metadata Model (UMM). Work done by the Analysis and Review of CMR (ARC) Team.
EPA Metadata Style Guide Keywords and EPA Organization Names

Science.gov (United States)

The following keywords and EPA organization names listed below, along with EPA’s Metadata Style Guide, are intended to provide suggestions and guidance to assist with the standardization of metadata records.
ATLAS Metadata Task Force

Energy Technology Data Exchange (ETDEWEB)

ATLAS Collaboration; Costanzo, D.; Cranshaw, J.; Gadomski, S.; Jezequel, S.; Klimentov, A.; Lehmann Miotto, G.; Malon, D.; Mornacchi, G.; Nemethy, P.; Pauly, T.; von der Schmitt, H.; Barberis, D.; Gianotti, F.; Hinchliffe, I.; Mapelli, L.; Quarrie, D.; Stapnes, S.

2007-04-04

This document provides an overview of the metadata, which are needed to characterizeATLAS event data at different levels (a complete run, data streams within a run, luminosity blocks within a run, individual events).
Dyniqx: a novel meta-search engine for metadata based cross search

OpenAIRE

Zhu, Jianhan; Song, Dawei; Eisenstadt, Marc; Barladeanu, Cristi; Rüger, Stefan

2008-01-01

The effect of metadata in collection fusion has not been sufficiently studied. In response to this, we present a novel meta-search engine called Dyniqx for metadata based cross search. Dyniqx exploits the availability of metadata in academic search services such as PubMed and Google Scholar etc for fusing search results from heterogeneous search engines. In addition, metadata from these search engines are used for generating dynamic query controls such as sliders and tick boxes etc which are ...
A Shared Infrastructure for Federated Search Across Distributed Scientific Metadata Catalogs

Science.gov (United States)

Reed, S. A.; Truslove, I.; Billingsley, B. W.; Grauch, A.; Harper, D.; Kovarik, J.; Lopez, L.; Liu, M.; Brandt, M.

2013-12-01

The vast amount of science metadata can be overwhelming and highly complex. Comprehensive analysis and sharing of metadata is difficult since institutions often publish to their own repositories. There are many disjoint standards used for publishing scientific data, making it difficult to discover and share information from different sources. Services that publish metadata catalogs often have different protocols, formats, and semantics. The research community is limited by the exclusivity of separate metadata catalogs and thus it is desirable to have federated search interfaces capable of unified search queries across multiple sources. Aggregation of metadata catalogs also enables users to critique metadata more rigorously. With these motivations in mind, the National Snow and Ice Data Center (NSIDC) and Advanced Cooperative Arctic Data and Information Service (ACADIS) implemented two search interfaces for the community. Both the NSIDC Search and ACADIS Arctic Data Explorer (ADE) use a common infrastructure which keeps maintenance costs low. The search clients are designed to make OpenSearch requests against Solr, an Open Source search platform. Solr applies indexes to specific fields of the metadata which in this instance optimizes queries containing keywords, spatial bounds and temporal ranges. NSIDC metadata is reused by both search interfaces but the ADE also brokers additional sources. Users can quickly find relevant metadata with minimal effort and ultimately lowers costs for research. This presentation will highlight the reuse of data and code between NSIDC and ACADIS, discuss challenges and milestones for each project, and will identify creation and use of Open Source libraries.
Enriching The Metadata On CDS

CERN Document Server

Chhibber, Nalin

2014-01-01

The project report revolves around the open source software package called Invenio. It provides the tools for management of digital assets in a repository and drives CERN Document Server. Primary objective is to enhance the existing metadata in CDS with data from other libraries. An implicit part of this task is to manage disambiguation (within incoming data), removal of multiple entries and handle replications between new and existing records. All such elements and their corresponding changes are integrated within Invenio to make the upgraded metadata available on the CDS. Latter part of the report discuss some changes related to the Invenio code-base itself.
Managing Data, Provenance and Chaos through Standardization and Automation at the Georgia Coastal Ecosystems LTER Site

Science.gov (United States)

Sheldon, W.

2013-12-01

Managing data for a large, multidisciplinary research program such as a Long Term Ecological Research (LTER) site is a significant challenge, but also presents unique opportunities for data stewardship. LTER research is conducted within multiple organizational frameworks (i.e. a specific LTER site as well as the broader LTER network), and addresses both specific goals defined in an NSF proposal as well as broader goals of the network; therefore, every LTER data can be linked to rich contextual information to guide interpretation and comparison. The challenge is how to link the data to this wealth of contextual metadata. At the Georgia Coastal Ecosystems LTER we developed an integrated information management system (GCE-IMS) to manage, archive and distribute data, metadata and other research products as well as manage project logistics, administration and governance (figure 1). This system allows us to store all project information in one place, and provide dynamic links through web applications and services to ensure content is always up to date on the web as well as in data set metadata. The database model supports tracking changes over time in personnel roles, projects and governance decisions, allowing these databases to serve as canonical sources of project history. Storing project information in a central database has also allowed us to standardize both the formatting and content of critical project information, including personnel names, roles, keywords, place names, attribute names, units, and instrumentation, providing consistency and improving data and metadata comparability. Lookup services for these standard terms also simplify data entry in web and database interfaces. We have also coupled the GCE-IMS to our MATLAB- and Python-based data processing tools (i.e. through database connections) to automate metadata generation and packaging of tabular and GIS data products for distribution. Data processing history is automatically tracked throughout the data
The Theory and Implementation for Metadata in Digital Library/Museum

Directory of Open Access Journals (Sweden)

Hsueh-hua Chen

1998-12-01

Full Text Available Digital Libraries and Museums (DL/M have become one of the important research issues of Library and Information Science as well as other related fields. This paper describes the basic concepts of DL/M and briefly introduces the development of Taiwan Digital Museum Project. Based on the features of various collections, wediscuss how to maintain, to manage and to exchange metadata, especially from the viewpoint of users. We propose the draft of metadata, MICI (Metadata Interchange for Chinese Information , developed by ROSS (Resources Organization and SearchingSpecification team. Finally, current problems and future development of metadata will be touched.[Article content in Chinese
GeneLab Analysis Working Group Kick-Off Meeting

Science.gov (United States)

Costes, Sylvain V.

2018-01-01

Goals to achieve for GeneLab AWG - GL vision - Review of GeneLab AWG charter Timeline and milestones for 2018 Logistics - Monthly Meeting - Workshop - Internship - ASGSR Introduction of team leads and goals of each group Introduction of all members Q/A Three-tier Client Strategy to Democratize Data Physiological changes, pathway enrichment, differential expression, normalization, processing metadata, reproducibility, Data federation/integration with heterogeneous bioinformatics external databases The GLDS currently serves over 100 omics investigations to the biomedical community via open access. In order to expand the scope of metadata record searches via the GLDS, we designed a metadata warehouse that collects and updates metadata records from external systems housing similar data. To demonstrate the capabilities of federated search and retrieval of these data, we imported metadata records from three open-access data systems into the GLDS metadata warehouse: NCBI's Gene Expression Omnibus (GEO), EBI's PRoteomics IDEntifications (PRIDE) repository, and the Metagenomics Analysis server (MG-RAST). Each of these systems defines metadata for omics data sets differently. One solution to bridge such differences is to employ a common object model (COM) to which each systems' representation of metadata can be mapped. Warehoused metadata records are then transformed at ETL to this single, common representation. Queries generated via the GLDS are then executed against the warehouse, and matching records are shown in the COM representation (Fig. 1). While this approach is relatively straightforward to implement, the volume of the data in the omics domain presents challenges in dealing with latency and currency of records. Furthermore, the lack of a coordinated has been federated data search for and retrieval of these kinds of data across other open-access systems, so that users are able to conduct biological meta-investigations using data from a variety of sources. Such meta
DEVELOPMENT OF A METADATA MANAGEMENT SYSTEM FOR AN INTERDISCIPLINARY RESEARCH PROJECT

Directory of Open Access Journals (Sweden)

C. Curdt

2012-07-01

Full Text Available In every interdisciplinary, long-term research project it is essential to manage and archive all heterogeneous research data, produced by the project participants during the project funding. This has to include sustainable storage, description with metadata, easy and secure provision, back up, and visualisation of all data. To ensure the accurate description of all project data with corresponding metadata, the design and implementation of a metadata management system is a significant duty. Thus, the sustainable use and search of all research results during and after the end of the project is particularly dependent on the implementation of a metadata management system. Therefore, this paper will describe the practical experiences gained during the development of a scientific research data management system (called the TR32DB including the corresponding metadata management system for the multidisciplinary research project Transregional Collaborative Research Centre 32 (CRC/TR32 'Patterns in Soil-Vegetation-Atmosphere Systems'. The entire system was developed according to the requirements of the funding agency, the user and project requirements, as well as according to recent standards and principles. The TR32DB is basically a combination of data storage, database, and web-interface. The metadata management system was designed, realized, and implemented to describe and access all project data via accurate metadata. Since the quantity and sort of descriptive metadata depends on the kind of data, a user-friendly multi-level approach was chosen to cover these requirements. Thus, the self-developed CRC/TR32 metadata framework is designed. It is a combination of general, CRC/TR32 specific, as well as data type specific properties.
Metadata Exporter for Scientific Photography Management

Science.gov (United States)

Staudigel, D.; English, B.; Delaney, R.; Staudigel, H.; Koppers, A.; Hart, S.

2005-12-01

Photographs have become an increasingly important medium, especially with the advent of digital cameras. It has become inexpensive to take photographs and quickly post them on a website. However informative photos may be, they still need to be displayed in a convenient way, and be cataloged in such a manner that makes them easily locatable. Managing the great number of photographs that digital cameras allow and creating a format for efficient dissemination of the information related to the photos is a tedious task. Products such as Apple's iPhoto have greatly eased the task of managing photographs, However, they often have limitations. Un-customizable metadata fields and poor metadata extraction tools limit their scientific usefulness. A solution to this persistent problem is a customizable metadata exporter. On the ALIA expedition, we successfully managed the thousands of digital photos we took. We did this with iPhoto and a version of the exporter that is now available to the public under the name "CustomHTMLExport" (http://www.versiontracker.com/dyn/moreinfo/macosx/27777), currently undergoing formal beta testing This software allows the use of customized metadata fields (including description, time, date, GPS data, etc.), which is exported along with the photo. It can also produce webpages with this data straight from iPhoto, in a much more flexible way than is already allowed. With this tool it becomes very easy to manage and distribute scientific photos.
Atmospheric Radiation Measurement's Data Management Facility captures metadata and uses visualization tools to assist in routine data management.

Science.gov (United States)

Keck, N. N.; Macduff, M.; Martin, T.

2017-12-01

The Atmospheric Radiation Measurement's (ARM) Data Management Facility (DMF) plays a critical support role in processing and curating data generated by the Department of Energy's ARM Program. Data are collected near real time from hundreds of observational instruments spread out all over the globe. Data are then ingested hourly to provide time series data in NetCDF (network Common Data Format) and includes standardized metadata. Based on automated processes and a variety of user reviews the data may need to be reprocessed. Final data sets are then stored and accessed by users through the ARM Archive. Over the course of 20 years, a suite of data visualization tools have been developed to facilitate the operational processes to manage and maintain the more than 18,000 real time events, that move 1.3 TB of data each day through the various stages of the DMF's data system. This poster will present the resources and methodology used to capture metadata and the tools that assist in routine data management and discoverability.
Languages for Metadata

NARCIS (Netherlands)

Brussee, R.; Veenstra, M.; Blanken, Henk; de Vries, A.P.; Blok, H.E.; Feng, L.

2007-01-01

The term meta origins from the Greek word µ∈τα, meaning after. The word Metaphysics is the title of Aristotle’s book coming after his book on nature called Physics. This has given meta the modern connotation of a nature of a higher order or of a more fundamental kind [1]. Literally, metadata is
An Assessment of the Evolving Common Metadata Repository Standards for Airborne Field Campaigns

Science.gov (United States)

Northup, E. A.; Chen, G.; Early, A. B.; Beach, A. L., III; Walter, J.; Conover, H.

2016-12-01

The NASA Earth Venture Program has led to a dramatic increase in airborne observations, requiring updated data management practices with clearly defined data standards and protocols for metadata. While the current data management practices demonstrate some success in serving airborne science team data user needs, existing metadata models and standards such as NASA's Unified Metadata Model (UMM) for Collections (UMM-C) present challenges with respect to accommodating certain features of airborne science metadata. UMM is the model implemented in the Common Metadata Repository (CMR), which catalogs all metadata records for NASA's Earth Observing System Data and Information System (EOSDIS). One example of these challenges is with representation of spatial and temporal metadata. In addition, many airborne missions target a particular geophysical event, such as a developing hurricane. In such cases, metadata about the event is also important for understanding the data. While coverage of satellite missions is highly predictable based on orbit characteristics, airborne missions feature complicated flight patterns where measurements can be spatially and temporally discontinuous. Therefore, existing metadata models will need to be expanded for airborne measurements and sampling strategies. An Airborne Metadata Working Group was established under the auspices of NASA's Earth Science Data Systems Working Group (ESDSWG) to identify specific features of airborne metadata that can not be currently represented in the UMM and to develop new recommendations. The group includes representation from airborne data users and providers. This presentation will discuss the challenges and recommendations in an effort to demonstrate how airborne metadata curation/management can be improved to streamline data ingest and discoverability to a broader user community.
Conception and realisation of an automatic bibliographic metadata update handler based on patch extraction and merging for the CERN document repository environment.

CERN Document Server

Vesper, Martin; Ziolek, Wojciech

Scientific literature and its corresponding bibliographic metadata information is typically available through online digital repositories: • INSPIRE, the High Energy Physics (HEP) information system is the source of information about the whole HEP literature. • TheCERNDocumentServer(CDS) is the CERN Institutional Library containing all documents produced at CERN; • arXiv is a pre-print server hosting pre-print versions of several scientific fields. • SCOAP3 is an initiative to convert key journals in the HEP field to open access and comes with its own digital repository. Across these 4 entities, there is a big overlap in terms of content, and maintaining consistency between the corresponding bibliographic metadata is an open challenge. The proposed thesis tries to model and implement a possible solution to automate the propagation of updates in order to reduce the necessary manual data manipulation to a minimum.
A Semantically Enabled Metadata Repository for Solar Irradiance Data Products

Science.gov (United States)

Wilson, A.; Cox, M.; Lindholm, D. M.; Nadiadi, I.; Traver, T.

2014-12-01

The Laboratory for Atmospheric and Space Physics, LASP, has been conducting research in Atmospheric and Space science for over 60 years, and providing the associated data products to the public. LASP has a long history, in particular, of making space-based measurements of the solar irradiance, which serves as crucial input to several areas of scientific research, including solar-terrestrial interactions, atmospheric, and climate. LISIRD, the LASP Interactive Solar Irradiance Data Center, serves these datasets to the public, including solar spectral irradiance (SSI) and total solar irradiance (TSI) data. The LASP extended metadata repository, LEMR, is a database of information about the datasets served by LASP, such as parameters, uncertainties, temporal and spectral ranges, current version, alerts, etc. It serves as the definitive, single source of truth for that information. The database is populated with information garnered via web forms and automated processes. Dataset owners keep the information current and verified for datasets under their purview. This information can be pulled dynamically for many purposes. Web sites such as LISIRD can include this information in web page content as it is rendered, ensuring users get current, accurate information. It can also be pulled to create metadata records in various metadata formats, such as SPASE (for heliophysics) and ISO 19115. Once these records are be made available to the appropriate registries, our data will be discoverable by users coming in via those organizations. The database is implemented as a RDF triplestore, a collection of instances of subject-object-predicate data entities identifiable with a URI. This capability coupled with SPARQL over HTTP read access enables semantic queries over the repository contents. To create the repository we leveraged VIVO, an open source semantic web application, to manage and create new ontologies and populate repository content. A variety of ontologies were used in
Metadata Schema Used in OCLC Sampled Web Pages

Directory of Open Access Journals (Sweden)

Fei Yu

2005-12-01

Full Text Available The tremendous growth of Web resources has made information organization and retrieval more and more difficult. As one approach to this problem, metadata schemas have been developed to characterize Web resources. However, many questions have been raised about the use of metadata schemas such as which metadata schemas have been used on the Web? How did they describe Web accessible information? What is the distribution of these metadata schemas among Web pages? Do certain schemas dominate the others? To address these issues, this study analyzed 16,383 Web pages with meta tags extracted from 200,000 OCLC sampled Web pages in 2000. It found that only 8.19% Web pages used meta tags; description tags, keyword tags, and Dublin Core tags were the only three schemas used in the Web pages. This article revealed the use of meta tags in terms of their function distribution, syntax characteristics, granularity of the Web pages, and the length distribution and word number distribution of both description and keywords tags.
Describing Geospatial Assets in the Web of Data: A Metadata Management Scenario

Directory of Open Access Journals (Sweden)

Cristiano Fugazza

2016-12-01

Full Text Available Metadata management is an essential enabling factor for geospatial assets because discovery, retrieval, and actual usage of the latter are tightly bound to the quality of these descriptions. Unfortunately, the multi-faceted landscape of metadata formats, requirements, and conventions makes it difficult to identify editing tools that can be easily tailored to the specificities of a given project, workgroup, and Community of Practice. Our solution is a template-driven metadata editing tool that can be customised to any XML-based schema. Its output is constituted by standards-compliant metadata records that also have a semantics-aware counterpart eliciting novel exploitation techniques. Moreover, external data sources can easily be plugged in to provide autocompletion functionalities on the basis of the data structures made available on the Web of Data. Beside presenting the essentials on customisation of the editor by means of two use cases, we extend the methodology to the whole life cycle of geospatial metadata. We demonstrate the novel capabilities enabled by RDF-based metadata representation with respect to traditional metadata management in the geospatial domain.
Estimating pediatric entrance skin dose from digital radiography examination using DICOM metadata: A quality assurance tool

Energy Technology Data Exchange (ETDEWEB)

Brady, S. L., E-mail: samuel.brady@stjude.org; Kaufman, R. A., E-mail: robert.kaufman@stjude.org [Department of Diagnostic Imaging, St. Jude Children’s Research Hospital, Memphis, Tennessee 38105 (United States)

2015-05-15

Purpose: To develop an automated methodology to estimate patient examination dose in digital radiography (DR) imaging using DICOM metadata as a quality assurance (QA) tool. Methods: Patient examination and demographical information were gathered from metadata analysis of DICOM header data. The x-ray system radiation output (i.e., air KERMA) was characterized for all filter combinations used for patient examinations. Average patient thicknesses were measured for head, chest, abdomen, knees, and hands using volumetric images from CT. Backscatter factors (BSFs) were calculated from examination kVp. Patient entrance skin air KERMA (ESAK) was calculated by (1) looking up examination technique factors taken from DICOM header metadata (i.e., kVp and mA s) to derive an air KERMA (k{sub air}) value based on an x-ray characteristic radiation output curve; (2) scaling k{sub air} with a BSF value; and (3) correcting k{sub air} for patient thickness. Finally, patient entrance skin dose (ESD) was calculated by multiplying a mass–energy attenuation coefficient ratio by ESAK. Patient ESD calculations were computed for common DR examinations at our institution: dual view chest, anteroposterior (AP) abdomen, lateral (LAT) skull, dual view knee, and bone age (left hand only) examinations. Results: ESD was calculated for a total of 3794 patients; mean age was 11 ± 8 yr (range: 2 months to 55 yr). The mean ESD range was 0.19–0.42 mGy for dual view chest, 0.28–1.2 mGy for AP abdomen, 0.18–0.65 mGy for LAT view skull, 0.15–0.63 mGy for dual view knee, and 0.10–0.12 mGy for bone age (left hand) examinations. Conclusions: A methodology combining DICOM header metadata and basic x-ray tube characterization curves was demonstrated. In a regulatory era where patient dose reporting has become increasingly in demand, this methodology will allow a knowledgeable user the means to establish an automatable dose reporting program for DR and perform patient dose related QA testing for

Towards Precise Metadata-set for Discovering 3D Geospatial Models in Geo-portals

Science.gov (United States)

Zamyadi, A.; Pouliot, J.; Bédard, Y.

2013-09-01

Accessing 3D geospatial models, eventually at no cost and for unrestricted use, is certainly an important issue as they become popular among participatory communities, consultants, and officials. Various geo-portals, mainly established for 2D resources, have tried to provide access to existing 3D resources such as digital elevation model, LIDAR or classic topographic data. Describing the content of data, metadata is a key component of data discovery in geo-portals. An inventory of seven online geo-portals and commercial catalogues shows that the metadata referring to 3D information is very different from one geo-portal to another as well as for similar 3D resources in the same geo-portal. The inventory considered 971 data resources affiliated with elevation. 51% of them were from three geo-portals running at Canadian federal and municipal levels whose metadata resources did not consider 3D model by any definition. Regarding the remaining 49% which refer to 3D models, different definition of terms and metadata were found, resulting in confusion and misinterpretation. The overall assessment of these geo-portals clearly shows that the provided metadata do not integrate specific and common information about 3D geospatial models. Accordingly, the main objective of this research is to improve 3D geospatial model discovery in geo-portals by adding a specific metadata-set. Based on the knowledge and current practices on 3D modeling, and 3D data acquisition and management, a set of metadata is proposed to increase its suitability for 3D geospatial models. This metadata-set enables the definition of genuine classes, fields, and code-lists for a 3D metadata profile. The main structure of the proposal contains 21 metadata classes. These classes are classified in three packages as General and Complementary on contextual and structural information, and Availability on the transition from storage to delivery format. The proposed metadata set is compared with Canadian Geospatial
Progress Report on the Airborne Metadata and Time Series Working Groups of the 2016 ESDSWG

Science.gov (United States)

Evans, K. D.; Northup, E. A.; Chen, G.; Conover, H.; Ames, D. P.; Teng, W. L.; Olding, S. W.; Krotkov, N. A.

2016-12-01

NASA's Earth Science Data Systems Working Groups (ESDSWG) was created over 10 years ago. The role of the ESDSWG is to make recommendations relevant to NASA's Earth science data systems from users' experiences. Each group works independently focusing on a unique topic. Participation in ESDSWG groups comes from a variety of NASA-funded science and technology projects, including MEaSUREs and ROSS. Participants include NASA information technology experts, affiliated contractor staff and other interested community members from academia and industry. Recommendations from the ESDSWG groups will enhance NASA's efforts to develop long term data products. The Airborne Metadata Working Group is evaluating the suitability of the current Common Metadata Repository (CMR) and Unified Metadata Model (UMM) for airborne data sets and to develop new recommendations as necessary. The overarching goal is to enhance the usability, interoperability, discovery and distribution of airborne observational data sets. This will be done by assessing the suitability (gaps) of the current UMM model for airborne data using lessons learned from current and past field campaigns, listening to user needs and community recommendations and assessing the suitability of ISO metadata and other standards to fill the gaps. The Time Series Working Group (TSWG) is a continuation of the 2015 Time Series/WaterML2 Working Group. The TSWG is using a case study-driven approach to test the new Open Geospatial Consortium (OGC) TimeseriesML standard to determine any deficiencies with respect to its ability to fully describe and encode NASA earth observation-derived time series data. To do this, the time series working group is engaging with the OGC TimeseriesML Standards Working Group (SWG) regarding unsatisfied needs and possible solutions. The effort will end with the drafting of an OGC Engineering Report based on the use cases and interactions with the OGC TimeseriesML SWG. Progress towards finalizing
A document centric metadata registration tool constructing earth environmental data infrastructure

Science.gov (United States)

Ichino, M.; Kinutani, H.; Ono, M.; Shimizu, T.; Yoshikawa, M.; Masuda, K.; Fukuda, K.; Kawamoto, H.

2009-12-01

DIAS (Data Integration and Analysis System) is one of GEOSS activities in Japan. It is also a leading part of the GEOSS task with the same name defined in GEOSS Ten Year Implementation Plan. The main mission of DIAS is to construct data infrastructure that can effectively integrate earth environmental data such as observation data, numerical model outputs, and socio-economic data provided from the fields of climate, water cycle, ecosystem, ocean, biodiversity and agriculture. Some of DIAS's data products are available at the following web site of http://www.jamstec.go.jp/e/medid/dias. Most of earth environmental data commonly have spatial and temporal attributes such as the covering geographic scope or the created date. The metadata standards including these common attributes are published by the geographic information technical committee (TC211) in ISO (the International Organization for Standardization) as specifications of ISO 19115:2003 and 19139:2007. Accordingly, DIAS metadata is developed with basing on ISO/TC211 metadata standards. From the viewpoint of data users, metadata is useful not only for data retrieval and analysis but also for interoperability and information sharing among experts, beginners and nonprofessionals. On the other hand, from the viewpoint of data providers, two problems were pointed out after discussions. One is that data providers prefer to minimize another tasks and spending time for creating metadata. Another is that data providers want to manage and publish documents to explain their data sets more comprehensively. Because of solving these problems, we have been developing a document centric metadata registration tool. The features of our tool are that the generated documents are available instantly and there is no extra cost for data providers to generate metadata. Also, this tool is developed as a Web application. So, this tool does not demand any software for data providers if they have a web-browser. The interface of the tool
“The Naming of Cats”: Automated Genre Classification

Directory of Open Access Journals (Sweden)

Yunhyong Kim

2007-07-01

Full Text Available This paper builds on the work presented at the ECDL 2006 in automated genre classification as a step toward automating metadata extraction from digital documents for ingest into digital repositories such as those run by archives, libraries and eprint services (Kim & Ross, 2006b. We have previously proposed dividing features of a document into five types (features for visual layout, language model features, stylometric features, features for semantic structure, and contextual features as an object linked to previously classified objects and other external sources and have examined visual and language model features. The current paper compares results from testing classifiers based on image and stylometric features in a binary classification to show that certain genres have strong image features which enable effective separation of documents belonging to the genre from a large pool of other documents.
Learning Object Metadata in a Web-Based Learning Environment

NARCIS (Netherlands)

Avgeriou, Paris; Koutoumanos, Anastasios; Retalis, Symeon; Papaspyrou, Nikolaos

2000-01-01

The plethora and variance of learning resources embedded in modern web-based learning environments require a mechanism to enable their structured administration. This goal can be achieved by defining metadata on them and constructing a system that manages the metadata in the context of the learning
Automated Test-Form Generation

Science.gov (United States)

van der Linden, Wim J.; Diao, Qi

2011-01-01

In automated test assembly (ATA), the methodology of mixed-integer programming is used to select test items from an item bank to meet the specifications for a desired test form and optimize its measurement accuracy. The same methodology can be used to automate the formatting of the set of selected items into the actual test form. Three different…
Automated transmission system operation and management : meeting stakeholder information needs

Energy Technology Data Exchange (ETDEWEB)

Peelo, D.F.; Toom, P.O. [British Columbia Hydro, Vancouver, BC (Canada)

1998-12-01

Information monitoring is considered to be the fundamental basis for moving beyond substation automation and into automated transmission system operation and management. Information monitoring was defined as the acquisition of data and processing the data into decision making. Advances in digital technology and cheaper, more powerful computing capability has made it possible to capture all transmission stakeholder needs in a shared and automated operation and management system. Recognizing that the key to success in the development of transmission systems is automation, BC Hydro has initiated a long-term research and development project to develop the structure and detail of transmission system automation. The involvement of partners, be they utility or equipment suppliers, is essential in order to deal with protocol and similar issues. 3 refs., 1 tab., 3 figs.
Metadata Quality in Institutional Repositories May be Improved by Addressing Staffing Issues

Directory of Open Access Journals (Sweden)

Elizabeth Stovold

2016-09-01

Full Text Available A Review of: Moulaison, S. H., & Dykas, F. (2016. High-quality metadata and repository staffing: Perceptions of United States–based OpenDOAR participants. Cataloging & Classification Quarterly, 54(2, 101-116. http://dx.doi.org/10.1080/01639374.2015.1116480 Objective – To investigate the quality of institutional repository metadata, metadata practices, and identify barriers to quality. Design – Survey questionnaire. Setting – The OpenDOAR online registry of worldwide repositories. Subjects – A random sample of 50 from 358 administrators of institutional repositories in the United States of America listed in the OpenDOAR registry. Methods – The authors surveyed a random sample of administrators of American institutional repositories included in the OpenDOAR registry. The survey was distributed electronically. Recipients were asked to forward the email if they felt someone else was better suited to respond. There were questions about the demographics of the repository, the metadata creation environment, metadata quality, standards and practices, and obstacles to quality. Results were analyzed in Excel, and qualitative responses were coded by two researchers together. Main results – There was a 42% (n=21 response rate to the section on metadata quality, a 40% (n=20 response rate to the metadata creation section, and 40% (n=20 to the section on obstacles to quality. The majority of respondents rated their metadata quality as average (65%, n=13 or above average (30%, n=5. No one rated the quality as high or poor, while 10% (n=2 rated the quality as below average. The survey found that the majority of descriptive metadata was created by professional (84%, n=16 or paraprofessional (53%, n=10 library staff. Professional staff were commonly involved in creating administrative metadata, reviewing the metadata, and selecting standards and documentation. Department heads and advisory committees were also involved in standards and documentation
Studies of Big Data metadata segmentation between relational and non-relational databases

Science.gov (United States)

Golosova, M. V.; Grigorieva, M. A.; Klimentov, A. A.; Ryabinkin, E. A.; Dimitrov, G.; Potekhin, M.

2015-12-01

In recent years the concepts of Big Data became well established in IT. Systems managing large data volumes produce metadata that describe data and workflows. These metadata are used to obtain information about current system state and for statistical and trend analysis of the processes these systems drive. Over the time the amount of the stored metadata can grow dramatically. In this article we present our studies to demonstrate how metadata storage scalability and performance can be improved by using hybrid RDBMS/NoSQL architecture.
Studies of Big Data metadata segmentation between relational and non-relational databases

CERN Document Server

Golosova, M V; Klimentov, A A; Ryabinkin, E A; Dimitrov, G; Potekhin, M

2015-01-01

In recent years the concepts of Big Data became well established in IT. Systems managing large data volumes produce metadata that describe data and workflows. These metadata are used to obtain information about current system state and for statistical and trend analysis of the processes these systems drive. Over the time the amount of the stored metadata can grow dramatically. In this article we present our studies to demonstrate how metadata storage scalability and performance can be improved by using hybrid RDBMS/NoSQL architecture.
Integrated Array/Metadata Analytics

Science.gov (United States)

Misev, Dimitar; Baumann, Peter

2015-04-01

Data comes in various forms and types, and integration usually presents a problem that is often simply ignored and solved with ad-hoc solutions. Multidimensional arrays are an ubiquitous data type, that we find at the core of virtually all science and engineering domains, as sensor, model, image, statistics data. Naturally, arrays are richly described by and intertwined with additional metadata (alphanumeric relational data, XML, JSON, etc). Database systems, however, a fundamental building block of what we call "Big Data", lack adequate support for modelling and expressing these array data/metadata relationships. Array analytics is hence quite primitive or non-existent at all in modern relational DBMS. Recognizing this, we extended SQL with a new SQL/MDA part seamlessly integrating multidimensional array analytics into the standard database query language. We demonstrate the benefits of SQL/MDA with real-world examples executed in ASQLDB, an open-source mediator system based on HSQLDB and rasdaman, that already implements SQL/MDA.
Metadata and Ontologies in Learning Resources Design

Science.gov (United States)

Vidal C., Christian; Segura Navarrete, Alejandra; Menéndez D., Víctor; Zapata Gonzalez, Alfredo; Prieto M., Manuel

Resource design and development requires knowledge about educational goals, instructional context and information about learner's characteristics among other. An important information source about this knowledge are metadata. However, metadata by themselves do not foresee all necessary information related to resource design. Here we argue the need to use different data and knowledge models to improve understanding the complex processes related to e-learning resources and their management. This paper presents the use of semantic web technologies, as ontologies, supporting the search and selection of resources used in design. Classification is done, based on instructional criteria derived from a knowledge acquisition process, using information provided by IEEE-LOM metadata standard. The knowledge obtained is represented in an ontology using OWL and SWRL. In this work we give evidence of the implementation of a Learning Object Classifier based on ontology. We demonstrate that the use of ontologies can support the design activities in e-learning.
A case for user-generated sensor metadata

Science.gov (United States)

Nüst, Daniel

2015-04-01

Cheap and easy to use sensing technology and new developments in ICT towards a global network of sensors and actuators promise previously unthought of changes for our understanding of the environment. Large professional as well as amateur sensor networks exist, and they are used for specific yet diverse applications across domains such as hydrology, meteorology or early warning systems. However the impact this "abundance of sensors" had so far is somewhat disappointing. There is a gap between (community-driven) sensor networks that could provide very useful data and the users of the data. In our presentation, we argue this is due to a lack of metadata which allows determining the fitness of use of a dataset. Syntactic or semantic interoperability for sensor webs have made great progress and continue to be an active field of research, yet they often are quite complex, which is of course due to the complexity of the problem at hand. But still, we see the most generic information to determine fitness for use is a dataset's provenance, because it allows users to make up their own minds independently from existing classification schemes for data quality. In this work we will make the case how curated user-contributed metadata has the potential to improve this situation. This especially applies for scenarios in which an observed property is applicable in different domains, and for set-ups where the understanding about metadata concepts and (meta-)data quality differs between data provider and user. On the one hand a citizen does not understand the ISO provenance metadata. On the other hand a researcher might find issues in publicly accessible time series published by citizens, which the latter might not be aware of or care about. Because users will have to determine fitness for use for each application on their own anyway, we suggest an online collaboration platform for user-generated metadata based on an extremely simplified data model. In the most basic fashion
FSA 2002 Digital Orthophoto Metadata

Data.gov (United States)

Minnesota Department of Natural Resources — Metadata for the 2002 FSA Color Orthophotos Layer. Each orthophoto is represented by a Quarter 24k Quad tile polygon. The polygon attributes contain the quarter-quad...
ONEMercury: Towards Automatic Annotation of Earth Science Metadata

Science.gov (United States)

Tuarob, S.; Pouchard, L. C.; Noy, N.; Horsburgh, J. S.; Palanisamy, G.

2012-12-01

Earth sciences have become more data-intensive, requiring access to heterogeneous data collected from multiple places, times, and thematic scales. For example, research on climate change may involve exploring and analyzing observational data such as the migration of animals and temperature shifts across the earth, as well as various model-observation inter-comparison studies. Recently, DataONE, a federated data network built to facilitate access to and preservation of environmental and ecological data, has come to exist. ONEMercury has recently been implemented as part of the DataONE project to serve as a portal for discovering and accessing environmental and observational data across the globe. ONEMercury harvests metadata from the data hosted by multiple data repositories and makes it searchable via a common search interface built upon cutting edge search engine technology, allowing users to interact with the system, intelligently filter the search results on the fly, and fetch the data from distributed data sources. Linking data from heterogeneous sources always has a cost. A problem that ONEMercury faces is the different levels of annotation in the harvested metadata records. Poorly annotated records tend to be missed during the search process as they lack meaningful keywords. Furthermore, such records would not be compatible with the advanced search functionality offered by ONEMercury as the interface requires a metadata record be semantically annotated. The explosion of the number of metadata records harvested from an increasing number of data repositories makes it impossible to annotate the harvested records manually, urging the need for a tool capable of automatically annotating poorly curated metadata records. In this paper, we propose a topic-model (TM) based approach for automatic metadata annotation. Our approach mines topics in the set of well annotated records and suggests keywords for poorly annotated records based on topic similarity. We utilize the
Metadata as a means for correspondence on digital media

NARCIS (Netherlands)

Stouffs, R.; Kooistra, J.; Tuncer, B.

2004-01-01

Metadata derive their action from their association to data and from the relationship they maintain with this data. An interpretation of this action is that the metadata lays claim to the data collection to which it is associated, where the claim is successful if the data collection gains quality as
Finding Atmospheric Composition (AC) Metadata

Science.gov (United States)

Strub, Richard F..; Falke, Stefan; Fiakowski, Ed; Kempler, Steve; Lynnes, Chris; Goussev, Oleg

2015-01-01

The Atmospheric Composition Portal (ACP) is an aggregator and curator of information related to remotely sensed atmospheric composition data and analysis. It uses existing tools and technologies and, where needed, enhances those capabilities to provide interoperable access, tools, and contextual guidance for scientists and value-adding organizations using remotely sensed atmospheric composition data. The initial focus is on Essential Climate Variables identified by the Global Climate Observing System CH4, CO, CO2, NO2, O3, SO2 and aerosols. This poster addresses our efforts in building the ACP Data Table, an interface to help discover and understand remotely sensed data that are related to atmospheric composition science and applications. We harvested GCMD, CWIC, GEOSS metadata catalogs using machine to machine technologies - OpenSearch, Web Services. We also manually investigated the plethora of CEOS data providers portals and other catalogs where that data might be aggregated. This poster is our experience of the excellence, variety, and challenges we encountered.Conclusions:1.The significant benefits that the major catalogs provide are their machine to machine tools like OpenSearch and Web Services rather than any GUI usability improvements due to the large amount of data in their catalog.2.There is a trend at the large catalogs towards simulating small data provider portals through advanced services. 3.Populating metadata catalogs using ISO19115 is too complex for users to do in a consistent way, difficult to parse visually or with XML libraries, and too complex for Java XML binders like CASTOR.4.The ability to search for Ids first and then for data (GCMD and ECHO) is better for machine to machine operations rather than the timeouts experienced when returning the entire metadata entry at once. 5.Metadata harvest and export activities between the major catalogs has led to a significant amount of duplication. (This is currently being addressed) 6.Most (if not all
Separation of metadata and pixel data to speed DICOM tag morphing.

Science.gov (United States)

Ismail, Mahmoud; Philbin, James

2013-01-01

The DICOM information model combines pixel data and metadata in single DICOM object. It is not possible to access the metadata separately from the pixel data. There are use cases where only metadata is accessed. The current DICOM object format increases the running time of those use cases. Tag morphing is one of those use cases. Tag morphing includes deletion, insertion or manipulation of one or more of the metadata attributes. It is typically used for order reconciliation on study acquisition or to localize the issuer of patient ID (IPID) and the patient ID attributes when data from one domain is transferred to a different domain. In this work, we propose using Multi-Series DICOM (MSD) objects, which separate metadata from pixel data and remove duplicate attributes, to reduce the time required for Tag Morphing. The time required to update a set of study attributes in each format is compared. The results show that the MSD format significantly reduces the time required for tag morphing.
Mercury- Distributed Metadata Management, Data Discovery and Access System

Science.gov (United States)

Palanisamy, Giri; Wilson, Bruce E.; Devarakonda, Ranjeet; Green, James M.

2007-12-01

Mercury is a federated metadata harvesting, search and retrieval tool based on both open source and ORNL- developed software. It was originally developed for NASA, and the Mercury development consortium now includes funding from NASA, USGS, and DOE. Mercury supports various metadata standards including XML, Z39.50, FGDC, Dublin-Core, Darwin-Core, EML, and ISO-19115 (under development). Mercury provides a single portal to information contained in disparate data management systems. It collects metadata and key data from contributing project servers distributed around the world and builds a centralized index. The Mercury search interfaces then allow the users to perform simple, fielded, spatial and temporal searches across these metadata sources. This centralized repository of metadata with distributed data sources provides extremely fast search results to the user, while allowing data providers to advertise the availability of their data and maintain complete control and ownership of that data. Mercury supports various projects including: ORNL DAAC, NBII, DADDI, LBA, NARSTO, CDIAC, OCEAN, I3N, IAI, ESIP and ARM. The new Mercury system is based on a Service Oriented Architecture and supports various services such as Thesaurus Service, Gazetteer Web Service and UDDI Directory Services. This system also provides various search services including: RSS, Geo-RSS, OpenSearch, Web Services and Portlets. Other features include: Filtering and dynamic sorting of search results, book-markable search results, save, retrieve, and modify search criteria.
Contaminant analysis automation, an overview

International Nuclear Information System (INIS)

Hollen, R.; Ramos, O. Jr.

1996-01-01

To meet the environmental restoration and waste minimization goals of government and industry, several government laboratories, universities, and private companies have formed the Contaminant Analysis Automation (CAA) team. The goal of this consortium is to design and fabricate robotics systems that standardize and automate the hardware and software of the most common environmental chemical methods. In essence, the CAA team takes conventional, regulatory- approved (EPA Methods) chemical analysis processes and automates them. The automation consists of standard laboratory modules (SLMs) that perform the work in a much more efficient, accurate, and cost- effective manner

Cytometry metadata in XML

Science.gov (United States)

Leif, Robert C.; Leif, Stephanie H.

2016-04-01

Introduction: The International Society for Advancement of Cytometry (ISAC) has created a standard for the Minimum Information about a Flow Cytometry Experiment (MIFlowCyt 1.0). CytometryML will serve as a common metadata standard for flow and image cytometry (digital microscopy). Methods: The MIFlowCyt data-types were created, as is the rest of CytometryML, in the XML Schema Definition Language (XSD1.1). The datatypes are primarily based on the Flow Cytometry and the Digital Imaging and Communication (DICOM) standards. A small section of the code was formatted with standard HTML formatting elements (p, h1, h2, etc.). Results:1) The part of MIFlowCyt that describes the Experimental Overview including the specimen and substantial parts of several other major elements has been implemented as CytometryML XML schemas (www.cytometryml.org). 2) The feasibility of using MIFlowCyt to provide the combination of an overview, table of contents, and/or an index of a scientific paper or a report has been demonstrated. Previously, a sample electronic publication, EPUB, was created that could contain both MIFlowCyt metadata as well as the binary data. Conclusions: The use of CytometryML technology together with XHTML5 and CSS permits the metadata to be directly formatted and together with the binary data to be stored in an EPUB container. This will facilitate: formatting, data- mining, presentation, data verification, and inclusion in structured research, clinical, and regulatory documents, as well as demonstrate a publication's adherence to the MIFlowCyt standard, promote interoperability and should also result in the textual and numeric data being published using web technology without any change in composition.
OAI-PMH repositories : quality issues regarding metadata and protocol compliance, tutorial 1

CERN Multimedia

CERN. Geneva; Cole, Tim

2005-01-01

This tutorial will provide an overview of emerging guidelines and best practices for OAI data providers and how they relate to expectations and needs of service providers. The audience should already be familiar with OAI protocol basics and have at least some experience with either data provider or service provider implementations. The speakers will present both protocol compliance best practices and general recommendations for creating and disseminating high-quality "shareable metadata". Protocol best practices discussion will include coverage of OAI identifiers, date-stamps, deleted records, sets, resumption tokens, about containers, branding, errors conditions, HTTP server issues, and repository lifecycle issues. Discussion of what makes for good, shareable metadata will cover topics including character encoding, namespace and XML schema issues, metadata crosswalk issues, support of multiple metadata formats, general metadata authoring recommendations, specific recommendations for use of Dublin Core elemen...
Provenance metadata gathering and cataloguing of EFIT++ code execution

Energy Technology Data Exchange (ETDEWEB)

Lupelli, I., E-mail: ivan.lupelli@ccfe.ac.uk [CCFE, Culham Science Centre, Abingdon, Oxon OX14 3DB (United Kingdom); Muir, D.G.; Appel, L.; Akers, R.; Carr, M. [CCFE, Culham Science Centre, Abingdon, Oxon OX14 3DB (United Kingdom); Abreu, P. [Instituto de Plasmas e Fusão Nuclear, Instituto Superior Técnico, Universidade de Lisboa, 1049-001 Lisboa (Portugal)

2015-10-15

Highlights: • An approach for automatic gathering of provenance metadata has been presented. • A provenance metadata catalogue has been created. • The overhead in the code runtime is less than 10%. • The metadata/data size ratio is about ∼20%. • A visualization interface based on Gephi, has been presented. - Abstract: Journal publications, as the final product of research activity, are the result of an extensive complex modeling and data analysis effort. It is of paramount importance, therefore, to capture the origins and derivation of the published data in order to achieve high levels of scientific reproducibility, transparency, internal and external data reuse and dissemination. The consequence of the modern research paradigm is that high performance computing and data management systems, together with metadata cataloguing, have become crucial elements within the nuclear fusion scientific data lifecycle. This paper describes an approach to the task of automatically gathering and cataloguing provenance metadata, currently under development and testing at Culham Center for Fusion Energy. The approach is being applied to a machine-agnostic code that calculates the axisymmetric equilibrium force balance in tokamaks, EFIT++, as a proof of principle test. The proposed approach avoids any code instrumentation or modification. It is based on the observation and monitoring of input preparation, workflow and code execution, system calls, log file data collection and interaction with the version control system. Pre-processing, post-processing, and data export and storage are monitored during the code runtime. Input data signals are captured using a data distribution platform called IDAM. The final objective of the catalogue is to create a complete description of the modeling activity, including user comments, and the relationship between data output, the main experimental database and the execution environment. For an intershot or post-pulse analysis (∼1000
Provenance metadata gathering and cataloguing of EFIT++ code execution

International Nuclear Information System (INIS)

Lupelli, I.; Muir, D.G.; Appel, L.; Akers, R.; Carr, M.; Abreu, P.

2015-01-01

Highlights: • An approach for automatic gathering of provenance metadata has been presented. • A provenance metadata catalogue has been created. • The overhead in the code runtime is less than 10%. • The metadata/data size ratio is about ∼20%. • A visualization interface based on Gephi, has been presented. - Abstract: Journal publications, as the final product of research activity, are the result of an extensive complex modeling and data analysis effort. It is of paramount importance, therefore, to capture the origins and derivation of the published data in order to achieve high levels of scientific reproducibility, transparency, internal and external data reuse and dissemination. The consequence of the modern research paradigm is that high performance computing and data management systems, together with metadata cataloguing, have become crucial elements within the nuclear fusion scientific data lifecycle. This paper describes an approach to the task of automatically gathering and cataloguing provenance metadata, currently under development and testing at Culham Center for Fusion Energy. The approach is being applied to a machine-agnostic code that calculates the axisymmetric equilibrium force balance in tokamaks, EFIT++, as a proof of principle test. The proposed approach avoids any code instrumentation or modification. It is based on the observation and monitoring of input preparation, workflow and code execution, system calls, log file data collection and interaction with the version control system. Pre-processing, post-processing, and data export and storage are monitored during the code runtime. Input data signals are captured using a data distribution platform called IDAM. The final objective of the catalogue is to create a complete description of the modeling activity, including user comments, and the relationship between data output, the main experimental database and the execution environment. For an intershot or post-pulse analysis (∼1000
Fast processing of digital imaging and communications in medicine (DICOM) metadata using multiseries DICOM format.

Science.gov (United States)

Ismail, Mahmoud; Philbin, James

2015-04-01

The digital imaging and communications in medicine (DICOM) information model combines pixel data and its metadata in a single object. There are user scenarios that only need metadata manipulation, such as deidentification and study migration. Most picture archiving and communication system use a database to store and update the metadata rather than updating the raw DICOM files themselves. The multiseries DICOM (MSD) format separates metadata from pixel data and eliminates duplicate attributes. This work promotes storing DICOM studies in MSD format to reduce the metadata processing time. A set of experiments are performed that update the metadata of a set of DICOM studies for deidentification and migration. The studies are stored in both the traditional single frame DICOM (SFD) format and the MSD format. The results show that it is faster to update studies' metadata in MSD format than in SFD format because the bulk data is separated in MSD and is not retrieved from the storage system. In addition, it is space efficient to store the deidentified studies in MSD format as it shares the same bulk data object with the original study. In summary, separation of metadata from pixel data using the MSD format provides fast metadata access and speeds up applications that process only the metadata.
The Use of Metadata Visualisation Assist Information Retrieval

Science.gov (United States)

2007-10-01

album title, the track length and the genre of music . Again, any of these pieces of information can be used to quickly search and locate specific...that person. Music files also have metadata tags, in a format called ID3. This usually contains information such as the artist, the song title, the...tracks, to provide more information about the entire music collection, or to find similar or diverse tracks within the collection. Metadata is
Linked data for libraries, archives and museums how to clean, link and publish your metadata

CERN Document Server

Hooland, Seth van

2014-01-01

This highly practical handbook teaches you how to unlock the value of your existing metadata through cleaning, reconciliation, enrichment and linking and how to streamline the process of new metadata creation. Libraries, archives and museums are facing up to the challenge of providing access to fast growing collections whilst managing cuts to budgets. Key to this is the creation, linking and publishing of good quality metadata as Linked Data that will allow their collections to be discovered, accessed and disseminated in a sustainable manner. This highly practical handbook teaches you how to unlock the value of your existing metadata through cleaning, reconciliation, enrichment and linking and how to streamline the process of new metadata creation. Metadata experts Seth van Hooland and Ruben Verborgh introduce the key concepts of metadata standards and Linked Data and how they can be practically applied to existing metadata, giving readers the tools and understanding to achieve maximum results with limited re...
Tools for automated acoustic monitoring within the R package monitoR

DEFF Research Database (Denmark)

Katz, Jonathan; Hafner, Sasha D.; Donovan, Therese

2016-01-01

The R package monitoR contains tools for managing an acoustic-monitoring program including survey metadata, template creation and manipulation, automated detection and results management. These tools are scalable for use with small projects as well as larger long-term projects and those...... with expansive spatial extents. Here, we describe typical workflow when using the tools in monitoR. Typical workflow utilizes a generic sequence of functions, with the option for either binary point matching or spectrogram cross-correlation detectors....
Tools for automated acoustic monitoring within the R package monitoR

Science.gov (United States)

Katz, Jonathan; Hafner, Sasha D.; Donovan, Therese

2016-01-01

The R package monitoR contains tools for managing an acoustic-monitoring program including survey metadata, template creation and manipulation, automated detection and results management. These tools are scalable for use with small projects as well as larger long-term projects and those with expansive spatial extents. Here, we describe typical workflow when using the tools in monitoR. Typical workflow utilizes a generic sequence of functions, with the option for either binary point matching or spectrogram cross-correlation detectors.
Normalized Metadata Generation for Human Retrieval Using Multiple Video Surveillance Cameras

Directory of Open Access Journals (Sweden)

Jaehoon Jung

2016-06-01

Full Text Available Since it is impossible for surveillance personnel to keep monitoring videos from a multiple camera-based surveillance system, an efficient technique is needed to help recognize important situations by retrieving the metadata of an object-of-interest. In a multiple camera-based surveillance system, an object detected in a camera has a different shape in another camera, which is a critical issue of wide-range, real-time surveillance systems. In order to address the problem, this paper presents an object retrieval method by extracting the normalized metadata of an object-of-interest from multiple, heterogeneous cameras. The proposed metadata generation algorithm consists of three steps: (i generation of a three-dimensional (3D human model; (ii human object-based automatic scene calibration; and (iii metadata generation. More specifically, an appropriately-generated 3D human model provides the foot-to-head direction information that is used as the input of the automatic calibration of each camera. The normalized object information is used to retrieve an object-of-interest in a wide-range, multiple-camera surveillance system in the form of metadata. Experimental results show that the 3D human model matches the ground truth, and automatic calibration-based normalization of metadata enables a successful retrieval and tracking of a human object in the multiple-camera video surveillance system.
Shared Geospatial Metadata Repository for Ontario University Libraries: Collaborative Approaches

Science.gov (United States)

Forward, Erin; Leahey, Amber; Trimble, Leanne

2015-01-01

Successfully providing access to special collections of digital geospatial data in academic libraries relies upon complete and accurate metadata. Creating and maintaining metadata using specialized standards is a formidable challenge for libraries. The Ontario Council of University Libraries' Scholars GeoPortal project, which created a shared…
Statistical Data Processing with R – Metadata Driven Approach

Directory of Open Access Journals (Sweden)

Rudi SELJAK

2016-06-01

Full Text Available In recent years the Statistical Office of the Republic of Slovenia has put a lot of effort into re-designing its statistical process. We replaced the classical stove-pipe oriented production system with general software solutions, based on the metadata driven approach. This means that one general program code, which is parametrized with process metadata, is used for data processing for a particular survey. Currently, the general program code is entirely based on SAS macros, but in the future we would like to explore how successfully statistical software R can be used for this approach. Paper describes the metadata driven principle for data validation, generic software solution and main issues connected with the use of statistical software R for this approach.
A Comparative Study on Metadata Scheme of Chinese and American Open Data Platforms

Directory of Open Access Journals (Sweden)

Yang Sinan

2018-01-01

Full Text Available [Purpose/significance] Open government data is conducive to the rational development and utilization of data resources. It can encourage social innovation and promote economic development. Besides, in order to ensure effective utilization and social increment of open government data, high-quality metadata schemes is necessary. [Method/process] Firstly, this paper analyzed the related research of open government data at home and abroad. Then, it investigated the open metadata schemes of some Chinese main local governments’ data platforms, and made a comparison with the metadata standard of American open government data. [Result/conclusion] This paper reveals that there are some disadvantages about Chinese local government open data affect the use effect of open data, which including that different governments use different data metadata schemes, the description of data set is too simple for further utilization and usually presented in HTML Web page format with lower machine-readable. Therefore, our government should come up with a standardized metadata schemes by drawing on the international mature and effective metadata standard, to ensure the social needs of high quality and high value data.
EUDAT B2FIND : A Cross-Discipline Metadata Service and Discovery Portal

Science.gov (United States)

Widmann, Heinrich; Thiemann, Hannes

2016-04-01

The European Data Infrastructure (EUDAT) project aims at a pan-European environment that supports a variety of multiple research communities and individuals to manage the rising tide of scientific data by advanced data management technologies. This led to the establishment of the community-driven Collaborative Data Infrastructure that implements common data services and storage resources to tackle the basic requirements and the specific challenges of international and interdisciplinary research data management. The metadata service B2FIND plays a central role in this context by providing a simple and user-friendly discovery portal to find research data collections stored in EUDAT data centers or in other repositories. For this we store the diverse metadata collected from heterogeneous sources in a comprehensive joint metadata catalogue and make them searchable in an open data portal. The implemented metadata ingestion workflow consists of three steps. First the metadata records - provided either by various research communities or via other EUDAT services - are harvested. Afterwards the raw metadata records are converted and mapped to unified key-value dictionaries as specified by the B2FIND schema. The semantic mapping of the non-uniform, community specific metadata to homogenous structured datasets is hereby the most subtle and challenging task. To assure and improve the quality of the metadata this mapping process is accompanied by • iterative and intense exchange with the community representatives, • usage of controlled vocabularies and community specific ontologies and • formal and semantic validation. Finally the mapped and checked records are uploaded as datasets to the catalogue, which is based on the open source data portal software CKAN. CKAN provides a rich RESTful JSON API and uses SOLR for dataset indexing that enables users to query and search in the catalogue. The homogenization of the community specific data models and vocabularies enables not
Athens automation and control experiment project review meeting, Dallas, Texas, December 5-6, 1984

Energy Technology Data Exchange (ETDEWEB)

Detwiler, J.S.; Hu, P.S.; Lawler, J.S.; Markel, L.C.; McIntyre, J.M.; McKinley, K.F.; Monteen, L.D.; Purucker, S.L.; Reed, J.H.; Rizy, D.T.

1985-12-01

The AACE is an electric power distribution automation project involving research and development of both hardware and software. Equipment for the project is being installed on the electric distribution system of the Athens Utilities Board (AUB), located in Athens, Tennessee. Purposes of the AACE are to develop and test load control, volt/var control, and system reconfiguration capabilities on an electric distribution system and to transfer what is learned to the electric utility industry. Expected benefits include deferral of costly power generation plants and increased electric service reliability. A project review meeting was held to review the progress of the AACE and to communicate the objectives and experimental plans to the electric utility industry. At the time of the meeting, the experimental test plans were being written; much of the AACE field equipment had been received by AUB, and installation had begun. A computer system, the AACE Test System (AACETS), was already operational at ORNL. AACETS will be used to develop and test applications software and experimental control strategies prior to their implementation on the AUB system. The AACE experiments are scheduled to begin in October 1985 and to continue through October 1987.
FSA 2003-2004 Digital Orthophoto Metadata

Data.gov (United States)

Minnesota Department of Natural Resources — Metadata for the 2003-2004 FSA Color Orthophotos Layer. Each orthophoto is represented by a Quarter 24k Quad tile polygon. The polygon attributes contain the...
USGS Digital Orthophoto Quad (DOQ) Metadata

Data.gov (United States)

Minnesota Department of Natural Resources — Metadata for the USGS DOQ Orthophoto Layer. Each orthophoto is represented by a Quarter 24k Quad tile polygon. The polygon attributes contain the quarter-quad tile...
AUTOMATION, ITS MEANING FOR EDUCATIONAL ADMINISTRATION.

Science.gov (United States)

National Conference of Professors of Educational Administration.

A REPORT OF THE TENTH ANNUAL MEETING OF THE NATIONAL CONFERENCE OF PROFESSORS OF EDUCATIONAL ADMINISTRATION (NCPEA), WHICH WAS HELD TO DISCUSS AUTOMATION AND ITS IMPLICATIONS FOR THE PREPARATION OF SCHOOL ADMINISTRATORS, IS PRESENTED. THE CONFERENCE WAS UNDERTAKEN BECAUSE THE NCPEA BELIEVED THAT AUTOMATION IS SYMBOLIC OF VAST CHANGES AT WORK IN…
DESIGN AND PRACTICE ON METADATA SERVICE SYSTEM OF SURVEYING AND MAPPING RESULTS BASED ON GEONETWORK

Directory of Open Access Journals (Sweden)

Z. Zha

2012-08-01

Full Text Available Based on the analysis and research on the current geographic information sharing and metadata service，we design, develop and deploy a distributed metadata service system based on GeoNetwork covering more than 30 nodes in provincial units of China.. By identifying the advantages of GeoNetwork, we design a distributed metadata service system of national surveying and mapping results. It consists of 31 network nodes, a central node and a portal. Network nodes are the direct system metadata source, and are distributed arround the country. Each network node maintains a metadata service system, responsible for metadata uploading and management. The central node harvests metadata from network nodes using OGC CSW 2.0.2 standard interface. The portal shows all metadata in the central node, provides users with a variety of methods and interface for metadata search or querying. It also provides management capabilities on connecting the central node and the network nodes together. There are defects with GeoNetwork too. Accordingly, we made improvement and optimization on big-amount metadata uploading, synchronization and concurrent access. For metadata uploading and synchronization, by carefully analysis the database and index operation logs, we successfully avoid the performance bottlenecks. And with a batch operation and dynamic memory management solution, data throughput and system performance are significantly improved; For concurrent access, , through a request coding and results cache solution, query performance is greatly improved. To smoothly respond to huge concurrent requests, a web cluster solution is deployed. This paper also gives an experiment analysis and compares the system performance before and after improvement and optimization. Design and practical results have been applied in national metadata service system of surveying and mapping results. It proved that the improved GeoNetwork service architecture can effectively adaptive for
Advancements in Large-Scale Data/Metadata Management for Scientific Data.

Science.gov (United States)

Guntupally, K.; Devarakonda, R.; Palanisamy, G.; Frame, M. T.

2017-12-01

Scientific data often comes with complex and diverse metadata which are critical for data discovery and users. The Online Metadata Editor (OME) tool, which was developed by an Oak Ridge National Laboratory team, effectively manages diverse scientific datasets across several federal data centers, such as DOE's Atmospheric Radiation Measurement (ARM) Data Center and USGS's Core Science Analytics, Synthesis, and Libraries (CSAS&L) project. This presentation will focus mainly on recent developments and future strategies for refining OME tool within these centers. The ARM OME is a standard based tool (https://www.archive.arm.gov/armome) that allows scientists to create and maintain metadata about their data products. The tool has been improved with new workflows that help metadata coordinators and submitting investigators to submit and review their data more efficiently. The ARM Data Center's newly upgraded Data Discovery Tool (http://www.archive.arm.gov/discovery) uses rich metadata generated by the OME to enable search and discovery of thousands of datasets, while also providing a citation generator and modern order-delivery techniques like Globus (using GridFTP), Dropbox and THREDDS. The Data Discovery Tool also supports incremental indexing, which allows users to find new data as and when they are added. The USGS CSAS&L search catalog employs a custom version of the OME (https://www1.usgs.gov/csas/ome), which has been upgraded with high-level Federal Geographic Data Committee (FGDC) validations and the ability to reserve and mint Digital Object Identifiers (DOIs). The USGS's Science Data Catalog (SDC) (https://data.usgs.gov/datacatalog) allows users to discover a myriad of science data holdings through a web portal. Recent major upgrades to the SDC and ARM Data Discovery Tool include improved harvesting performance and migration using new search software, such as Apache Solr 6.0 for serving up data/metadata to scientific communities. Our presentation will highlight

ncISO Facilitating Metadata and Scientific Data Discovery

Science.gov (United States)

Neufeld, D.; Habermann, T.

2011-12-01

Increasing the usability and availability climate and oceanographic datasets for environmental research requires improved metadata and tools to rapidly locate and access relevant information for an area of interest. Because of the distributed nature of most environmental geospatial data, a common approach is to use catalog services that support queries on metadata harvested from remote map and data services. A key component to effectively using these catalog services is the availability of high quality metadata associated with the underlying data sets. In this presentation, we examine the use of ncISO, and Geoportal as open source tools that can be used to document and facilitate access to ocean and climate data available from Thematic Realtime Environmental Distributed Data Services (THREDDS) data services. Many atmospheric and oceanographic spatial data sets are stored in the Network Common Data Format (netCDF) and served through the Unidata THREDDS Data Server (TDS). NetCDF and THREDDS are becoming increasingly accepted in both the scientific and geographic research communities as demonstrated by the recent adoption of netCDF as an Open Geospatial Consortium (OGC) standard. One important source for ocean and atmospheric based data sets is NOAA's Unified Access Framework (UAF) which serves over 3000 gridded data sets from across NOAA and NOAA-affiliated partners. Due to the large number of datasets, browsing the data holdings to locate data is impractical. Working with Unidata, we have created a new service for the TDS called "ncISO", which allows automatic generation of ISO 19115-2 metadata from attributes and variables in TDS datasets. The ncISO metadata records can be harvested by catalog services such as ESSI-labs GI-Cat catalog service, and ESRI's Geoportal which supports query through a number of services, including OpenSearch and Catalog Services for the Web (CSW). ESRI's Geoportal Server provides a number of user friendly search capabilities for end users
Fast processing of digital imaging and communications in medicine (DICOM) metadata using multiseries DICOM format

OpenAIRE

Ismail, Mahmoud; Philbin, James

2015-01-01

The digital imaging and communications in medicine (DICOM) information model combines pixel data and its metadata in a single object. There are user scenarios that only need metadata manipulation, such as deidentification and study migration. Most picture archiving and communication system use a database to store and update the metadata rather than updating the raw DICOM files themselves. The multiseries DICOM (MSD) format separates metadata from pixel data and eliminates duplicate attributes...
Improving Earth Science Metadata: Modernizing ncISO

Science.gov (United States)

O'Brien, K.; Schweitzer, R.; Neufeld, D.; Burger, E. F.; Signell, R. P.; Arms, S. C.; Wilcox, K.

2016-12-01

ncISO is a package of tools developed at NOAA's National Center for Environmental Information (NCEI) that facilitates the generation of ISO 19115-2 metadata from NetCDF data sources. The tool currently exists in two iterations: a command line utility and a web-accessible service within the THREDDS Data Server (TDS). Several projects, including NOAA's Unified Access Framework (UAF), depend upon ncISO to generate the ISO-compliant metadata from their data holdings and use the resulting information to populate discovery tools such as NCEI's ESRI Geoportal and NOAA's data.noaa.gov CKAN system. In addition to generating ISO 19115-2 metadata, the tool calculates a rubric score based on how well the dataset follows the Attribute Conventions for Dataset Discovery (ACDD). The result of this rubric calculation, along with information about what has been included and what is missing is displayed in an HTML document generated by the ncISO software package. Recently ncISO has fallen behind in terms of supporting updates to conventions such updates to the ACDD. With the blessing of the original programmer, NOAA's UAF has been working to modernize the ncISO software base. In addition to upgrading ncISO to utilize version1.3 of the ACDD, we have been working with partners at Unidata and IOOS to unify the tool's code base. In essence, we are merging the command line capabilities into the same software that will now be used by the TDS service, allowing easier updates when conventions such as ACDD are updated in the future. In this presentation, we will discuss the work the UAF project has done to support updated conventions within ncISO, as well as describe how the updated tool is helping to improve metadata throughout the earth and ocean sciences.
Automation and Integration in Semiconductor Manufacturing

OpenAIRE

Liao, Da-Yin

2010-01-01

Semiconductor automation originates from the prevention and avoidance of frauds in daily fab operations. As semiconductor technology and business continuously advance and grow, manufacturing systems must aggressively evolve to meet the changing technical and business requirements in this industry. Semiconductor manufacturing has been suffering pains from islands of automation. The problems associated with these systems are limited
File and metadata management for BESIII distributed computing

International Nuclear Information System (INIS)

Nicholson, C; Zheng, Y H; Lin, L; Deng, Z Y; Li, W D; Zhang, X M

2012-01-01

The BESIII experiment at the Institute of High Energy Physics (IHEP), Beijing, uses the high-luminosity BEPCII e + e − collider to study physics in the π-charm energy region around 3.7 GeV; BEPCII has produced the worlds largest samples of J/φ and φ’ events to date. An order of magnitude increase in the data sample size over the 2011-2012 data-taking period demanded a move from a very centralized to a distributed computing environment, as well as the development of an efficient file and metadata management system. While BESIII is on a smaller scale than some other HEP experiments, this poses particular challenges for its distributed computing and data management system. These constraints include limited resources and manpower, and low quality of network connections to IHEP. Drawing on the rich experience of the HEP community, a system has been developed which meets these constraints. The design and development of the BESIII distributed data management system, including its integration with other BESIII distributed computing components, such as job management, are presented here.
Meeting report: GSC M5 roundtable at the 13th International Society for Microbial Ecology meeting in Seattle, WA, USA August 22-27, 2010

Science.gov (United States)

Gilbert, Jack A.; Meyer, Folker; Knight, Rob; Field, Dawn; Kyrpides, Nikos; Yilmaz, Pelin; Wooley, John

2010-01-01

This report summarizes the proceedings of the Metagenomics, Metadata, Metaanalysis, Models and Metainfrastructure (M5) Roundtable at the 13th International Society for Microbial Ecology Meeting in Seattle, WA, USA August 22-27, 2010. The Genomic Standards Consortium (GSC) hosted this meeting as a community engagement exercise to describe the GSC to the microbial ecology community during this important international meeting. The roundtable included five talks given by members of the GSC, and was followed by audience participation in the form of a roundtable discussion. This report summarizes this event. Further information on the GSC and its range of activities can be found at http://www.gensc.org. PMID:21304725
Meeting report: GSC M5 roundtable at the 13th International Society for Microbial Ecology meeting in Seattle, WA, USA August 22-27, 2010.

Science.gov (United States)

Gilbert, Jack A; Meyer, Folker; Knight, Rob; Field, Dawn; Kyrpides, Nikos; Yilmaz, Pelin; Wooley, John

2010-12-15

This report summarizes the proceedings of the Metagenomics, Metadata, Metaanalysis, Models and Metainfrastructure (M5) Roundtable at the 13th International Society for Microbial Ecology Meeting in Seattle, WA, USA August 22-27, 2010. The Genomic Standards Consortium (GSC) hosted this meeting as a community engagement exercise to describe the GSC to the microbial ecology community during this important international meeting. The roundtable included five talks given by members of the GSC, and was followed by audience participation in the form of a roundtable discussion. This report summarizes this event. Further information on the GSC and its range of activities can be found at http://www.gensc.org.
Predicting age groups of Twitter users based on language and metadata features.

Directory of Open Access Journals (Sweden)

Antonio A Morgan-Lopez

Full Text Available Health organizations are increasingly using social media, such as Twitter, to disseminate health messages to target audiences. Determining the extent to which the target audience (e.g., age groups was reached is critical to evaluating the impact of social media education campaigns. The main objective of this study was to examine the separate and joint predictive validity of linguistic and metadata features in predicting the age of Twitter users. We created a labeled dataset of Twitter users across different age groups (youth, young adults, adults by collecting publicly available birthday announcement tweets using the Twitter Search application programming interface. We manually reviewed results and, for each age-labeled handle, collected the 200 most recent publicly available tweets and user handles' metadata. The labeled data were split into training and test datasets. We created separate models to examine the predictive validity of language features only, metadata features only, language and metadata features, and words/phrases from another age-validated dataset. We estimated accuracy, precision, recall, and F1 metrics for each model. An L1-regularized logistic regression model was conducted for each age group, and predicted probabilities between the training and test sets were compared for each age group. Cohen's d effect sizes were calculated to examine the relative importance of significant features. Models containing both Tweet language features and metadata features performed the best (74% precision, 74% recall, 74% F1 while the model containing only Twitter metadata features were least accurate (58% precision, 60% recall, and 57% F1 score. Top predictive features included use of terms such as "school" for youth and "college" for young adults. Overall, it was more challenging to predict older adults accurately. These results suggest that examining linguistic and Twitter metadata features to predict youth and young adult Twitter users may
An Assistant for Loading Learning Object Metadata: An Ontology Based Approach

Science.gov (United States)

Casali, Ana; Deco, Claudia; Romano, Agustín; Tomé, Guillermo

2013-01-01

In the last years, the development of different Repositories of Learning Objects has been increased. Users can retrieve these resources for reuse and personalization through searches in web repositories. The importance of high quality metadata is key for a successful retrieval. Learning Objects are described with metadata usually in the standard…
A metadata schema for data objects in clinical research.

Science.gov (United States)

Canham, Steve; Ohmann, Christian

2016-11-24

A large number of stakeholders have accepted the need for greater transparency in clinical research and, in the context of various initiatives and systems, have developed a diverse and expanding number of repositories for storing the data and documents created by clinical studies (collectively known as data objects). To make the best use of such resources, we assert that it is also necessary for stakeholders to agree and deploy a simple, consistent metadata scheme. The relevant data objects and their likely storage are described, and the requirements for metadata to support data sharing in clinical research are identified. Issues concerning persistent identifiers, for both studies and data objects, are explored. A scheme is proposed that is based on the DataCite standard, with extensions to cover the needs of clinical researchers, specifically to provide (a) study identification data, including links to clinical trial registries; (b) data object characteristics and identifiers; and (c) data covering location, ownership and access to the data object. The components of the metadata scheme are described. The metadata schema is proposed as a natural extension of a widely agreed standard to fill a gap not tackled by other standards related to clinical research (e.g., Clinical Data Interchange Standards Consortium, Biomedical Research Integrated Domain Group). The proposal could be integrated with, but is not dependent on, other moves to better structure data in clinical research.
Metadata-Driven SOA-Based Application for Facilitation of Real-Time Data Warehousing

Science.gov (United States)

Pintar, Damir; Vranić, Mihaela; Skočir, Zoran

Service-oriented architecture (SOA) has already been widely recognized as an effective paradigm for achieving integration of diverse information systems. SOA-based applications can cross boundaries of platforms, operation systems and proprietary data standards, commonly through the usage of Web Services technology. On the other side, metadata is also commonly referred to as a potential integration tool given the fact that standardized metadata objects can provide useful information about specifics of unknown information systems with which one has interest in communicating with, using an approach commonly called "model-based integration". This paper presents the result of research regarding possible synergy between those two integration facilitators. This is accomplished with a vertical example of a metadata-driven SOA-based business process that provides ETL (Extraction, Transformation and Loading) and metadata services to a data warehousing system in need of a real-time ETL support.
Ontology-based Metadata Portal for Unified Semantics

Data.gov (United States)

National Aeronautics and Space Administration — The Ontology-based Metadata Portal for Unified Semantics (OlyMPUS) will extend the prototype Ontology-Driven Interactive Search Environment for Earth Sciences...
Automated measuring systems. Automatisierte Messsysteme

Energy Technology Data Exchange (ETDEWEB)

1985-01-01

Microprocessors have become a regular component of automated measuring systems. Experts offer their experience and basic information in 24 lectures and 10 poster presentations. The focus is on the following: Automated measuring, computer and microprocessor use, sensor technique, actuator technique, communication, interfaces, man-system interaction, distrubance tolerance and availability as well as uses. A discussion meeting is dedicated to the theme complex sensor digital signal, sensor interface and sensor bus.
A semantically rich and standardised approach enhancing discovery of sensor data and metadata

Science.gov (United States)

Kokkinaki, Alexandra; Buck, Justin; Darroch, Louise

2016-04-01

The marine environment plays an essential role in the earth's climate. To enhance the ability to monitor the health of this important system, innovative sensors are being produced and combined with state of the art sensor technology. As the number of sensors deployed is continually increasing,, it is a challenge for data users to find the data that meet their specific needs. Furthermore, users need to integrate diverse ocean datasets originating from the same or even different systems. Standards provide a solution to the above mentioned challenges. The Open Geospatial Consortium (OGC) has created Sensor Web Enablement (SWE) standards that enable different sensor networks to establish syntactic interoperability. When combined with widely accepted controlled vocabularies, they become semantically rich and semantic interoperability is achievable. In addition, Linked Data is the recommended best practice for exposing, sharing and connecting information on the Semantic Web using Uniform Resource Identifiers (URIs), Resource Description Framework (RDF) and RDF Query Language (SPARQL). As part of the EU-funded SenseOCEAN project, the British Oceanographic Data Centre (BODC) is working on the standardisation of sensor metadata enabling 'plug and play' sensor integration. Our approach combines standards, controlled vocabularies and persistent URIs to publish sensor descriptions, their data and associated metadata as 5 star Linked Data and OGC SWE (SensorML, Observations & Measurements) standard. Thus sensors become readily discoverable, accessible and useable via the web. Content and context based searching is also enabled since sensors descriptions are understood by machines. Additionally, sensor data can be combined with other sensor or Linked Data datasets to form knowledge. This presentation will describe the work done in BODC to achieve syntactic and semantic interoperability in the sensor domain. It will illustrate the reuse and extension of the Semantic Sensor
Radiological dose and metadata management

International Nuclear Information System (INIS)

Walz, M.; Madsack, B.; Kolodziej, M.

2016-01-01

This article describes the features of management systems currently available in Germany for extraction, registration and evaluation of metadata from radiological examinations, particularly in the digital imaging and communications in medicine (DICOM) environment. In addition, the probable relevant developments in this area concerning radiation protection legislation, terminology, standardization and information technology are presented. (orig.) [de
Meta-Data Objects as the Basis for System Evolution

CERN Document Server

Estrella, Florida; Tóth, N; Kovács, Z; Le Goff, J M; Clatchey, Richard Mc; Toth, Norbert; Kovacs, Zsolt; Goff, Jean-Marie Le

2001-01-01

One of the main factors driving object-oriented software development in the Web- age is the need for systems to evolve as user requirements change. A crucial factor in the creation of adaptable systems dealing with changing requirements is the suitability of the underlying technology in allowing the evolution of the system. A reflective system utilizes an open architecture where implicit system aspects are reified to become explicit first-class (meta-data) objects. These implicit system aspects are often fundamental structures which are inaccessible and immutable, and their reification as meta-data objects can serve as the basis for changes and extensions to the system, making it self- describing. To address the evolvability issue, this paper proposes a reflective architecture based on two orthogonal abstractions - model abstraction and information abstraction. In this architecture the modeling abstractions allow for the separation of the description meta-data from the system aspects they represent so that th...
Nonanalytic Laboratory Automation: A Quarter Century of Progress.

Science.gov (United States)

Hawker, Charles D

2017-06-01

Clinical laboratory automation has blossomed since the 1989 AACC meeting, at which Dr. Masahide Sasaki first showed a western audience what his laboratory had implemented. Many diagnostics and other vendors are now offering a variety of automated options for laboratories of all sizes. Replacing manual processing and handling procedures with automation was embraced by the laboratory community because of the obvious benefits of labor savings and improvement in turnaround time and quality. Automation was also embraced by the diagnostics vendors who saw automation as a means of incorporating the analyzers purchased by their customers into larger systems in which the benefits of automation were integrated to the analyzers.This report reviews the options that are available to laboratory customers. These options include so called task-targeted automation-modules that range from single function devices that automate single tasks (e.g., decapping or aliquoting) to multifunction workstations that incorporate several of the functions of a laboratory sample processing department. The options also include total laboratory automation systems that use conveyors to link sample processing functions to analyzers and often include postanalytical features such as refrigerated storage and sample retrieval.Most importantly, this report reviews a recommended process for evaluating the need for new automation and for identifying the specific requirements of a laboratory and developing solutions that can meet those requirements. The report also discusses some of the practical considerations facing a laboratory in a new implementation and reviews the concept of machine vision to replace human inspections. © 2017 American Association for Clinical Chemistry.
Batch metadata assignment to archival photograph collections using facial recognition software

Directory of Open Access Journals (Sweden)

Kyle Banerjee

2013-07-01

Full Text Available Useful metadata is essential to giving individual meaning and value within the context of a greater image collection as well as making them more discoverable. However, often little information is available about the photos themselves, so adding consistent metadata to large collections of digital and digitized photographs is a time consuming process requiring highly experienced staff. By using facial recognition software, staff can identify individuals more quickly and reliably. Knowledge of individuals in photos helps staff determine when and where photos are taken and also improves understanding of the subject matter. This article demonstrates simple techniques for using facial recognition software and command line tools to assign, modify, and read metadata for large archival photograph collections.
phosphorus retention data and metadata

Science.gov (United States)

phosphorus retention in wetlands data and metadataThis dataset is associated with the following publication:Lane , C., and B. Autrey. Phosphorus retention of forested and emergent marsh depressional wetlands in differing land uses in Florida, USA. Wetlands Ecology and Management. Springer Science and Business Media B.V;Formerly Kluwer Academic Publishers B.V., GERMANY, 24(1): 45-60, (2016).
NCPP's Use of Standard Metadata to Promote Open and Transparent Climate Modeling

Science.gov (United States)

Treshansky, A.; Barsugli, J. J.; Guentchev, G.; Rood, R. B.; DeLuca, C.

2012-12-01

The National Climate Predictions and Projections (NCPP) Platform is developing comprehensive regional and local information about the evolving climate to inform decision making and adaptation planning. This includes both creating and providing tools to create metadata about the models and processes used to create its derived data products. NCPP is using the Common Information Model (CIM), an ontology developed by a broad set of international partners in climate research, as its metadata language. This use of a standard ensures interoperability within the climate community as well as permitting access to the ecosystem of tools and services emerging alongside the CIM. The CIM itself is divided into a general-purpose (UML & XML) schema which structures metadata documents, and a project or community-specific (XML) Controlled Vocabulary (CV) which constraints the content of metadata documents. NCPP has already modified the CIM Schema to accommodate downscaling models, simulations, and experiments. NCPP is currently developing a CV for use by the downscaling community. Incorporating downscaling into the CIM will lead to several benefits: easy access to the existing CIM Documents describing CMIP5 models and simulations that are being downscaled, access to software tools that have been developed in order to search, manipulate, and visualize CIM metadata, and coordination with national and international efforts such as ES-DOC that are working to make climate model descriptions and datasets interoperable. Providing detailed metadata descriptions which include the full provenance of derived data products will contribute to making that data (and, the models and processes which generated that data) more open and transparent to the user community.

OntoCheck: verifying ontology naming conventions and metadata completeness in Protégé 4.

Science.gov (United States)

Schober, Daniel; Tudose, Ilinca; Svatek, Vojtech; Boeker, Martin

2012-09-21

Although policy providers have outlined minimal metadata guidelines and naming conventions, ontologies of today still display inter- and intra-ontology heterogeneities in class labelling schemes and metadata completeness. This fact is at least partially due to missing or inappropriate tools. Software support can ease this situation and contribute to overall ontology consistency and quality by helping to enforce such conventions. We provide a plugin for the Protégé Ontology editor to allow for easy checks on compliance towards ontology naming conventions and metadata completeness, as well as curation in case of found violations. In a requirement analysis, derived from a prior standardization approach carried out within the OBO Foundry, we investigate the needed capabilities for software tools to check, curate and maintain class naming conventions. A Protégé tab plugin was implemented accordingly using the Protégé 4.1 libraries. The plugin was tested on six different ontologies. Based on these test results, the plugin could be refined, also by the integration of new functionalities. The new Protégé plugin, OntoCheck, allows for ontology tests to be carried out on OWL ontologies. In particular the OntoCheck plugin helps to clean up an ontology with regard to lexical heterogeneity, i.e. enforcing naming conventions and metadata completeness, meeting most of the requirements outlined for such a tool. Found test violations can be corrected to foster consistency in entity naming and meta-annotation within an artefact. Once specified, check constraints like name patterns can be stored and exchanged for later re-use. Here we describe a first version of the software, illustrate its capabilities and use within running ontology development efforts and briefly outline improvements resulting from its application. Further, we discuss OntoChecks capabilities in the context of related tools and highlight potential future expansions. The OntoCheck plugin facilitates
Data catalog project—A browsable, searchable, metadata system

International Nuclear Information System (INIS)

Stillerman, Joshua; Fredian, Thomas; Greenwald, Martin; Manduchi, Gabriele

2016-01-01

Modern experiments are typically conducted by large, extended groups, where researchers rely on other team members to produce much of the data they use. The experiments record very large numbers of measurements that can be difficult for users to find, access and understand. We are developing a system for users to annotate their data products with structured metadata, providing data consumers with a discoverable, browsable data index. Machine understandable metadata captures the underlying semantics of the recorded data, which can then be consumed by both programs, and interactively by users. Collaborators can use these metadata to select and understand recorded measurements. The data catalog project is a data dictionary and index which enables users to record general descriptive metadata, use cases and rendering information as well as providing them a transparent data access mechanism (URI). Users describe their diagnostic including references, text descriptions, units, labels, example data instances, author contact information and data access URIs. The list of possible attribute labels is extensible, but limiting the vocabulary of names increases the utility of the system. The data catalog is focused on the data products and complements process-based systems like the Metadata Ontology Provenance project [Greenwald, 2012; Schissel, 2015]. This system can be coupled with MDSplus to provide a simple platform for data driven display and analysis programs. Sites which use MDSplus can describe tree branches, and if desired create ‘processed data trees’ with homogeneous node structures for measurements. Sites not currently using MDSplus can either use the database to reference local data stores, or construct an MDSplus tree whose leaves reference the local data store. A data catalog system can provide a useful roadmap of data acquired from experiments or simulations making it easier for researchers to find and access important data and understand the meaning of the
Data catalog project—A browsable, searchable, metadata system

Energy Technology Data Exchange (ETDEWEB)

Stillerman, Joshua, E-mail: jas@psfc.mit.edu [MIT Plasma Science and Fusion Center, Cambridge, MA (United States); Fredian, Thomas; Greenwald, Martin [MIT Plasma Science and Fusion Center, Cambridge, MA (United States); Manduchi, Gabriele [Consorzio RFX, Euratom-ENEA Association, Corso Stati Uniti 4, Padova 35127 (Italy)

2016-11-15

Modern experiments are typically conducted by large, extended groups, where researchers rely on other team members to produce much of the data they use. The experiments record very large numbers of measurements that can be difficult for users to find, access and understand. We are developing a system for users to annotate their data products with structured metadata, providing data consumers with a discoverable, browsable data index. Machine understandable metadata captures the underlying semantics of the recorded data, which can then be consumed by both programs, and interactively by users. Collaborators can use these metadata to select and understand recorded measurements. The data catalog project is a data dictionary and index which enables users to record general descriptive metadata, use cases and rendering information as well as providing them a transparent data access mechanism (URI). Users describe their diagnostic including references, text descriptions, units, labels, example data instances, author contact information and data access URIs. The list of possible attribute labels is extensible, but limiting the vocabulary of names increases the utility of the system. The data catalog is focused on the data products and complements process-based systems like the Metadata Ontology Provenance project [Greenwald, 2012; Schissel, 2015]. This system can be coupled with MDSplus to provide a simple platform for data driven display and analysis programs. Sites which use MDSplus can describe tree branches, and if desired create ‘processed data trees’ with homogeneous node structures for measurements. Sites not currently using MDSplus can either use the database to reference local data stores, or construct an MDSplus tree whose leaves reference the local data store. A data catalog system can provide a useful roadmap of data acquired from experiments or simulations making it easier for researchers to find and access important data and understand the meaning of the
Title, Description, and Subject are the Most Important Metadata Fields for Keyword Discoverability

Directory of Open Access Journals (Sweden)

Laura Costello

2016-09-01

Full Text Available A Review of: Yang, L. (2016. Metadata effectiveness in internet discovery: An analysis of digital collection metadata elements and internet search engine keywords. College & Research Libraries, 77(1, 7-19. http://doi.org/10.5860/crl.77.1.7 Objective – To determine which metadata elements best facilitate discovery of digital collections. Design – Case study. Setting – A public research university serving over 32,000 graduate and undergraduate students in the Southwestern United States of America. Subjects – A sample of 22,559 keyword searches leading to the institution’s digital repository between August 1, 2013, and July 31, 2014. Methods – The author used Google Analytics to analyze 73,341 visits to the institution’s digital repository. He determined that 22,559 of these visits were due to keyword searches. Using Random Integer Generator, the author identified a random sample of 378 keyword searches. The author then matched the keywords with the Dublin Core and VRA Core metadata elements on the landing page in the digital repository to determine which metadata field had drawn the keyword searcher to that particular page. Many of these keywords matched to more than one metadata field, so the author also analyzed the metadata elements that generated unique keyword hits and those fields that were frequently matched together. Main Results – Title was the most matched metadata field with 279 matched keywords from searches. Description and Subject were also significant fields with 208 and 79 matches respectively. Slightly more than half of the results, 195 keywords, matched the institutional repository in one field only. Both Title and Description had significant match rates both independently and in conjunction with other elements, but Subject keywords were the sole match in only three of the sampled cases. Conclusion – The Dublin Core elements of Title, Description, and Subject were the most frequently matched fields in keyword
New Tools to Document and Manage Data/Metadata: Example NGEE Arctic and ARM

Science.gov (United States)

Crow, M. C.; Devarakonda, R.; Killeffer, T.; Hook, L.; Boden, T.; Wullschleger, S.

2017-12-01

Tools used for documenting, archiving, cataloging, and searching data are critical pieces of informatics. This poster describes tools being used in several projects at Oak Ridge National Laboratory (ORNL), with a focus on the U.S. Department of Energy's Next Generation Ecosystem Experiment in the Arctic (NGEE Arctic) and Atmospheric Radiation Measurements (ARM) project, and their usage at different stages of the data lifecycle. The Online Metadata Editor (OME) is used for the documentation and archival stages while a Data Search tool supports indexing, cataloging, and searching. The NGEE Arctic OME Tool [1] provides a method by which researchers can upload their data and provide original metadata with each upload while adhering to standard metadata formats. The tool is built upon a Java SPRING framework to parse user input into, and from, XML output. Many aspects of the tool require use of a relational database including encrypted user-login, auto-fill functionality for predefined sites and plots, and file reference storage and sorting. The Data Search Tool conveniently displays each data record in a thumbnail containing the title, source, and date range, and features a quick view of the metadata associated with that record, as well as a direct link to the data. The search box incorporates autocomplete capabilities for search terms and sorted keyword filters are available on the side of the page, including a map for geo-searching. These tools are supported by the Mercury [2] consortium (funded by DOE, NASA, USGS, and ARM) and developed and managed at Oak Ridge National Laboratory. Mercury is a set of tools for collecting, searching, and retrieving metadata and data. Mercury collects metadata from contributing project servers, then indexes the metadata to make it searchable using Apache Solr, and provides access to retrieve it from the web page. Metadata standards that Mercury supports include: XML, Z39.50, FGDC, Dublin-Core, Darwin-Core, EML, and ISO-19115.
System for Earth Sample Registration SESAR: Services for IGSN Registration and Sample Metadata Management

Science.gov (United States)

Chan, S.; Lehnert, K. A.; Coleman, R. J.

2011-12-01

SESAR, the System for Earth Sample Registration, is an online registry for physical samples collected for Earth and environmental studies. SESAR generates and administers the International Geo Sample Number IGSN, a unique identifier for samples that is dramatically advancing interoperability amongst information systems for sample-based data. SESAR was developed to provide the complete range of registry services, including definition of IGSN syntax and metadata profiles, registration and validation of name spaces requested by users, tools for users to submit and manage sample metadata, validation of submitted metadata, generation and validation of the unique identifiers, archiving of sample metadata, and public or private access to the sample metadata catalog. With the development of SESAR v3, we placed particular emphasis on creating enhanced tools that make metadata submission easier and more efficient for users, and that provide superior functionality for users to manage metadata of their samples in their private workspace MySESAR. For example, SESAR v3 includes a module where users can generate custom spreadsheet templates to enter metadata for their samples, then upload these templates online for sample registration. Once the content of the template is uploaded, it is displayed online in an editable grid format. Validation rules are executed in real-time on the grid data to ensure data integrity. Other new features of SESAR v3 include the capability to transfer ownership of samples to other SESAR users, the ability to upload and store images and other files in a sample metadata profile, and the tracking of changes to sample metadata profiles. In the next version of SESAR (v3.5), we will further improve the discovery, sharing, registration of samples. For example, we are developing a more comprehensive suite of web services that will allow discovery and registration access to SESAR from external systems. Both batch and individual registrations will be possible
Leveraging Metadata to Create Interactive Images... Today!

Science.gov (United States)

Hurt, Robert L.; Squires, G. K.; Llamas, J.; Rosenthal, C.; Brinkworth, C.; Fay, J.

2011-01-01

The image gallery for NASA's Spitzer Space Telescope has been newly rebuilt to fully support the Astronomy Visualization Metadata (AVM) standard to create a new user experience both on the website and in other applications. We encapsulate all the key descriptive information for a public image, including color representations and astronomical and sky coordinates and make it accessible in a user-friendly form on the website, but also embed the same metadata within the image files themselves. Thus, images downloaded from the site will carry with them all their descriptive information. Real-world benefits include display of general metadata when such images are imported into image editing software (e.g. Photoshop) or image catalog software (e.g. iPhoto). More advanced support in Microsoft's WorldWide Telescope can open a tagged image after it has been downloaded and display it in its correct sky position, allowing comparison with observations from other observatories. An increasing number of software developers are implementing AVM support in applications and an online image archive for tagged images is under development at the Spitzer Science Center. Tagging images following the AVM offers ever-increasing benefits to public-friendly imagery in all its standard forms (JPEG, TIFF, PNG). The AVM standard is one part of the Virtual Astronomy Multimedia Project (VAMP); http://www.communicatingastronomy.org
Comparative Study of Metadata Elements Used in the Website of Central Library of Universities Subordinate to the Ministry of Science, Research and Technology with the Dublin Core Metadata Elements

Directory of Open Access Journals (Sweden)

Kobra Babaei

2012-03-01

Full Text Available This research has been carried out with the aim of studying the web sites of central libraries of universities subordinate to the Ministry of Science, Research and Technology usage of metadata elements and its comparison with Dublin Core standard elements. This study was a comparative survey, in which 40 websites of academic library by using Internet Explorer browser. Then the HTML pages of these websites were seen through the Source of View menu, and metadata elements of each websites were extracted and entered in the checklist. Then, with using descriptive statistics (frequency, percentage and mean analysis of data was discussed. Research findings showed that the reviewed websites did not use any Dublin Core metadata elements, general metadata Markup language used in design of all websites, the amount of metadata elements used in website, Central Library of Ferdowsi University of Mashhad and Iran Science and Industries with 57% in first ranked and Shahid Beheshti University with 49% in second ranked and the International University of Imam Khomeini with 40% was in third ranked. The approach to web designers was determined too that as follows: the content of source in first ranked and attention to physical appearance source in second ranked and also ownership of source in third position.
Dealing with metadata quality: the legacy of digital library efforts

OpenAIRE

Tani, Alice; Candela, Leonardo; Castelli, Donatella

2013-01-01

In this work, we elaborate on the meaning of metadata quality by surveying efforts and experiences matured in the digital library domain. In particular, an overview of the frameworks developed to characterize such a multi-faceted concept is presented. Moreover, the most common quality-related problems affecting metadata both during the creation and the aggregation phase are discussed together with the approaches, technologies and tools developed to mitigate them. This survey on digital librar...
Making Information Visible, Accessible, and Understandable: Meta-Data and Registries

Science.gov (United States)

2007-07-01

the data created, the length of play time, album name, and the genre. Without resource metadata, portable digital music players would not be so...notion of a catalog card in a library. An example of metadata is the description of a music file specifying the creator, the artist that performed the song...describe struc- ture and formatting which are critical to interoperability and the management of databases. Going back to the portable music player example
Inselect: Automating the Digitization of Natural History Collections.

Directory of Open Access Journals (Sweden)

Lawrence N Hudson

Full Text Available The world's natural history collections constitute an enormous evidence base for scientific research on the natural world. To facilitate these studies and improve access to collections, many organisations are embarking on major programmes of digitization. This requires automated approaches to mass-digitization that support rapid imaging of specimens and associated data capture, in order to process the tens of millions of specimens common to most natural history collections. In this paper we present Inselect-a modular, easy-to-use, cross-platform suite of open-source software tools that supports the semi-automated processing of specimen images generated by natural history digitization programmes. The software is made up of a Windows, Mac OS X, and Linux desktop application, together with command-line tools that are designed for unattended operation on batches of images. Blending image visualisation algorithms that automatically recognise specimens together with workflows to support post-processing tasks such as barcode reading, label transcription and metadata capture, Inselect fills a critical gap to increase the rate of specimen digitization.
A Metadata Standard for Hydroinformatic Data Conforming to International Standards

Science.gov (United States)

Notay, Vikram; Carstens, Georg; Lehfeldt, Rainer

2017-04-01

The affordable availability of computing power and digital storage has been a boon for the scientific community. The hydroinformatics community has also benefitted from the so-called digital revolution, which has enabled the tackling of more and more complex physical phenomena using hydroinformatic models, instruments, sensors, etc. With models getting more and more complex, computational domains getting larger and the resolution of computational grids and measurement data getting finer, a large amount of data is generated and consumed in any hydroinformatics related project. The ubiquitous availability of internet also contributes to this phenomenon with data being collected through sensor networks connected to telecommunications networks and the internet long before the term Internet of Things existed. Although generally good, this exponential increase in the number of available datasets gives rise to the need to describe this data in a standardised way to not only be able to get a quick overview about the data but to also facilitate interoperability of data from different sources. The Federal Waterways Engineering and Research Institute (BAW) is a federal authority of the German Federal Ministry of Transport and Digital Infrastructure. BAW acts as a consultant for the safe and efficient operation of the German waterways. As part of its consultation role, BAW operates a number of physical and numerical models for sections of inland and marine waterways. In order to uniformly describe the data produced and consumed by these models throughout BAW and to ensure interoperability with other federal and state institutes on the one hand and with EU countries on the other, a metadata profile for hydroinformatic data has been developed at BAW. The metadata profile is composed in its entirety using the ISO 19115 international standard for metadata related to geographic information. Due to the widespread use of the ISO 19115 standard in the existing geodata infrastructure
Seeing the Wood for the Trees: Enhancing Metadata Subject Elements with Weights

Directory of Open Access Journals (Sweden)

Hong Zhang

2011-06-01

Full Text Available Subject indexing has been conducted in a dichotomous way in terms of what the information object is primarily about/of or not, corresponding to the presence or absence of a particular subject term, respectively. With more subject terms brought into information systems via social tagging, manual cataloging, or automated indexing, many more partially relevant results can be retrieved. Using examples from digital image collections and online library catalog systems, we explore the problem and advocate for adding a weighting mechanism to subject indexing and tagging to make web search and navigation more effective and efficient. We argue that the weighting of subject terms is more important than ever in today’s world of growing collections, more federated searching, and expansion of social tagging. Such a weighting mechanism needs to be considered and applied not only by indexers, catalogers, and taggers, but also needs to be incorporated into system functionality and metadata schemas.
Structural Metadata Research in the Ears Program

National Research Council Canada - National Science Library

Liu, Yang; Shriberg, Elizabeth; Stolcke, Andreas; Peskin, Barbara; Ang, Jeremy; Hillard, Dustin; Ostendorf, Mari; Tomalin, Marcus; Woodland, Phil; Harper, Mary

2005-01-01

Both human and automatic processing of speech require recognition of more than just words. In this paper we provide a brief overview of research on structural metadata extraction in the DARPA EARS rich transcription program...
Audit of the Reporting Requirements for Major Automated Information System Programs

National Research Council Canada - National Science Library

2000-01-01

.... There are 71 Major Automated Information System programs with total program costs of $26 billion. To qualify as a Major Automated Information System, the program must meet the following critena...
Embedding Metadata and Other Semantics in Word Processing Documents

Directory of Open Access Journals (Sweden)

Peter Sefton

2009-10-01

Full Text Available This paper describes a technique for embedding document metadata, and potentially other semantic references inline in word processing documents, which the authors have implemented with the help of a software development team. Several assumptions underly the approach; It must be available across computing platforms and work with both Microsoft Word (because of its user base and OpenOffice.org (because of its free availability. Further the application needs to be acceptable to and usable by users, so the initial implementation covers only small number of features, which will only be extended after user-testing. Within these constraints the system provides a mechanism for encoding not only simple metadata, but for inferring hierarchical relationships between metadata elements from a ‘flat’ word processing file.The paper includes links to open source code implementing the techniques as part of a broader suite of tools for academic writing. This addresses tools and software, semantic web and data curation, integrating curation into research workflows and will provide a platform for integrating work on ontologies, vocabularies and folksonomies into word processing tools.
Virtual Environments for Visualizing Structural Health Monitoring Sensor Networks, Data, and Metadata.

Science.gov (United States)

Napolitano, Rebecca; Blyth, Anna; Glisic, Branko

2018-01-16

Visualization of sensor networks, data, and metadata is becoming one of the most pivotal aspects of the structural health monitoring (SHM) process. Without the ability to communicate efficiently and effectively between disparate groups working on a project, an SHM system can be underused, misunderstood, or even abandoned. For this reason, this work seeks to evaluate visualization techniques in the field, identify flaws in current practices, and devise a new method for visualizing and accessing SHM data and metadata in 3D. More precisely, the work presented here reflects a method and digital workflow for integrating SHM sensor networks, data, and metadata into a virtual reality environment by combining spherical imaging and informational modeling. Both intuitive and interactive, this method fosters communication on a project enabling diverse practitioners of SHM to efficiently consult and use the sensor networks, data, and metadata. The method is presented through its implementation on a case study, Streicker Bridge at Princeton University campus. To illustrate the efficiency of the new method, the time and data file size were compared to other potential methods used for visualizing and accessing SHM sensor networks, data, and metadata in 3D. Additionally, feedback from civil engineering students familiar with SHM is used for validation. Recommendations on how different groups working together on an SHM project can create SHM virtual environment and convey data to proper audiences, are also included.
Testing Metadata Existence of Web Map Services

Directory of Open Access Journals (Sweden)

Jan Růžička

2011-05-01

Full Text Available For a general user is quite common to use data sources available on WWW. Almost all GIS software allow to use data sources available via Web Map Service (ISO/OGC standard interface. The opportunity to use different sources and combine them brings a lot of problems that were discussed many times on conferences or journal papers. One of the problem is based on non existence of metadata for published sources. The question was: were the discussions effective? The article is partly based on comparison of situation for metadata between years 2007 and 2010. Second part of the article is focused only on 2010 year situation. The paper is created in a context of research of intelligent map systems, that can be used for an automatic or a semi-automatic map creation or a map evaluation.
Generic Automated Multi-function Finger Design

Science.gov (United States)

Honarpardaz, M.; Tarkian, M.; Sirkett, D.; Ölvander, J.; Feng, X.; Elf, J.; Sjögren, R.

2016-11-01

Multi-function fingers that are able to handle multiple workpieces are crucial in improvement of a robot workcell. Design automation of multi-function fingers is highly demanded by robot industries to overcome the current iterative, time consuming and complex manual design process. However, the existing approaches for the multi-function finger design automation are unable to entirely meet the robot industries’ need. This paper proposes a generic approach for design automation of multi-function fingers. The proposed approach completely automates the design process and requires no expert skill. In addition, this approach executes the design process much faster than the current manual process. To validate the approach, multi-function fingers are successfully designed for two case studies. Further, the results are discussed and benchmarked with existing approaches.
How to assess sustainability in automated manufacturing

DEFF Research Database (Denmark)

Dijkman, Teunis Johannes; Rödger, Jan-Markus; Bey, Niki

2015-01-01

The aim of this paper is to describe how sustainability in automation can be assessed. The assessment method is illustrated using a case study of a robot. Three aspects of sustainability assessment in automation are identified. Firstly, we consider automation as part of a larger system...... that fulfills the market demand for a given functionality. Secondly, three aspects of sustainability have to be assessed: environment, economy, and society. Thirdly, automation is part of a system with many levels, with different actors on each level, resulting in meeting the market demand. In this system......, (sustainability) specifications move top-down, which helps avoiding sub-optimization and problem shifting. From these three aspects, sustainable automation is defined as automation that contributes to products that fulfill a market demand in a more sustainable way. The case study presents the carbon footprints...

Toward automated interpretation of integrated information: Managing "big data" for NDE

Science.gov (United States)

Gregory, Elizabeth; Lesthaeghe, Tyler; Holland, Stephen

2015-03-01

Large scale automation of NDE processes is rapidly maturing, thanks to recent improvements in robotics and the rapid growth of computer power over the last twenty years. It is fairly straightforward to automate NDE data collection itself, but the process of NDE remains largely manual. We will discuss three threads of technological needs that must be addressed before we are able to perform automated NDE. Spatial context, the first thread, means that each NDE measurement taken is accompanied by metadata that locates the measurement with respect to the 3D physical geometry of the specimen. In this way, the geometry of the specimen acts as a database key. Data context, the second thread, means that we record why the data was taken and how it was measured in addition to the NDE data itself. We will present our software tool that helps users interact with data in context, Databrowse. Condition estimation, the third thread, is maintaining the best possible knowledge of the condition (serviceability, degradation, etc.) of an object or part. In the NDE context, we can prospectively use Bayes' Theorem to integrate the data from each new NDE measurement with prior knowledge. These tools, combined with robotic measurements and automated defect analysis, will provide the information needed to make high-level life predictions and focus NDE measurements where they are needed most.
ORGANIZATION OF DIGITAL RESOURCES IN REPEC THROUGH REDIF METADATA

Directory of Open Access Journals (Sweden)

Salvador Enrique Vazquez Moctezuma

2018-06-01

Full Text Available Introduction: The disciplinary repository RePEc (Research Papers in Economics provides access to a wide range of preprints, journal articles, books, book chapters and software about economic and administrative sciences. This repository adds bibliographic records produced by different universities, institutes, editors and authors that work collaboratively following the norms of the documentary organization. Objective: In this paper, mainly, we identify and analyze the functioning of RePEc, which includes the organization of the files, which is characterized using the protocol Guildford and metadata ReDIF (Research Documentation Information Format templates own for the documentary description. Methodology: Part of this research was studied theoretically in the literature; another part was carried out by observing a series of features visible on the RePEc website and in the archives of a journal that collaborates in this repository. Results: The repository is a decentralized collaborative project and it also provides several services derived from the metadata analysis. Conclusions: We conclude that the ReDIF templates and the Guildford communication protocol are key elements for organizing records in RePEc, and there is a similarity with the Dublin Core metadata
Development of an open metadata schema for prospective clinical research (openPCR) in China.

Science.gov (United States)

Xu, W; Guan, Z; Sun, J; Wang, Z; Geng, Y

2014-01-01

In China, deployment of electronic data capture (EDC) and clinical data management system (CDMS) for clinical research (CR) is in its very early stage, and about 90% of clinical studies collected and submitted clinical data manually. This work aims to build an open metadata schema for Prospective Clinical Research (openPCR) in China based on openEHR archetypes, in order to help Chinese researchers easily create specific data entry templates for registration, study design and clinical data collection. Singapore Framework for Dublin Core Application Profiles (DCAP) is used to develop openPCR and four steps such as defining the core functional requirements and deducing the core metadata items, developing archetype models, defining metadata terms and creating archetype records, and finally developing implementation syntax are followed. The core functional requirements are divided into three categories: requirements for research registration, requirements for trial design, and requirements for case report form (CRF). 74 metadata items are identified and their Chinese authority names are created. The minimum metadata set of openPCR includes 3 documents, 6 sections, 26 top level data groups, 32 lower data groups and 74 data elements. The top level container in openPCR is composed of public document, internal document and clinical document archetypes. A hierarchical structure of openPCR is established according to Data Structure of Electronic Health Record Architecture and Data Standard of China (Chinese EHR Standard). Metadata attributes are grouped into six parts: identification, definition, representation, relation, usage guides, and administration. OpenPCR is an open metadata schema based on research registration standards, standards of the Clinical Data Interchange Standards Consortium (CDISC) and Chinese healthcare related standards, and is to be publicly available throughout China. It considers future integration of EHR and CR by adopting data structure and data
The Benefits and Future of Standards: Metadata and Beyond

Science.gov (United States)

Stracke, Christian M.

This article discusses the benefits and future of standards and presents the generic multi-dimensional Reference Model. First the importance and the tasks of interoperability as well as quality development and their relationship are analyzed. Especially in e-Learning their connection and interdependence is evident: Interoperability is one basic requirement for quality development. In this paper, it is shown how standards and specifications are supporting these crucial issues. The upcoming ISO metadata standard MLR (Metadata for Learning Resource) will be introduced and used as example for identifying the requirements and needs for future standardization. In conclusion a vision of the challenges and potentials for e-Learning standardization is outlined.
Crowd-sourced BMS point matching and metadata maintenance with Babel

DEFF Research Database (Denmark)

Fürst, Jonathan; Chen, Kaifei; Katz, Randy H.

2016-01-01

Cyber-physical applications, deployed on top of Building Management Systems (BMS), promise energy saving and comfort improvement in non-residential buildings. Such applications are so far mainly deployed as research prototypes. The main roadblock to widespread adoption is the low quality of BMS...... systems. Such applications access sensors and actuators through BMS metadata in form of point labels. The naming of labels is however often inconsistent and incomplete. To tackle this problem, we introduce Babel, a crowd-sourced approach to the creation and maintenance of BMS metadata. In our system...
Automated Resource Classifier for agglomerative functional ...

Indian Academy of Sciences (India)

2007-06-16

Jun 16, 2007 ... Automated resource; functional classification; integrative biology ... which is an open source software meeting the user requirements of flexibility. ... entries into any of the 7 basic non-overlapping functional classes: Cell wall, ...
Deploying the ATLAS Metadata Interface (AMI) on the cloud with Jenkins

Science.gov (United States)

Lambert, F.; Odier, J.; Fulachier, J.; ATLAS Collaboration

2017-10-01

The ATLAS Metadata Interface (AMI) is a mature application of more than 15 years of existence. Mainly used by the ATLAS experiment at CERN, it consists of a very generic tool ecosystem for metadata aggregation and cataloguing. AMI is used by the ATLAS production system, therefore the service must guarantee a high level of availability. We describe our monitoring and administration systems, and the Jenkins-based strategy used to dynamically test and deploy cloud OpenStack nodes on demand.
Deploying the ATLAS Metadata Interface (AMI) on the cloud with Jenkins.

CERN Document Server

AUTHOR|(SzGeCERN)637120; The ATLAS collaboration; Odier, Jerome; Fulachier, Jerome

2017-01-01

The ATLAS Metadata Interface (AMI) is a mature application of more than 15 years of existence. Mainly used by the ATLAS experiment at CERN, it consists of a very generic tool ecosystem for metadata aggregation and cataloguing. AMI is used by the ATLAS production system, therefore the service must guarantee a high level of availability. We describe our monitoring and administration systems, and the Jenkins-based strategy used to dynamically test and deploy cloud OpenStack nodes on demand.
Inconsistencies between Academic E-Book Platforms: A Comparison of Metadata and Search Results

Science.gov (United States)

Wiersma, Gabrielle; Tovstiadi, Esta

2017-01-01

This article presents the results of a study of academic e-books that compared the metadata and search results from major academic e-book platforms. The authors collected data and performed a series of test searches designed to produce the same result regardless of platform. Testing, however, revealed metadata-related errors and significant…
PERANCANGAN SISTEM METADATA UNTUK DATA WAREHOUSE DENGAN STUDI KASUS REVENUE TRACKING PADA PT. TELKOM DIVRE V JAWA TIMUR

Directory of Open Access Journals (Sweden)

Yudhi Purwananto

2004-07-01

Full Text Available Normal 0 false false false IN X-NONE X-NONE MicrosoftInternetExplorer4 /* Style Definitions */ table.MsoNormalTable {mso-style-name:"Table Normal"; mso-tstyle-rowband-size:0; mso-tstyle-colband-size:0; mso-style-noshow:yes; mso-style-priority:99; mso-style-qformat:yes; mso-style-parent:""; mso-padding-alt:0cm 5.4pt 0cm 5.4pt; mso-para-margin:0cm; mso-para-margin-bottom:.0001pt; mso-pagination:widow-orphan; font-size:11.0pt; font-family:"Calibri","sans-serif"; mso-ascii-font-family:Calibri; mso-ascii-theme-font:minor-latin; mso-fareast-font-family:"Times New Roman"; mso-fareast-theme-font:minor-fareast; mso-hansi-font-family:Calibri; mso-hansi-theme-font:minor-latin; mso-bidi-font-family:"Times New Roman"; mso-bidi-theme-font:minor-bidi;} Data warehouse merupakan media penyimpanan data dalam perusahaan yang diambil dari berbagai sistem dan dapat digunakan untuk berbagai keperluan seperti analisis dan pelaporan. Di PT Telkom Divre V Jawa Timur telah dibangun sebuah data warehouse yang disebut dengan Regional Database. Di Regional Database memerlukan sebuah komponen penting dalam data warehouse yaitu metadata. Definisi metadata secara sederhana adalah "data tentang data". Dalam penelitian ini dirancang sistem metadata dengan studi kasus Revenue Tracking sebagai komponen analisis dan pelaporan pada Regional Database. Metadata sangat perlu digunakan dalam pengelolaan dan memberikan informasi tentang data warehouse. Proses - proses di dalam data warehouse serta komponen - komponen yang berkaitan dengan data warehouse harus saling terintegrasi untuk mewujudkan karakteristik data warehouse yang subject-oriented, integrated, time-variant, dan non-volatile. Karena itu metadata juga harus memiliki kemampuan mempertukarkan informasi (exchange antar komponen dalam data warehouse tersebut. Web service digunakan sebagai mekanisme pertukaran ini. Web service menggunakan teknologi XML dan protokol HTTP dalam berkomunikasi. Dengan web service, setiap komponen
The Value of Data and Metadata Standardization for Interoperability in Giovanni

Science.gov (United States)

Smit, C.; Hegde, M.; Strub, R. F.; Bryant, K.; Li, A.; Petrenko, M.

2017-12-01

Giovanni (https://giovanni.gsfc.nasa.gov/giovanni/) is a data exploration and visualization tool at the NASA Goddard Earth Sciences Data Information Services Center (GES DISC). It has been around in one form or another for more than 15 years. Giovanni calculates simple statistics and produces 22 different visualizations for more than 1600 geophysical parameters from more than 90 satellite and model products. Giovanni relies on external data format standards to ensure interoperability, including the NetCDF CF Metadata Conventions. Unfortunately, these standards were insufficient to make Giovanni's internal data representation truly simple to use. Finding and working with dimensions can be convoluted with the CF Conventions. Furthermore, the CF Conventions are silent on machine-friendly descriptive metadata such as the parameter's source product and product version. In order to simplify analyzing disparate earth science data parameters in a unified way, we developed Giovanni's internal standard. First, the format standardizes parameter dimensions and variables so they can be easily found. Second, the format adds all the machine-friendly metadata Giovanni needs to present our parameters to users in a consistent and clear manner. At a glance, users can grasp all the pertinent information about parameters both during parameter selection and after visualization. This poster gives examples of how our metadata and data standards, both external and internal, have both simplified our code base and improved our users' experiences.
Improving the accessibility and re-use of environmental models through provision of model metadata - a scoping study

Science.gov (United States)

Riddick, Andrew; Hughes, Andrew; Harpham, Quillon; Royse, Katherine; Singh, Anubha

2014-05-01

There has been an increasing interest both from academic and commercial organisations over recent years in developing hydrologic and other environmental models in response to some of the major challenges facing the environment, for example environmental change and its effects and ensuring water resource security. This has resulted in a significant investment in modelling by many organisations both in terms of financial resources and intellectual capital. To capitalise on the effort on producing models, then it is necessary for the models to be both discoverable and appropriately described. If this is not undertaken then the effort in producing the models will be wasted. However, whilst there are some recognised metadata standards relating to datasets these may not completely address the needs of modellers regarding input data for example. Also there appears to be a lack of metadata schemes configured to encourage the discovery and re-use of the models themselves. The lack of an established standard for model metadata is considered to be a factor inhibiting the more widespread use of environmental models particularly the use of linked model compositions which fuse together hydrologic models with models from other environmental disciplines. This poster presents the results of a Natural Environment Research Council (NERC) funded scoping study to understand the requirements of modellers and other end users for metadata about data and models. A user consultation exercise using an on-line questionnaire has been undertaken to capture the views of a wide spectrum of stakeholders on how they are currently managing metadata for modelling. This has provided a strong confirmation of our original supposition that there is a lack of systems and facilities to capture metadata about models. A number of specific gaps in current provision for data and model metadata were also identified, including a need for a standard means to record detailed information about the modelling
Flexible Authoring of Metadata for Learning : Assembling forms from a declarative data and view model

OpenAIRE

Enoksson, Fredrik

2011-01-01

With the vast amount of information in various formats that is produced today it becomes necessary for consumers ofthis information to be able to judge if it is relevant for them. One way to enable that is to provide information abouteach piece of information, i.e. provide metadata. When metadata is to be edited by a human being, a metadata editorneeds to be provided. This thesis describes the design and practical use of a configuration mechanism for metadataeditors called annotation profiles...
Making progress with the automation of systematic reviews: principles of the International Collaboration for the Automation of Systematic Reviews (ICASR).

Science.gov (United States)

Beller, Elaine; Clark, Justin; Tsafnat, Guy; Adams, Clive; Diehl, Heinz; Lund, Hans; Ouzzani, Mourad; Thayer, Kristina; Thomas, James; Turner, Tari; Xia, Jun; Robinson, Karen; Glasziou, Paul

2018-05-19

Systematic reviews (SR) are vital to health care, but have become complicated and time-consuming, due to the rapid expansion of evidence to be synthesised. Fortunately, many tasks of systematic reviews have the potential to be automated or may be assisted by automation. Recent advances in natural language processing, text mining and machine learning have produced new algorithms that can accurately mimic human endeavour in systematic review activity, faster and more cheaply. Automation tools need to be able to work together, to exchange data and results. Therefore, we initiated the International Collaboration for the Automation of Systematic Reviews (ICASR), to successfully put all the parts of automation of systematic review production together. The first meeting was held in Vienna in October 2015. We established a set of principles to enable tools to be developed and integrated into toolkits.This paper sets out the principles devised at that meeting, which cover the need for improvement in efficiency of SR tasks, automation across the spectrum of SR tasks, continuous improvement, adherence to high quality standards, flexibility of use and combining components, the need for a collaboration and varied skills, the desire for open source, shared code and evaluation, and a requirement for replicability through rigorous and open evaluation.Automation has a great potential to improve the speed of systematic reviews. Considerable work is already being done on many of the steps involved in a review. The 'Vienna Principles' set out in this paper aim to guide a more coordinated effort which will allow the integration of work by separate teams and build on the experience, code and evaluations done by the many teams working across the globe.
Towards an Ontology for the Global Geodynamics Project: Automated Extraction of Resource Descriptions from an XML-Based Data Model

Science.gov (United States)

Lumb, L. I.; Aldridge, K. D.

2005-12-01

Using the Earth Science Markup Language (ESML), an XML-based data model for the Global Geodynamics Project (GGP) was recently introduced [Lumb & Aldridge, Proc. HPCS 2005, Kotsireas & Stacey, eds., IEEE, 2005, 216-222]. This data model possesses several key attributes -i.e., it: makes use of XML schema; supports semi-structured ASCII format files; includes Earth Science affinities; and is on track for compliance with emerging Grid computing standards (e.g., the Global Grid Forum's Data Format Description Language, DFDL). Favorable attributes notwithstanding, metadata (i.e., data about data) was identified [Lumb & Aldridge, 2005] as a key challenge for progress in enabling the GGP for Grid computing. Even in projects of small-to-medium scale like the GGP, the manual introduction of metadata has the potential to be the rate-determining metric for progress. Fortunately, an automated approach for metadata introduction has recently emerged. Based on Gleaning Resource Descriptions from Dialects of Languages (GRDDL, http://www.w3.org/2004/01/rdxh/spec), this bottom-up approach allows for the extraction of Resource Description Format (RDF) representations from the XML-based data model (i.e., the ESML representation of GGP data) subject to rules of transformation articulated via eXtensible Stylesheet Language Transformations (XSLT). In addition to introducing relationships into the GGP data model, and thereby addressing the metadata requirement, the syntax and semantics of RDF comprise a requisite for a GGP ontology - i.e., ``the common words and concepts (the meaning) used to describe and represent an area of knowledge'' [Daconta et al., The Semantic Web, Wiley, 2003]. After briefly reviewing the XML-based model for the GGP, attention focuses on the automated extraction of an RDF representation via GRDDL with XSLT-delineated templates. This bottom-up approach, in tandem with a top-down approach based on the Protege integrated development environment for ontologies (http
The Automation-by-Expertise-by-Training Interaction.

Science.gov (United States)

Strauch, Barry

2017-03-01

I introduce the automation-by-expertise-by-training interaction in automated systems and discuss its influence on operator performance. Transportation accidents that, across a 30-year interval demonstrated identical automation-related operator errors, suggest a need to reexamine traditional views of automation. I review accident investigation reports, regulator studies, and literature on human computer interaction, expertise, and training and discuss how failing to attend to the interaction of automation, expertise level, and training has enabled operators to commit identical automation-related errors. Automated systems continue to provide capabilities exceeding operators' need for effective system operation and provide interfaces that can hinder, rather than enhance, operator automation-related situation awareness. Because of limitations in time and resources, training programs do not provide operators the expertise needed to effectively operate these automated systems, requiring them to obtain the expertise ad hoc during system operations. As a result, many do not acquire necessary automation-related system expertise. Integrating automation with expected operator expertise levels, and within training programs that provide operators the necessary automation expertise, can reduce opportunities for automation-related operator errors. Research to address the automation-by-expertise-by-training interaction is needed. However, such research must meet challenges inherent to examining realistic sociotechnical system automation features with representative samples of operators, perhaps by using observational and ethnographic research. Research in this domain should improve the integration of design and training and, it is hoped, enhance operator performance.
USGS 24k Digital Raster Graphic (DRG) Metadata

Data.gov (United States)

Minnesota Department of Natural Resources — Metadata for the scanned USGS 24k Topograpic Map Series (also known as 24k Digital Raster Graphic). Each scanned map is represented by a polygon in the layer and the...
Evaluation of Semi-Automatic Metadata Generation Tools: A Survey of the Current State of the Art

Directory of Open Access Journals (Sweden)

Jung-ran Park

2015-09-01

Full Text Available Assessment of the current landscape of semi-automatic metadata generation tools is particularly important considering the rapid development of digital repositories and the recent explosion of big data. Utilization of (semiautomatic metadata generation is critical in addressing these environmental changes and may be unavoidable in the future considering the costly and complex operation of manual metadata creation. To address such needs, this study examines the range of semi-automatic metadata generation tools (n=39 while providing an analysis of their techniques, features, and functions. The study focuses on open-source tools that can be readily utilized in libraries and other memory institutions. The challenges and current barriers to implementation of these tools were identified. The greatest area of difficulty lies in the fact that the piecemeal development of most semi-automatic generation tools only addresses part of the issue of semi-automatic metadata generation, providing solutions to one or a few metadata elements but not the full range elements. This indicates that significant local efforts will be required to integrate the various tools into a coherent set of a working whole. Suggestions toward such efforts are presented for future developments that may assist information professionals with incorporation of semi-automatic tools within their daily workflows.
A renaissance in library metadata? The importance of community collaboration in a digital world

Directory of Open Access Journals (Sweden)

Sarah Bull

2016-07-01

Full Text Available This article summarizes a presentation given by Sarah Bull as part of the Association of Learned and Professional Society Publishers (ALPSP seminar ‘Setting the Standard’ in November 2015. Representing the library community at the wide-ranging seminar, Sarah was tasked with making the topic of library metadata an engaging and informative one for a largely publisher audience. With help from co-author Amanda Quimby, this article is an attempt to achieve the same aim! It covers the importance of library metadata and standards in the supply chain and also reflects on the role of the community in successful standards development and maintenance. Special emphasis is given to the importance of quality in e-book metadata and the need for publisher and library collaboration to improve discovery, usage and the student experience. The article details the University of Birmingham experience of e-book metadata from a workflow perspective to highlight the complex integration issues which remain between content procurement and discovery.
Advances in Electrical Engineering and Automation

CERN Document Server

Huang, Xiong

2012-01-01

EEA2011 is an integrated conference concentration its focus on Electrical Engineering and Automation. In the proceeding, you can learn much more knowledge about Electrical Engineering and Automation of researchers from all around the world. The main role of the proceeding is to be used as an exchange pillar for researchers who are working in the mentioned fields. In order to meet the high quality of Springer, AISC series, the organization committee has made their efforts to do the following things. Firstly, poor quality paper has been refused after reviewing course by anonymous referee experts. Secondly, periodically review meetings have been held around the reviewers about five times for exchanging reviewing suggestions. Finally, the conference organizers had several preliminary sessions before the conference. Through efforts of different people and departments, the conference will be successful and fruitful.

Building a semantic web-based metadata repository for facilitating detailed clinical modeling in cancer genome studies.

Science.gov (United States)

Sharma, Deepak K; Solbrig, Harold R; Tao, Cui; Weng, Chunhua; Chute, Christopher G; Jiang, Guoqian

2017-06-05

Detailed Clinical Models (DCMs) have been regarded as the basis for retaining computable meaning when data are exchanged between heterogeneous computer systems. To better support clinical cancer data capturing and reporting, there is an emerging need to develop informatics solutions for standards-based clinical models in cancer study domains. The objective of the study is to develop and evaluate a cancer genome study metadata management system that serves as a key infrastructure in supporting clinical information modeling in cancer genome study domains. We leveraged a Semantic Web-based metadata repository enhanced with both ISO11179 metadata standard and Clinical Information Modeling Initiative (CIMI) Reference Model. We used the common data elements (CDEs) defined in The Cancer Genome Atlas (TCGA) data dictionary, and extracted the metadata of the CDEs using the NCI Cancer Data Standards Repository (caDSR) CDE dataset rendered in the Resource Description Framework (RDF). The ITEM/ITEM_GROUP pattern defined in the latest CIMI Reference Model is used to represent reusable model elements (mini-Archetypes). We produced a metadata repository with 38 clinical cancer genome study domains, comprising a rich collection of mini-Archetype pattern instances. We performed a case study of the domain "clinical pharmaceutical" in the TCGA data dictionary and demonstrated enriched data elements in the metadata repository are very useful in support of building detailed clinical models. Our informatics approach leveraging Semantic Web technologies provides an effective way to build a CIMI-compliant metadata repository that would facilitate the detailed clinical modeling to support use cases beyond TCGA in clinical cancer study domains.
iLOG: A Framework for Automatic Annotation of Learning Objects with Empirical Usage Metadata

Science.gov (United States)

Miller, L. D.; Soh, Leen-Kiat; Samal, Ashok; Nugent, Gwen

2012-01-01

Learning objects (LOs) are digital or non-digital entities used for learning, education or training commonly stored in repositories searchable by their associated metadata. Unfortunately, based on the current standards, such metadata is often missing or incorrectly entered making search difficult or impossible. In this paper, we investigate…
Integration of Phenotypic Metadata and Protein Similarity in Archaea Using a Spectral Bipartitioning Approach

Energy Technology Data Exchange (ETDEWEB)

Hooper, Sean D.; Anderson, Iain J; Pati, Amrita; Dalevi, Daniel; Mavromatis, Konstantinos; Kyrpides, Nikos C

2009-01-01

In order to simplify and meaningfully categorize large sets of protein sequence data, it is commonplace to cluster proteins based on the similarity of those sequences. However, it quickly becomes clear that the sequence flexibility allowed a given protein varies significantly among different protein families. The degree to which sequences are conserved not only differs for each protein family, but also is affected by the phylogenetic divergence of the source organisms. Clustering techniques that use similarity thresholds for protein families do not always allow for these variations and thus cannot be confidently used for applications such as automated annotation and phylogenetic profiling. In this work, we applied a spectral bipartitioning technique to all proteins from 53 archaeal genomes. Comparisons between different taxonomic levels allowed us to study the effects of phylogenetic distances on cluster structure. Likewise, by associating functional annotations and phenotypic metadata with each protein, we could compare our protein similarity clusters with both protein function and associated phenotype. Our clusters can be analyzed graphically and interactively online.
Automating the radiographic NDT process

International Nuclear Information System (INIS)

Aman, J.K.

1986-01-01

Automation, the removal of the human element in inspection, has not been generally applied to film radiographic NDT. The justication for automating is not only productivity but also reliability of results. Film remains in the automated system of the future because of its extremely high image content, approximately 8 x 10 9 bits per 14 x 17. The equivalent to 2200 computer floppy discs. Parts handling systems and robotics applied for manufacturing and some NDT modalities, should now be applied to film radiographic NDT systems. Automatic film handling can be achieved with the daylight NDT film handling system. Automatic film processing is becoming the standard in industry and can be coupled to the daylight system. Robots offer the opportunity to automate fully the exposure step. Finally, computer aided interpretation appears on the horizon. A unit which laser scans a 14 x 17 (inch) film in 6 - 8 seconds can digitize film information for further manipulation and possible automatic interrogations (computer aided interpretation). The system called FDRS (for Film Digital Radiography System) is moving toward 50 micron (*approx* 16 lines/mm) resolution. This is believed to meet the need of the majority of image content needs. We expect the automated system to appear first in parts (modules) as certain operations are automated. The future will see it all come together in an automated film radiographic NDT system (author) [pt
Data Bookkeeping Service 3 - Providing event metadata in CMS

CERN Document Server

Giffels, Manuel; Riley, Daniel

2014-01-01

The Data Bookkeeping Service 3 provides a catalog of event metadata for Monte Carlo and recorded data of the Compact Muon Solenoid (CMS) experiment at the Large Hadron Collider (LHC) at CERN, Geneva. It comprises all necessary information for tracking datasets, their processing history and associations between runs, files and datasets, on a large scale of about $200,000$ datasets and more than $40$ million files, which adds up in around $700$ GB of metadata. The DBS is an essential part of the CMS Data Management and Workload Management (DMWM) systems, all kind of data-processing like Monte Carlo production, processing of recorded event data as well as physics analysis done by the users are heavily relying on the information stored in DBS.
Data Bookkeeping Service 3 - Providing Event Metadata in CMS

Energy Technology Data Exchange (ETDEWEB)

Giffels, Manuel [CERN; Guo, Y. [Fermilab; Riley, Daniel [Cornell U.

2014-01-01

The Data Bookkeeping Service 3 provides a catalog of event metadata for Monte Carlo and recorded data of the Compact Muon Solenoid (CMS) experiment at the Large Hadron Collider (LHC) at CERN, Geneva. It comprises all necessary information for tracking datasets, their processing history and associations between runs, files and datasets, on a large scale of about 200, 000 datasets and more than 40 million files, which adds up in around 700 GB of metadata. The DBS is an essential part of the CMS Data Management and Workload Management (DMWM) systems [1], all kind of data-processing like Monte Carlo production, processing of recorded event data as well as physics analysis done by the users are heavily relying on the information stored in DBS.
Evolution of Web Services in EOSDIS: Search and Order Metadata Registry (ECHO)

Science.gov (United States)

Mitchell, Andrew; Ramapriyan, Hampapuram; Lowe, Dawn

2009-01-01

During 2005 through 2008, NASA defined and implemented a major evolutionary change in it Earth Observing system Data and Information System (EOSDIS) to modernize its capabilities. This implementation was based on a vision for 2015 developed during 2005. The EOSDIS 2015 Vision emphasizes increased end-to-end data system efficiency and operability; increased data usability; improved support for end users; and decreased operations costs. One key feature of the Evolution plan was achieving higher operational maturity (ingest, reconciliation, search and order, performance, error handling) for the NASA s Earth Observing System Clearinghouse (ECHO). The ECHO system is an operational metadata registry through which the scientific community can easily discover and exchange NASA's Earth science data and services. ECHO contains metadata for 2,726 data collections comprising over 87 million individual data granules and 34 million browse images, consisting of NASA s EOSDIS Data Centers and the United States Geological Survey's Landsat Project holdings. ECHO is a middleware component based on a Service Oriented Architecture (SOA). The system is comprised of a set of infrastructure services that enable the fundamental SOA functions: publish, discover, and access Earth science resources. It also provides additional services such as user management, data access control, and order management. The ECHO system has a data registry and a services registry. The data registry enables organizations to publish EOS and other Earth-science related data holdings to a common metadata model. These holdings are described through metadata in terms of datasets (types of data) and granules (specific data items of those types). ECHO also supports browse images, which provide a visual representation of the data. The published metadata can be mapped to and from existing standards (e.g., FGDC, ISO 19115). With ECHO, users can find the metadata stored in the data registry and then access the data either
Deploying the ATLAS Metadata Interface (AMI) on the cloud with Jenkins

CERN Document Server

Lambert, Fabian; The ATLAS collaboration

2016-01-01

The ATLAS Metadata Interface (AMI) is a mature application of more than 15 years of existence. Mainly used by the ATLAS experiment at CERN, it consists of a very generic tool ecosystem for metadata aggregation and cataloguing. AMI is used by the ATLAS production system, therefore the service must guarantee a high level of availability. We describe our monitoring system and the Jenkins-based strategy used to dynamically test and deploy cloud OpenStack nodes on demand. Moreover, we describe how to switch to a distant replica in case of downtime.
Asymmetric Programming: A Highly Reliable Metadata Allocation Strategy for MLC NAND Flash Memory-Based Sensor Systems

Science.gov (United States)

Huang, Min; Liu, Zhaoqing; Qiao, Liyan

2014-01-01

While the NAND flash memory is widely used as the storage medium in modern sensor systems, the aggressive shrinking of process geometry and an increase in the number of bits stored in each memory cell will inevitably degrade the reliability of NAND flash memory. In particular, it's critical to enhance metadata reliability, which occupies only a small portion of the storage space, but maintains the critical information of the file system and the address translations of the storage system. Metadata damage will cause the system to crash or a large amount of data to be lost. This paper presents Asymmetric Programming, a highly reliable metadata allocation strategy for MLC NAND flash memory storage systems. Our technique exploits for the first time the property of the multi-page architecture of MLC NAND flash memory to improve the reliability of metadata. The basic idea is to keep metadata in most significant bit (MSB) pages which are more reliable than least significant bit (LSB) pages. Thus, we can achieve relatively low bit error rates for metadata. Based on this idea, we propose two strategies to optimize address mapping and garbage collection. We have implemented Asymmetric Programming on a real hardware platform. The experimental results show that Asymmetric Programming can achieve a reduction in the number of page errors of up to 99.05% with the baseline error correction scheme. PMID:25310473
Asymmetric Programming: A Highly Reliable Metadata Allocation Strategy for MLC NAND Flash Memory-Based Sensor Systems

Directory of Open Access Journals (Sweden)

Min Huang

2014-10-01

Full Text Available While the NAND flash memory is widely used as the storage medium in modern sensor systems, the aggressive shrinking of process geometry and an increase in the number of bits stored in each memory cell will inevitably degrade the reliability of NAND flash memory. In particular, it’s critical to enhance metadata reliability, which occupies only a small portion of the storage space, but maintains the critical information of the file system and the address translations of the storage system. Metadata damage will cause the system to crash or a large amount of data to be lost. This paper presents Asymmetric Programming, a highly reliable metadata allocation strategy for MLC NAND flash memory storage systems. Our technique exploits for the first time the property of the multi-page architecture of MLC NAND flash memory to improve the reliability of metadata. The basic idea is to keep metadata in most significant bit (MSB pages which are more reliable than least significant bit (LSB pages. Thus, we can achieve relatively low bit error rates for metadata. Based on this idea, we propose two strategies to optimize address mapping and garbage collection. We have implemented Asymmetric Programming on a real hardware platform. The experimental results show that Asymmetric Programming can achieve a reduction in the number of page errors of up to 99.05% with the baseline error correction scheme.
Scalable Metadata Management for a Large Multi-Source Seismic Data Repository

Energy Technology Data Exchange (ETDEWEB)

Gaylord, J. M. [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States); Dodge, D. A. [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States); Magana-Zook, S. A. [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States); Barno, J. G. [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States); Knapp, D. R. [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States)

2017-04-11

In this work, we implemented the key metadata management components of a scalable seismic data ingestion framework to address limitations in our existing system, and to position it for anticipated growth in volume and complexity. We began the effort with an assessment of open source data flow tools from the Hadoop ecosystem. We then began the construction of a layered architecture that is specifically designed to address many of the scalability and data quality issues we experience with our current pipeline. This included implementing basic functionality in each of the layers, such as establishing a data lake, designing a unified metadata schema, tracking provenance, and calculating data quality metrics. Our original intent was to test and validate the new ingestion framework with data from a large-scale field deployment in a temporary network. This delivered somewhat unsatisfying results, since the new system immediately identified fatal flaws in the data relatively early in the pipeline. Although this is a correct result it did not allow us to sufficiently exercise the whole framework. We then widened our scope to process all available metadata from over a dozen online seismic data sources to further test the implementation and validate the design. This experiment also uncovered a higher than expected frequency of certain types of metadata issues that challenged us to further tune our data management strategy to handle them. Our result from this project is a greatly improved understanding of real world data issues, a validated design, and prototype implementations of major components of an eventual production framework. This successfully forms the basis of future development for the Geophysical Monitoring Program data pipeline, which is a critical asset supporting multiple programs. It also positions us very well to deliver valuable metadata management expertise to our sponsors, and has already resulted in an NNSA Office of Defense Nuclear Nonproliferation
INSPIRE: Managing Metadata in a Global Digital Library for High-Energy Physics

CERN Document Server

Martin Montull, Javier

2011-01-01

Four leading laboratories in the High-Energy Physics (HEP) field are collaborating to roll-out the next-generation scientific information portal: INSPIRE. The goal of this project is to replace the popular 40 year-old SPIRES database. INSPIRE already provides access to about 1 million records and includes services such as fulltext search, automatic keyword assignment, ingestion and automatic display of LaTeX, citation analysis, automatic author disambiguation, metadata harvesting, extraction of figures from fulltext and search in figure captions. In order to achieve high quality metadata both automatic processing and manual curation are needed. The different tools available in the system use modern web technologies to provide the curators of the maximum efficiency, while dealing with the MARC standard format. The project is under heavy development in order to provide new features including semantic analysis, crowdsourcing of metadata curation, user tagging, recommender systems, integration of OAIS standards a...
Parallel file system with metadata distributed across partitioned key-value store c

Science.gov (United States)

Bent, John M.; Faibish, Sorin; Grider, Gary; Torres, Aaron

2017-09-19

Improved techniques are provided for storing metadata associated with a plurality of sub-files associated with a single shared file in a parallel file system. The shared file is generated by a plurality of applications executing on a plurality of compute nodes. A compute node implements a Parallel Log Structured File System (PLFS) library to store at least one portion of the shared file generated by an application executing on the compute node and metadata for the at least one portion of the shared file on one or more object storage servers. The compute node is also configured to implement a partitioned data store for storing a partition of the metadata for the shared file, wherein the partitioned data store communicates with partitioned data stores on other compute nodes using a message passing interface. The partitioned data store can be implemented, for example, using Multidimensional Data Hashing Indexing Middleware (MDHIM).
Integrating Semantic Information in Metadata Descriptions for a Geoscience-wide Resource Inventory.

Science.gov (United States)

Zaslavsky, I.; Richard, S. M.; Gupta, A.; Valentine, D.; Whitenack, T.; Ozyurt, I. B.; Grethe, J. S.; Schachne, A.

2016-12-01

Integrating semantic information into legacy metadata catalogs is a challenging issue and so far has been mostly done on a limited scale. We present experience of CINERGI (Community Inventory of Earthcube Resources for Geoscience Interoperability), an NSF Earthcube Building Block project, in creating a large cross-disciplinary catalog of geoscience information resources to enable cross-domain discovery. The project developed a pipeline for automatically augmenting resource metadata, in particular generating keywords that describe metadata documents harvested from multiple geoscience information repositories or contributed by geoscientists through various channels including surveys and domain resource inventories. The pipeline examines available metadata descriptions using text parsing, vocabulary management and semantic annotation and graph navigation services of GeoSciGraph. GeoSciGraph, in turn, relies on a large cross-domain ontology of geoscience terms, which bridges several independently developed ontologies or taxonomies including SWEET, ENVO, YAGO, GeoSciML, GCMD, SWO, and CHEBI. The ontology content enables automatic extraction of keywords reflecting science domains, equipment used, geospatial features, measured properties, methods, processes, etc. We specifically focus on issues of cross-domain geoscience ontology creation, resolving several types of semantic conflicts among component ontologies or vocabularies, and constructing and managing facets for improved data discovery and navigation. The ontology and keyword generation rules are iteratively improved as pipeline results are presented to data managers for selective manual curation via a CINERGI Annotator user interface. We present lessons learned from applying CINERGI metadata augmentation pipeline to a number of federal agency and academic data registries, in the context of several use cases that require data discovery and integration across multiple earth science data catalogs of varying quality
Revision of IRIS/IDA Seismic Station Metadata

Science.gov (United States)

Xu, W.; Davis, P.; Auerbach, D.; Klimczak, E.

2017-12-01

Trustworthy data quality assurance has always been one of the goals of seismic network operators and data management centers. This task is considerably complex and evolving due to the huge quantities as well as the rapidly changing characteristics and complexities of seismic data. Published metadata usually reflect instrument response characteristics and their accuracies, which includes zero frequency sensitivity for both seismometer and data logger as well as other, frequency-dependent elements. In this work, we are mainly focused studying the variation of the seismometer sensitivity with time of IRIS/IDA seismic recording systems with a goal to improve the metadata accuracy for the history of the network. There are several ways to measure the accuracy of seismometer sensitivity for the seismic stations in service. An effective practice recently developed is to collocate a reference seismometer in proximity to verify the in-situ sensors' calibration. For those stations with a secondary broadband seismometer, IRIS' MUSTANG metric computation system introduced a transfer function metric to reflect two sensors' gain ratios in the microseism frequency band. In addition, a simulation approach based on M2 tidal measurements has been proposed and proven to be effective. In this work, we compare and analyze the results from three different methods, and concluded that the collocated-sensor method is most stable and reliable with the minimum uncertainties all the time. However, for epochs without both the collocated sensor and secondary seismometer, we rely on the analysis results from tide method. For the data since 1992 on IDA stations, we computed over 600 revised seismometer sensitivities for all the IRIS/IDA network calibration epochs. Hopefully further revision procedures will help to guarantee that the data is accurately reflected by the metadata of these stations.
Metadata and network API aspects of a framework for storing and retrieving civil infrastructure monitoring data

Science.gov (United States)

Wong, John-Michael; Stojadinovic, Bozidar

2005-05-01

A framework has been defined for storing and retrieving civil infrastructure monitoring data over a network. The framework consists of two primary components: metadata and network communications. The metadata component provides the descriptions and data definitions necessary for cataloging and searching monitoring data. The communications component provides Java classes for remotely accessing the data. Packages of Enterprise JavaBeans and data handling utility classes are written to use the underlying metadata information to build real-time monitoring applications. The utility of the framework was evaluated using wireless accelerometers on a shaking table earthquake simulation test of a reinforced concrete bridge column. The NEESgrid data and metadata repository services were used as a backend storage implementation. A web interface was created to demonstrate the utility of the data model and provides an example health monitoring application.
The relevance of music information representation metadata from the perspective of expert users

Directory of Open Access Journals (Sweden)

Camila Monteiro de Barros

Full Text Available The general goal of this research was to verify which metadata elements of music information representation are relevant for its retrieval from the perspective of expert music users. Based on a bibliographical research, a comprehensive metadata set of music information representation was developed and transformed into a questionnaire for data collection, which was applied to students and professors of the Graduate Program in Music at the Federal University of Rio Grande do Sul. The results show that the most relevant information for expert music users is related to identification and authorship responsibilities. The respondents from Composition and Interpretative Practice areas agree with these results, while the respondents from Musicology/Ethnomusicology and Music Education areas also consider the metadata related to the historical context of composition relevant.
Practical management of heterogeneous neuroimaging metadata by global neuroimaging data repositories.

Science.gov (United States)

Neu, Scott C; Crawford, Karen L; Toga, Arthur W

2012-01-01

Rapidly evolving neuroimaging techniques are producing unprecedented quantities of digital data at the same time that many research studies are evolving into global, multi-disciplinary collaborations between geographically distributed scientists. While networked computers have made it almost trivial to transmit data across long distances, collecting and analyzing this data requires extensive metadata if the data is to be maximally shared. Though it is typically straightforward to encode text and numerical values into files and send content between different locations, it is often difficult to attach context and implicit assumptions to the content. As the number of and geographic separation between data contributors grows to national and global scales, the heterogeneity of the collected metadata increases and conformance to a single standardization becomes implausible. Neuroimaging data repositories must then not only accumulate data but must also consolidate disparate metadata into an integrated view. In this article, using specific examples from our experiences, we demonstrate how standardization alone cannot achieve full integration of neuroimaging data from multiple heterogeneous sources and why a fundamental change in the architecture of neuroimaging data repositories is needed instead.
Automation of Electrical Cable Harnesses Testing

Directory of Open Access Journals (Sweden)

Zhuming Bi

2017-12-01

Full Text Available Traditional automated systems, such as industrial robots, are applied in well-structured environments, and many automated systems have a limited adaptability to deal with complexity and uncertainty; therefore, the applications of industrial robots in small- and medium-sized enterprises (SMEs are very limited. The majority of manual operations in SMEs are too complicated for automation. The rapidly developed information technologies (IT has brought new opportunities for the automation of manufacturing and assembly processes in the ill-structured environments. Note that an automation solution should be designed to meet the given requirements of the specified application, and it differs from one application to another. In this paper, we look into the feasibility of automated testing for electric cable harnesses, and our focus is on some of the generic strategies for the improvement of the adaptability of automation solutions. Especially, the concept of modularization is adopted in developing hardware and software to maximize system adaptability in testing a wide scope of products. A proposed system has been implemented, and the system performances have been evaluated by executing tests on actual products. The testing experiments have shown that the automated system outperformed manual operations greatly in terms of cost-saving, productivity and reliability. Due to the potential of increasing system adaptability and cost reduction, the presented work has its theoretical and practical significance for an extension for other automation solutions in SMEs.
Circular dichroism spectral data and metadata in the Protein Circular Dichroism Data Bank (PCDDB): a tutorial guide to accession and deposition.

Science.gov (United States)

Janes, Robert W; Miles, A J; Woollett, B; Whitmore, L; Klose, D; Wallace, B A

2012-09-01

The Protein Circular Dichroism Data Bank (PCDDB) is a web-based resource containing circular dichroism (CD) and synchrotron radiation circular dichroism spectral and associated metadata located at http://pcddb.cryst.bbk.ac.uk. This resource provides a freely available, user-friendly means of accessing validated CD spectra and their associated experimental details and metadata, thereby enabling broad usage of this material and new developments across the structural biology, chemistry, and bioinformatics communities. The resource also enables researchers utilizing CD as an experimental technique to have a means of storing their data at a secure site from which it is easily retrievable, thereby making their results publicly accessible, a current requirement of many grant-funding agencies world-wide, as well as meeting the data-sharing requirements for journal publications. This tutorial provides extensive information on searching, accessing, and downloading procedures for those who wish to utilize the data available in the data bank, and detailed information on deposition procedures for creating and validating entries, including comprehensive explanations of their contents and formats, for those who wish to include their data in the data bank. Copyright © 2012 Wiley Periodicals, Inc.

Automated Conflict Resolution For Air Traffic Control

Science.gov (United States)

Erzberger, Heinz

2005-01-01

The ability to detect and resolve conflicts automatically is considered to be an essential requirement for the next generation air traffic control system. While systems for automated conflict detection have been used operationally by controllers for more than 20 years, automated resolution systems have so far not reached the level of maturity required for operational deployment. Analytical models and algorithms for automated resolution have been traffic conditions to demonstrate that they can handle the complete spectrum of conflict situations encountered in actual operations. The resolution algorithm described in this paper was formulated to meet the performance requirements of the Automated Airspace Concept (AAC). The AAC, which was described in a recent paper [1], is a candidate for the next generation air traffic control system. The AAC's performance objectives are to increase safety and airspace capacity and to accommodate user preferences in flight operations to the greatest extent possible. In the AAC, resolution trajectories are generated by an automation system on the ground and sent to the aircraft autonomously via data link .The algorithm generating the trajectories must take into account the performance characteristics of the aircraft, the route structure of the airway system, and be capable of resolving all types of conflicts for properly equipped aircraft without requiring supervision and approval by a controller. Furthermore, the resolution trajectories should be compatible with the clearances, vectors and flight plan amendments that controllers customarily issue to pilots in resolving conflicts. The algorithm described herein, although formulated specifically to meet the needs of the AAC, provides a generic engine for resolving conflicts. Thus, it can be incorporated into any operational concept that requires a method for automated resolution, including concepts for autonomous air to air resolution.
Metadata Quality Improvement : DASISH deliverable 5.2A

NARCIS (Netherlands)

L'Hours, Hervé; Offersgaard, Lene; Wittenberg, M.; Wloka, Bartholomäus

2014-01-01

The aim of this task was to analyse and compare the different metadata strategies of CLARIN, DARIAH and CESSDA, and to identify possibilities of cross-fertilization to take profit from each other solutions where possible. To have a better understanding in which stages of the research lifecycle
Reactor pressure vessel stud management automation strategies

International Nuclear Information System (INIS)

Biach, W.L.; Hill, R.; Hung, K.

1992-01-01

The adoption of hydraulic tensioner technology as the standard for bolting and unbolting the reactor pressure vessel (RPV) head 35 yr ago represented an incredible commitment to new technology, but the existing technology was so primitive as to be clearly unacceptable. Today, a variety of approaches for improvement make the decision more difficult. Automation in existing installations must meet complex physical, logistic, and financial parameters while addressing the demands of reduced exposure, reduced critical path, and extended plant life. There are two generic approaches to providing automated RPV stud engagement and disengagement: the multiple stud tensioner and automated individual tools. A variation of the latter would include the handling system. Each has its benefits and liabilities
Content-aware network storage system supporting metadata retrieval

Science.gov (United States)

Liu, Ke; Qin, Leihua; Zhou, Jingli; Nie, Xuejun

2008-12-01

Nowadays, content-based network storage has become the hot research spot of academy and corporation[1]. In order to solve the problem of hit rate decline causing by migration and achieve the content-based query, we exploit a new content-aware storage system which supports metadata retrieval to improve the query performance. Firstly, we extend the SCSI command descriptor block to enable system understand those self-defined query requests. Secondly, the extracted metadata is encoded by extensible markup language to improve the universality. Thirdly, according to the demand of information lifecycle management (ILM), we store those data in different storage level and use corresponding query strategy to retrieval them. Fourthly, as the file content identifier plays an important role in locating data and calculating block correlation, we use it to fetch files and sort query results through friendly user interface. Finally, the experiments indicate that the retrieval strategy and sort algorithm have enhanced the retrieval efficiency and precision.
Conditions and configuration metadata for the ATLAS experiment

International Nuclear Information System (INIS)

Gallas, E J; Pachal, K E; Tseng, J C L; Albrand, S; Fulachier, J; Lambert, F; Zhang, Q

2012-01-01

In the ATLAS experiment, a system called COMA (Conditions/Configuration Metadata for ATLAS), has been developed to make globally important run-level metadata more readily accessible. It is based on a relational database storing directly extracted, refined, reduced, and derived information from system-specific database sources as well as information from non-database sources. This information facilitates a variety of unique dynamic interfaces and provides information to enhance the functionality of other systems. This presentation will give an overview of the components of the COMA system, enumerate its diverse data sources, and give examples of some of the interfaces it facilitates. We list important principles behind COMA schema and interface design, and how features of these principles create coherence and eliminate redundancy among the components of the overall system. In addition, we elucidate how interface logging data has been used to refine COMA content and improve the value and performance of end-user reports and browsers.
Conditions and configuration metadata for the ATLAS experiment

CERN Document Server

Gallas, E J; Albrand, S; Fulachier, J; Lambert, F; Pachal, K E; Tseng, J C L; Zhang, Q

2012-01-01

In the ATLAS experiment, a system called COMA (Conditions/Configuration Metadata for ATLAS), has been developed to make globally important run-level metadata more readily accessible. It is based on a relational database storing directly extracted, refined, reduced, and derived information from system-specific database sources as well as information from non-database sources. This information facilitates a variety of unique dynamic interfaces and provides information to enhance the functionality of other systems. This presentation will give an overview of the components of the COMA system, enumerate its diverse data sources, and give examples of some of the interfaces it facilitates. We list important principles behind COMA schema and interface design, and how features of these principles create coherence and eliminate redundancy among the components of the overall system. In addition, we elucidate how interface logging data has been used to refine COMA content and improve the value and performance of end-user...
Benefits of Record Management For Scientific Writing (Study of Metadata Reception of Zotero Reference Management Software in UIN Malang

Directory of Open Access Journals (Sweden)

Moch Fikriansyah Wicaksono

2018-01-01

Full Text Available Record creation and management by individuals or organizations grows rapidly, particularly the change from print to electronics, and the smallest part of record (metadata. Therefore, there is a need to perform record management metadata, particularly for students who have the needs of recording references and citation. Reference management software (RMS is a software to help reference management, one of them named zotero. The purpose of this article is to describe the benefits of record management for the writing of scientific papers for students, especially on biology study program in UIN Malik Ibrahim Malang. The type of research used is descriptive with quantitative approach. To increase the depth of respondents' answers, we used additional data by conducting interviews. The selected population is 322 students, class of 2012 to 2014, using random sampling. The selection criteria were chosen because the introduction and use of reference management software, zotero have started since three years ago. Respondents in this study as many as 80 people, which is obtained from the formula Yamane. The results showed that 70% agreed that using reference management software saved time and energy in managing digital file metadata, 71% agreed that if digital metadata can be quickly stored into RMS, 65% agreed on the ease of storing metadata into the reference management software, 70% agreed when it was easy to configure metadata to quote and bibliography, 56.6% agreed that the metadata stored in reference management software could be edited, 73.8% agreed that using metadata will make it easier to write quotes and bibliography.
CCR+: Metadata Based Extended Personal Health Record Data Model Interoperable with the ASTM CCR Standard.

Science.gov (United States)

Park, Yu Rang; Yoon, Young Jo; Jang, Tae Hun; Seo, Hwa Jeong; Kim, Ju Han

2014-01-01

Extension of the standard model while retaining compliance with it is a challenging issue because there is currently no method for semantically or syntactically verifying an extended data model. A metadata-based extended model, named CCR+, was designed and implemented to achieve interoperability between standard and extended models. Furthermore, a multilayered validation method was devised to validate the standard and extended models. The American Society for Testing and Materials (ASTM) Community Care Record (CCR) standard was selected to evaluate the CCR+ model; two CCR and one CCR+ XML files were evaluated. In total, 188 metadata were extracted from the ASTM CCR standard; these metadata are semantically interconnected and registered in the metadata registry. An extended-data-model-specific validation file was generated from these metadata. This file can be used in a smartphone application (Health Avatar CCR+) as a part of a multilayered validation. The new CCR+ model was successfully evaluated via a patient-centric exchange scenario involving multiple hospitals, with the results supporting both syntactic and semantic interoperability between the standard CCR and extended, CCR+, model. A feasible method for delivering an extended model that complies with the standard model is presented herein. There is a great need to extend static standard models such as the ASTM CCR in various domains: the methods presented here represent an important reference for achieving interoperability between standard and extended models.
Automating Groundwater Sampling At Hanford, The Next Step

International Nuclear Information System (INIS)

Connell, C.W.; Conley, S.F.; Hildebrand, R.D.; Cunningham, D.E.

2010-01-01

Historically, the groundwater monitoring activities at the Department of Energy's Hanford Site in southeastern Washington State have been very 'people intensive.' Approximately 1500 wells are sampled each year by field personnel or 'samplers.' These individuals have been issued pre-printed forms showing information about the well(s) for a particular sampling evolution. This information is taken from 2 official electronic databases: the Hanford Well information System (HWIS) and the Hanford Environmental Information System (HEIS). The samplers used these hardcopy forms to document the groundwater samples and well water-levels. After recording the entries in the field, the samplers turned the forms in at the end of the day and other personnel posted the collected information onto a spreadsheet that was then printed and included in a log book. The log book was then used to make manual entries of the new information into the software application(s) for the HEIS and HWIS databases. A pilot project for automating this extremely tedious process was lauched in 2008. Initially, the automation was focused on water-level measurements. Now, the effort is being extended to automate the meta-data associated with collecting groundwater samples. The project allowed electronic forms produced in the field by samplers to be used in a work flow process where the data is transferred to the database and electronic form is filed in managed records - thus eliminating manually completed forms. Elimating the manual forms and streamlining the data entry not only improved the accuracy of the information recorded, but also enhanced the efficiency and sampling capacity of field office personnel.
The evolution of chondrichthyan research through a metadata ...

African Journals Online (AJOL)

We compiled metadata from Sharks Down Under (1991) and the two Sharks International conferences (2010 and 2014), spanning 23 years. Analysis of the data highlighted taxonomic biases towards charismatic species, a declining number of studies in fundamental science such as those related to taxonomy and basic life ...
Scaling the walls of discovery: using semantic metadata for integrative problem solving.

Science.gov (United States)

Manning, Maurice; Aggarwal, Amit; Gao, Kevin; Tucker-Kellogg, Greg

2009-03-01

Current data integration approaches by bioinformaticians frequently involve extracting data from a wide variety of public and private data repositories, each with a unique vocabulary and schema, via scripts. These separate data sets must then be normalized through the tedious and lengthy process of resolving naming differences and collecting information into a single view. Attempts to consolidate such diverse data using data warehouses or federated queries add significant complexity and have shown limitations in flexibility. The alternative of complete semantic integration of data requires a massive, sustained effort in mapping data types and maintaining ontologies. We focused instead on creating a data architecture that leverages semantic mapping of experimental metadata, to support the rapid prototyping of scientific discovery applications with the twin goals of reducing architectural complexity while still leveraging semantic technologies to provide flexibility, efficiency and more fully characterized data relationships. A metadata ontology was developed to describe our discovery process. A metadata repository was then created by mapping metadata from existing data sources into this ontology, generating RDF triples to describe the entities. Finally an interface to the repository was designed which provided not only search and browse capabilities but complex query templates that aggregate data from both RDF and RDBMS sources. We describe how this approach (i) allows scientists to discover and link relevant data across diverse data sources and (ii) provides a platform for development of integrative informatics applications.
Combined use of semantics and metadata to manage Research Data Life Cycle in Environmental Sciences

Science.gov (United States)

Aguilar Gómez, Fernando; de Lucas, Jesús Marco; Pertinez, Esther; Palacio, Aida

2017-04-01

The use of metadata to contextualize datasets is quite extended in Earth System Sciences. There are some initiatives and available tools to help data managers to choose the best metadata standard that fit their use cases, like the DCC Metadata Directory (http://www.dcc.ac.uk/resources/metadata-standards). In our use case, we have been gathering physical, chemical and biological data from a water reservoir since 2010. A well metadata definition is crucial not only to contextualize our own data but also to integrate datasets from other sources like satellites or meteorological agencies. That is why we have chosen EML (Ecological Metadata Language), which integrates many different elements to define a dataset, including the project context, instrumentation and parameters definition, and the software used to process, provide quality controls and include the publication details. Those metadata elements can contribute to help both human and machines to understand and process the dataset. However, the use of metadata is not enough to fully support the data life cycle, from the Data Management Plan definition to the Publication and Re-use. To do so, we need to define not only metadata and attributes but also the relationships between them, so semantics are needed. Ontologies, being a knowledge representation, can contribute to define the elements of a research data life cycle, including DMP, datasets, software, etc. They also can define how the different elements are related between them and how they interact. The first advantage of developing an ontology of a knowledge domain is that they provide a common vocabulary hierarchy (i.e. a conceptual schema) that can be used and standardized by all the agents interested in the domain (either humans or machines). This way of using ontologies is one of the basis of the Semantic Web, where ontologies are set to play a key role in establishing a common terminology between agents. To develop an ontology we are using a graphical tool
Knowledge Support and Automation for Performance Analysis with PerfExplorer 2.0

Directory of Open Access Journals (Sweden)

Kevin A. Huck

2008-01-01

Full Text Available The integration of scalable performance analysis in parallel development tools is difficult. The potential size of data sets and the need to compare results from multiple experiments presents a challenge to manage and process the information. Simply to characterize the performance of parallel applications running on potentially hundreds of thousands of processor cores requires new scalable analysis techniques. Furthermore, many exploratory analysis processes are repeatable and could be automated, but are now implemented as manual procedures. In this paper, we will discuss the current version of PerfExplorer, a performance analysis framework which provides dimension reduction, clustering and correlation analysis of individual trails of large dimensions, and can perform relative performance analysis between multiple application executions. PerfExplorer analysis processes can be captured in the form of Python scripts, automating what would otherwise be time-consuming tasks. We will give examples of large-scale analysis results, and discuss the future development of the framework, including the encoding and processing of expert performance rules, and the increasing use of performance metadata.
A method to establish seismic noise baselines for automated station assessment

Science.gov (United States)

McNamara, D.E.; Hutt, C.R.; Gee, L.S.; Benz, H.M.; Buland, R.P.

2009-01-01

We present a method for quantifying station noise baselines and characterizing the spectral shape of out-of-nominal noise sources. Our intent is to automate this method in order to ensure that only the highest-quality data are used in rapid earthquake products at NEIC. In addition, the station noise baselines provide a valuable tool to support the quality control of GSN and ANSS backbone data and metadata. The procedures addressed here are currently in development at the NEIC, and work is underway to understand how quickly changes from nominal can be observed and used within the NEIC processing framework. The spectral methods and software used to compute station baselines and described herein (PQLX) can be useful to both permanent and portable seismic stations operators. Applications include: general seismic station and data quality control (QC), evaluation of instrument responses, assessment of near real-time communication system performance, characterization of site cultural noise conditions, and evaluation of sensor vault design, as well as assessment of gross network capabilities (McNamara et al. 2005). Future PQLX development plans include incorporating station baselines for automated QC methods and automating station status report generation and notification based on user-defined QC parameters. The PQLX software is available through the USGS (http://earthquake. usgs.gov/research/software/pqlx.php) and IRIS (http://www.iris.edu/software/ pqlx/).
A DDI3.2 Style for Data and Metadata Extracted from SAS

OpenAIRE

Hoyle, Larry

2014-01-01

Earlier work by Wackerow and Hoyle has shown that DDI can be a useful medium for interchange of data and metadata among statistical packages. DDI 3.2 has new features which enhance this capability, such as the ability to use UserAttributePairs to represent custom attributes. The metadata from a statistical package can also be represented in DDI3.2 using several different styles – embedded in a StudyUnit, in a Resource Package, or in a set of Fragments. The DDI Documentation for a Fragment sta...
A Flexible Online Metadata Editing and Management System

Energy Technology Data Exchange (ETDEWEB)

Aguilar, Raul [Arizona State University; Pan, Jerry Yun [ORNL; Gries, Corinna [Arizona State University; Inigo, Gil San [University of New Mexico, Albuquerque; Palanisamy, Giri [ORNL

2010-01-01

A metadata editing and management system is being developed employing state of the art XML technologies. A modular and distributed design was chosen for scalability, flexibility, options for customizations, and the possibility to add more functionality at a later stage. The system consists of a desktop design tool or schema walker used to generate code for the actual online editor, a native XML database, and an online user access management application. The design tool is a Java Swing application that reads an XML schema, provides the designer with options to combine input fields into online forms and give the fields user friendly tags. Based on design decisions, the tool generates code for the online metadata editor. The code generated is an implementation of the XForms standard using the Orbeon Framework. The design tool fulfills two requirements: First, data entry forms based on one schema may be customized at design time and second data entry applications may be generated for any valid XML schema without relying on custom information in the schema. However, the customized information generated at design time is saved in a configuration file which may be re-used and changed again in the design tool. Future developments will add functionality to the design tool to integrate help text, tool tips, project specific keyword lists, and thesaurus services. Additional styling of the finished editor is accomplished via cascading style sheets which may be further customized and different look-and-feels may be accumulated through the community process. The customized editor produces XML files in compliance with the original schema, however, data from the current page is saved into a native XML database whenever the user moves to the next screen or pushes the save button independently of validity. Currently the system uses the open source XML database eXist for storage and management, which comes with third party online and desktop management tools. However, access to
New Tools to Document and Manage Data/Metadata: Example NGEE Arctic and UrbIS

Science.gov (United States)

Crow, M. C.; Devarakonda, R.; Hook, L.; Killeffer, T.; Krassovski, M.; Boden, T.; King, A. W.; Wullschleger, S. D.

2016-12-01

Tools used for documenting, archiving, cataloging, and searching data are critical pieces of informatics. This discussion describes tools being used in two different projects at Oak Ridge National Laboratory (ORNL), but at different stages of the data lifecycle. The Metadata Entry and Data Search Tool is being used for the documentation, archival, and data discovery stages for the Next Generation Ecosystem Experiment - Arctic (NGEE Arctic) project while the Urban Information Systems (UrbIS) Data Catalog is being used to support indexing, cataloging, and searching. The NGEE Arctic Online Metadata Entry Tool [1] provides a method by which researchers can upload their data and provide original metadata with each upload. The tool is built upon a Java SPRING framework to parse user input into, and from, XML output. Many aspects of the tool require use of a relational database including encrypted user-login, auto-fill functionality for predefined sites and plots, and file reference storage and sorting. The UrbIS Data Catalog is a data discovery tool supported by the Mercury cataloging framework [2] which aims to compile urban environmental data from around the world into one location, and be searchable via a user-friendly interface. Each data record conveniently displays its title, source, and date range, and features: (1) a button for a quick view of the metadata, (2) a direct link to the data and, for some data sets, (3) a button for visualizing the data. The search box incorporates autocomplete capabilities for search terms and sorted keyword filters are available on the side of the page, including a map for searching by area. References: [1] Devarakonda, Ranjeet, et al. "Use of a metadata documentation and search tool for large data volumes: The NGEE arctic example." Big Data (Big Data), 2015 IEEE International Conference on. IEEE, 2015. [2] Devarakonda, R., Palanisamy, G., Wilson, B. E., & Green, J. M. (2010). Mercury: reusable metadata management, data discovery
MetaRNA-Seq: An Interactive Tool to Browse and Annotate Metadata from RNA-Seq Studies

Directory of Open Access Journals (Sweden)

Pankaj Kumar

2015-01-01

Full Text Available The number of RNA-Seq studies has grown in recent years. The design of RNA-Seq studies varies from very simple (e.g., two-condition case-control to very complicated (e.g., time series involving multiple samples at each time point with separate drug treatments. Most of these publically available RNA-Seq studies are deposited in NCBI databases, but their metadata are scattered throughout four different databases: Sequence Read Archive (SRA, Biosample, Bioprojects, and Gene Expression Omnibus (GEO. Although the NCBI web interface is able to provide all of the metadata information, it often requires significant effort to retrieve study- or project-level information by traversing through multiple hyperlinks and going to another page. Moreover, project- and study-level metadata lack manual or automatic curation by categories, such as disease type, time series, case-control, or replicate type, which are vital to comprehending any RNA-Seq study. Here we describe “MetaRNA-Seq,” a new tool for interactively browsing, searching, and annotating RNA-Seq metadata with the capability of semiautomatic curation at the study level.
Semantic web technologies for video surveillance metadata

OpenAIRE

Poppe, Chris; Martens, Gaëtan; De Potter, Pieterjan; Van de Walle, Rik

2012-01-01

Video surveillance systems are growing in size and complexity. Such systems typically consist of integrated modules of different vendors to cope with the increasing demands on network and storage capacity, intelligent video analytics, picture quality, and enhanced visual interfaces. Within a surveillance system, relevant information (like technical details on the video sequences, or analysis results of the monitored environment) is described using metadata standards. However, different module...
Metadata: A user`s view

Energy Technology Data Exchange (ETDEWEB)

Bretherton, F.P. [Univ. of Wisconsin, Madison, WI (United States); Singley, P.T. [Oak Ridge National Lab., TN (United States)

1994-12-31

An analysis is presented of the uses of metadata from four aspects of database operations: (1) search, query, retrieval, (2) ingest, quality control, processing, (3) application to application transfer; (4) storage, archive. Typical degrees of database functionality ranging from simple file retrieval to interdisciplinary global query with metadatabase-user dialog and involving many distributed autonomous databases, are ranked in approximate order of increasing sophistication of the required knowledge representation. An architecture is outlined for implementing such functionality in many different disciplinary domains utilizing a variety of off the shelf database management subsystems and processor software, each specialized to a different abstract data model.

The Earthscope USArray Array Network Facility (ANF): Metadata, Network and Data Monitoring, Quality Assurance During the Second Year of Operations

Science.gov (United States)

Eakins, J. A.; Vernon, F. L.; Martynov, V.; Newman, R. L.; Cox, T. A.; Lindquist, K. L.; Hindley, A.; Foley, S.

2005-12-01

The Array Network Facility (ANF) for the Earthscope USArray Transportable Array seismic network is responsible for: the delivery of all Transportable Array stations (400 at full deployment) and telemetered Flexible Array stations (up to 200) to the IRIS Data Management Center; station command and control; verification and distribution of metadata; providing useful remotely accessible world wide web interfaces for personnel at the Array Operations Facility (AOF) to access state of health information; and quality control for all data. To meet these goals, we use the Antelope software package to facilitate data collection and transfer, generation and merging of the metadata, real-time monitoring of dataloggers, generation of station noise spectra, and analyst review of individual events. Recently, an Antelope extension to the PHP scripting language has been implemented which facilitates the dynamic presentation of the real-time data to local web pages. Metadata transfers have been simplified by the use of orb transfer technologies at the ANF and receiver end points. Web services are being investigated as a means to make a potentially complicated set of operations easy to follow and reproduce for each newly installed or decommissioned station. As part of the quality control process, daily analyst review has highlighted areas where neither the regional network bulletins nor the USGS global bulletin have published solutions. Currently four regional networks (Anza, BDSN, SCSN, and UNR) contribute data to the Transportable Array with additional contributors expected. The first 100 stations (42 new Earthscope stations) were operational by September 2005 with all but one of the California stations installed. By year's end, weather permitting, the total number of stations deployed is expected to be around 145. Visit http://anf.ucsd.edu for more information on the project and current status.
The ATLAS Eventlndex: data flow and inclusion of other metadata

Science.gov (United States)

Barberis, D.; Cárdenas Zárate, S. E.; Favareto, A.; Fernandez Casani, A.; Gallas, E. J.; Garcia Montoro, C.; Gonzalez de la Hoz, S.; Hrivnac, J.; Malon, D.; Prokoshin, F.; Salt, J.; Sanchez, J.; Toebbicke, R.; Yuan, R.; ATLAS Collaboration

2016-10-01

The ATLAS EventIndex is the catalogue of the event-related metadata for the information collected from the ATLAS detector. The basic unit of this information is the event record, containing the event identification parameters, pointers to the files containing this event as well as trigger decision information. The main use case for the EventIndex is event picking, as well as data consistency checks for large production campaigns. The EventIndex employs the Hadoop platform for data storage and handling, as well as a messaging system for the collection of information. The information for the EventIndex is collected both at Tier-0, when the data are first produced, and from the Grid, when various types of derived data are produced. The EventIndex uses various types of auxiliary information from other ATLAS sources for data collection and processing: trigger tables from the condition metadata database (COMA), dataset information from the data catalogue AMI and the Rucio data management system and information on production jobs from the ATLAS production system. The ATLAS production system is also used for the collection of event information from the Grid jobs. EventIndex developments started in 2012 and in the middle of 2015 the system was commissioned and started collecting event metadata, as a part of ATLAS Distributed Computing operations.
Indexing of ATLAS data management and analysis system metadata

CERN Document Server

Grigoryeva, Maria; The ATLAS collaboration

2017-01-01

This manuscript is devoted to the development of the system to manage metainformation of modern HENP experiments. The main purpose of the system is to provide scientists with transparent access to the actual and historical metadata related to data analysis, processing and modeling. The system design addresses the following goals : providing a flexible and fast search for metadata on various combinations of keywords, generating aggregated reports, categorized according to selected parameters, such as the studied physical process, scientific topic, physical group, etc. The article presents the architecture of the developed indexing and search system, as well as the results of performance tests. The comparison of the query execution speed within the developed system and in case of querying the original relational databases showed that the developed system provides results faster. Also the new system allows much more complex search requests, than the original storages.
ATLAS Metadata Infrastructure Evolution for Run 2 and Beyond

CERN Document Server

van Gemmeren, Peter; The ATLAS collaboration; Malon, David; Vaniachine, Alexandre

2015-01-01

ATLAS developed and employed for Run 1 of the Large Hadron Collider a sophisticated infrastructure for metadata handling in event processing jobs. This infrastructure profits from a rich feature set provided by the ATLAS execution control framework, including standardized interfaces and invocation mechanisms for tools and services, segregation of transient data stores with concomitant object lifetime management, and mechanisms for handling occurrences asynchronous to the control framework’s state machine transitions. This metadata infrastructure is evolving and being extended for Run 2 to allow its use and reuse in downstream physics analyses, analyses that may or may not utilize the ATLAS control framework. At the same time, multiprocessing versions of the control framework and the requirements of future multithreaded frameworks are leading to redesign of components that use an incident-handling approach to asynchrony. The increased use of scatter-gather architectures, both local and distributed, requires ...
Big Earth Data Initiative: Metadata Improvement: Case Studies

Science.gov (United States)

Kozimor, John; Habermann, Ted; Farley, John

2016-01-01

Big Earth Data Initiative (BEDI) The Big Earth Data Initiative (BEDI) invests in standardizing and optimizing the collection, management and delivery of U.S. Government's civil Earth observation data to improve discovery, access use, and understanding of Earth observations by the broader user community. Complete and consistent standard metadata helps address all three goals.
Building a scalable event-level metadata service for ATLAS

International Nuclear Information System (INIS)

Cranshaw, J; Malon, D; Goosens, L; Viegas, F T A; McGlone, H

2008-01-01

The ATLAS TAG Database is a multi-terabyte event-level metadata selection system, intended to allow discovery, selection of and navigation to events of interest to an analysis. The TAG Database encompasses file- and relational-database-resident event-level metadata, distributed across all ATLAS Tiers. An oracle hosted global TAG relational database, containing all ATLAS events, implemented in Oracle, will exist at Tier O. Implementing a system that is both performant and manageable at this scale is a challenge. A 1 TB relational TAG Database has been deployed at Tier 0 using simulated tag data. The database contains one billion events, each described by two hundred event metadata attributes, and is currently undergoing extensive testing in terms of queries, population and manageability. These 1 TB tests aim to demonstrate and optimise the performance and scalability of an Oracle TAG Database on a global scale. Partitioning and indexing strategies are crucial to well-performing queries and manageability of the database and have implications for database population and distribution, so these are investigated. Physics query patterns are anticipated, but a crucial feature of the system must be to support a broad range of queries across all attributes. Concurrently, event tags from ATLAS Computing System Commissioning distributed simulations are accumulated in an Oracle-hosted database at CERN, providing an event-level selection service valuable for user experience and gathering information about physics query patterns. In this paper we describe the status of the Global TAG relational database scalability work and highlight areas of future direction
The Geospatial Metadata Manager’s Toolbox: Three Techniques for Maintaining Records

Directory of Open Access Journals (Sweden)

Bruce Godfrey

2015-07-01

Full Text Available Managing geospatial metadata records requires a range of techniques. At the University of Idaho Library, we have tens of thousands of records which need to be maintained as well as the addition of new records which need to be normalized and added to the collections. We show a graphical user interface (GUI tool that was developed to make simple modifications, a simple XSLT that operates on complex metadata, and a Python script with enables parallel processing to make maintenance tasks more efficient. Throughout, we compare these techniques and discuss when they may be useful.
Metafier - a Tool for Annotating and Structuring Building Metadata

DEFF Research Database (Denmark)

Holmegaard, Emil; Johansen, Aslak; Kjærgaard, Mikkel Baun

2017-01-01

in achieving this goal, but often they work as silos. Improving at scale the energy performance of buildings depends on applications breaking these silos and being portable among buildings. To enable portable building applications, the building instrumentation should be supported by a metadata layer...
SnoVault and encodeD: A novel object-based storage system and applications to ENCODE metadata.

Directory of Open Access Journals (Sweden)

Benjamin C Hitz

Full Text Available The Encyclopedia of DNA elements (ENCODE project is an ongoing collaborative effort to create a comprehensive catalog of functional elements initiated shortly after the completion of the Human Genome Project. The current database exceeds 6500 experiments across more than 450 cell lines and tissues using a wide array of experimental techniques to study the chromatin structure, regulatory and transcriptional landscape of the H. sapiens and M. musculus genomes. All ENCODE experimental data, metadata, and associated computational analyses are submitted to the ENCODE Data Coordination Center (DCC for validation, tracking, storage, unified processing, and distribution to community resources and the scientific community. As the volume of data increases, the identification and organization of experimental details becomes increasingly intricate and demands careful curation. The ENCODE DCC has created a general purpose software system, known as SnoVault, that supports metadata and file submission, a database used for metadata storage, web pages for displaying the metadata and a robust API for querying the metadata. The software is fully open-source, code and installation instructions can be found at: http://github.com/ENCODE-DCC/snovault/ (for the generic database and http://github.com/ENCODE-DCC/encoded/ to store genomic data in the manner of ENCODE. The core database engine, SnoVault (which is completely independent of ENCODE, genomic data, or bioinformatic data has been released as a separate Python package.
Towards a best practice of modeling unit of measure and related statistical metadata

CERN Document Server

Grossmann, Wilfried

2011-01-01

Data and metadata exchange between organizations requires a common language for describing structure and content of statistical data and metadata. The SDMX consortium develops content oriented guidelines (COG) recommending harmonized cross-domain concepts and terminology to increase the efficiency of (meta-) data exchange. A recent challenge is a recommended code list for the unit of measure. Based on examples from SDMX sponsor organizations this paper analyses the diversity of ""unit of measure"" as used in practice, including potential breakdowns and interdependencies of the respective meta-
Launch Control System Software Development System Automation Testing

Science.gov (United States)

Hwang, Andrew

2017-01-01

) tool to Brandon Echols, a fellow intern, and I. The purpose of the OCR tool is to analyze an image and find the coordinates of any group of text. Some issues that arose while installing the OCR tool included the absence of certain libraries needed to train the tool and an outdated software version. We eventually resolved the issues and successfully installed the OCR tool. Training the tool required many images and different fonts and sizes, but in the end the tool learned to accurately decipher the text in the images and their coordinates. The OCR tool produced a file that contained significant metadata for each section of text, but only the text and coordinates of the text was required for our purpose. The team made a script to parse the information we wanted from the OCR file to a different file that would be used by automation functions within the automated framework. Since a majority of development and testing for the automated test cases for the GUI in question has been done using live simulated data on the workstations at the Launch Control Center (LCC), a large amount of progress has been made. As of this writing, about 60% of all of automated testing has been implemented. Additionally, the OCR tool will help make our automated tests more robust due to the tool's text recognition being highly scalable to different text fonts and text sizes. Soon we will have the whole test system automated, allowing for more full-time engineers working on development projects.
Automated Inadvertent Intruder Application

International Nuclear Information System (INIS)

Koffman, Larry D.; Lee, Patricia L.; Cook, James R.; Wilhite, Elmer L.

2008-01-01

The Environmental Analysis and Performance Modeling group of Savannah River National Laboratory (SRNL) conducts performance assessments of the Savannah River Site (SRS) low-level waste facilities to meet the requirements of DOE Order 435.1. These performance assessments, which result in limits on the amounts of radiological substances that can be placed in the waste disposal facilities, consider numerous potential exposure pathways that could occur in the future. One set of exposure scenarios, known as inadvertent intruder analysis, considers the impact on hypothetical individuals who are assumed to inadvertently intrude onto the waste disposal site. Inadvertent intruder analysis considers three distinct scenarios for exposure referred to as the agriculture scenario, the resident scenario, and the post-drilling scenario. Each of these scenarios has specific exposure pathways that contribute to the overall dose for the scenario. For the inadvertent intruder analysis, the calculation of dose for the exposure pathways is a relatively straightforward algebraic calculation that utilizes dose conversion factors. Prior to 2004, these calculations were performed using an Excel spreadsheet. However, design checks of the spreadsheet calculations revealed that errors could be introduced inadvertently when copying spreadsheet formulas cell by cell and finding these errors was tedious and time consuming. This weakness led to the specification of functional requirements to create a software application that would automate the calculations for inadvertent intruder analysis using a controlled source of input parameters. This software application, named the Automated Inadvertent Intruder Application, has undergone rigorous testing of the internal calculations and meets software QA requirements. The Automated Inadvertent Intruder Application was intended to replace the previous spreadsheet analyses with an automated application that was verified to produce the same calculations and
Making Information Visible, Accessible, and Understandable: Meta-Data and Registries

National Research Council Canada - National Science Library

Robinson, Clay

2007-01-01

... the interoperability, discovery, and utility of data assets throughout the Department of Defense (DoD). Proper use and understanding of metadata can substantially enhance the utility of data by making it more visible, accessible, and understandable...
Automated remedial assessment methodology software system

International Nuclear Information System (INIS)

Whiting, M.; Wilkins, M.; Stiles, D.

1994-11-01

The Automated Remedial Analysis Methodology (ARAM) software system has been developed by the Pacific Northwest Laboratory to assist the U.S. Department of Energy (DOE) in evaluating cleanup options for over 10,000 contaminated sites across the DOE complex. The automated methodology comprises modules for decision logic diagrams, technology applicability and effectiveness rules, mass balance equations, cost and labor estimating factors and equations, and contaminant stream routing. ARAM is used to select technologies for meeting cleanup targets; determine the effectiveness of the technologies in destroying, removing, or immobilizing contaminants; decide the nature and amount of secondary waste requiring further treatment; and estimate the cost and labor involved when applying technologies
Metadata Laws, Journalism and Resistance in Australia

Directory of Open Access Journals (Sweden)

Benedetta Brevini

2017-03-01

Full Text Available The intelligence leaks from Edward Snowden in 2013 unveiled the sophistication and extent of data collection by the United States’ National Security Agency and major global digital firms prompting domestic and international debates about the balance between security and privacy, openness and enclosure, accountability and secrecy. It is difficult not to see a clear connection with the Snowden leaks in the sharp acceleration of new national security legislations in Australia, a long term member of the Five Eyes Alliance. In October 2015, the Australian federal government passed controversial laws that require telecommunications companies to retain the metadata of their customers for a period of two years. The new acts pose serious threats for the profession of journalism as they enable government agencies to easily identify and pursue journalists’ sources. Bulk data collections of this type of information deter future whistleblowers from approaching journalists, making the performance of the latter’s democratic role a challenge. After situating this debate within the scholarly literature at the intersection between surveillance studies and communication studies, this article discusses the political context in which journalists are operating and working in Australia; assesses how metadata laws have affected journalism practices and addresses the possibility for resistance.
Remote fabrication of nuclear fuel: a secure automated fabrication overview

International Nuclear Information System (INIS)

Nyman, D.H.; Benson, E.M.; Yatabe, J.M.; Nagamoto, T.T.

1981-01-01

An automated line for the fabrication of breeder reactor fuel pins is being developed. The line will be installed in the Fuels and Materials Examination Facility (FMEF) presently under construction at the Hanford site near Richland, Washington. The application of automation and remote operations to fuel processing technology is needed to meet program requirements of reduced personnel exposure, enhanced safeguards, improved product quality, and increased productivity. Commercially available robots are being integrated into operations such as handling of radioactive material within a process operation. These and other automated equipment and chemistry analyses systems under development are described
Operations management system advanced automation: Fault detection isolation and recovery prototyping

Science.gov (United States)

Hanson, Matt

1990-01-01

The purpose of this project is to address the global fault detection, isolation and recovery (FDIR) requirements for Operation's Management System (OMS) automation within the Space Station Freedom program. This shall be accomplished by developing a selected FDIR prototype for the Space Station Freedom distributed processing systems. The prototype shall be based on advanced automation methodologies in addition to traditional software methods to meet the requirements for automation. A secondary objective is to expand the scope of the prototyping to encompass multiple aspects of station-wide fault management (SWFM) as discussed in OMS requirements documentation.
CHARMe Commentary metadata for Climate Science: collecting, linking and sharing user feedback on climate datasets

Science.gov (United States)

Blower, Jon; Lawrence, Bryan; Kershaw, Philip; Nagni, Maurizio

2014-05-01

The research process can be thought of as an iterative activity, initiated based on prior domain knowledge, as well on a number of external inputs, and producing a range of outputs including datasets, studies and peer reviewed publications. These outputs may describe the problem under study, the methodology used, the results obtained, etc. In any new publication, the author may cite or comment other papers or datasets in order to support their research hypothesis. However, as their work progresses, the researcher may draw from many other latent channels of information. These could include for example, a private conversation following a lecture or during a social dinner; an opinion expressed concerning some significant event such as an earthquake or for example a satellite failure. In addition, other sources of information of grey literature are important public such as informal papers such as the arxiv deposit, reports and studies. The climate science community is not an exception to this pattern; the CHARMe project, funded under the European FP7 framework, is developing an online system for collecting and sharing user feedback on climate datasets. This is to help users judge how suitable such climate data are for an intended application. The user feedback could be comments about assessments, citations, or provenance of the dataset, or other information such as descriptions of uncertainty or data quality. We define this as a distinct category of metadata called Commentary or C-metadata. We link C-metadata with target climate datasets using a Linked Data approach via the Open Annotation data model. In the context of Linked Data, C-metadata plays the role of a resource which, depending on its nature, may be accessed as simple text or as more structured content. The project is implementing a range of software tools to create, search or visualize C-metadata including a JavaScript plugin enabling this functionality to be integrated in situ with data provider portals
Automated Security Testing of Web Widget Interactions

NARCIS (Netherlands)

Bezemer, C.P.; Mesbah, A.; Van Deursen, A.

2009-01-01

This paper is a pre-print of: Cor-Paul Bezemer, Ali Mesbah, and Arie van Deursen. Automated Security Testing of Web Widget Interactions. In Proceedings of the 7th joint meeting of the European Software Engineering Conference and the ACM SIGSOFT Symposium on the Foundations of Software Engineering
GEO Label Web Services for Dynamic and Effective Communication of Geospatial Metadata Quality

Science.gov (United States)

Lush, Victoria; Nüst, Daniel; Bastin, Lucy; Masó, Joan; Lumsden, Jo

2014-05-01

We present demonstrations of the GEO label Web services and their integration into a prototype extension of the GEOSS portal (http://scgeoviqua.sapienzaconsulting.com/web/guest/geo_home), the GMU portal (http://gis.csiss.gmu.edu/GADMFS/) and a GeoNetwork catalog application (http://uncertdata.aston.ac.uk:8080/geonetwork/srv/eng/main.home). The GEO label is designed to communicate, and facilitate interrogation of, geospatial quality information with a view to supporting efficient and effective dataset selection on the basis of quality, trustworthiness and fitness for use. The GEO label which we propose was developed and evaluated according to a user-centred design (UCD) approach in order to maximise the likelihood of user acceptance once deployed. The resulting label is dynamically generated from producer metadata in ISO or FDGC format, and incorporates user feedback on dataset usage, ratings and discovered issues, in order to supply a highly informative summary of metadata completeness and quality. The label was easily incorporated into a community portal as part of the GEO Architecture Implementation Programme (AIP-6) and has been successfully integrated into a prototype extension of the GEOSS portal, as well as the popular metadata catalog and editor, GeoNetwork. The design of the GEO label was based on 4 user studies conducted to: (1) elicit initial user requirements; (2) investigate initial user views on the concept of a GEO label and its potential role; (3) evaluate prototype label visualizations; and (4) evaluate and validate physical GEO label prototypes. The results of these studies indicated that users and producers support the concept of a label with drill-down interrogation facility, combining eight geospatial data informational aspects, namely: producer profile, producer comments, lineage information, standards compliance, quality information, user feedback, expert reviews, and citations information. These are delivered as eight facets of a wheel

AUTOMATED INADVERTENT INTRUDER APPLICATION

International Nuclear Information System (INIS)

Koffman, L; Patricia Lee, P; Jim Cook, J; Elmer Wilhite, E

2007-01-01

The Environmental Analysis and Performance Modeling group of Savannah River National Laboratory (SRNL) conducts performance assessments of the Savannah River Site (SRS) low-level waste facilities to meet the requirements of DOE Order 435.1. These performance assessments, which result in limits on the amounts of radiological substances that can be placed in the waste disposal facilities, consider numerous potential exposure pathways that could occur in the future. One set of exposure scenarios, known as inadvertent intruder analysis, considers the impact on hypothetical individuals who are assumed to inadvertently intrude onto the waste disposal site. Inadvertent intruder analysis considers three distinct scenarios for exposure referred to as the agriculture scenario, the resident scenario, and the post-drilling scenario. Each of these scenarios has specific exposure pathways that contribute to the overall dose for the scenario. For the inadvertent intruder analysis, the calculation of dose for the exposure pathways is a relatively straightforward algebraic calculation that utilizes dose conversion factors. Prior to 2004, these calculations were performed using an Excel spreadsheet. However, design checks of the spreadsheet calculations revealed that errors could be introduced inadvertently when copying spreadsheet formulas cell by cell and finding these errors was tedious and time consuming. This weakness led to the specification of functional requirements to create a software application that would automate the calculations for inadvertent intruder analysis using a controlled source of input parameters. This software application, named the Automated Inadvertent Intruder Application, has undergone rigorous testing of the internal calculations and meets software QA requirements. The Automated Inadvertent Intruder Application was intended to replace the previous spreadsheet analyses with an automated application that was verified to produce the same calculations and
Latest developments for the IAGOS database: Interoperability and metadata

Science.gov (United States)

Boulanger, Damien; Gautron, Benoit; Thouret, Valérie; Schultz, Martin; van Velthoven, Peter; Broetz, Bjoern; Rauthe-Schöch, Armin; Brissebrat, Guillaume

2014-05-01

In-service Aircraft for a Global Observing System (IAGOS, http://www.iagos.org) aims at the provision of long-term, frequent, regular, accurate, and spatially resolved in situ observations of the atmospheric composition. IAGOS observation systems are deployed on a fleet of commercial aircraft. The IAGOS database is an essential part of the global atmospheric monitoring network. Data access is handled by open access policy based on the submission of research requests which are reviewed by the PIs. Users can access the data through the following web sites: http://www.iagos.fr or http://www.pole-ether.fr as the IAGOS database is part of the French atmospheric chemistry data centre ETHER (CNES and CNRS). The database is in continuous development and improvement. In the framework of the IGAS project (IAGOS for GMES/COPERNICUS Atmospheric Service), major achievements will be reached, such as metadata and format standardisation in order to interoperate with international portals and other databases, QA/QC procedures and traceability, CARIBIC (Civil Aircraft for the Regular Investigation of the Atmosphere Based on an Instrument Container) data integration within the central database, and the real-time data transmission. IGAS work package 2 aims at providing the IAGOS data to users in a standardized format including the necessary metadata and information on data processing, data quality and uncertainties. We are currently redefining and standardizing the IAGOS metadata for interoperable use within GMES/Copernicus. The metadata are compliant with the ISO 19115, INSPIRE and NetCDF-CF conventions. IAGOS data will be provided to users in NetCDF or NASA Ames format. We also are implementing interoperability between all the involved IAGOS data services, including the central IAGOS database, the former MOZAIC and CARIBIC databases, Aircraft Research DLR database and the Jülich WCS web application JOIN (Jülich OWS Interface) which combines model outputs with in situ data for
Training and Best Practice Guidelines: Implications for Metadata Creation

Science.gov (United States)

Chuttur, Mohammad Y.

2012-01-01

In response to the rapid development of digital libraries over the past decade, researchers have focused on the use of metadata as an effective means to support resource discovery within online repositories. With the increasing involvement of libraries in digitization projects and the growing number of institutional repositories, it is anticipated…
Recent international activity in cooperative vehicle-highway automation systems.

Science.gov (United States)

2012-12-01

This report summarizes the current state of the art in cooperative vehiclehighway automation systems in Europe and Asia : based on a series of meetings, demonstrations, and site visits, combined with the results of literature review. This review c...
Automation of the dicentric chromosome assay and related assays

International Nuclear Information System (INIS)

Balajee, Adayabalam S.; Dainiak, Nicholas

2016-01-01

Dicentric Chromosome Assay (DCA) is considered to be the 'gold standard' for personalized dose assessment in humans after accidental or incidental radiation exposure. Although this technique is superior to other cytogenetic assays in terms of specificity and sensitivity, its potential application to radiation mass casualty scenarios is highly restricted because DCA is time consuming and labor intensive when performed manually. Therefore, it is imperative to develop high throughput automation techniques to make DCA suitable for radiological triage scenarios. At the Cytogenetic Biodosimetry Laboratory in Oak Ridge, efforts are underway to develop high throughput automation of DCA. Current status on development of various automated cytogenetic techniques in meeting the biodosimetry needs of radiological/nuclear incident(s) will be discussed
Automating the radiographic NDT process

International Nuclear Information System (INIS)

Aman, J.K.

1988-01-01

Automation, the removal of the human element in inspection has not been generally applied to film radiographic NDT. The justification for automation is not only productivity but also reliability of results. Film remains in the automated system of the future because of its extremely high image content, approximately 3x10 (to the power of nine) bits per 14x17. This is equivalent to 2200 computer floppy disks parts handling systems and robotics applied for manufacturing and some NDT modalities, should now be applied to film radiographic NDT systems. Automatic film handling can be achieved with the daylight NDT film handling system. Automatic film processing is becoming the standard in industry and can be coupled to the daylight system. Robots offer the opportunity to automate fully the exposure step. Finally, a computer aided interpretation appears on the horizon. A unit which laser scans a 14x27 (inch) film in 6-8 seconds can digitize film in information for further manipulation and possible automatic interrogations (computer aided interpretation). The system called FDRS (for film digital radiography system) is moving toward 50 micron (16 lines/mm) resolution. This is believed to meet the need of the majority of image content needs. (Author). 4 refs.; 21 figs
Generation of Multiple Metadata Formats from a Geospatial Data Repository

Science.gov (United States)

Hudspeth, W. B.; Benedict, K. K.; Scott, S.

2012-12-01

The Earth Data Analysis Center (EDAC) at the University of New Mexico is partnering with the CYBERShARE and Environmental Health Group from the Center for Environmental Resource Management (CERM), located at the University of Texas, El Paso (UTEP), the Biodiversity Institute at the University of Kansas (KU), and the New Mexico Geo- Epidemiology Research Network (GERN) to provide a technical infrastructure that enables investigation of a variety of climate-driven human/environmental systems. Two significant goals of this NASA-funded project are: a) to increase the use of NASA Earth observational data at EDAC by various modeling communities through enabling better discovery, access, and use of relevant information, and b) to expose these communities to the benefits of provenance for improving understanding and usability of heterogeneous data sources and derived model products. To realize these goals, EDAC has leveraged the core capabilities of its Geographic Storage, Transformation, and Retrieval Engine (Gstore) platform, developed with support of the NSF EPSCoR Program. The Gstore geospatial services platform provides general purpose web services based upon the REST service model, and is capable of data discovery, access, and publication functions, metadata delivery functions, data transformation, and auto-generated OGC services for those data products that can support those services. Central to the NASA ACCESS project is the delivery of geospatial metadata in a variety of formats, including ISO 19115-2/19139, FGDC CSDGM, and the Proof Markup Language (PML). This presentation details the extraction and persistence of relevant metadata in the Gstore data store, and their transformation into multiple metadata formats that are increasingly utilized by the geospatial community to document not only core library catalog elements (e.g. title, abstract, publication data, geographic extent, projection information, and database elements), but also the processing steps used to
User interface development and metadata considerations for the Atmospheric Radiation Measurement (ARM) archive

Science.gov (United States)

Singley, P. T.; Bell, J. D.; Daugherty, P. F.; Hubbs, C. A.; Tuggle, J. G.

1993-01-01

This paper will discuss user interface development and the structure and use of metadata for the Atmospheric Radiation Measurement (ARM) Archive. The ARM Archive, located at Oak Ridge National Laboratory (ORNL) in Oak Ridge, Tennessee, is the data repository for the U.S. Department of Energy's (DOE's) ARM Project. After a short description of the ARM Project and the ARM Archive's role, we will consider the philosophy and goals, constraints, and prototype implementation of the user interface for the archive. We will also describe the metadata that are stored at the archive and support the user interface.
Competence Based Educational Metadata for Supporting Lifelong Competence Development Programmes

NARCIS (Netherlands)

Sampson, Demetrios; Fytros, Demetrios

2008-01-01

Sampson, D., & Fytros, D. (2008). Competence Based Educational Metadata for Supporting Lifelong Competence Development Programmes. In P. Diaz, Kinshuk, I. Aedo & E. Mora (Eds.), Proceedings of the 8th IEEE International Conference on Advanced Learning Technologies (ICALT 2008), pp. 288-292. July,
Large-Scale Data Collection Metadata Management at the National Computation Infrastructure

Science.gov (United States)

Wang, J.; Evans, B. J. K.; Bastrakova, I.; Ryder, G.; Martin, J.; Duursma, D.; Gohar, K.; Mackey, T.; Paget, M.; Siddeswara, G.

2014-12-01

Data Collection management has become an essential activity at the National Computation Infrastructure (NCI) in Australia. NCI's partners (CSIRO, Bureau of Meteorology, Australian National University, and Geoscience Australia), supported by the Australian Government and Research Data Storage Infrastructure (RDSI), have established a national data resource that is co-located with high-performance computing. This paper addresses the metadata management of these data assets over their lifetime. NCI manages 36 data collections (10+ PB) categorised as earth system sciences, climate and weather model data assets and products, earth and marine observations and products, geosciences, terrestrial ecosystem, water management and hydrology, astronomy, social science and biosciences. The data is largely sourced from NCI partners, the custodians of many of the national scientific records, and major research community organisations. The data is made available in a HPC and data-intensive environment - a ~56000 core supercomputer, virtual labs on a 3000 core cloud system, and data services. By assembling these large national assets, new opportunities have arisen to harmonise the data collections, making a powerful cross-disciplinary resource.To support the overall management, a Data Management Plan (DMP) has been developed to record the workflows, procedures, the key contacts and responsibilities. The DMP has fields that can be exported to the ISO19115 schema and to the collection level catalogue of GeoNetwork. The subset or file level metadata catalogues are linked with the collection level through parent-child relationship definition using UUID. A number of tools have been developed that support interactive metadata management, bulk loading of data, and support for computational workflows or data pipelines. NCI creates persistent identifiers for each of the assets. The data collection is tracked over its lifetime, and the recognition of the data providers, data owners, data
Energy Assessment of Automated Mobility Districts

Energy Technology Data Exchange (ETDEWEB)

Chen, Yuche [National Renewable Energy Laboratory (NREL), Golden, CO (United States)

2017-08-03

Automated vehicles (AVs) are increasingly being discussed as the basis for on-demand mobility services, introducing a new paradigm in which a fleet of AVs displace private automobiles for day-to-day travel in dense activity districts. This project examines such a concept to displace privately owned automobiles within a region containing dense activity generators (jobs, retail, entertainment, etc.), referred to as an automated mobility district (AMDs). The project reviews several such districts including airport, college campuses, business parks, downtown urban cores, and military bases, with examples of previous attempts to meet the mobility needs apart from private automobiles, some with automated technology and others with more traditional transit based solutions. The issues and benefits of AMDs are framed within the perspective of intra-district, inter-district, and border issues, and the requirements for a modeling framework are identified to adequately reflect the breadth of mobility, energy, and emissions impact anticipated with AMDs.
Logic programming and metadata specifications

Science.gov (United States)

Lopez, Antonio M., Jr.; Saacks, Marguerite E.

1992-01-01

Artificial intelligence (AI) ideas and techniques are critical to the development of intelligent information systems that will be used to collect, manipulate, and retrieve the vast amounts of space data produced by 'Missions to Planet Earth.' Natural language processing, inference, and expert systems are at the core of this space application of AI. This paper presents logic programming as an AI tool that can support inference (the ability to draw conclusions from a set of complicated and interrelated facts). It reports on the use of logic programming in the study of metadata specifications for a small problem domain of airborne sensors, and the dataset characteristics and pointers that are needed for data access.
NCI's national environmental research data collection: metadata management built on standards and preparing for the semantic web

Science.gov (United States)

Wang, Jingbo; Bastrakova, Irina; Evans, Ben; Gohar, Kashif; Santana, Fabiana; Wyborn, Lesley

2015-04-01

National Computational Infrastructure (NCI) manages national environmental research data collections (10+ PB) as part of its specialized high performance data node of the Research Data Storage Infrastructure (RDSI) program. We manage 40+ data collections using NCI's Data Management Plan (DMP), which is compatible with the ISO 19100 metadata standards. We utilize ISO standards to make sure our metadata is transferable and interoperable for sharing and harvesting. The DMP is used along with metadata from the data itself, to create a hierarchy of data collection, dataset and time series catalogues that is then exposed through GeoNetwork for standard discoverability. This hierarchy catalogues are linked using a parent-child relationship. The hierarchical infrastructure of our GeoNetwork catalogues system aims to address both discoverability and in-house administrative use-cases. At NCI, we are currently improving the metadata interoperability in our catalogue by linking with standardized community vocabulary services. These emerging vocabulary services are being established to help harmonise data from different national and international scientific communities. One such vocabulary service is currently being established by the Australian National Data Services (ANDS). Data citation is another important aspect of the NCI data infrastructure, which allows tracking of data usage and infrastructure investment, encourage data sharing, and increasing trust in research that is reliant on these data collections. We incorporate the standard vocabularies into the data citation metadata so that the data citation become machine readable and semantically friendly for web-search purpose as well. By standardizing our metadata structure across our entire data corpus, we are laying the foundation to enable the application of appropriate semantic mechanisms to enhance discovery and analysis of NCI's national environmental research data information. We expect that this will further
SPASE, Metadata, and the Heliophysics Virtual Observatories

Science.gov (United States)

Thieman, James; King, Todd; Roberts, Aaron

2010-01-01

To provide data search and access capability in the field of Heliophysics (the study of the Sun and its effects on the Solar System, especially the Earth) a number of Virtual Observatories (VO) have been established both via direct funding from the U.S. National Aeronautics and Space Administration (NASA) and through other funding agencies in the U.S. and worldwide. At least 15 systems can be labeled as Virtual Observatories in the Heliophysics community, 9 of them funded by NASA. The problem is that different metadata and data search approaches are used by these VO's and a search for data relevant to a particular research question can involve consulting with multiple VO's - needing to learn a different approach for finding and acquiring data for each. The Space Physics Archive Search and Extract (SPASE) project is intended to provide a common data model for Heliophysics data and therefore a common set of metadata for searches of the VO's. The SPASE Data Model has been developed through the common efforts of the Heliophysics Data and Model Consortium (HDMC) representatives over a number of years. We currently have released Version 2.1 of the Data Model. The advantages and disadvantages of the Data Model will be discussed along with the plans for the future. Recent changes requested by new members of the SPASE community indicate some of the directions for further development.
Assuring the Quality of Agricultural Learning Repositories: Issues for the Learning Object Metadata Creation Process of the CGIAR

Science.gov (United States)

Zschocke, Thomas; Beniest, Jan

The Consultative Group on International Agricultural Re- search (CGIAR) has established a digital repository to share its teaching and learning resources along with descriptive educational information based on the IEEE Learning Object Metadata (LOM) standard. As a critical component of any digital repository, quality metadata are critical not only to enable users to find more easily the resources they require, but also for the operation and interoperability of the repository itself. Studies show that repositories have difficulties in obtaining good quality metadata from their contributors, especially when this process involves many different stakeholders as is the case with the CGIAR as an international organization. To address this issue the CGIAR began investigating the Open ECBCheck as well as the ISO/IEC 19796-1 standard to establish quality protocols for its training. The paper highlights the implications and challenges posed by strengthening the metadata creation workflow for disseminating learning objects of the CGIAR.
QualityML: a dictionary for quality metadata encoding

Science.gov (United States)

Ninyerola, Miquel; Sevillano, Eva; Serral, Ivette; Pons, Xavier; Zabala, Alaitz; Bastin, Lucy; Masó, Joan

2014-05-01

The scenario of rapidly growing geodata catalogues requires tools focused on facilitate users the choice of products. Having quality fields populated in metadata allow the users to rank and then select the best fit-for-purpose products. In this direction, we have developed the QualityML (http://qualityml.geoviqua.org), a dictionary that contains hierarchically structured concepts to precisely define and relate quality levels: from quality classes to quality measurements. Generically, a quality element is the path that goes from the higher level (quality class) to the lowest levels (statistics or quality metrics). This path is used to encode quality of datasets in the corresponding metadata schemas. The benefits of having encoded quality, in the case of data producers, are related with improvements in their product discovery and better transmission of their characteristics. In the case of data users, particularly decision-makers, they would find quality and uncertainty measures to take the best decisions as well as perform dataset intercomparison. Also it allows other components (such as visualization, discovery, or comparison tools) to be quality-aware and interoperable. On one hand, the QualityML is a profile of the ISO geospatial metadata standards providing a set of rules for precisely documenting quality indicator parameters that is structured in 6 levels. On the other hand, QualityML includes semantics and vocabularies for the quality concepts. Whenever possible, if uses statistic expressions from the UncertML dictionary (http://www.uncertml.org) encoding. However it also extends UncertML to provide list of alternative metrics that are commonly used to quantify quality. A specific example, based on a temperature dataset, is shown below. The annual mean temperature map has been validated with independent in-situ measurements to obtain a global error of 0.5 ° C. Level 0: Quality class (e.g., Thematic accuracy) Level 1: Quality indicator (e.g., Quantitative
Scalable Metadata Management for a Large Multi-Source Seismic Data Repository

Energy Technology Data Exchange (ETDEWEB)

Gaylord, J. M. [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States); Dodge, D. A. [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States); Magana-Zook, S. A. [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States); Barno, J. G. [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States); Knapp, D. R. [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States); Thomas, J. M. [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States); Sullivan, D. S. [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States); Ruppert, S. D. [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States); Mellors, R. J. [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States)

2017-05-26

In this work, we implemented the key metadata management components of a scalable seismic data ingestion framework to address limitations in our existing system, and to position it for anticipated growth in volume and complexity.
DataNet: A flexible metadata overlay over file resources

CERN Multimedia

CERN. Geneva

2014-01-01

Managing and sharing data stored in files results in a challenge due to data amounts produced by various scientific experiments [1]. While solutions such as Globus Online [2] focus on file transfer and synchronization, in this work we propose an additional layer of metadata over file resources which helps to categorize and structure the data, as well as to make it efficient in integration with web-based research gateways. A basic concept of the proposed solution [3] is a data model consisting of entities built from primitive types such as numbers, texts and also from files and relationships among different entities. This allows for building complex data structure definitions and mix metadata and file data into a single model tailored for a given scientific field. A data model becomes actionable after being deployed as a data repository which is done automatically by the proposed framework by using one of the available PaaS (platform-as-a-service) platforms and is exposed to the world as a REST service, which...
Semantic Web: Metadata, Linked Data, Open Data

Directory of Open Access Journals (Sweden)

Vanessa Russo

2015-12-01

Full Text Available What's the Semantic Web? What's the use? The inventor of the Web Tim Berners-Lee describes it as a research methodology able to take advantage of the network to its maximum capacity. This metadata system represents the innovative element through web 2.0 to web 3.0. In this context will try to understand what are the theoretical and informatic requirements of the Semantic Web. Finally will explain Linked Data applications to develop new tools for active citizenship.
Automated testing of electro-optical systems; Proceedings of the Meeting, Orlando, FL, Apr. 7, 8, 1988

International Nuclear Information System (INIS)

Nestler, J.; Richardson, P.I.

1988-01-01

Various papers on the automated testing of electrooptical systems are presented. Individual topics addressed include: simultaneous automated testing of Thematic Mapper dynamic spatial performance characteristics, results of objective automatic minimum resolvable temperature testing of thermal imagers using a proposed new figure of merit, test and manufacture of three-mirror laboratory telescope, calculation of apparent delta-T errors for band-limited detectors, and automated laser seeker performance evaluation system

Building a High Performance Metadata Broker using Clojure, NoSQL and Message Queues

Science.gov (United States)

Truslove, I.; Reed, S.

2013-12-01

In practice, Earth and Space Science Informatics often relies on getting more done with less: fewer hardware resources, less IT staff, fewer lines of code. As a capacity-building exercise focused on rapid development of high-performance geoinformatics software, the National Snow and Ice Data Center (NSIDC) built a prototype metadata brokering system using a new JVM language, modern database engines and virtualized or cloud computing resources. The metadata brokering system was developed with the overarching goals of (i) demonstrating a technically viable product with as little development effort as possible, (ii) using very new yet very popular tools and technologies in order to get the most value from the least legacy-encumbered code bases, and (iii) being a high-performance system by using scalable subcomponents, and implementation patterns typically used in web architectures. We implemented the system using the Clojure programming language (an interactive, dynamic, Lisp-like JVM language), Redis (a fast in-memory key-value store) as both the data store for original XML metadata content and as the provider for the message queueing service, and ElasticSearch for its search and indexing capabilities to generate search results. On evaluating the results of the prototyping process, we believe that the technical choices did in fact allow us to do more for less, due to the expressive nature of the Clojure programming language and its easy interoperability with Java libraries, and the successful reuse or re-application of high performance products or designs. This presentation will describe the architecture of the metadata brokering system, cover the tools and techniques used, and describe lessons learned, conclusions, and potential next steps.
Exploring the Use of a Test Automation Framework

Science.gov (United States)

Cervantes, Alex

2009-01-01

It is known that software testers, more often than not, lack the time needed to fully test the delivered software product within the time period allotted to them. When problems in the implementation phase of a development project occur, it normally causes the software delivery date to slide. As a result, testers either need to work longer hours, or supplementary resources need to be added to the test team in order to meet aggressive test deadlines. One solution to this problem is to provide testers with a test automation framework to facilitate the development of automated test solutions.
On the communication of scientific data: The Full-Metadata Format

DEFF Research Database (Denmark)

Riede, Moritz; Schueppel, Rico; Sylvester-Hvid, Kristian O.

2010-01-01

In this paper, we introduce a scientific format for text-based data files, which facilitates storing and communicating tabular data sets. The so-called Full-Metadata Format builds on the widely used INI-standard and is based on four principles: readable self-documentation, flexible structure, fail...
Automated cleaning and uncertainty attribution of archival bathymetry based on a priori knowledge

Science.gov (United States)

Ladner, Rodney Wade; Elmore, Paul; Perkins, A. Louise; Bourgeois, Brian; Avera, Will

2017-09-01

Hydrographic offices hold large valuable historic bathymetric data sets, many of which were collected using older generation survey systems that contain little or no metadata and/or uncertainty estimates. These bathymetric data sets generally contain large outlier (errant) data points to clean, yet standard practice does not include rigorous automated procedures for systematic cleaning of these historical data sets and their subsequent conversion into reusable data formats. In this paper, we propose an automated method for this task. We utilize statistically diverse threshold tests, including a robust least trimmed squared method, to clean the data. We use LOESS weighted regression residuals together with a Student-t distribution to attribute uncertainty for each retained sounding; the resulting uncertainty values compare favorably with native estimates of uncertainty from co-located data sets which we use to estimate a point-wise goodness-of-fit measure. Storing a cleansed validated data set augmented with uncertainty in a re-usable format provides the details of this analysis for subsequent users. Our test results indicate that the method significantly improves the quality of the data set while concurrently providing confidence interval estimates and point-wise goodness-of-fit estimates as referenced to current hydrographic practices.
The importance of metadata to assess information content in digital reconstructions of neuronal morphology.

Science.gov (United States)

Parekh, Ruchi; Armañanzas, Rubén; Ascoli, Giorgio A

2015-04-01

Digital reconstructions of axonal and dendritic arbors provide a powerful representation of neuronal morphology in formats amenable to quantitative analysis, computational modeling, and data mining. Reconstructed files, however, require adequate metadata to identify the appropriate animal species, developmental stage, brain region, and neuron type. Moreover, experimental details about tissue processing, neurite visualization and microscopic imaging are essential to assess the information content of digital morphologies. Typical morphological reconstructions only partially capture the underlying biological reality. Tracings are often limited to certain domains (e.g., dendrites and not axons), may be incomplete due to tissue sectioning, imperfect staining, and limited imaging resolution, or can disregard aspects irrelevant to their specific scientific focus (such as branch thickness or depth). Gauging these factors is critical in subsequent data reuse and comparison. NeuroMorpho.Org is a central repository of reconstructions from many laboratories and experimental conditions. Here, we introduce substantial additions to the existing metadata annotation aimed to describe the completeness of the reconstructed neurons in NeuroMorpho.Org. These expanded metadata form a suitable basis for effective description of neuromorphological data.
Perancangan dan Implementasi Aplikasi Internet Radio Menggunakan Multimedia Database Melalui Penerapan Ontology dan Metadata

Directory of Open Access Journals (Sweden)

M. Rudy Erwansyah

2012-06-01

Full Text Available The study aims to analyze, design and implement the internet radio application used in managing the audio data on Heartline FM radio station. In this application, the audio data which has been managed can be used in a radio broadcast scheduling. The scheduled radio broadcast is then forwarded to the webcast server to be transmitted through the Internet. This research carries out analysis, design and implementation using Object Oriented Analysis and Design method and Lean Architecture for Agile Software Development. The programcomponent design consists of: (1 software functional system, (2 user interface, (3 problem domain model, which in internet radio application is divided into five subcomponents, namely: audio-indexing-retrieval, scheduling, reporting, user and ontology. In the implementation of internet application of this radio, the audio data management uses multimedia database by applying metadata and ontology, so that the process of indexing and retrieval can be reused quickly on the broadcast. This application can also be used in carrying out the radiobroadcast automatically during specified hours. This internet radio application has been able to meet the needs of radio Heartline.
77 FR 20051 - Meeting of the CJIS Advisory Policy Board

Science.gov (United States)

2012-04-03

... meeting will take place at The Adams Mark Hotel, 120 Church Street, Buffalo, New York 14202, telephone... Integrated Automated Fingerprint Identification System/Next Generation Identification, Interstate...
Information resource description creating and managing metadata

CERN Document Server

Hider, Philip

2012-01-01

An overview of the field of information organization that examines resource description as both a product and process of the contemporary digital environment.This timely book employs the unifying mechanism of the semantic web and the resource description framework to integrate the various traditions and practices of information and knowledge organization. Uniquely, it covers both the domain-specific traditions and practices and the practices of the ?metadata movement' through a single lens ? that of resource description in the broadest, semantic web sense.This approach more readily accommodate
Automated Information System (AIS) Alarm System

International Nuclear Information System (INIS)

Hunteman, W.

1997-01-01

The Automated Information Alarm System is a joint effort between Los Alamos National Laboratory, Lawrence Livermore National Laboratory, and Sandia National Laboratory to demonstrate and implement, on a small-to-medium sized local area network, an automated system that detects and automatically responds to attacks that use readily available tools and methodologies. The Alarm System will sense or detect, assess, and respond to suspicious activities that may be detrimental to information on the network or to continued operation of the network. The responses will allow stopping, isolating, or ejecting the suspicious activities. The number of sensors, the sensitivity of the sensors, the assessment criteria, and the desired responses may be set by the using organization to meet their local security policies
Automated Information System (AIS) Alarm System

Energy Technology Data Exchange (ETDEWEB)

Hunteman, W.

1997-05-01

The Automated Information Alarm System is a joint effort between Los Alamos National Laboratory, Lawrence Livermore National Laboratory, and Sandia National Laboratory to demonstrate and implement, on a small-to-medium sized local area network, an automated system that detects and automatically responds to attacks that use readily available tools and methodologies. The Alarm System will sense or detect, assess, and respond to suspicious activities that may be detrimental to information on the network or to continued operation of the network. The responses will allow stopping, isolating, or ejecting the suspicious activities. The number of sensors, the sensitivity of the sensors, the assessment criteria, and the desired responses may be set by the using organization to meet their local security policies.
Pilot program for an automated data collection system

International Nuclear Information System (INIS)

Burns, R.S.; Johnson, P.S.; Denny, E.C.

1984-01-01

This report describes the pilot program of an automated data collection system and presents some of the managerial experiences during its startup. The pilot program demonstrated that improvements can be made in data collection and handling, even when a key hardware item does not meet requirements. 2 figures, 1 table
Ready to put metadata on the post-2015 development agenda? Linking data publications to responsible innovation and science diplomacy.

Science.gov (United States)

Özdemir, Vural; Kolker, Eugene; Hotez, Peter J; Mohin, Sophie; Prainsack, Barbara; Wynne, Brian; Vayena, Effy; Coşkun, Yavuz; Dereli, Türkay; Huzair, Farah; Borda-Rodriguez, Alexander; Bragazzi, Nicola Luigi; Faris, Jack; Ramesar, Raj; Wonkam, Ambroise; Dandara, Collet; Nair, Bipin; Llerena, Adrián; Kılıç, Koray; Jain, Rekha; Reddy, Panga Jaipal; Gollapalli, Kishore; Srivastava, Sanjeeva; Kickbusch, Ilona

2014-01-01

Metadata refer to descriptions about data or as some put it, "data about data." Metadata capture what happens on the backstage of science, on the trajectory from study conception, design, funding, implementation, and analysis to reporting. Definitions of metadata vary, but they can include the context information surrounding the practice of science, or data generated as one uses a technology, including transactional information about the user. As the pursuit of knowledge broadens in the 21(st) century from traditional "science of whats" (data) to include "science of hows" (metadata), we analyze the ways in which metadata serve as a catalyst for responsible and open innovation, and by extension, science diplomacy. In 2015, the United Nations Millennium Development Goals (MDGs) will formally come to an end. Therefore, we propose that metadata, as an ingredient of responsible innovation, can help achieve the Sustainable Development Goals (SDGs) on the post-2015 agenda. Such responsible innovation, as a collective learning process, has become a key component, for example, of the European Union's 80 billion Euro Horizon 2020 R&D Program from 2014-2020. Looking ahead, OMICS: A Journal of Integrative Biology, is launching an initiative for a multi-omics metadata checklist that is flexible yet comprehensive, and will enable more complete utilization of single and multi-omics data sets through data harmonization and greater visibility and accessibility. The generation of metadata that shed light on how omics research is carried out, by whom and under what circumstances, will create an "intervention space" for integration of science with its socio-technical context. This will go a long way to addressing responsible innovation for a fairer and more transparent society. If we believe in science, then such reflexive qualities and commitments attained by availability of omics metadata are preconditions for a robust and socially attuned science, which can then remain broadly
The Genomic Observatories Metadatabase (GeOMe): A new repository for field and sampling event metadata associated with genetic samples

Science.gov (United States)

Deck, John; Gaither, Michelle R.; Ewing, Rodney; Bird, Christopher E.; Davies, Neil; Meyer, Christopher; Riginos, Cynthia; Toonen, Robert J.; Crandall, Eric D.

2017-01-01

The Genomic Observatories Metadatabase (GeOMe, http://www.geome-db.org/) is an open access repository for geographic and ecological metadata associated with biosamples and genetic data. Whereas public databases have served as vital repositories for nucleotide sequences, they do not accession all the metadata required for ecological or evolutionary analyses. GeOMe fills this need, providing a user-friendly, web-based interface for both data contributors and data recipients. The interface allows data contributors to create a customized yet standard-compliant spreadsheet that captures the temporal and geospatial context of each biosample. These metadata are then validated and permanently linked to archived genetic data stored in the National Center for Biotechnology Information’s (NCBI’s) Sequence Read Archive (SRA) via unique persistent identifiers. By linking ecologically and evolutionarily relevant metadata with publicly archived sequence data in a structured manner, GeOMe sets a gold standard for data management in biodiversity science. PMID:28771471
The Genomic Observatories Metadatabase (GeOMe: A new repository for field and sampling event metadata associated with genetic samples.

Directory of Open Access Journals (Sweden)

John Deck

2017-08-01

Full Text Available The Genomic Observatories Metadatabase (GeOMe, http://www.geome-db.org/ is an open access repository for geographic and ecological metadata associated with biosamples and genetic data. Whereas public databases have served as vital repositories for nucleotide sequences, they do not accession all the metadata required for ecological or evolutionary analyses. GeOMe fills this need, providing a user-friendly, web-based interface for both data contributors and data recipients. The interface allows data contributors to create a customized yet standard-compliant spreadsheet that captures the temporal and geospatial context of each biosample. These metadata are then validated and permanently linked to archived genetic data stored in the National Center for Biotechnology Information's (NCBI's Sequence Read Archive (SRA via unique persistent identifiers. By linking ecologically and evolutionarily relevant metadata with publicly archived sequence data in a structured manner, GeOMe sets a gold standard for data management in biodiversity science.
Metadata Access Tool for Climate and Health

Science.gov (United States)

Trtanji, J.

2012-12-01

The need for health information resources to support climate change adaptation and mitigation decisions is growing, both in the United States and around the world, as the manifestations of climate change become more evident and widespread. In many instances, these information resources are not specific to a changing climate, but have either been developed or are highly relevant for addressing health issues related to existing climate variability and weather extremes. To help address the need for more integrated data, the Interagency Cross-Cutting Group on Climate Change and Human Health, a working group of the U.S. Global Change Research Program, has developed the Metadata Access Tool for Climate and Health (MATCH). MATCH is a gateway to relevant information that can be used to solve problems at the nexus of climate science and public health by facilitating research, enabling scientific collaborations in a One Health approach, and promoting data stewardship that will enhance the quality and application of climate and health research. MATCH is a searchable clearinghouse of publicly available Federal metadata including monitoring and surveillance data sets, early warning systems, and tools for characterizing the health impacts of global climate change. Examples of relevant databases include the Centers for Disease Control and Prevention's Environmental Public Health Tracking System and NOAA's National Climate Data Center's national and state temperature and precipitation data. This presentation will introduce the audience to this new web-based geoportal and demonstrate its features and potential applications.
Definition of an ISO 19115 metadata profile for SeaDataNet II Cruise Summary Reports and its XML encoding

Science.gov (United States)

Boldrini, Enrico; Schaap, Dick M. A.; Nativi, Stefano

2013-04-01

SeaDataNet implements a distributed pan-European infrastructure for Ocean and Marine Data Management whose nodes are maintained by 40 national oceanographic and marine data centers from 35 countries riparian to all European seas. A unique portal makes possible distributed discovery, visualization and access of the available sea data across all the member nodes. Geographic metadata play an important role in such an infrastructure, enabling an efficient documentation and discovery of the resources of interest. In particular: - Common Data Index (CDI) metadata describe the sea datasets, including identification information (e.g. product title, interested area), evaluation information (e.g. data resolution, constraints) and distribution information (e.g. download endpoint, download protocol); - Cruise Summary Reports (CSR) metadata describe cruises and field experiments at sea, including identification information (e.g. cruise title, name of the ship), acquisition information (e.g. utilized instruments, number of samples taken) In the context of the second phase of SeaDataNet (SeaDataNet 2 EU FP7 project, grant agreement 283607, started on October 1st, 2011 for a duration of 4 years) a major target is the setting, adoption and promotion of common international standards, to the benefit of outreach and interoperability with the international initiatives and communities (e.g. OGC, INSPIRE, GEOSS, …). A standardization effort conducted by CNR with the support of MARIS, IFREMER, STFC, BODC and ENEA has led to the creation of a ISO 19115 metadata profile of CDI and its XML encoding based on ISO 19139. The CDI profile is now in its stable version and it's being implemented and adopted by the SeaDataNet community tools and software. The effort has then continued to produce an ISO based metadata model and its XML encoding also for CSR. The metadata elements included in the CSR profile belong to different models: - ISO 19115: E.g. cruise identification information, including
Utility of collecting metadata to manage a large scale conditions database in ATLAS

International Nuclear Information System (INIS)

Gallas, E J; Albrand, S; Borodin, M; Formica, A

2014-01-01

The ATLAS Conditions Database, based on the LCG Conditions Database infrastructure, contains a wide variety of information needed in online data taking and offline analysis. The total volume of ATLAS conditions data is in the multi-Terabyte range. Internally, the active data is divided into 65 separate schemas (each with hundreds of underlying tables) according to overall data taking type, detector subsystem, and whether the data is used offline or strictly online. While each schema has a common infrastructure, each schema's data is entirely independent of other schemas, except at the highest level, where sets of conditions from each subsystem are tagged globally for ATLAS event data reconstruction and reprocessing. The partitioned nature of the conditions infrastructure works well for most purposes, but metadata about each schema is problematic to collect in global tools from such a system because it is only accessible via LCG tools schema by schema. This makes it difficult to get an overview of all schemas, collect interesting and useful descriptive and structural metadata for the overall system, and connect it with other ATLAS systems. This type of global information is needed for time critical data preparation tasks for data processing and has become more critical as the system has grown in size and diversity. Therefore, a new system has been developed to collect metadata for the management of the ATLAS Conditions Database. The structure and implementation of this metadata repository will be described. In addition, we will report its usage since its inception during LHC Run 1, how it has been exploited in the process of conditions data evolution during LSI (the current LHC long shutdown) in preparation for Run 2, and long term plans to incorporate more of its information into future ATLAS Conditions Database tools and the overall ATLAS information infrastructure.
Large - scale Rectangular Ruler Automated Verification Device

Science.gov (United States)

Chen, Hao; Chang, Luping; Xing, Minjian; Xie, Xie

2018-03-01

This paper introduces a large-scale rectangular ruler automated verification device, which consists of photoelectric autocollimator and self-designed mechanical drive car and data automatic acquisition system. The design of mechanical structure part of the device refer to optical axis design, drive part, fixture device and wheel design. The design of control system of the device refer to hardware design and software design, and the hardware mainly uses singlechip system, and the software design is the process of the photoelectric autocollimator and the automatic data acquisition process. This devices can automated achieve vertical measurement data. The reliability of the device is verified by experimental comparison. The conclusion meets the requirement of the right angle test procedure.
Practical automation for mature producing areas

International Nuclear Information System (INIS)

Luppens, J.C.

1995-01-01

Successful installation and operation of supervisory control and data acquisition (SCADA) systems on two US gulf coast platforms, prompted the installation of the first SCADA, or automation, system in Oklahoma in 1989. The initial installation consisted of four remote terminal units (RTU's) at four beam-pumped leases and a PC-based control system communicating by means of a 900-MHz data repeated. This first installation was a building block for additional wells to be automated, and then additional systems, consisting of RTU's, a PC, and a data repeated, were installed. By the end of 1992 there were 98 RTU's operating on five separation systems and additional RTU's are being installed on a regular basis. This paper outlines the logical development of automation systems on properties in Oklahoma operated by Phillips Petroleum Co. Those factors critical to the success of the effort are (1) designing data-gathering and control capability in conjunction with the field operations staff to meet and not exceed their needs; (2) selection of a computer operating system and automation software package; (3) selection of computer, RTU, and end-device hardware; and (4) continuous involvement of the field operations staff in the installation, operation, and maintenance of the systems. Additionally, specific tangible and intangible results are discussed
Autonomous Underwater Vehicle Data Management and Metadata Interoperability for Coastal Ocean Studies

Science.gov (United States)

McCann, M. P.; Ryan, J. P.; Chavez, F. P.; Rienecker, E.

2004-12-01

Data from over 1000 km of Autonomous Underwater Vehicle (AUV) surveys of Monterey Bay have been collected and cataloged in an ocean observatory data management system. The Monterey Bay Aquarium Institute's AUV is equipped with a suite of instruments that include a conductivity, temperature, depth (CTD) instrument, transmissometers, a fluorometer, a nitrate sensor, and an inertial navigation system. Data are logged on the vehicle and upon completion of a survey XML descriptions of the data are submitted to the Shore Side Data System (SSDS). Instrument data are then processed on shore to apply calibrations and produce scientifically useful data products. The SSDS employs a data model that tracks data from the instrument that created it through all the consuming processes that generate derived products. SSDS employs OPeNDAP and netCDF to provide data set interoperability at the data level. The core of SSDS is the metadata that is the catalog of these data sets and their relation to all other relevant data. The metadata is managed in a relational database and governed by a Enterprise Java Bean (EJB) server application. Cross-platform Java applications have been written to manage and visualize these data. A Java Swing application - the Hierarchical Ocean Observatory Visualization and Editing System (HOOVES) - has been developed to provide visualization of data set pedigree and data set variables. Because the SSDS data model is generalized according to "Data Producers" and "Data Containers" many different types of data can be represented in SSDS allowing for interoperability at a metadata level. Comparisons of appropriate data sets, whether they are from an autonomous underwater vehicle or from a fixed mooring are easily made using SSDS. The authors will present the SSDS data model and show examples of how the model helps organize data set metadata allowing for data discovery and interoperability. With improved discovery and interoperability the system is helping us

Social media in radiology: early trends in Twitter microblogging at radiology's largest international meeting.

Science.gov (United States)

Hawkins, C Matthew; Duszak, Richard; Rawson, James V

2014-04-01

Twitter is a social media microblogging platform that allows rapid exchange of information between individuals. Despite its widespread acceptance and use at various other medical specialty meetings, there are no published data evaluating its use at radiology meetings. The purpose of this study is to quantitatively and qualitatively evaluate the use of Twitter as a microblogging platform at recent RSNA annual meetings. Twitter activity meta-data tagged with official meeting hashtags #RSNA11 and #RSNA12 were collected and analyzed. Multiple metrics were evaluated, including daily and hourly Twitter activity, frequency of microblogging activity over time, characteristics of the 100 most active Twitter users at each meeting, characteristics of meeting-related tweets, and the geographic origin of meeting microbloggers. The use of Twitter microblogging increased by at least 30% by all identifiable meaningful metrics between the 2011 and 2012 RSNA annual meetings, including total tweets, tweets per day, activity of the most active microbloggers, and total number of microbloggers. Similar increases were observed in numbers of North American and international microbloggers. Markedly increased use of the Twitter microblogging platform at recent RSNA annual meetings demonstrates the potential to leverage this technology to engage meeting attendees, improve scientific sessions, and promote improved collaboration at national radiology meetings. Copyright © 2014 American College of Radiology. Published by Elsevier Inc. All rights reserved.
Web Approach for Ontology-Based Classification, Integration, and Interdisciplinary Usage of Geoscience Metadata

Directory of Open Access Journals (Sweden)

B Ritschel

2012-10-01

Full Text Available The Semantic Web is a W3C approach that integrates the different sources of semantics within documents and services using ontology-based techniques. The main objective of this approach in the geoscience domain is the improvement of understanding, integration, and usage of Earth and space science related web content in terms of data, information, and knowledge for machines and people. The modeling and representation of semantic attributes and relations within and among documents can be realized by human readable concept maps and machine readable OWL documents. The objectives for the usage of the Semantic Web approach in the GFZ data center ISDC project are the design of an extended classification of metadata documents for product types related to instruments, platforms, and projects as well as the integration of different types of metadata related to data product providers, users, and data centers. Sources of content and semantics for the description of Earth and space science product types and related classes are standardized metadata documents (e.g., DIF documents, publications, grey literature, and Web pages. Other sources are information provided by users, such as tagging data and social navigation information. The integration of controlled vocabularies as well as folksonomies plays an important role in the design of well formed ontologies.
Stereo vision based automated grasp planning

International Nuclear Information System (INIS)

Wilhelmsen, K.; Huber, L.; Silva, D.; Grasz, E.; Cadapan, L.

1995-02-01

The Department of Energy has a need for treating existing nuclear waste. Hazardous waste stored in old warehouses needs to be sorted and treated to meet environmental regulations. Lawrence Livermore National Laboratory is currently experimenting with automated manipulations of unknown objects for sorting, treating, and detailed inspection. To accomplish these tasks, three existing technologies were expanded to meet the increasing requirements. First, a binocular vision range sensor was combined with a surface modeling system to make virtual images of unknown objects. Then, using the surface model information, stable grasp of the unknown shaped objects were planned algorithmically utilizing a limited set of robotic grippers. This paper is an expansion of previous work and will discuss the grasp planning algorithm
An institutional repository initiative and issues concerning metadata

OpenAIRE

BAYRAM, Özlem; ATILGAN, Doğan; ARSLANTEKİN, Sacit

2006-01-01

Ankara University has become one of the fist open access initiatives in Turkey. Ankara University Open Access Program (AUO) was formed as part of the Open Access project (http://acikarsiv.ankara.edu.tr ) and supported by the University with an example of an open access institutional repository. As for the further step, the system will require the metadata tools to enable international recognization. According to Budapest Open Access Initiative, as suggested two strategies for open access t...
An Examination of the Adoption of Preservation Metadata in Cultural Heritage Institutions: An Exploratory Study Using Diffusion of Innovations Theory

Science.gov (United States)

Alemneh, Daniel Gelaw

2009-01-01

Digital preservation is a significant challenge for cultural heritage institutions and other repositories of digital information resources. Recognizing the critical role of metadata in any successful digital preservation strategy, the Preservation Metadata Implementation Strategies (PREMIS) has been extremely influential on providing a "core" set…
Geo-metadata design for the GIS of the pre-selected site for China's high-level radioactive waste repository

International Nuclear Information System (INIS)

Zhong Xia; Wang Ju; Huang Shutao; Wang Shuhong; Gao Min

2008-01-01

The information system for the geological disposal of high-level radioactive waste aims at the integrated management and full application of multi-sourceful information in the research for geological disposal of high-level radioactive waste. And the establishment and operation of the system need geo-metadata's support of multi-sourceful information. In the paper, on the basis of geo-data analysis for pre-selected site of disposal of high-level radioactive waste, we can apply the existing metadata standards. Also we can research and design the content information, management pattern and application for geo-metadata of the multi-sourceful information. (authors)
Overview of long-term field experiments in Germany - metadata visualization

Science.gov (United States)

Muqit Zoarder, Md Abdul; Heinrich, Uwe; Svoboda, Nikolai; Grosse, Meike; Hierold, Wilfried

2017-04-01

BonaRes ("soil as a sustainable resource for the bioeconomy") is conducting to collect data and metadata of agricultural long-term field experiments (LTFE) of Germany. It is funded by the German Federal Ministry of Education and Research (BMBF) under the umbrella of the National Research Strategy BioEconomy 2030. BonaRes consists of ten interdisciplinary research project consortia and the 'BonaRes - Centre for Soil Research'. BonaRes Data Centre is responsible for collecting all LTFE data and regarding metadata into an enterprise database upon higher level of security and visualization of the data and metadata through data portal. In the frame of the BonaRes project, we are compiling an overview of long-term field experiments in Germany that is based on a literature review, the results of the online survey and direct contacts with LTFE operators. Information about research topic, contact person, website, experiment setup and analyzed parameters are collected. Based on the collected LTFE data, an enterprise geodatabase is developed and a GIS-based web-information system about LTFE in Germany is also settled. Various aspects of the LTFE, like experiment type, land-use type, agricultural category and duration of experiment, are presented in thematic maps. This information system is dynamically linked to the database, which means changes in the data directly affect the presentation. An easy data searching option using LTFE name, -location or -operators and the dynamic layer selection ensure a user-friendly web application. Dispersion and visualization of the overlapping LTFE points on the overview map are also challenging and we make it automatized at very zoom level which is also a consistent part of this application. The application provides both, spatial location and meta-information of LTFEs, which is backed-up by an enterprise geodatabase, GIS server for hosting map services and Java script API for web application development.
Metadata Harvesting in Regional Digital Libraries in the PIONIER Network

Science.gov (United States)

Mazurek, Cezary; Stroinski, Maciej; Werla, Marcin; Weglarz, Jan

2006-01-01

Purpose: The paper aims to present the concept of the functionality of metadata harvesting for regional digital libraries, based on the OAI-PMH protocol. This functionality is a part of regional digital libraries platform created in Poland. The platform was required to reach one of main objectives of the Polish PIONIER Programme--to enrich the…
An on-line Integrated Bookkeeping: electronic run log book and Meta-Data Repository for ATLAS

CERN Document Server

Barczyc, M.; Caprini, M.; Da Silva Conceicao, J.; Dobson, M.; Flammer, J.; Burckhart-Chromek, D.; Caprini, M.; Conceicao, J.D.S.; Dobson, M.; Flammer, J.; Jones, R.; Kazarov, A.; Kolos, S.; Kazarov, A.; Kolos, S.; Liko, D.; Mapelli, L.; Soloviev, I.; Hart, R.; Amorim, A.; Mapelli, L.; Soloviev, I.; Amorim, A.; Klose, D.; Lima, J.; Lucio, L.; Pedro, L.; Wolters, H.; Badescu, E.; Alexandrov, I.; Kotov, V.; Mineev, M.; Ryabov, Yu.

2003-01-01

In the context of the ATLAS experiment there is growing evidence of the importance of different kinds of Meta-data including all the important details of the detector and data acquisition that are vital for the analysis of the acquired data. The Online BookKeeper (OBK) is a component of ATLAS online software that stores all information collected while running the experiment, including the Meta-data associated with the event acquisition, triggering and storage. The facilities for acquisition of control data within the on-line software framework, together with a full functional Web interface, make the OBK a powerful tool containing all information needed for event analysis, including an electronic log book. In this paper we explain how OBK plays a role as one of the main collectors and managers of Meta-data produced on-line, and we'll also focus on the Web facilities already available. The usage of the web interface as an electronic run logbook is also explained, together with the future extensions. We describe...
Detection of Vandalism in Wikipedia using Metadata Features – Implementation in Simple English and Albanian sections

Directory of Open Access Journals (Sweden)

Arsim Susuri

2017-03-01

Full Text Available In this paper, we evaluate a list of classifiers in order to use them in the detection of vandalism by focusing on metadata features. Our work is focused on two low resource data sets (Simple English and Albanian from Wikipedia. The aim of this research is to prove that this form of vandalism detection applied in one data set (language can be extended into another data set (language. Article views data sets in Wikipedia have been used rarely for the purpose of detecting vandalism. We will show the benefits of using article views data set with features from the article revisions data set with the aim of improving the detection of vandalism. The key advantage of using metadata features is that these metadata features are language independent and simple to extract because they require minimal processing. This paper shows that application of vandalism models across low resource languages is possible, and vandalism can be detected through view patterns of articles.
The OceanLink Project

Science.gov (United States)

Narock, T.; Arko, R. A.; Carbotte, S. M.; Chandler, C. L.; Cheatham, M.; Finin, T.; Hitzler, P.; Krisnadhi, A.; Raymond, L. M.; Shepherd, A.; Wiebe, P. H.

2014-12-01

A wide spectrum of maturing methods and tools, collectively characterized as the Semantic Web, is helping to vastly improve the dissemination of scientific research. Creating semantic integration requires input from both domain and cyberinfrastructure scientists. OceanLink, an NSF EarthCube Building Block, is demonstrating semantic technologies through the integration of geoscience data repositories, library holdings, conference abstracts, and funded research awards. Meeting project objectives involves applying semantic technologies to support data representation, discovery, sharing and integration. Our semantic cyberinfrastructure components include ontology design patterns, Linked Data collections, semantic provenance, and associated services to enhance data and knowledge discovery, interoperation, and integration. We discuss how these components are integrated, the continued automated and semi-automated creation of semantic metadata, and techniques we have developed to integrate ontologies, link resources, and preserve provenance and attribution.
Implementation of a metadata architecture and knowledge collection to support semantic interoperability in an enterprise data warehouse.

Science.gov (United States)

Dhaval, Rakesh; Borlawsky, Tara; Ostrander, Michael; Santangelo, Jennifer; Kamal, Jyoti; Payne, Philip R O

2008-11-06

In order to enhance interoperability between enterprise systems, and improve data validity and reliability throughout The Ohio State University Medical Center (OSUMC), we have initiated the development of an ontology-anchored metadata architecture and knowledge collection for our enterprise data warehouse. The metadata and corresponding semantic relationships stored in the OSUMC knowledge collection are intended to promote consistency and interoperability across the heterogeneous clinical, research, business and education information managed within the data warehouse.
Transforming and enhancing metadata for enduser discovery: a case study

Directory of Open Access Journals (Sweden)

Edward M. Corrado

2014-05-01

The Libraries’ workflow and portions of code will be shared; issues and challenges involved will be discussed. While this case study is specific to Binghamton University Libraries, examples of strategies used at other institutions will also be introduced. This paper should be useful to anyone interested in describing large quantities of photographs or other materials with preexisting embedded metadata.
The ATLAS EventIndex: data flow and inclusion of other metadata

CERN Document Server

Prokoshin, Fedor; The ATLAS collaboration; Cardenas Zarate, Simon Ernesto; Favareto, Andrea; Fernandez Casani, Alvaro; Gallas, Elizabeth; Garcia Montoro, Carlos; Gonzalez de la Hoz, Santiago; Hrivnac, Julius; Malon, David; Salt, Jose; Sanchez, Javier; Toebbicke, Rainer; Yuan, Ruijun

2016-01-01

The ATLAS EventIndex is the catalogue of the event-related metadata for the information obtained from the ATLAS detector. The basic unit of this information is event record, containing the event identification parameters, pointers to the files containing this event as well as trigger decision information. The main use case for the EventIndex are the event picking, providing information for the Event Service and data consistency checks for large production campaigns. The EventIndex employs the Hadoop platform for data storage and handling, as well as a messaging system for the collection of information. The information for the EventIndex is collected both at Tier-0, when the data are first produced, and from the GRID, when various types of derived data are produced. The EventIndex uses various types of auxiliary information from other ATLAS sources for data collection and processing: trigger tables from the condition metadata database (COMA), dataset information from the data catalog AMI and the Rucio data man...
The ATLAS EventIndex: data flow and inclusion of other metadata

CERN Document Server

AUTHOR|(INSPIRE)INSPIRE-00064378; Cardenas Zarate, Simon Ernesto; Favareto, Andrea; Fernandez Casani, Alvaro; Gallas, Elizabeth; Garcia Montoro, Carlos; Gonzalez de la Hoz, Santiago; Hrivnac, Julius; Malon, David; Prokoshin, Fedor; Salt, Jose; Sanchez, Javier; Toebbicke, Rainer; Yuan, Ruijun

2016-01-01

The ATLAS EventIndex is the catalogue of the event-related metadata for the information collected from the ATLAS detector. The basic unit of this information is the event record, containing the event identification parameters, pointers to the files containing this event as well as trigger decision information. The main use case for the EventIndex is event picking, as well as data consistency checks for large production campaigns. The EventIndex employs the Hadoop platform for data storage and handling, as well as a messaging system for the collection of information. The information for the EventIndex is collected both at Tier-0, when the data are first produced, and from the Grid, when various types of derived data are produced. The EventIndex uses various types of auxiliary information from other ATLAS sources for data collection and processing: trigger tables from the condition metadata database (COMA), dataset information from the data catalogue AMI and the Rucio data management system and information on p...
Digital Libraries that Demonstrate High Levels of Mutual Complementarity in Collection-level Metadata Give a Richer Representation of their Content and Improve Subject Access for Users

Directory of Open Access Journals (Sweden)

Aoife Lawton

2014-12-01

Full Text Available A Review of: Zavalina, O. L. (2013. Complementarity in subject metadata in large-scale digital libraries: A comparative analysis. Cataloging & Classification Quarterly, 52(1, 77-89. http://dx.doi.org/10.1080/01639374.2013.848316 Abstract Objective – To determine how well digital library content is represented through free-text and subject headings. Specifically to examine whether a combination of free-text description data and controlled vocabulary is more comprehensive than free-text description data alone in describing digital collections. Design – Qualitative content analysis and complementarity comparison. Setting – Three large scale cultural heritage digital libraries: one in Europe and two in the United States of America. Methods – The researcher retrieved XML files of complete metadata records for two of the digital libraries, while the third library openly exposed its full metadata. The systematic samples obtained for all three libraries enabled qualitative content analysis to uncover how metadata values relate to each other at the collection level. The researcher retrieved 99 collection-level metadata records in total for analysis. The breakdown was 39, 33, and 27 records per digital library. When comparing metadata in the free-text Description metadata element with data in four controlled vocabulary elements, Subject, Geographic Coverage, Temporal Coverage and Object Type, the researcher observed three types of complementarity: one-way, two-way and multiple-complementarity. The author refers to complementarity as “describing a collection’s subject matter with mutually complementary data values in controlled vocabulary and free-text subject metadata elements” (Zavalina, 2013, p. 77. For example, within a Temporal Coverage metadata element the term “19th century” would complement a Description metadata element “1850–1899” in the same record. Main Results – The researcher found a high level of one
From drafting guideline to error detection: Automating style checking for legislative texts

OpenAIRE

Höfler Stefan; Sugisaki Kyoko

2012-01-01

This paper reports on the development of methods for the automated detection of violations of style guidelines for legislative texts, and their implementation in a prototypical tool. To this aim, the approach of error modelling employed in automated style checkers for technical writing is enhanced to meet the requirements of legislative editing. The paper identifies and discusses the two main sets of challenges that have to be tackled in this process: (i) the provision of domain-specific NLP ...
Report on the Global Data Assembly Center (GDAC) to the 12th GHRSST Science Team Meeting

Science.gov (United States)

Armstrong, Edward M.; Bingham, Andrew; Vazquez, Jorge; Thompson, Charles; Huang, Thomas; Finch, Chris

2011-01-01

In 2010/2011 the Global Data Assembly Center (GDAC) at NASA's Physical Oceanography Distributed Active Archive Center (PO.DAAC) continued its role as the primary clearinghouse and access node for operational Group for High Resolution Sea Surface Temperature (GHRSST) datastreams, as well as its collaborative role with the NOAA Long Term Stewardship and Reanalysis Facility (LTSRF) for archiving. Here we report on our data management activities and infrastructure improvements since the last science team meeting in June 2010.These include the implementation of all GHRSST datastreams in the new PO.DAAC Data Management and Archive System (DMAS) for more reliable and timely data access. GHRSST dataset metadata are now stored in a new database that has made the maintenance and quality improvement of metadata fields more straightforward. A content management system for a revised suite of PO.DAAC web pages allows dynamic access to a subset of these metadata fields for enhanced dataset description as well as discovery through a faceted search mechanism from the perspective of the user. From the discovery and metadata standpoint the GDAC has also implemented the NASA version of the OpenSearch protocol for searching for GHRSST granules and developed a web service to generate ISO 19115-2 compliant metadata records. Furthermore, the GDAC has continued to implement a new suite of tools and services for GHRSST datastreams including a Level 2 subsetter known as Dataminer, a revised POET Level 3/4 subsetter and visualization tool, a Google Earth interface to selected daily global Level 2 and Level 4 data, and experimented with a THREDDS catalog of GHRSST data collections. Finally we will summarize the expanding user and data statistics, and other metrics that we have collected over the last year demonstrating the broad user community and applications that the GHRSST project continues to serve via the GDAC distribution mechanisms. This report also serves by extension to summarize the
Design and development on automated control system of coated fuel particle fabrication process

International Nuclear Information System (INIS)

Liu Malin; Shao Youlin; Liu Bing

2013-01-01

With the development trend of the large-scale production of the HTR coated fuel particles, the original manual control system can not meet the requirement and the automation control system of coated fuel particle fabrication in modern industrial grade is needed to develop. The comprehensive analysis aiming at successive 4-layer coating process of TRISO type coated fuel particles was carried out. It was found that the coating process could be divided into five subsystems and nine operating states. The establishment of DCS-type (distributed control system) of automation control system was proposed. According to the rigorous requirements of preparation process for coated particles, the design considerations of DCS were proposed, including the principle of coordinated control, safety and reliability, integration specification, practical and easy to use, and open and easy to update. A complete set of automation control system for coated fuel particle preparation process was manufactured based on fulfilling the requirements of these principles in manufacture practice. The automated control system was put into operation in the production of irradiated samples for HTRPM demonstration project. The experimental results prove that the system can achieve better control of coated fuel particle preparation process and meet the requirements of factory-scale production. (authors)
E-health, phase two: the imperative to integrate process automation with communication automation for large clinical reference laboratories.

Science.gov (United States)

White, L; Terner, C

2001-01-01

The initial efforts of e-health have fallen far short of expectations. They were buoyed by the hype and excitement of the Internet craze but limited by their lack of understanding of important market and environmental factors. E-health now recognizes that legacy systems and processes are important, that there is a technology adoption process that needs to be followed, and that demonstrable value drives adoption. Initial e-health transaction solutions have targeted mostly low-cost problems. These solutions invariably are difficult to integrate into existing systems, typically requiring manual interfacing to supported processes. This limitation in particular makes them unworkable for large volume providers. To meet the needs of these providers, e-health companies must rethink their approaches, appropriately applying technology to seamlessly integrate all steps into existing business functions. E-automation is a transaction technology that automates steps, integration of steps, and information communication demands, resulting in comprehensive automation of entire business functions. We applied e-automation to create a billing management solution for clinical reference laboratories. Large volume, onerous regulations, small margins, and only indirect access to patients challenge large laboratories' billing departments. Couple these problems with outmoded, largely manual systems and it becomes apparent why most laboratory billing departments are in crisis. Our approach has been to focus on the most significant and costly problems in billing: errors, compliance, and system maintenance and management. The core of the design relies on conditional processing, a "universal" communications interface, and ASP technologies. The result is comprehensive automation of all routine processes, driving out errors and costs. Additionally, compliance management and billing system support and management costs are dramatically reduced. The implications of e-automated processes can extend

36 CFR 1192.173 - Automated guideway transit vehicles and systems.

Science.gov (United States)

2010-07-01

.... Vertical alignment may be accomplished by vehicle air suspension or other suitable means of meeting the... vehicles and systems. 1192.173 Section 1192.173 Parks, Forests, and Public Property ARCHITECTURAL AND... TRANSPORTATION VEHICLES Other Vehicles and Systems § 1192.173 Automated guideway transit vehicles and systems. (a...
49 CFR 38.173 - Automated guideway transit vehicles and systems.

Science.gov (United States)

2010-10-01

... accomplished by vehicle air suspension or other suitable means of meeting the requirement. (c) In stations... 49 Transportation 1 2010-10-01 2010-10-01 false Automated guideway transit vehicles and systems... DISABILITIES ACT (ADA) ACCESSIBILITY SPECIFICATIONS FOR TRANSPORTATION VEHICLES Other Vehicles and Systems § 38...
Using New Technologies for Design, Modernization and Automation of Systems and Processes Related with Nuclear Applications. Technical Report of an Experts' Meeting

International Nuclear Information System (INIS)

2016-04-01

In recent years the International Atomic Energy Agency has organized a series of technical cooperation projects aimed at expanding and strengthening the capacities of laboratories that work with nuclear instrumentation, as well as developing prototypes of instruments and interfaces which can respond to different needs in nuclear applications in the Latin America Caribbean (RLA) region. The introduction of advanced technologies for the design of instrumentation, interfaces and devices specialized and for work automation systems or processes related nuclear applications, has enabled to make better use of the available instrumentation. Repair work and modernization are an alternative to counter the lack of maintenance services by equipment suppliers that exceed the time life provided by manufacturers (planned obsolescence). Not only do specialized designs allow to solve specific needs, but often help reduce costs and the deadlines of work. An expert' meeting was held under the regional project RLA1011 (Support Automation Systems and Processes in Nuclear Facilities) to prepare a technical report that would provide recommendations to Member States of the IAEA on the current state of development several advanced technologies and their main fields of application, including examples of work in the region within the framework of this project. Eleven analytics entirely in Spanish from participating Member States are included in this publication
Definition of a CDI metadata profile and its ISO 19139 based encoding

Science.gov (United States)

Boldrini, Enrico; de Korte, Arjen; Santoro, Mattia; Schaap, Dick M. A.; Nativi, Stefano; Manzella, Giuseppe

2010-05-01

The Common Data Index (CDI) is the middleware service adopted by SeaDataNet for discovery and query. The primary goal of the EU funded project SeaDataNet is to develop a system which provides transparent access to marine data sets and data products from 36 countries in and around Europe. The European context of SeaDataNet requires that the developed system complies with European Directive INSPIRE. In order to assure the required conformity a GI-cat based solution is proposed. GI-cat is a broker service able to mediate from different metadata sources and publish them through a consistent and unified interface. In this case GI-cat is used as a front end to the SeaDataNet portal publishing the original data, based on CDI v.1 XML schema, through an ISO 19139 application profile catalog interface (OGC CSW AP ISO). The choice of ISO 19139 is supported and driven by INSPIRE Implementing Rules, that have been used as a reference through the whole development process. A mapping from the CDI data model to the ISO 19139 was hence to be implemented in GI-cat and a first draft quickly developed, as both CDI v.1 and ISO 19139 happen to be XML implementations based on the same abstract data model (standard ISO 19115 - metadata about geographic information). This first draft mapping pointed out the CDI metadata model differences with respect to ISO 19115, as it was not possible to accommodate all the information contained in CDI v.1 into ISO 19139. Moreover some modifications were needed in order to reach INSPIRE compliance. The consequent work consisted in the definition of the CDI metadata model as a profile of ISO 19115. This included checking of all the metadata elements present in CDI and their cardinality. A comparison was made with respect to ISO 19115 and possible extensions were individuated. ISO 19139 was then chosen as a natural XML implementation of this new CDI metadata profile. The mapping and the profile definition processes were iteratively refined leading up to a
Automated Generation of the Alaska Coastline Using High-Resolution Satellite Imagery

Science.gov (United States)

Roth, G.; Porter, C. C.; Cloutier, M. D.; Clementz, M. E.; Reim, C.; Morin, P. J.

2015-12-01

Previous campaigns to map Alaska's coast at high resolution have relied on airborne, marine, or ground-based surveying and manual digitization. The coarse temporal resolution, inability to scale geographically, and high cost of field data acquisition in these campaigns is inadequate for the scale and speed of recent coastal change in Alaska. Here, we leverage the Polar Geospatial Center (PGC) archive of DigitalGlobe, Inc. satellite imagery to produce a state-wide coastline at 2 meter resolution. We first select multispectral imagery based on time and quality criteria. We then extract the near-infrared (NIR) band from each processed image, and classify each pixel as water or land with a pre-determined NIR threshold value. Processing continues with vectorizing the water-land boundary, removing extraneous data, and attaching metadata. Final coastline raster and vector products maintain the original accuracy of the orthorectified satellite data, which is often within the local tidal range. The repeat frequency of coastline production can range from 1 month to 3 years, depending on factors such as satellite capacity, cloud cover, and floating ice. Shadows from trees or structures complicate the output and merit further data cleaning. The PGC's imagery archive, unique expertise, and computing resources enabled us to map the Alaskan coastline in a few months. The DigitalGlobe archive allows us to update this coastline as new imagery is acquired, and facilitates baseline data for studies of coastal change and improvement of topographic datasets. Our results are not simply a one-time coastline, but rather a system for producing multi-temporal, automated coastlines. Workflows and tools produced with this project can be freely distributed and utilized globally. Researchers and government agencies must now consider how they can incorporate and quality-control this high-frequency, high-resolution data to meet their mapping standards and research objectives.
Metadata and Tools for Integration and Preservation of Cultural Heritage 3D Information

Directory of Open Access Journals (Sweden)

Achille Felicetti

2011-12-01

Full Text Available In this paper we investigate many of the various storage, portability and interoperability issues arising among archaeologists and cultural heritage people when dealing with 3D technologies. On the one side, the available digital repositories look often unable to guarantee affordable features in the management of 3D models and their metadata; on the other side the nature of most of the available data format for 3D encoding seem to be not satisfactory for the necessary portability required nowadays by 3D information across different systems. We propose a set of possible solutions to show how integration can be achieved through the use of well known and wide accepted standards for data encoding and data storage. Using a set of 3D models acquired during various archaeological campaigns and a number of open source tools, we have implemented a straightforward encoding process to generate meaningful semantic data and metadata. We will also present the interoperability process carried out to integrate the encoded 3D models and the geographic features produced by the archaeologists. Finally we will report the preliminary (rather encouraging development of a semantic enabled and persistent digital repository, where 3D models (but also any kind of digital data and metadata can easily be stored, retrieved and shared with the content of other digital archives.
An Intelligent Web Digital Image Metadata Service Platform for Social Curation Commerce Environment

Directory of Open Access Journals (Sweden)

Seong-Yong Hong

2015-01-01

Full Text Available Information management includes multimedia data management, knowledge management, collaboration, and agents, all of which are supporting technologies for XML. XML technologies have an impact on multimedia databases as well as collaborative technologies and knowledge management. That is, e-commerce documents are encoded in XML and are gaining much popularity for business-to-business or business-to-consumer transactions. Recently, the internet sites, such as e-commerce sites and shopping mall sites, deal with a lot of image and multimedia information. This paper proposes an intelligent web digital image information retrieval platform, which adopts XML technology for social curation commerce environment. To support object-based content retrieval on product catalog images containing multiple objects, we describe multilevel metadata structures representing the local features, global features, and semantics of image data. To enable semantic-based and content-based retrieval on such image data, we design an XML-Schema for the proposed metadata. We also describe how to automatically transform the retrieval results into the forms suitable for the various user environments, such as web browser or mobile device, using XSLT. The proposed scheme can be utilized to enable efficient e-catalog metadata sharing between systems, and it will contribute to the improvement of the retrieval correctness and the user’s satisfaction on semantic-based web digital image information retrieval.
Automated aerosol sampling and analysis for the Comprehensive Test Ban Treaty

International Nuclear Information System (INIS)

Miley, H.S.; Bowyer, S.M.; Hubbard, C.W.; McKinnon, A.D.; Perkins, R.W.; Thompson, R.C.; Warner, R.A.

1998-01-01

Detecting nuclear debris from a nuclear weapon exploded in or substantially vented to the Earth's atmosphere constitutes the most certain indication that a violation of the Comprehensive Test Ban Treaty has occurred. For this reason, a radionuclide portion of the International Monitoring System is being designed and implemented. The IMS will monitor aerosols and gaseous xenon isotopes to detect atmospheric and underground tests, respectively. An automated system, the Radionuclide Aerosol Sampler/Analyzer (RASA), has been developed at Pacific Northwest National Laboratory to meet CTBT aerosol measurement requirements. This is achieved by the use of a novel sampling apparatus, a high-resolution germanium detector, and very sophisticated software. This system draws a large volume of air (∼ 20,000 m 3 /day), performs automated gamma-ray spectral measurements (MDC( 140 Ba) 3 ), and communicates this and other data to a central data facility. Automated systems offer the added benefit of rigid controls, easily implemented QA/QC procedures, and centralized depot maintenance and operation. Other types of automated communication include pull or push transmission of State-Of-Health data, commands, and configuration data. In addition, a graphical user interface, Telnet, and other interactive communications are supported over ordinary phone or network lines. This system has been the subject of a USAF commercialization effort to meet US CTBT monitoring commitments. It will also be available to other CTBT signatories and the monitoring community for various governmental, environmental, or commercial needs. The current status of the commercialization is discussed
PLACE: an open-source python package for laboratory automation, control, and experimentation.

Science.gov (United States)

Johnson, Jami L; Tom Wörden, Henrik; van Wijk, Kasper

2015-02-01

In modern laboratories, software can drive the full experimental process from data acquisition to storage, processing, and analysis. The automation of laboratory data acquisition is an important consideration for every laboratory. When implementing a laboratory automation scheme, important parameters include its reliability, time to implement, adaptability, and compatibility with software used at other stages of experimentation. In this article, we present an open-source, flexible, and extensible Python package for Laboratory Automation, Control, and Experimentation (PLACE). The package uses modular organization and clear design principles; therefore, it can be easily customized or expanded to meet the needs of diverse laboratories. We discuss the organization of PLACE, data-handling considerations, and then present an example using PLACE for laser-ultrasound experiments. Finally, we demonstrate the seamless transition to post-processing and analysis with Python through the development of an analysis module for data produced by PLACE automation. © 2014 Society for Laboratory Automation and Screening.
Open Access Metadata, Catalogers, and Vendors: The Future of Cataloging Records

Science.gov (United States)

Flynn, Emily Alinder

2013-01-01

The open access (OA) movement is working to transform scholarly communication around the world, but this philosophy can also apply to metadata and cataloging records. While some notable, large academic libraries, such as Harvard University, the University of Michigan, and the University of Cambridge, released their cataloging records under OA…
Automation of the National Water Quality Laboratories, U. S. Geological Survey. I. Description of laboratory functions and definition of the automation project

Energy Technology Data Exchange (ETDEWEB)

Morris, W.F.; Ames, H.S.

1977-07-01

In January 1976, the Water Resources Division of the U.S. Geological Survey asked Lawrence Livermore Laboratory to conduct a feasibility study for automation of the National Water Quality (NWQ) Laboratory in Denver, Colorado (formerly Denver Central Laboratory). Results of the study were published in the Feasibility Study for Automation of the Central Laboratories, Lawrence Livermore Laboratory, Rept. UCRL-52001 (1976). Because the present system for processing water samples was found inadequate to meet the demands of a steadily increasing workload, new automation was recommended. In this document we present details necessary for future implementation of the new system, as well as descriptions of current laboratory automatic data processing and analytical facilities to better define the scope of the project and illustrate what the new system will accomplish. All pertinent inputs, outputs, and other operations that define the project are shown in functional designs.
Automation of the National Water Quality Laboratories, U.S. Geological Survey. I. Description of laboratory functions and definition of the automation project

International Nuclear Information System (INIS)

Morris, W.F.; Ames, H.S.

1977-01-01

In January 1976, the Water Resources Division of the U.S. Geological Survey asked Lawrence Livermore Laboratory to conduct a feasibility study for automation of the National Water Quality (NWQ) Laboratory in Denver, Colorado (formerly Denver Central Laboratory). Results of the study were published in the Feasibility Study for Automation of the Central Laboratories, Lawrence Livermore Laboratory, Rept. UCRL-52001 (1976). Because the present system for processing water samples was found inadequate to meet the demands of a steadily increasing workload, new automation was recommended. In this document we present details necessary for future implementation of the new system, as well as descriptions of current laboratory automatic data processing and analytical facilities to better define the scope of the project and illustrate what the new system will accomplish. All pertinent inputs, outputs, and other operations that define the project are shown in functional designs
75 FR 6214 - Notice of Meeting of the Advisory Committee on Commercial Operations of Customs and Border...

Science.gov (United States)

2010-02-08

.... Air Cargo Security Subcommittee 6. Automation Subcommittee 7. ACE/ITDS (Automated Commercial...-4290. Information on Services for Individuals With Disabilities For information on facilities or services for individuals with disabilities or to request special assistance at the meeting, contact Ms...
Impact of Metadata on Full-text Information Retrieval Performance: An Experimental Research on a Small Scale Turkish Corpus

Directory of Open Access Journals (Sweden)

Çağdaş Çapkın

2016-12-01

Full Text Available Information institutions use text-based information retrieval systems to store, index and retrieve metadata, full-text, or both metadata and full-text (hybrid contents. The aim of this research was to evaluate impact of these contents on information retrieval performance. For this purpose, metadata (MIR, full-text (FIR and hybrid (HIR content information retrieval systems were developed with default Lucene information retrieval model for a small scale Turkish corpus. In order to evaluate performance of this three systems, “precision - recall” and “normalized recall” tests were conducted. Experimental findings showed that there were no significant differences between MIR and FIR in mean average precision (MAP performance. On the other hand, MAP performance of HIR was significantly higher in comparison to MIR and FIR. When information retrieval performance was evaluated as user-centered, the “normalized recall” performances of MIR and HIR were significantly higher than FIR. Additionally, there were no significant differences between the systems in retrieved relevant document means. Processing different types of contents such as metadata and full-text had some advantages and disadvantages for information retrieval systems in terms of term management. The advantages brought together in hybrid content processing (HIR and information retrieval performance improved.
The Semantic Mapping of Archival Metadata to the CIDOC CRM Ontology

Science.gov (United States)

Bountouri, Lina; Gergatsoulis, Manolis

2011-01-01

In this article we analyze the main semantics of archival description, expressed through Encoded Archival Description (EAD). Our main target is to map the semantics of EAD to the CIDOC Conceptual Reference Model (CIDOC CRM) ontology as part of a wider integration architecture of cultural heritage metadata. Through this analysis, it is concluded…
That obscure object of desire: multimedia metadata on the Web, part 2

NARCIS (Netherlands)

F.-M. Nack (Frank); J.R. van Ossenbruggen (Jacco); L. Hardman (Lynda)

2003-01-01

textabstractThis article discusses the state of the art in metadata for audio-visual media in large semantic networks, such as the Semantic Web. Our discussion is predominantly motivated by the two most widely known approaches towards machine-processable and semantic-based content description,
That obscure object of desire: multimedia metadata on the Web, part 1

NARCIS (Netherlands)

F.-M. Nack (Frank); J.R. van Ossenbruggen (Jacco); L. Hardman (Lynda)

2003-01-01

textabstractThis article discusses the state of the art in metadata for audio-visual media in large semantic networks, such as the Semantic Web. Our discussion is predominantly motivated by the two most widely known approaches towards machine-processable and semantic-based content description,
Safeguards and security considerations for automated and robotic systems

Energy Technology Data Exchange (ETDEWEB)

Jordan, S.E.; Jaeger, C.D.

1994-09-01

Within the reconfigured Nuclear Weapons Complex there will be a large number of automated and robotic (A&R) systems because of the many benefits derived from their use. To meet the overall security requirements of a facility, consideration must be given to those systems that handle and process nuclear material. Since automation and robotics is a relatively new technology, not widely applied to the Nuclear Weapons Complex, safeguards and security (S&S) issues related to these systems have not been extensively explored, and no guidance presently exists. The goal of this effort is to help integrate S&S into the design of future A&R systems. Towards this, the authors first examined existing A and R systems from a security perspective to identify areas of concern and possible solutions of these problems. They then were able to develop generalized S&S guidance and design considerations for automation and robotics.
Twenty-first century metadata operations challenges, opportunities, directions

CERN Document Server

Lee Eden, Bradford

2014-01-01

It has long been apparent to academic library administrators that the current technical services operations within libraries need to be redirected and refocused in terms of both format priorities and human resources. A number of developments and directions have made this reorganization imperative, many of which have been accelerated by the current economic crisis. All of the chapters detail some aspect of technical services reorganization due to downsizing and/or reallocation of human resources, retooling professional and support staff in higher level duties and/or non-MARC metadata, ""value-a
Policy challenges of increasing automation in driving

Directory of Open Access Journals (Sweden)

Ata M. Khan

2012-03-01

Full Text Available The convergence of information and communication technologies (ICT with automotive technologies has already resulted in automation features in road vehicles and this trend is expected to continue in the future owing to consumer demand, dropping costs of components, and improved reliability. While the automation features that have taken place so far are mainly in the form of information and driver warning technologies (classified as level I pre-2010, future developments in the medium term (level II 2010–2025 are expected to exhibit connected cognitive vehicle features and encompass increasing degree of automation in the form of advanced driver assistance systems. Although autonomous vehicles have been developed for research purposes and are being tested in controlled driving missions, the autonomous driving case is only a long term (level III 2025+ scenario. This paper contributes knowledge on technological forecasts regarding automation, policy challenges for each level of technology development and application context, and the essential instrument of cost-effectiveness for policy analysis which enables policy decisions on the automation systems to be assessed in a consistent and balanced manner. The cost of a system per vehicle is viewed against its effectiveness in meeting policy objectives of improving safety, efficiency, mobility, convenience and reducing environmental effects. Example applications are provided that illustrate the contribution of the methodology in providing information for supporting policy decisions. Given the uncertainties in system costs as well as effectiveness, the tool for assessing policies for future generation features probabilistic and utility-theoretic analysis capability. The policy issues defined and the assessment framework enable the resolution of policy challenges while allowing worthy innovative automation in driving to enhance future road transportation.

The ANSS Station Information System: A Centralized Station Metadata Repository for Populating, Managing and Distributing Seismic Station Metadata

Science.gov (United States)

Thomas, V. I.; Yu, E.; Acharya, P.; Jaramillo, J.; Chowdhury, F.

2015-12-01

Maintaining and archiving accurate site metadata is critical for seismic network operations. The Advanced National Seismic System (ANSS) Station Information System (SIS) is a repository of seismic network field equipment, equipment response, and other site information. Currently, there are 187 different sensor models and 114 data-logger models in SIS. SIS has a web-based user interface that allows network operators to enter information about seismic equipment and assign response parameters to it. It allows users to log entries for sites, equipment, and data streams. Users can also track when equipment is installed, updated, and/or removed from sites. When seismic equipment configurations change for a site, SIS computes the overall gain of a data channel by combining the response parameters of the underlying hardware components. Users can then distribute this metadata in standardized formats such as FDSN StationXML or dataless SEED. One powerful advantage of SIS is that existing data in the repository can be leveraged: e.g., new instruments can be assigned response parameters from the Incorporated Research Institutions for Seismology (IRIS) Nominal Response Library (NRL), or from a similar instrument already in the inventory, thereby reducing the amount of time needed to determine parameters when new equipment (or models) are introduced into a network. SIS is also useful for managing field equipment that does not produce seismic data (eg power systems, telemetry devices or GPS receivers) and gives the network operator a comprehensive view of site field work. SIS allows users to generate field logs to document activities and inventory at sites. Thus, operators can also use SIS reporting capabilities to improve planning and maintenance of the network. Queries such as how many sensors of a certain model are installed or what pieces of equipment have active problem reports are just a few examples of the type of information that is available to SIS users.
Remote management and programmable automata; Telegestion et automate programmable

Energy Technology Data Exchange (ETDEWEB)

NONE

2000-07-01

This document is the proceedings of the meeting organized by the French association of refrigeration (AFF) during the ELEC 2000 exhibition about the measurement, recording, data transmission, automation, diagnosis and management of refrigeration systems. An example of the energy saving made in a supermarket context thanks to the operation control of a refrigerating machinery is presented. (J.S.)
The PDS4 Data Dictionary Tool - Metadata Design for Data Preparers

Science.gov (United States)

Raugh, A.; Hughes, J. S.

2017-12-01

One of the major design goals of the PDS4 development effort was to create an extendable Information Model (IM) for the archive, and to allow mission data designers/preparers to create extensions for metadata definitions specific to their own contexts. This capability is critical for the Planetary Data System - an archive that deals with a data collection that is diverse along virtually every conceivable axis. Amid such diversity in the data itself, it is in the best interests of the PDS archive and its users that all extensions to the IM follow the same design techniques, conventions, and restrictions as the core implementation itself. But it is unrealistic to expect mission data designers to acquire expertise in information modeling, model-driven design, ontology, schema formulation, and PDS4 design conventions and philosophy in order to define their own metadata. To bridge that expertise gap and bring the power of information modeling to the data label designer, the PDS Engineering Node has developed the data dictionary creation tool known as "LDDTool". This tool incorporates the same software used to maintain and extend the core IM, packaged with an interface that enables a developer to create his extension to the IM using the same, standards-based metadata framework PDS itself uses. Through this interface, the novice dictionary developer has immediate access to the common set of data types and unit classes for defining attributes, and a straight-forward method for constructing classes. The more experienced developer, using the same tool, has access to more sophisticated modeling methods like abstraction and extension, and can define context-specific validation rules. We present the key features of the PDS Local Data Dictionary Tool, which both supports the development of extensions to the PDS4 IM, and ensures their compatibility with the IM.
What Information Does Your EHR Contain? Automatic Generation of a Clinical Metadata Warehouse (CMDW) to Support Identification and Data Access Within Distributed Clinical Research Networks.

Science.gov (United States)

Bruland, Philipp; Doods, Justin; Storck, Michael; Dugas, Martin

2017-01-01

Data dictionaries provide structural meta-information about data definitions in health information technology (HIT) systems. In this regard, reusing healthcare data for secondary purposes offers several advantages (e.g. reduce documentation times or increased data quality). Prerequisites for data reuse are its quality, availability and identical meaning of data. In diverse projects, research data warehouses serve as core components between heterogeneous clinical databases and various research applications. Given the complexity (high number of data elements) and dynamics (regular updates) of electronic health record (EHR) data structures, we propose a clinical metadata warehouse (CMDW) based on a metadata registry standard. Metadata of two large hospitals were automatically inserted into two CMDWs containing 16,230 forms and 310,519 data elements. Automatic updates of metadata are possible as well as semantic annotations. A CMDW allows metadata discovery, data quality assessment and similarity analyses. Common data models for distributed research networks can be established based on similarity analyses.
A Systems Approach to Information Technology (IT) Infrastructure Design for Utility Management Automation Systems

OpenAIRE

A. Fereidunian; H. Lesani; C. Lucas; M. Lehtonen; M. M. Nordman

2006-01-01

Almost all of electric utility companies are planning to improve their management automation system, in order to meet the changing requirements of new liberalized energy market and to benefit from the innovations in information and communication technology (ICT or IT). Architectural design of the utility management automation (UMA) systems for their IT-enabling requires proper selection of IT choices for UMA system, which leads to multi-criteria decision-makings (MCDM). In resp...
Tags and self-organisation: a metadata ecology for learning resources in a multilingual context

OpenAIRE

Vuorikari, Riina Hannuli

2010-01-01

Vuorikari, R. (2009). Tags and self-organisation: a metadata ecology for learning resources in a multilingual context. Doctoral thesis. November, 13, 2009, Heerlen, The Netherlands: Open University of the Netherlands, CELSTEC.
Tags and self-organisation: a metadata ecology for learning resources in a multilingual context

NARCIS (Netherlands)

Vuorikari, Riina

2009-01-01

Vuorikari, R. (2009). Tags and self-organisation: a metadata ecology for learning resources in a multilingual context. Doctoral thesis. November, 13, 2009, Heerlen, The Netherlands: Open University of the Netherlands, CELSTEC.
Defining Linkages between the GSC and NSF's LTER Program: How the Ecological Metadata Language (EML) Relates to GCDML and Other Outcomes

Science.gov (United States)

Inigo San Gil; Wade Sheldon; Tom Schmidt; Mark Servilla; Raul Aguilar; Corinna Gries; Tanya Gray; Dawn Field; James Cole; Jerry Yun Pan; Giri Palanisamy; Donald Henshaw; Margaret O' Brien; Linda Kinkel; Kathrine McMahon; Renzo Kottmann; Linda Amaral-Zettler; John Hobbie; Philip Goldstein; Robert P. Guralnick; James Brunt; William K. Michener

2008-01-01

The Genomic Standards Consortium (GSC) invited a representative of the Long-Term Ecological Research (LTER) to its fifth workshop to present the Ecological Metadata Language (EML) metadata standard and its relationship to the Minimum Information about a Genome/Metagenome Sequence (MIGS/MIMS) and its implementation, the Genomic Contextual Data Markup Language (GCDML)....
ICECAP: an integrated, general-purpose, automation-assisted IC50/EC50 assay platform.

Science.gov (United States)

Li, Ming; Chou, Judy; King, Kristopher W; Jing, Jing; Wei, Dong; Yang, Liyu

2015-02-01

IC50 and EC50 values are commonly used to evaluate drug potency. Mass spectrometry (MS)-centric bioanalytical and biomarker labs are now conducting IC50/EC50 assays, which, if done manually, are tedious and error-prone. Existing bioanalytical sample preparation automation systems cannot meet IC50/EC50 assay throughput demand. A general-purpose, automation-assisted IC50/EC50 assay platform was developed to automate the calculations of spiking solutions and the matrix solutions preparation scheme, the actual spiking and matrix solutions preparations, as well as the flexible sample extraction procedures after incubation. In addition, the platform also automates the data extraction, nonlinear regression curve fitting, computation of IC50/EC50 values, graphing, and reporting. The automation-assisted IC50/EC50 assay platform can process the whole class of assays of varying assay conditions. In each run, the system can handle up to 32 compounds and up to 10 concentration levels per compound, and it greatly improves IC50/EC50 assay experimental productivity and data processing efficiency. © 2014 Society for Laboratory Automation and Screening.
A Metadata Model for E-Learning Coordination through Semantic Web Languages

Science.gov (United States)

Elci, Atilla

2005-01-01

This paper reports on a study aiming to develop a metadata model for e-learning coordination based on semantic web languages. A survey of e-learning modes are done initially in order to identify content such as phases, activities, data schema, rules and relations, etc. relevant for a coordination model. In this respect, the study looks into the…
Automated Test Methods for XML Metadata

Science.gov (United States)

2017-12-28

8933 Com (661) 277 8933 email jon.morgan.2.ctr@us.af.mil Secretariat, Range Commanders Council ATTN: TEDT-WS-RCC 1510 Headquarters Avenue White...Sands Missile Range, New Mexico 88002-5110 Phone: DSN 258-1107 Com (575) 678-1107 Fax: DSN 258-7519 Com (575) 678-7519 email ...Method for Testing Syntax The test method is as follows. 1. Initialize the programming environment. 2. Write test application code to use the
Feasibility study for automating the analytical laboratories of the Chemistry Branch, National Enforcement Investigation Center, Environmental Protection Agency

International Nuclear Information System (INIS)

Morris, W.F.; Fisher, E.R.; Barton, G.W. Jr.

1978-01-01

The feasibility of automating the analytical laboratories of the Chemistry Branch of the National Enforcement Investigation Center, Environmental Protection Agency, Denver, Colorado, is explored. The goals of the chemistry laboratory are defined, and instrumental methods and other tasks to be automated are described. Five optional automation systems are proposed to meet these goals and the options are evaluated in terms of cost effectiveness and other specified criteria. The instruments to be automated include (1) a Perkin-Elmer AA spectrophotometer 403, (2) Perkin-Elmer AA spectrophotometer 306, (3) Technicon AutoAnalyzer II, (4) Mettler electronic balance, and a (5) Jarrell-Ash ICP emission spectrometer
Development of an Automated Security Risk Assessment Methodology Tool for Critical Infrastructures.

Energy Technology Data Exchange (ETDEWEB)

Jaeger, Calvin Dell; Roehrig, Nathaniel S.; Torres, Teresa M.

2008-12-01

This document presents the security automated Risk Assessment Methodology (RAM) prototype tool developed by Sandia National Laboratories (SNL). This work leverages SNL's capabilities and skills in security risk analysis and the development of vulnerability assessment/risk assessment methodologies to develop an automated prototype security RAM tool for critical infrastructures (RAM-CITM). The prototype automated RAM tool provides a user-friendly, systematic, and comprehensive risk-based tool to assist CI sector and security professionals in assessing and managing security risk from malevolent threats. The current tool is structured on the basic RAM framework developed by SNL. It is envisioned that this prototype tool will be adapted to meet the requirements of different CI sectors and thereby provide additional capabilities.
Safeguards and security considerations for automated and robotic systems

International Nuclear Information System (INIS)

Jordan, S.E.; Jaeger, C.D.

1994-01-01

Within the reconfigured Nuclear Weapons Complex there will be a large number of automated and robotic (A ampersand R) systems because of the many benefits derived from their use. To meet the overall security requirements of a facility, consideration must be given to those systems that handle and process nuclear material. Since automation and robotics is a relatively new technology, not widely applied to the Nuclear Weapons Complex, safeguards and security (S ampersand S) issues related to these systems have not been extensively explored, and no guidance presently exists. The goal of this effort is to help integrate S ampersand S into the design of future A ampersand R systems. Towards this, the authors first examined existing A and R systems from a security perspective to identify areas of concern and possible solutions of these problems. They then were able to develop generalized S ampersand S guidance and design considerations for automation and robotics
Safeguards and security considerations for automated and robotic systems

International Nuclear Information System (INIS)

Jordan, S.E.; Jaeger, C.D.

1994-01-01

Within the reconfigured Nuclear Weapons Complex there will be a large number of automated and robotic (A ampersand R) systems because of the many benefits derived from their use. To meet the overall security requirements of a facility, consideration must be given to those systems that handle and process nuclear material. Since automation and robotics is a relatively new technology, not widely applied to the Nuclear Weapons Complex, safeguards and security (S ampersand S) issues related to these systems have not been extensively explored, and no guidance presently exists. The goal of this effort is to help integrate S ampersand S into the design of future A ampersand R systems. Towards this, we first examined existing A ampersand R systems from a security perspective to identify areas of concern and possible solutions to these problems. We then were able to develop generalized S ampersand S guidance and design considerations for automation and robotics
Defense Virtual Library: Technical Metadata for the Long-Term Management of Digital Materials: Preliminary Guidelines

National Research Council Canada - National Science Library

Flynn, Marcy

2002-01-01

... of the digital materials being preserved. This report, prepared by Silver Image Management (SIM), proposes technical metadata elements appropriate for digital objects in the Defense Virtual Library...
Metadata requirements for results of diagnostic imaging procedures: a BIIF profile to support user applications

Science.gov (United States)

Brown, Nicholas J.; Lloyd, David S.; Reynolds, Melvin I.; Plummer, David L.

2002-05-01

A visible digital image is rendered from a set of digital image data. Medical digital image data can be stored as either: (a) pre-rendered format, corresponding to a photographic print, or (b) un-rendered format, corresponding to a photographic negative. The appropriate image data storage format and associated header data (metadata) required by a user of the results of a diagnostic procedure recorded electronically depends on the task(s) to be performed. The DICOM standard provides a rich set of metadata that supports the needs of complex applications. Many end user applications, such as simple report text viewing and display of a selected image, are not so demanding and generic image formats such as JPEG are sometimes used. However, these are lacking some basic identification requirements. In this paper we make specific proposals for minimal extensions to generic image metadata of value in various domains, which enable safe use in the case of two simple healthcare end user scenarios: (a) viewing of text and a selected JPEG image activated by a hyperlink and (b) viewing of one or more JPEG images together with superimposed text and graphics annotation using a file specified by a profile of the ISO/IEC Basic Image Interchange Format (BIIF).
Automating the Analytical Laboratories Section, Lewis Research Center, National Aeronautics and Space Administration: a feasibility study

International Nuclear Information System (INIS)

Boyle, W.G.; Barton, G.W.

1979-01-01

We studied the feasibility of computerized automation of the Analytical Laboratories Section at NASA's Lewis Research Center. Since that laboratory's duties are not routine, we set our automation goals with that in mind. We selected four instruments as the most likely automation candidates: an atomic absorption spectrophotometer, an emission spectrometer, an x-ray fluorescence spectrometer, and an x-ray diffraction unit. Our study describes two options for computer automation: a time-shared central computer and a system with microcomputers for each instrument connected to a central computer. A third option, presented for future planning, expands the microcomputer version. We determine costs and benefits for each option. We conclude that the microcomputer version best fits the goals and duties of the laboratory and that such an automated system is needed to meet the laboratory's future requirements
A network analysis using metadata to investigate innovation in clean-tech – Implications for energy policy

International Nuclear Information System (INIS)

Marra, Alessandro; Antonelli, Paola; Dell’Anna, Luca; Pozzi, Cesare

2015-01-01

Clean-technology (clean-tech) is a large and increasing sector. Research and development (R&D) is the lifeline of the industry and innovation is fostered by a plethora of high-tech start-ups and small and medium-sized enterprises (SMEs). Any empirical-based attempt to detect the pattern of technological innovation in the industry is challenging. This paper proposes an investigation of innovation in clean-tech using metadata provided by CrunchBase. Metadata reveal information on markets, products, services and technologies driving innovation in the clean-tech industry worldwide and for San Francisco, the leader in clean-tech innovation with more than two hundred specialised companies. A network analysis using metadata is the employed methodology and the main metrics of the resulting networks are discussed from an economic point of view. The purpose of the paper is to understand specifically specializations and technological complementarities underlying innovative companies, detect emerging industrial clusters at the global and local/metropolitan level and, finally, suggest a way to realize whether observed start-ups, SMEs and clusters follow a technological path of complementary innovation and market opportunity or, instead, present a risk of lock-in. The discussion of the results of the network analysis shows interesting implications for energy policy, particularly useful from an operational point of view. - Highlights: • Metadata provide information on companies' products and technologies. • A network analysis enables detection of specializations and complementarities. • An investigation of the network allows to identify emerging industrial clusters. • Metrics help to appreciate complementary innovation and market opportunity. • Results of the network analysis show interesting policy implications.
Nuclear power generation and automation technology

International Nuclear Information System (INIS)

Korei, Yoshiro

1985-01-01

The proportion of nuclear power in the total generated electric power has been increasing year after year, and the ensuring of its stable supply has been demanded. For the further development of nuclear power generation, the heightening of economical efficiency which is the largest merit of nuclear power and the public acceptance as a safe and stable electric power source are the important subjects. In order to solve these subjects, in nuclear power generation, various automation techniques have been applied for the purpose of the heightening of reliability, labor saving and the reduction of radiation exposure. Meeting the high needs of automation, the automation technology aided by computers have been applied to the design, manufacture and construction, operation and maintenance of nuclear power plants. Computer-aided design and the examples of design of a reactor building, pipings and a fuel assembly, an automatic welder for pipings of all position TIG welding type, a new central monitoring and control system, an automatic exchanger of control rod-driving mechanism, an automatic in-service inspection system for nozzles and pipings, and a robot for steam generator maintenance are shown. The trend of technical development and an intelligent moving robot, a system maintenance robot and a four legs walking robot are explained. (Kako, I.)

Altering user' acceptance of automation through prior automation exposure.

Science.gov (United States)

Bekier, Marek; Molesworth, Brett R C

2017-06-01

Air navigation service providers worldwide see increased use of automation as one solution to overcome the capacity constraints imbedded in the present air traffic management (ATM) system. However, increased use of automation within any system is dependent on user acceptance. The present research sought to determine if the point at which an individual is no longer willing to accept or cooperate with automation can be manipulated. Forty participants underwent training on a computer-based air traffic control programme, followed by two ATM exercises (order counterbalanced), one with and one without the aid of automation. Results revealed after exposure to a task with automation assistance, user acceptance of high(er) levels of automation ('tipping point') decreased; suggesting it is indeed possible to alter automation acceptance. Practitioner Summary: This paper investigates whether the point at which a user of automation rejects automation (i.e. 'tipping point') is constant or can be manipulated. The results revealed after exposure to a task with automation assistance, user acceptance of high(er) levels of automation decreased; suggesting it is possible to alter automation acceptance.
Preliminary study of technical terminology for the retrieval of scientific book metadata records

DEFF Research Database (Denmark)

Larsen, Birger; Lioma, Christina; Frommholz, Ingo

2012-01-01

Books only represented by brief metadata (book records) are particularly hard to retrieve. One way of improving their retrieval is by extracting retrieval enhancing features from them. This work focusses on scientific (physics) book records. We ask if their technical terminology can be used...
mzML2ISA & nmrML2ISA: generating enriched ISA-Tab metadata files from metabolomics XML data.

Science.gov (United States)

Larralde, Martin; Lawson, Thomas N; Weber, Ralf J M; Moreno, Pablo; Haug, Kenneth; Rocca-Serra, Philippe; Viant, Mark R; Steinbeck, Christoph; Salek, Reza M

2017-08-15

Submission to the MetaboLights repository for metabolomics data currently places the burden of reporting instrument and acquisition parameters in ISA-Tab format on users, who have to do it manually, a process that is time consuming and prone to user input error. Since the large majority of these parameters are embedded in instrument raw data files, an opportunity exists to capture this metadata more accurately. Here we report a set of Python packages that can automatically generate ISA-Tab metadata file stubs from raw XML metabolomics data files. The parsing packages are separated into mzML2ISA (encompassing mzML and imzML formats) and nmrML2ISA (nmrML format only). Overall, the use of mzML2ISA & nmrML2ISA reduces the time needed to capture metadata substantially (capturing 90% of metadata on assay and sample levels), is much less prone to user input errors, improves compliance with minimum information reporting guidelines and facilitates more finely grained data exploration and querying of datasets. mzML2ISA & nmrML2ISA are available under version 3 of the GNU General Public Licence at https://github.com/ISA-tools. Documentation is available from http://2isa.readthedocs.io/en/latest/. reza.salek@ebi.ac.uk or isatools@googlegroups.com. Supplementary data are available at Bioinformatics online. © The Author(s) 2017. Published by Oxford University Press.
An Observation Capability Metadata Model for EO Sensor Discovery in Sensor Web Enablement Environments

Directory of Open Access Journals (Sweden)

Chuli Hu

2014-10-01

Full Text Available Accurate and fine-grained discovery by diverse Earth observation (EO sensors ensures a comprehensive response to collaborative observation-required emergency tasks. This discovery remains a challenge in an EO sensor web environment. In this study, we propose an EO sensor observation capability metadata model that reuses and extends the existing sensor observation-related metadata standards to enable the accurate and fine-grained discovery of EO sensors. The proposed model is composed of five sub-modules, namely, ObservationBreadth, ObservationDepth, ObservationFrequency, ObservationQuality and ObservationData. The model is applied to different types of EO sensors and is formalized by the Open Geospatial Consortium Sensor Model Language 1.0. The GeosensorQuery prototype retrieves the qualified EO sensors based on the provided geo-event. An actual application to flood emergency observation in the Yangtze River Basin in China is conducted, and the results indicate that sensor inquiry can accurately achieve fine-grained discovery of qualified EO sensors and obtain enriched observation capability information. In summary, the proposed model enables an efficient encoding system that ensures minimum unification to represent the observation capabilities of EO sensors. The model functions as a foundation for the efficient discovery of EO sensors. In addition, the definition and development of this proposed EO sensor observation capability metadata model is a helpful step in extending the Sensor Model Language (SensorML 2.0 Profile for the description of the observation capabilities of EO sensors.
ARIADNE: a tracking system for relationships in LHCb metadata

International Nuclear Information System (INIS)

Shapoval, I; Clemencic, M; Cattaneo, M

2014-01-01

The data processing model of the LHCb experiment implies handling of an evolving set of heterogeneous metadata entities and relationships between them. The entities range from software and databases states to architecture specificators and software/data deployment locations. For instance, there is an important relationship between the LHCb Conditions Database (CondDB), which provides versioned, time dependent geometry and conditions data, and the LHCb software, which is the data processing applications (used for simulation, high level triggering, reconstruction and analysis of physics data). The evolution of CondDB and of the LHCb applications is a weakly-homomorphic process. It means that relationships between a CondDB state and LHCb application state may not be preserved across different database and application generations. These issues may lead to various kinds of problems in the LHCb production, varying from unexpected application crashes to incorrect data processing results. In this paper we present Ariadne – a generic metadata relationships tracking system based on the novel NoSQL Neo4j graph database. Its aim is to track and analyze many thousands of evolving relationships for cases such as the one described above, and several others, which would otherwise remain unmanaged and potentially harmful. The highlights of the paper include the system's implementation and management details, infrastructure needed for running it, security issues, first experience of usage in the LHCb production and potential of the system to be applied to a wider set of LHCb tasks.
ARIADNE: a Tracking System for Relationships in LHCb Metadata

Science.gov (United States)

Shapoval, I.; Clemencic, M.; Cattaneo, M.

2014-06-01

The data processing model of the LHCb experiment implies handling of an evolving set of heterogeneous metadata entities and relationships between them. The entities range from software and databases states to architecture specificators and software/data deployment locations. For instance, there is an important relationship between the LHCb Conditions Database (CondDB), which provides versioned, time dependent geometry and conditions data, and the LHCb software, which is the data processing applications (used for simulation, high level triggering, reconstruction and analysis of physics data). The evolution of CondDB and of the LHCb applications is a weakly-homomorphic process. It means that relationships between a CondDB state and LHCb application state may not be preserved across different database and application generations. These issues may lead to various kinds of problems in the LHCb production, varying from unexpected application crashes to incorrect data processing results. In this paper we present Ariadne - a generic metadata relationships tracking system based on the novel NoSQL Neo4j graph database. Its aim is to track and analyze many thousands of evolving relationships for cases such as the one described above, and several others, which would otherwise remain unmanaged and potentially harmful. The highlights of the paper include the system's implementation and management details, infrastructure needed for running it, security issues, first experience of usage in the LHCb production and potential of the system to be applied to a wider set of LHCb tasks.
Complacency and Automation Bias in the Use of Imperfect Automation.

Science.gov (United States)

Wickens, Christopher D; Clegg, Benjamin A; Vieane, Alex Z; Sebok, Angelia L

2015-08-01

We examine the effects of two different kinds of decision-aiding automation errors on human-automation interaction (HAI), occurring at the first failure following repeated exposure to correctly functioning automation. The two errors are incorrect advice, triggering the automation bias, and missing advice, reflecting complacency. Contrasts between analogous automation errors in alerting systems, rather than decision aiding, have revealed that alerting false alarms are more problematic to HAI than alerting misses are. Prior research in decision aiding, although contrasting the two aiding errors (incorrect vs. missing), has confounded error expectancy. Participants performed an environmental process control simulation with and without decision aiding. For those with the aid, automation dependence was created through several trials of perfect aiding performance, and an unexpected automation error was then imposed in which automation was either gone (one group) or wrong (a second group). A control group received no automation support. The correct aid supported faster and more accurate diagnosis and lower workload. The aid failure degraded all three variables, but "automation wrong" had a much greater effect on accuracy, reflecting the automation bias, than did "automation gone," reflecting the impact of complacency. Some complacency was manifested for automation gone, by a longer latency and more modest reduction in accuracy. Automation wrong, creating the automation bias, appears to be a more problematic form of automation error than automation gone, reflecting complacency. Decision-aiding automation should indicate its lower degree of confidence in uncertain environments to avoid the automation bias. © 2015, Human Factors and Ergonomics Society.
The NOAA OneStop System: From Well-Curated Metadata to Data Discovery

Science.gov (United States)

McQuinn, E.; Jakositz, A.; Caldwell, A.; Delk, Z.; Neufeld, D.; Shapiro, J.; Partee, R.; Milan, A.

2017-12-01

The NOAA OneStop project is a pathfinder in the realm of enabling users to search for, discover, and access NOAA data. As the project continues along its path to maturity, it has become evident that three areas are of utmost importance to its success in the Earth science community: ensuring quality metadata, building a robust and scalable backend architecture, and keeping the user interface simple to use. Why is this the case? Because, simply put, we are dealing with all aspects of a Big Data problem: large volumes of disparate data needing to be quickly and easily processed and retrieved. In this presentation we discuss the three key aspects of OneStop architecture and how development in each area must be done through cross-team collaboration in order to succeed. We cover aspects of the web-based user interface and OneStop API and how metadata curators and software engineers have worked together to continually iterate on an ever-improving data discovery tool meant to be used by a variety of users searching across a broad assortment of data types.
SCDU Testbed Automated In-Situ Alignment, Data Acquisition and Analysis

Science.gov (United States)

Werne, Thomas A.; Wehmeier, Udo J.; Wu, Janet P.; An, Xin; Goullioud, Renaud; Nemati, Bijan; Shao, Michael; Shen, Tsae-Pyng J.; Wang, Xu; Weilert, Mark A.;

2010-01-01

In the course of fulfilling its mandate, the Spectral Calibration Development Unit (SCDU) testbed for SIM-Lite produces copious amounts of raw data. To effectively spend time attempting to understand the science driving the data, the team devised computerized automations to limit the time spent bringing the testbed to a healthy state and commanding it, and instead focus on analyzing the processed results. We developed a multi-layered scripting language that emphasized the scientific experiments we conducted, which drastically shortened our experiment scripts, improved their readability, and all-but-eliminated testbed operator errors. In addition to scientific experiment functions, we also developed a set of automated alignments that bring the testbed up to a well-aligned state with little more than the push of a button. These scripts were written in the scripting language, and in Matlab via an interface library, allowing all members of the team to augment the existing scripting language with complex analysis scripts. To keep track of these results, we created an easily-parseable state log in which we logged both the state of the testbed and relevant metadata. Finally, we designed a distributed processing system that allowed us to farm lengthy analyses to a collection of client computers which reported their results in a central log. Since these logs were parseable, we wrote query scripts that gave us an effortless way to compare results collected under different conditions. This paper serves as a case-study, detailing the motivating requirements for the decisions we made and explaining the implementation process.

GeoBoost: accelerating research involving the geospatial metadata of virus GenBank records.

Science.gov (United States)

Tahsin, Tasnia; Weissenbacher, Davy; O'Connor, Karen; Magge, Arjun; Scotch, Matthew; Gonzalez-Hernandez, Graciela

2018-05-01

GeoBoost is a command-line software package developed to address sparse or incomplete metadata in GenBank sequence records that relate to the location of the infected host (LOIH) of viruses. Given a set of GenBank accession numbers corresponding to virus GenBank records, GeoBoost extracts, integrates and normalizes geographic information reflecting the LOIH of the viruses using integrated information from GenBank metadata and related full-text publications. In addition, to facilitate probabilistic geospatial modeling, GeoBoost assigns probability scores for each possible LOIH. Binaries and resources required for running GeoBoost are packed into a single zipped file and freely available for download at https://tinyurl.com/geoboost. A video tutorial is included to help users quickly and easily install and run the software. The software is implemented in Java 1.8, and supported on MS Windows and Linux platforms. gragon@upenn.edu. Supplementary data are available at Bioinformatics online.
Phonion: Practical Protection of Metadata in Telephony Networks

Directory of Open Access Journals (Sweden)

Heuser Stephan

2017-01-01

Full Text Available The majority of people across the globe rely on telephony networks as their primary means of communication. As such, many of the most sensitive personal, corporate and government related communications pass through these systems every day. Unsurprisingly, such connections are subject to a wide range of attacks. Of increasing concern is the use of metadata contained in Call Detail Records (CDRs, which contain source, destination, start time and duration of a call. This information is potentially dangerous as the very act of two parties communicating can reveal significant details about their relationship and put them in the focus of targeted observation or surveillance, which is highly critical especially for journalists and activists. To address this problem, we develop the Phonion architecture to frustrate such attacks by separating call setup functions from call delivery. Specifically, Phonion allows users to preemptively establish call circuits across multiple providers and technologies before dialing into the circuit and does not require constant Internet connectivity. Since no single carrier can determine the ultimate destination of the call, it provides unlinkability for its users and helps them to avoid passive surveillance. We define and discuss a range of adversary classes and analyze why current obfuscation technologies fail to protect users against such metadata attacks. In our extensive evaluation we further analyze advanced anonymity technologies (e.g., VoIP over Tor, which do not preserve our functional requirements for high voice quality in the absence of constant broadband Internet connectivity and compatibility with landline and feature phones. Phonion is the first practical system to provide guarantees of unlinkable communication against a range of practical adversaries in telephony systems.
Taxonomic names, metadata, and the Semantic Web

Directory of Open Access Journals (Sweden)

Roderic D. M. Page

2006-01-01

Full Text Available Life Science Identifiers (LSIDs offer an attractive solution to the problem of globally unique identifiers for digital objects in biology. However, I suggest that in the context of taxonomic names, the most compelling benefit of adopting these identifiers comes from the metadata associated with each LSID. By using existing vocabularies wherever possible, and using a simple vocabulary for taxonomy-specific concepts we can quickly capture the essential information about a taxonomic name in the Resource Description Framework (RDF format. This opens up the prospect of using technologies developed for the Semantic Web to add ``taxonomic intelligence" to biodiversity databases. This essay explores some of these ideas in the context of providing a taxonomic framework for the phylogenetic database TreeBASE.
Evolution of the ATLAS Metadata Interface (AMI)

CERN Document Server

Odier, Jerome; The ATLAS collaboration; Fulachier, Jerome; Lambert, Fabian

2015-01-01

The ATLAS Metadata Interface (AMI) can be considered to be a mature application because it has existed for at least 10 years. Over the years, the number of users and the number of functions provided for these users has increased. It has been necessary to adapt the hardware infrastructure in a seamless way so that the Quality of Service remains high. We will describe the evolution of the application from the initial one, using single server with a MySQL backend database, to the current state, where we use a cluster of Virtual Machines on the French Tier 1 Cloud at Lyon, an ORACLE database backend also at Lyon, with replication to CERN using ORACLE streams behind a back-up server.
ENHANCING SEISMIC CALIBRATION RESEARCH THROUGH SOFTWARE AUTOMATION AND SCIENTIFIC INFORMATION MANAGEMENT

Energy Technology Data Exchange (ETDEWEB)

Ruppert, S; Dodge, D A; Ganzberger, M D; Hauk, T F; Matzel, E M

2008-07-03

The National Nuclear Security Administration (NNSA) Ground-Based Nuclear Explosion Monitoring Research and Development (GNEMRD) Program at LLNL continues to make significant progress enhancing the process of deriving seismic calibrations and performing scientific integration, analysis, and information management with software automation tools. Our tool efforts address the problematic issues of very large datasets and varied formats encountered during seismic calibration research. New information management and analysis tools have resulted in demonstrated gains in efficiency of producing scientific data products and improved accuracy of derived seismic calibrations. The foundation of a robust, efficient data development and processing environment is comprised of many components built upon engineered versatile libraries. We incorporate proven industry 'best practices' throughout our code and apply source code and bug tracking management as well as automatic generation and execution of unit tests for our experimental, development and production lines. Significant software engineering and development efforts have produced an object-oriented framework that provides database centric coordination between scientific tools, users, and data. Over a half billion parameters, signals, measurements, and metadata entries are all stored in a relational database accessed by an extensive object-oriented multi-technology software framework that includes stored procedures, real-time transactional database triggers and constraints, as well as coupled Java and C++ software libraries to handle the information interchange and validation requirements. Significant resources were applied to schema design to enable management of processing methods and station parameters, responses and metadata. This allowed for the development of merged ground-truth (GT) data sets compiled by the NNSA labs and AFTAC that include hundreds of thousands of events and tens of millions of arrivals. The
There's Trouble in Paradise: Problems with Educational Metadata Encountered during the MALTED Project.

Science.gov (United States)

Monthienvichienchai, Rachada; Sasse, M. Angela; Wheeldon, Richard

This paper investigates the usability of educational metadata schemas with respect to the case of the MALTED (Multimedia Authoring Language Teachers and Educational Developers) project at University College London (UCL). The project aims to facilitate authoring of multimedia materials for language learning by allowing teachers to share multimedia…
Identity and privacy. Unique in the shopping mall: on the reidentifiability of credit card metadata.

Science.gov (United States)

de Montjoye, Yves-Alexandre; Radaelli, Laura; Singh, Vivek Kumar; Pentland, Alex Sandy

2015-01-30

Large-scale data sets of human behavior have the potential to fundamentally transform the way we fight diseases, design cities, or perform research. Metadata, however, contain sensitive information. Understanding the privacy of these data sets is key to their broad use and, ultimately, their impact. We study 3 months of credit card records for 1.1 million people and show that four spatiotemporal points are enough to uniquely reidentify 90% of individuals. We show that knowing the price of a transaction increases the risk of reidentification by 22%, on average. Finally, we show that even data sets that provide coarse information at any or all of the dimensions provide little anonymity and that women are more reidentifiable than men in credit card metadata. Copyright © 2015, American Association for the Advancement of Science.
Study on Information Management for the Conservation of Traditional Chinese Architectural Heritage - 3d Modelling and Metadata Representation

Science.gov (United States)

Yen, Y. N.; Weng, K. H.; Huang, H. Y.

2013-07-01

After over 30 years of practise and development, Taiwan's architectural conservation field is moving rapidly into digitalization and its applications. Compared to modern buildings, traditional Chinese architecture has considerably more complex elements and forms. To document and digitize these unique heritages in their conservation lifecycle is a new and important issue. This article takes the caisson ceiling of the Taipei Confucius Temple, octagonal with 333 elements in 8 types, as a case study for digitization practise. The application of metadata representation and 3D modelling are the two key issues to discuss. Both Revit and SketchUp were appliedin this research to compare its effectiveness to metadata representation. Due to limitation of the Revit database, the final 3D models wasbuilt with SketchUp. The research found that, firstly, cultural heritage databasesmustconvey that while many elements are similar in appearance, they are unique in value; although 3D simulations help the general understanding of architectural heritage, software such as Revit and SketchUp, at this stage, could onlybe used tomodel basic visual representations, and is ineffective indocumenting additional critical data ofindividually unique elements. Secondly, when establishing conservation lifecycle information for application in management systems, a full and detailed presentation of the metadata must also be implemented; the existing applications of BIM in managing conservation lifecycles are still insufficient. Results of the research recommends SketchUp as a tool for present modelling needs, and BIM for sharing data between users, but the implementation of metadata representation is of the utmost importance.
Placing Music Artists and Songs in Time Using Editorial Metadata and Web Mining Techniques

NARCIS (Netherlands)

Bountouridis, D.; Veltkamp, R.C.; Balen, J.M.H. van

2013-01-01

This paper investigates the novel task of situating music artists and songs in time, thereby adding contextual information that typically correlates with an artist’s similarities, collaborations and influences. The proposed method makes use of editorial metadata in conjunction with web mining
Automated x-ray inspection of composites at northrop aircraft

International Nuclear Information System (INIS)

Murphy, W.J. Jr.; Nutter, R.L.; Patricelli, F.

1985-01-01

The Northrop automated x-ray inspection system (AXIS) has evolved from a research and development program initiated in 1981 to reduce increasing inspection costs; and reduce inspection times to stay abreast with increasing F/A-18A production. The goal of the program was to develop an automated production system that would meet existing inspection requirements; automate handling and alignment; and replace film for the inspection of F/A-18A composite assemblies and laminates. Originally, the program was supported completely by Northrop internal finding. However in 1984 it became part of the Navy Industrial Modernization Incentive Program (IMIP) with joint funding. The program was selected by the Navy because of its great potential to reduce and stabilize costs associated with F/A-18A inspections. Currently the AXIS is in the last stage of development with final integration expected by the end of July 1985 and production implementation by the end of the year. This paper briefly describes the equipment, and operation of the AXIS. Slides will be presented at the conference which will further illustrate the system; including inspection results
Library Automation Report, 1996. Multimedia Computers in U.S. Public Schools, 1995-96.

Science.gov (United States)

Quality Education Data, Inc., Denver, CO.

District library media directors face dual demands including competition for limited educational dollars and the need to meet increasingly sophisticated student research requests. To solve these dilemmas, many districts are automating their schools' library media centers. Quality Education Data (QED) is an education research firm providing…

Improving the driver-automation interaction: an approach using automation uncertainty.

Science.gov (United States)

Beller, Johannes; Heesen, Matthias; Vollrath, Mark

2013-12-01

The aim of this study was to evaluate whether communicating automation uncertainty improves the driver-automation interaction. A false system understanding of infallibility may provoke automation misuse and can lead to severe consequences in case of automation failure. The presentation of automation uncertainty may prevent this false system understanding and, as was shown by previous studies, may have numerous benefits. Few studies, however, have clearly shown the potential of communicating uncertainty information in driving. The current study fills this gap. We conducted a driving simulator experiment, varying the presented uncertainty information between participants (no uncertainty information vs. uncertainty information) and the automation reliability (high vs.low) within participants. Participants interacted with a highly automated driving system while engaging in secondary tasks and were required to cooperate with the automation to drive safely. Quantile regressions and multilevel modeling showed that the presentation of uncertainty information increases the time to collision in the case of automation failure. Furthermore, the data indicated improved situation awareness and better knowledge of fallibility for the experimental group. Consequently, the automation with the uncertainty symbol received higher trust ratings and increased acceptance. The presentation of automation uncertaintythrough a symbol improves overall driver-automation cooperation. Most automated systems in driving could benefit from displaying reliability information. This display might improve the acceptance of fallible systems and further enhances driver-automation cooperation.
Semi-automated ontology generation and evolution

Science.gov (United States)

Stirtzinger, Anthony P.; Anken, Craig S.

2009-05-01

Extending the notion of data models or object models, ontology can provide rich semantic definition not only to the meta-data but also to the instance data of domain knowledge, making these semantic definitions available in machine readable form. However, the generation of an effective ontology is a difficult task involving considerable labor and skill. This paper discusses an Ontology Generation and Evolution Processor (OGEP) aimed at automating this process, only requesting user input when un-resolvable ambiguous situations occur. OGEP directly attacks the main barrier which prevents automated (or self learning) ontology generation: the ability to understand the meaning of artifacts and the relationships the artifacts have to the domain space. OGEP leverages existing lexical to ontological mappings in the form of WordNet, and Suggested Upper Merged Ontology (SUMO) integrated with a semantic pattern-based structure referred to as the Semantic Grounding Mechanism (SGM) and implemented as a Corpus Reasoner. The OGEP processing is initiated by a Corpus Parser performing a lexical analysis of the corpus, reading in a document (or corpus) and preparing it for processing by annotating words and phrases. After the Corpus Parser is done, the Corpus Reasoner uses the parts of speech output to determine the semantic meaning of a word or phrase. The Corpus Reasoner is the crux of the OGEP system, analyzing, extrapolating, and evolving data from free text into cohesive semantic relationships. The Semantic Grounding Mechanism provides a basis for identifying and mapping semantic relationships. By blending together the WordNet lexicon and SUMO ontological layout, the SGM is given breadth and depth in its ability to extrapolate semantic relationships between domain entities. The combination of all these components results in an innovative approach to user assisted semantic-based ontology generation. This paper will describe the OGEP technology in the context of the architectural
Effects of an Automated Maintenance Management System on organizational communication

International Nuclear Information System (INIS)

Bauman, M.B.; VanCott, H.P.

1988-01-01

The primary purpose of the project was to evaluate the effectiveness of two techniques for improving organizational communication: (1) an Automated Maintenance Management System (AMMS) and (2) Interdepartmental Coordination Meetings. Additional objectives concerned the preparation of functional requirements for an AMMS, and training modules to improve group communication skills. Four nuclear power plants participated in the evaluation. Two plants installed AMMSs, one plant instituted interdepartmental job coordination meetings, and the fourth plant served as a control for the evaluation. Questionnaires and interviews were used to collect evaluative data. The evaluation focused on five communication or information criteria: timeliness, redundancy, withholding or gatekeeping, feedback, and accuracy/amount
Transitioning Resolution Responsibility between the Controller and Automation Team in Simulated NextGen Separation Assurance

Science.gov (United States)

Cabrall, C.; Gomez, A.; Homola, J.; Hunt, S..; Martin, L.; Merccer, J.; Prevott, T.

2013-01-01

As part of an ongoing research effort on separation assurance and functional allocation in NextGen, a controller- in-the-loop study with ground-based automation was conducted at NASA Ames' Airspace Operations Laboratory in August 2012 to investigate the potential impact of introducing self-separating aircraft in progressively advanced NextGen timeframes. From this larger study, the current exploratory analysis of controller-automation interaction styles focuses on the last and most far-term time frame. Measurements were recorded that firstly verified the continued operational validity of this iteration of the ground-based functional allocation automation concept in forecast traffic densities up to 2x that of current day high altitude en-route sectors. Additionally, with greater levels of fully automated conflict detection and resolution as well as the introduction of intervention functionality, objective and subjective analyses showed a range of passive to active controller- automation interaction styles between the participants. Not only did the controllers work with the automation to meet their safety and capacity goals in the simulated future NextGen timeframe, they did so in different ways and with different attitudes of trust/use of the automation. Taken as a whole, the results showed that the prototyped controller-automation functional allocation framework was very flexible and successful overall.
Laboratory automation: a challenge for the 1990s.

Science.gov (United States)

Mordini, C

1994-01-01

THERE IS TREMENDOUS PRESSURE ON INDUSTRY AND LABORATORIES TO DEVELOP INCREASINGLY COMPLEX PROCUCTS: for example catalysts, chiral chemicals, drugs and ceramics; conform to regulations; cope with increasingly severe competition; and meet steadily increasing costs. It is difficult, in this situation, to remain productive and competitive. It is vital to be equipped with, and be able to use appropriately, all the suitable methodologies and technologies. Working methods and personnel have to be appropriate. The future depends on three interdependent domains: automation in the broadest sense of the word, instrumentation and information systems. The easy work has already been done. Between 1984 and 1990, it was a question of going from nothing to something; now, it is necessary to increase and optimize.THEREFORE, THE CRUCIAL QUESTION IS NOW: 'how can we go quicker in experimentation and acquire more knowledge, while spending less money?' One solution is to use all the aspects of automation (robotics, instrumentation, data). Successful laboratory automation depends.on: shortened time to market; improved efficiency/cost ratio; motivation/competence/ expertise; communication; and knowledge acquisition. This paper examines some of the major technological areas of application.
Virtual commissioning of automated micro-optical assembly

Science.gov (United States)

Schlette, Christian; Losch, Daniel; Haag, Sebastian; Zontar, Daniel; Roßmann, Jürgen; Brecher, Christian

2015-02-01

In this contribution, we present a novel approach to enable virtual commissioning for process developers in micro-optical assembly. Our approach aims at supporting micro-optics experts to effectively develop assisted or fully automated assembly solutions without detailed prior experience in programming while at the same time enabling them to easily implement their own libraries of expert schemes and algorithms for handling optical components. Virtual commissioning is enabled by a 3D simulation and visualization system in which the functionalities and properties of automated systems are modeled, simulated and controlled based on multi-agent systems. For process development, our approach supports event-, state- and time-based visual programming techniques for the agents and allows for their kinematic motion simulation in combination with looped-in simulation results for the optical components. First results have been achieved for simply switching the agents to command the real hardware setup after successful process implementation and validation in the virtual environment. We evaluated and adapted our system to meet the requirements set by industrial partners-- laser manufacturers as well as hardware suppliers of assembly platforms. The concept is applied to the automated assembly of optical components for optically pumped semiconductor lasers and positioning of optical components for beam-shaping
An Automatic Indicator of the Reusability of Learning Objects Based on Metadata That Satisfies Completeness Criteria

Science.gov (United States)

Sanz-Rodríguez, Javier; Margaritopoulos, Merkourios; Margaritopoulos, Thomas; Dodero, Juan Manuel; Sánchez-Alonso, Salvador; Manitsaris, Athanasios

The search for learning objects in open repositories is currently a tedious task, owing to the vast amount of resources available and the fact that most of them do not have associated ratings to help users make a choice. In order to tackle this problem, we propose a reusability indicator, which can be calculated automatically using the metadata that describes the objects, allowing us to select those materials most likely to be reused. In order for this reusability indicator to be applied, metadata records must reach a certain amount of completeness, guaranteeing that the material is adequately described. This reusability indicator is tested in two studies on the Merlot and eLera repositories, and results obtained offer evidence to support their effectiveness.
A Geospatial Data Recommender System based on Metadata and User Behaviour

Science.gov (United States)

Li, Y.; Jiang, Y.; Yang, C. P.; Armstrong, E. M.; Huang, T.; Moroni, D. F.; Finch, C. J.; McGibbney, L. J.

2017-12-01

Earth observations are produced in a fast velocity through real time sensors, reaching tera- to peta- bytes of geospatial data daily. Discovering and accessing the right data from the massive geospatial data is like finding needle in the haystack. To help researchers find the right data for study and decision support, quite a lot of research focusing on improving search performance have been proposed including recommendation algorithm. However, few papers have discussed the way to implement a recommendation algorithm in geospatial data retrieval system. In order to address this problem, we propose a recommendation engine to improve discovering relevant geospatial data by mining and utilizing metadata and user behavior data: 1) metadata based recommendation considers the correlation of each attribute (i.e., spatiotemporal, categorical, and ordinal) to data to be found. In particular, phrase extraction method is used to improve the accuracy of the description similarity; 2) user behavior data are utilized to predict the interest of a user through collaborative filtering; 3) an integration method is designed to combine the results of the above two methods to achieve better recommendation Experiments show that in the hybrid recommendation list, the all the precisions are larger than 0.8 from position 1 to 10.
A data automation system at Los Alamos National Laboratory

International Nuclear Information System (INIS)

Betts, S.E.; Schneider, C.M.; Pickrell, M.M.

2001-01-01

Idaho National Engineering and Environmental Laboratory (INEEL) has developed an automated computer program, Data Review Expert System (DRXS), for reviewing nondestructive assay (NDA) data. DRXS significantly reduces the data review time needed to meet characterization requirements for the Waste Isolation Pilot Plant (WIPP). Los Alamos National Laboratory (LANL) is in the process of developing a computer program, Software System Logic for Intelligent Certification (SSLIC), to automate other tasks associa ted with characterization of Transuranic Waste (TRU) samples. LANL has incorporated a version of DRXS specific to LANL's isotopic data into SSLIC. This version of SSLIC was audited by the National Transuranic Program on October, 24, 2001. This paper will present the results of the audit, and discuss future plans for SSLIC including the integration on the INEELLANL developed Rule Editor.
Remote handling and automation in back end of fuel cycle

International Nuclear Information System (INIS)

Nair, K.N.S.

2010-01-01

Full text: Indian nuclear programme is readying for a quantum leap and it is essential that technology is available for building advanced fuel recycle plants in the back end and for sustained operation of such plants. Remote technology and automation plays a big role to achieve this goal. With the introduction of advanced fuel cycles in indigenous programme and scenario of international cooperation it is essential to be ready with indigenous technology for meeting all challenges. Work has been progressing to develop locally support technology for remote handling and automation with good success. Essential RH tools such as master slave manipulators, power manipulators and hot cell viewing systems have been developed and commercial production has been established. Customised RH requirements for back end plants have been met and the designs have proven to be worthy for hot operations over the years. In the last few years stress has been on development of equipment and technology to meet the increasing demands of higher throughput plants. Substantial progress has been achieved in the head end and reconversion laboratory systems of reprocessing plants. Similarly successful efforts have also been made for establishing Thoria processing cells and also the RH in the reconversion operations. Custom designed equipment has been developed for decommissioning of ceramic melter, used glove boxes etc. Efforts are on hand to develop automated RH equipment for material handling in underground repositories. This paper aims at bringing out the theme based on some of our own experiences and some reports from plants in operation abroad. (author)
An open framework for automated chemical hazard assessment based on GreenScreen for Safer Chemicals: A proof of concept.

Science.gov (United States)

Wehage, Kristopher; Chenhansa, Panan; Schoenung, Julie M

2017-01-01

GreenScreen® for Safer Chemicals is a framework for comparative chemical hazard assessment. It is the first transparent, open and publicly accessible framework of its kind, allowing manufacturers and governmental agencies to make informed decisions about the chemicals and substances used in consumer products and buildings. In the GreenScreen® benchmarking process, chemical hazards are assessed and classified based on 18 hazard endpoints from up to 30 different sources. The result is a simple numerical benchmark score and accompanying assessment report that allows users to flag chemicals of concern and identify safer alternatives. Although the screening process is straightforward, aggregating and sorting hazard data is tedious, time-consuming, and prone to human error. In light of these challenges, the present work demonstrates the usage of automation to cull chemical hazard data from publicly available internet resources, assign metadata, and perform a GreenScreen® hazard assessment using the GreenScreen® "List Translator." The automated technique, written as a module in the Python programming language, generates GreenScreen® List Translation data for over 3000 chemicals in approximately 30 s. Discussion of the potential benefits and limitations of automated techniques is provided. By embedding the library into a web-based graphical user interface, the extensibility of the library is demonstrated. The accompanying source code is made available to the hazard assessment community. Integr Environ Assess Manag 2017;13:167-176. © 2016 SETAC. © 2016 SETAC.
Advancing haemostasis automation--successful implementation of robotic centrifugation and sample processing in a tertiary service hospital.

Science.gov (United States)

Sédille-Mostafaie, Nazanin; Engler, Hanna; Lutz, Susanne; Korte, Wolfgang

2013-06-01

Laboratories today face increasing pressure to automate operations due to increasing workloads and the need to reduce expenditure. Few studies to date have focussed on the laboratory automation of preanalytical coagulation specimen processing. In the present study, we examined whether a clinical chemistry automation protocol meets the preanalytical requirements for the analyses of coagulation. During the implementation of laboratory automation, we began to operate a pre- and postanalytical automation system. The preanalytical unit processes blood specimens for chemistry, immunology and coagulation by automated specimen processing. As the production of platelet-poor plasma is highly dependent on optimal centrifugation, we examined specimen handling under different centrifugation conditions in order to produce optimal platelet deficient plasma specimens. To this end, manually processed models centrifuged at 1500 g for 5 and 20 min were compared to an automated centrifugation model at 3000 g for 7 min. For analytical assays that are performed frequently enough to be targets for full automation, Passing-Bablok regression analysis showed close agreement between different centrifugation methods, with a correlation coefficient between 0.98 and 0.99 and a bias between -5% and +6%. For seldom performed assays that do not mandate full automation, the Passing-Bablok regression analysis showed acceptable to poor agreement between different centrifugation methods. A full automation solution is suitable and can be recommended for frequent haemostasis testing.
Using microwave Doppler radar in automated manufacturing applications

Science.gov (United States)

Smith, Gregory C.

Since the beginning of the Industrial Revolution, manufacturers worldwide have used automation to improve productivity, gain market share, and meet growing or changing consumer demand for manufactured products. To stimulate further industrial productivity, manufacturers need more advanced automation technologies: "smart" part handling systems, automated assembly machines, CNC machine tools, and industrial robots that use new sensor technologies, advanced control systems, and intelligent decision-making algorithms to "see," "hear," "feel," and "think" at the levels needed to handle complex manufacturing tasks without human intervention. The investigator's dissertation offers three methods that could help make "smart" CNC machine tools and industrial robots possible: (1) A method for detecting acoustic emission using a microwave Doppler radar detector, (2) A method for detecting tool wear on a CNC lathe using a Doppler radar detector, and (3) An online non-contact method for detecting industrial robot position errors using a microwave Doppler radar motion detector. The dissertation studies indicate that microwave Doppler radar could be quite useful in automated manufacturing applications. In particular, the methods developed may help solve two difficult problems that hinder further progress in automating manufacturing processes: (1) Automating metal-cutting operations on CNC machine tools by providing a reliable non-contact method for detecting tool wear, and (2) Fully automating robotic manufacturing tasks by providing a reliable low-cost non-contact method for detecting on-line position errors. In addition, the studies offer a general non-contact method for detecting acoustic emission that may be useful in many other manufacturing and non-manufacturing areas, as well (e.g., monitoring and nondestructively testing structures, materials, manufacturing processes, and devices). By advancing the state of the art in manufacturing automation, the studies may help
Constructing a Cross-Domain Resource Inventory: Key Components and Results of the EarthCube CINERGI Project.

Science.gov (United States)

Zaslavsky, I.; Richard, S. M.; Malik, T.; Hsu, L.; Gupta, A.; Grethe, J. S.; Valentine, D. W., Jr.; Lehnert, K. A.; Bermudez, L. E.; Ozyurt, I. B.; Whitenack, T.; Schachne, A.; Giliarini, A.

2015-12-01

While many geoscience-related repositories and data discovery portals exist, finding information about available resources remains a pervasive problem, especially when searching across multiple domains and catalogs. Inconsistent and incomplete metadata descriptions, disparate access protocols and semantic differences across domains, and troves of unstructured or poorly structured information which is hard to discover and use are major hindrances toward discovery, while metadata compilation and curation remain manual and time-consuming. We report on methodology, main results and lessons learned from an ongoing effort to develop a geoscience-wide catalog of information resources, with consistent metadata descriptions, traceable provenance, and automated metadata enhancement. Developing such a catalog is the central goal of CINERGI (Community Inventory of EarthCube Resources for Geoscience Interoperability), an EarthCube building block project (earthcube.org/group/cinergi). The key novel technical contributions of the projects include: a) development of a metadata enhancement pipeline and a set of document enhancers to automatically improve various aspects of metadata descriptions, including keyword assignment and definition of spatial extents; b) Community Resource Viewers: online applications for crowdsourcing community resource registry development, curation and search, and channeling metadata to the unified CINERGI inventory, c) metadata provenance, validation and annotation services, d) user interfaces for advanced resource discovery; and e) geoscience-wide ontology and machine learning to support automated semantic tagging and faceted search across domains. We demonstrate these CINERGI components in three types of user scenarios: (1) improving existing metadata descriptions maintained by government and academic data facilities, (2) supporting work of several EarthCube Research Coordination Network projects in assembling information resources for their domains
Initial Assessment and Modeling Framework Development for Automated Mobility Districts: Preprint

Energy Technology Data Exchange (ETDEWEB)

Hou, Yi [National Renewable Energy Laboratory (NREL), Golden, CO (United States); Young, Stanley E [National Renewable Energy Laboratory (NREL), Golden, CO (United States); Garikapati, Venu [National Renewable Energy Laboratory (NREL), Golden, CO (United States); Chen, Yuche [National Renewable Energy Laboratory (NREL), Golden, CO (United States); Zhu, Lei [National Renewable Energy Laboratory (NREL), Golden, CO (United States)

2018-02-07

Automated vehicles (AVs) are increasingly being discussed as the basis for on-demand mobility services, introducing a new paradigm in which a fleet of AVs displaces private automobiles for day-to-day travel in dense activity districts. This paper examines a concept to displace privately owned automobiles within a region containing dense activity generators (jobs, retail, entertainment, etc.), referred to as an automated mobility district (AMD). This paper reviews several such districts, including airports, college campuses, business parks, downtown urban cores, and military bases, with examples of previous attempts to meet the mobility needs apart from private automobiles, some with automated technology and others with more traditional transit-based solutions. The issues and benefits of AMDs are framed within the perspective of intra-district, inter-district, and border issues, and the requirements for a modeling framework are identified to adequately reflect the breadth of mobility, energy, and emissions impact anticipated with AMDs
Automated damage test facilities for materials development and production optic quality assurance at Lawrence Livermore National Laboratory

International Nuclear Information System (INIS)

Battersby, C.; Dickson, R.; Jennings, R.; Kimmons, J.; Kozlowski, M. R.; Maricle, S.; Mouser, R.; Runkel, M.; Schwartz, S.; Sheehan, L. M.; Weinzapfel, C.

1998-01-01

The Laser Program at LLNL has developed automated facilities for damage testing optics up to 1 meter in diameter. The systems were developed to characterize the statistical distribution of localized damage performance across large-aperture National Ignition Facility optics. Full aperture testing is a key component of the quality assurance program for several of the optical components. The primary damage testing methods used are R:1 mapping and raster scanning. Automation of these test methods was required to meet the optics manufacturing schedule. The automated activities include control and diagnosis of the damage-test laser beam as well as detection and characterization of damage events
Automated transport and sorting system in a large reference laboratory: part 1. Evaluation of needs and alternatives and development of a plan.

Science.gov (United States)

Hawker, Charles D; Garr, Susan B; Hamilton, Leslie T; Penrose, John R; Ashwood, Edward R; Weiss, Ronald L

2002-10-01

Our laboratory, a large, commercial, esoteric reference laboratory, sought some form of total laboratory automation to keep pace with rapid growth of specimen volumes as well as to meet competitive demands for cost reduction and improved turnaround time. We conducted a systematic evaluation of our needs, which led to the development of a plan to implement an automated transport and sorting system. We systematically analyzed and studied our specimen containers, test submission requirements and temperatures, and the workflow and movement of people, specimens, and information throughout the laboratory. We performed an intricate timing study that identified bottlenecks in our manual handling processes. We also evaluated various automation options. The automation alternative viewed to best meet our needs was a transport and sorting system from MDS AutoLab. Our comprehensive plan also included a new standardized transport tube; a centralized automated core laboratory for higher volume tests; a new "automation-friendly" software system for order entry, tracking, and process control; a complete reengineering of our order-entry, handling, and tracking processes; and remodeling of our laboratory facility and specimen processing area. The scope of this project and its potential impact on overall laboratory operations and performance justified the extensive time we invested (nearly 4 years) in a systematic approach to the evaluation, design, and planning of this project.
Event metadata records as a testbed for scalable data mining

International Nuclear Information System (INIS)

Gemmeren, P van; Malon, D

2010-01-01

At a data rate of 200 hertz, event metadata records ('TAGs,' in ATLAS parlance) provide fertile grounds for development and evaluation of tools for scalable data mining. It is easy, of course, to apply HEP-specific selection or classification rules to event records and to label such an exercise 'data mining,' but our interest is different. Advanced statistical methods and tools such as classification, association rule mining, and cluster analysis are common outside the high energy physics community. These tools can prove useful, not for discovery physics, but for learning about our data, our detector, and our software. A fixed and relatively simple schema makes TAG export to other storage technologies such as HDF5 straightforward. This simplifies the task of exploiting very-large-scale parallel platforms such as Argonne National Laboratory's BlueGene/P, currently the largest supercomputer in the world for open science, in the development of scalable tools for data mining. Using a domain-neutral scientific data format may also enable us to take advantage of existing data mining components from other communities. There is, further, a substantial literature on the topic of one-pass algorithms and stream mining techniques, and such tools may be inserted naturally at various points in the event data processing and distribution chain. This paper describes early experience with event metadata records from ATLAS simulation and commissioning as a testbed for scalable data mining tool development and evaluation.
Social Web Content Enhancement in a Distance Learning Environment: Intelligent Metadata Generation for Resources

Science.gov (United States)

García-Floriano, Andrés; Ferreira-Santiago, Angel; Yáñez-Márquez, Cornelio; Camacho-Nieto, Oscar; Aldape-Pérez, Mario; Villuendas-Rey, Yenny

2017-01-01

Social networking potentially offers improved distance learning environments by enabling the exchange of resources between learners. The existence of properly classified content results in an enhanced distance learning experience in which appropriate materials can be retrieved efficiently; however, for this to happen, metadata needs to be present.…
Simplifying the Reuse and Interoperability of Geoscience Data Sets and Models with Semantic Metadata that is Human-Readable and Machine-actionable

Science.gov (United States)

Peckham, S. D.

2017-12-01

Standardized, deep descriptions of digital resources (e.g. data sets, computational models, software tools and publications) make it possible to develop user-friendly software systems that assist scientists with the discovery and appropriate use of these resources. Semantic metadata makes it possible for machines to take actions on behalf of humans, such as automatically identifying the resources needed to solve a given problem, retrieving them and then automatically connecting them (despite their heterogeneity) into a functioning workflow. Standardized model metadata also helps model users to understand the important details that underpin computational models and to compare the capabilities of different models. These details include simplifying assumptions on the physics, governing equations and the numerical methods used to solve them, discretization of space (the grid) and time (the time-stepping scheme), state variables (input or output), model configuration parameters. This kind of metadata provides a "deep description" of a computational model that goes well beyond other types of metadata (e.g. author, purpose, scientific domain, programming language, digital rights, provenance, execution) and captures the science that underpins a model. A carefully constructed, unambiguous and rules-based schema to address this problem, called the Geoscience Standard Names ontology will be presented that utilizes Semantic Web best practices and technologies. It has also been designed to work across science domains and to be readable by both humans and machines.

A Metadata based Knowledge Discovery Methodology for Seeding Translational Research.

Science.gov (United States)

Kothari, Cartik R; Payne, Philip R O

2015-01-01

In this paper, we present a semantic, metadata based knowledge discovery methodology for identifying teams of researchers from diverse backgrounds who can collaborate on interdisciplinary research projects: projects in areas that have been identified as high-impact areas at The Ohio State University. This methodology involves the semantic annotation of keywords and the postulation of semantic metrics to improve the efficiency of the path exploration algorithm as well as to rank the results. Results indicate that our methodology can discover groups of experts from diverse areas who can collaborate on translational research projects.
A standard analysis method (SAM) for the automated analysis of polychlorinated biphenyls (PCBs) in soils using the chemical analysis automation (CAA) paradigm: validation and performance

International Nuclear Information System (INIS)

Rzeszutko, C.; Johnson, C.R.; Monagle, M.; Klatt, L.N.

1997-10-01

The Chemical Analysis Automation (CAA) program is developing a standardized modular automation strategy for chemical analysis. In this automation concept, analytical chemistry is performed with modular building blocks that correspond to individual elements of the steps in the analytical process. With a standardized set of behaviors and interactions, these blocks can be assembled in a 'plug and play' manner into a complete analysis system. These building blocks, which are referred to as Standard Laboratory Modules (SLM), interface to a host control system that orchestrates the entire analytical process, from sample preparation through data interpretation. The integrated system is called a Standard Analysis Method (SAME). A SAME for the automated determination of Polychlorinated Biphenyls (PCB) in soils, assembled in a mobile laboratory, is undergoing extensive testing and validation. The SAME consists of the following SLMs: a four channel Soxhlet extractor, a High Volume Concentrator, column clean up, a gas chromatograph, a PCB data interpretation module, a robot, and a human- computer interface. The SAME is configured to meet the requirements specified in U.S. Environmental Protection Agency's (EPA) SW-846 Methods 3541/3620A/8082 for the analysis of pcbs in soils. The PCB SAME will be described along with the developmental test plan. Performance data obtained during developmental testing will also be discussed
Low cost automation

International Nuclear Information System (INIS)

1987-03-01

This book indicates method of building of automation plan, design of automation facilities, automation and CHIP process like basics of cutting, NC processing machine and CHIP handling, automation unit, such as drilling unit, tapping unit, boring unit, milling unit and slide unit, application of oil pressure on characteristics and basic oil pressure circuit, application of pneumatic, automation kinds and application of process, assembly, transportation, automatic machine and factory automation.
Proceedings of the distribution automation seminar. CD-ROM ed.

International Nuclear Information System (INIS)

2003-01-01

Electric utilities are being driven to improve the utilization of their distribution system assets while reducing life cycle costs. This seminar provided an opportunity for electric utilities to share their experience and knowledge about the constantly evolving technologies that apply to distributed automation. Customers and their representatives place increased priority on regulatory commissions to achieve reliability and push the conventional use of distribution automation into rural areas. Various options are under consideration by managers to incorporate a variety of distributed generation resources. Several papers highlighted technical aspects as they relate to applications to meet the changing needs of utilities. The latest products and technologies in the field were on display. The seminar sessions included: business cases; utility experience and applications; utility experience and projects; and, technology and equipment. Eight presentations were indexed separately for inclusion in this database
Production and quality assurance automation in the Goddard Space Flight Center Flight Dynamics Facility

Science.gov (United States)

Chapman, K. B.; Cox, C. M.; Thomas, C. W.; Cuevas, O. O.; Beckman, R. M.

1994-01-01

The Flight Dynamics Facility (FDF) at the NASA Goddard Space Flight Center (GSFC) generates numerous products for NASA-supported spacecraft, including the Tracking and Data Relay Satellites (TDRS's), the Hubble Space Telescope (HST), the Extreme Ultraviolet Explorer (EUVE), and the space shuttle. These products include orbit determination data, acquisition data, event scheduling data, and attitude data. In most cases, product generation involves repetitive execution of many programs. The increasing number of missions supported by the FDF has necessitated the use of automated systems to schedule, execute, and quality assure these products. This automation allows the delivery of accurate products in a timely and cost-efficient manner. To be effective, these systems must automate as many repetitive operations as possible and must be flexible enough to meet changing support requirements. The FDF Orbit Determination Task (ODT) has implemented several systems that automate product generation and quality assurance (QA). These systems include the Orbit Production Automation System (OPAS), the New Enhanced Operations Log (NEOLOG), and the Quality Assurance Automation Software (QA Tool). Implementation of these systems has resulted in a significant reduction in required manpower, elimination of shift work and most weekend support, and improved support quality, while incurring minimal development cost. This paper will present an overview of the concepts used and experiences gained from the implementation of these automation systems.
FIR: An Effective Scheme for Extracting Useful Metadata from Social Media.

Science.gov (United States)

Chen, Long-Sheng; Lin, Zue-Cheng; Chang, Jing-Rong

2015-11-01

Recently, the use of social media for health information exchange is expanding among patients, physicians, and other health care professionals. In medical areas, social media allows non-experts to access, interpret, and generate medical information for their own care and the care of others. Researchers paid much attention on social media in medical educations, patient-pharmacist communications, adverse drug reactions detection, impacts of social media on medicine and healthcare, and so on. However, relatively few papers discuss how to extract useful knowledge from a huge amount of textual comments in social media effectively. Therefore, this study aims to propose a Fuzzy adaptive resonance theory network based Information Retrieval (FIR) scheme by combining Fuzzy adaptive resonance theory (ART) network, Latent Semantic Indexing (LSI), and association rules (AR) discovery to extract knowledge from social media. In our FIR scheme, Fuzzy ART network firstly has been employed to segment comments. Next, for each customer segment, we use LSI technique to retrieve important keywords. Then, in order to make the extracted keywords understandable, association rules mining is presented to organize these extracted keywords to build metadata. These extracted useful voices of customers will be transformed into design needs by using Quality Function Deployment (QFD) for further decision making. Unlike conventional information retrieval techniques which acquire too many keywords to get key points, our FIR scheme can extract understandable metadata from social media.
Radiological dose and metadata management; Radiologisches Dosis- und Metadatenmanagement

Energy Technology Data Exchange (ETDEWEB)

Walz, M.; Madsack, B. [TUeV SUeD Life Service GmbH, Aerztliche Stelle fuer Qualitaetssicherung in der Radiologie, Nuklearmedizin und Strahlentherapie Hessen, Frankfurt (Germany); Kolodziej, M. [INFINITT Europe GmbH, Frankfurt/M (Germany)

2016-12-15

This article describes the features of management systems currently available in Germany for extraction, registration and evaluation of metadata from radiological examinations, particularly in the digital imaging and communications in medicine (DICOM) environment. In addition, the probable relevant developments in this area concerning radiation protection legislation, terminology, standardization and information technology are presented. (orig.) [German] Dieser Artikel stellt die aktuell in Deutschland verfuegbaren Funktionen von Managementsystemen zur Erfassung und Auswertung von Metadaten zu radiologischen Untersuchungen insbesondere im DICOM-Umfeld (Digital Imaging and Communications in Medicine) vor. Ausserdem werden die in diesem Bereich voraussichtlich relevanten Entwicklungen von Strahlenschutzgesetzgebung ueber Terminologie und Standardisierung bis zu informationstechnischen Aspekten dargestellt. (orig.)
Improve data integration performance by employing metadata management utility

Energy Technology Data Exchange (ETDEWEB)

Wei, M.; Sung, A.H. [New Mexico Petroleum Recovery Research Center, Socorro, NM (United States)

2005-07-01

This conference paper explored ways of integrating petroleum and exploration data obtained from different sources in order to provide more comprehensive data for various analysis purposes and to improve the integrity and consistency of integrated data. This paper proposes a methodology to enhance oil and gas industry data integration performance by cooperating data management utilities in Microsoft SQL Server database management system (DBMS) for small scale data integration without support of commercial software. By semi-automatically capturing metadata, data sources are investigated in detail, data quality problems are partially cleansed, and the performance of data integration is improved. 20 refs., 7 tabs., 1 fig.
[Automated measurement of distance vision based on the DIN strategy].

Science.gov (United States)

Effert, R; Steinmetz, H; Jansen, W; Rau, G; Reim, M

1989-07-01

A method for automated measurement of far vision is described which meets the test requirements laid down in the new DIN standards. The subject sits 5 m from a high-resolution monitor on which either Landolt rings or Snellen's types are generated by a computer. By moving a joystick the subject indicates to the computer whether he can see the critical detail (e.g., the direction of opening of the Landolt ring). Depending on the subject's input and the course of the test so far, the computer generates the next test symbol until the threshold criterion is reached. The sequence of presentation of the symbols and the threshold criterion are also in accordance with the DIN standard. Initial measurements of far vision using this automated system produced similar results to those obtained by conventional methods.
Exploring metadata standards for competence descriptions in the business & management domain

OpenAIRE

Stergioulas, L

2011-01-01

This paper explores the development and use of competency metadata standards. As there has recently been a surge of a number of standards to address the challenge of representing competencies and there is a rising need to develop a common methodology, as well as methods and tools for developing, reusing, adapting, integrating such standards, this research is now becoming important and timely. We explore this within the context of the OpenScout project, which is building a federation of reposi...
Autonomy and Automation

Science.gov (United States)

Shively, Jay

2017-01-01

A significant level of debate and confusion has surrounded the meaning of the terms autonomy and automation. Automation is a multi-dimensional concept, and we propose that Remotely Piloted Aircraft Systems (RPAS) automation should be described with reference to the specific system and task that has been automated, the context in which the automation functions, and other relevant dimensions. In this paper, we present definitions of automation, pilot in the loop, pilot on the loop and pilot out of the loop. We further propose that in future, the International Civil Aviation Organization (ICAO) RPAS Panel avoids the use of the terms autonomy and autonomous when referring to automated systems on board RPA. Work Group 7 proposes to develop, in consultation with other workgroups, a taxonomy of Levels of Automation for RPAS.
Hyper Text Mark-up Language and Dublin Core metadata element set usage in websites of Iranian State Universities’ libraries

Science.gov (United States)

Zare-Farashbandi, Firoozeh; Ramezan-Shirazi, Mahtab; Ashrafi-Rizi, Hasan; Nouri, Rasool

2014-01-01

Introduction: Recent progress in providing innovative solutions in the organization of electronic resources and research in this area shows a global trend in the use of new strategies such as metadata to facilitate description, place for, organization and retrieval of resources in the web environment. In this context, library metadata standards have a special place; therefore, the purpose of the present study has been a comparative study on the Central Libraries’ Websites of Iran State Universities for Hyper Text Mark-up Language (HTML) and Dublin Core metadata elements usage in 2011. Materials and Methods: The method of this study is applied-descriptive and data collection tool is the check lists created by the researchers. Statistical community includes 98 websites of the Iranian State Universities of the Ministry of Health and Medical Education and Ministry of Science, Research and Technology and method of sampling is the census. Information was collected through observation and direct visits to websites and data analysis was prepared by Microsoft Excel software, 2011. Results: The results of this study indicate that none of the websites use Dublin Core (DC) metadata and that only a few of them have used overlaps elements between HTML meta tags and Dublin Core (DC) elements. The percentage of overlaps of DC elements centralization in the Ministry of Health were 56% for both description and keywords and, in the Ministry of Science, were 45% for the keywords and 39% for the description. But, HTML meta tags have moderate presence in both Ministries, as the most-used elements were keywords and description (56%) and the least-used elements were date and formatter (0%). Conclusion: It was observed that the Ministry of Health and Ministry of Science follows the same path for using Dublin Core standard on their websites in the future. Because Central Library Websites are an example of scientific web pages, special attention in designing them can help the researchers
Hyper Text Mark-up Language and Dublin Core metadata element set usage in websites of Iranian State Universities' libraries.

Science.gov (United States)

Zare-Farashbandi, Firoozeh; Ramezan-Shirazi, Mahtab; Ashrafi-Rizi, Hasan; Nouri, Rasool

2014-01-01

Recent progress in providing innovative solutions in the organization of electronic resources and research in this area shows a global trend in the use of new strategies such as metadata to facilitate description, place for, organization and retrieval of resources in the web environment. In this context, library metadata standards have a special place; therefore, the purpose of the present study has been a comparative study on the Central Libraries' Websites of Iran State Universities for Hyper Text Mark-up Language (HTML) and Dublin Core metadata elements usage in 2011. The method of this study is applied-descriptive and data collection tool is the check lists created by the researchers. Statistical community includes 98 websites of the Iranian State Universities of the Ministry of Health and Medical Education and Ministry of Science, Research and Technology and method of sampling is the census. Information was collected through observation and direct visits to websites and data analysis was prepared by Microsoft Excel software, 2011. The results of this study indicate that none of the websites use Dublin Core (DC) metadata and that only a few of them have used overlaps elements between HTML meta tags and Dublin Core (DC) elements. The percentage of overlaps of DC elements centralization in the Ministry of Health were 56% for both description and keywords and, in the Ministry of Science, were 45% for the keywords and 39% for the description. But, HTML meta tags have moderate presence in both Ministries, as the most-used elements were keywords and description (56%) and the least-used elements were date and formatter (0%). It was observed that the Ministry of Health and Ministry of Science follows the same path for using Dublin Core standard on their websites in the future. Because Central Library Websites are an example of scientific web pages, special attention in designing them can help the researchers to achieve faster and more accurate information resources
LAIR: A Language for Automated Semantics-Aware Text Sanitization based on Frame Semantics

DEFF Research Database (Denmark)

Hedegaard, Steffen; Houen, Søren; Simonsen, Jakob Grue

2009-01-01

We present \\lair{}: A domain-specific language that enables users to specify actions to be taken upon meeting specific semantic frames in a text, in particular to rephrase and redact the textual content. While \\lair{} presupposes superficial knowledge of frames and frame semantics, it requires on...... with automated redaction of web pages for subjectively undesirable content; initial experiments suggest that using a small language based on semantic recognition of undesirable terms can be highly useful as a supplement to traditional methods of text sanitization.......We present \\lair{}: A domain-specific language that enables users to specify actions to be taken upon meeting specific semantic frames in a text, in particular to rephrase and redact the textual content. While \\lair{} presupposes superficial knowledge of frames and frame semantics, it requires only...... limited prior programming experience. It neither contain scripting or I/O primitives, nor does it contain general loop constructions and is not Turing-complete. We have implemented a \\lair{} compiler and integrated it in a pipeline for automated redaction of web pages. We detail our experience...
EU Law and Mass Internet Metadata Surveillance in the Post-Snowden Era

Directory of Open Access Journals (Sweden)

Nora Ni Loideain

2015-09-01

Full Text Available Legal frameworks exist within democracies to prevent the misuse and abuse of personal data that law enforcement authorities obtain from private communication service providers. The fundamental rights to respect for private life and the protection of personal data underpin this framework within the European Union. Accordingly, the protection of the principles and safeguards required by these rights is key to ensuring that the oversight of State surveillance powers is robust and transparent. Furthermore, without the robust scrutiny of independent judicial review, the principles and safeguards guaranteed by these rights may become more illusory than real. Following the Edward Snowden revelations, major concerns have been raised worldwide regarding the legality, necessity and proportionality standards governing these laws. In 2014, the highest court in the EU struck down the legal framework that imposed a mandatory duty on communication service providers to undertake the mass retention of metadata for secret intelligence and law enforcement authorities across the EU. This article considers the influence of the Snowden revelations on this landmark judgment. Subsequently, the analysis explores the significance of this ruling for the future reform of EU law governing metadata surveillance and its contribution to the worldwide debate on indiscriminate and covert monitoring in the post-Snowden era.
A Sharable and Efficient Metadata Model for Heterogeneous Earth Observation Data Retrieval in Multi-Scale Flood Mapping

Directory of Open Access Journals (Sweden)

Nengcheng Chen

2015-07-01

Full Text Available Remote sensing plays an important role in flood mapping and is helping advance flood monitoring and management. Multi-scale flood mapping is necessary for dividing floods into several stages for comprehensive management. However, existing data systems are typically heterogeneous owing to the use of different access protocols and archiving metadata models. In this paper, we proposed a sharable and efficient metadata model (APEOPM for constructing an Earth observation (EO data system to retrieve remote sensing data for flood mapping. The proposed model contains two sub-models, an access protocol model and an enhanced encoding model. The access protocol model helps unify heterogeneous access protocols and can achieve intelligent access via a semantic enhancement method. The enhanced encoding model helps unify a heterogeneous archiving metadata model. Wuhan city, one of the most important cities in the Yangtze River Economic Belt in China, is selected as a study area for testing the retrieval of heterogeneous EO data and flood mapping. The past torrential rain period from 25 March 2015 to 10 April 2015 is chosen as the temporal range in this study. To aid in comprehensive management, mapping is conducted at different spatial and temporal scales. In addition, the efficiency of data retrieval is analyzed, and validation between the flood maps and actual precipitation was conducted. The results show that the flood map coincided with the actual precipitation.
Automated Computer Access Request System

Science.gov (United States)

Snook, Bryan E.

2010-01-01

The Automated Computer Access Request (AutoCAR) system is a Web-based account provisioning application that replaces the time-consuming paper-based computer-access request process at Johnson Space Center (JSC). Auto- CAR combines rules-based and role-based functionality in one application to provide a centralized system that is easily and widely accessible. The system features a work-flow engine that facilitates request routing, a user registration directory containing contact information and user metadata, an access request submission and tracking process, and a system administrator account management component. This provides full, end-to-end disposition approval chain accountability from the moment a request is submitted. By blending both rules-based and rolebased functionality, AutoCAR has the flexibility to route requests based on a user s nationality, JSC affiliation status, and other export-control requirements, while ensuring a user s request is addressed by either a primary or backup approver. All user accounts that are tracked in AutoCAR are recorded and mapped to the native operating system schema on the target platform where user accounts reside. This allows for future extensibility for supporting creation, deletion, and account management directly on the target platforms by way of AutoCAR. The system s directory-based lookup and day-today change analysis of directory information determines personnel moves, deletions, and additions, and automatically notifies a user via e-mail to revalidate his/her account access as a result of such changes. AutoCAR is a Microsoft classic active server page (ASP) application hosted on a Microsoft Internet Information Server (IIS).
Metadata database and data analysis software for the ground-based upper atmospheric data developed by the IUGONET project

Science.gov (United States)

Hayashi, H.; Tanaka, Y.; Hori, T.; Koyama, Y.; Shinbori, A.; Abe, S.; Kagitani, M.; Kouno, T.; Yoshida, D.; Ueno, S.; Kaneda, N.; Yoneda, M.; Tadokoro, H.; Motoba, T.; Umemura, N.; Iugonet Project Team

2011-12-01

The Inter-university Upper atmosphere Global Observation NETwork (IUGONET) is a Japanese inter-university project by the National Institute of Polar Research (NIPR), Tohoku University, Nagoya University, Kyoto University, and Kyushu University to build a database of metadata for ground-based observations of the upper atmosphere. The IUGONET institutes/universities have been collecting various types of data by radars, magnetometers, photometers, radio telescopes, helioscopes, etc. at various locations all over the world and at various altitude layers from the Earth's surface to the Sun. The metadata database will be of great help to researchers in efficiently finding and obtaining these observational data spread over the institutes/universities. This should also facilitate synthetic analysis of multi-disciplinary data, which will lead to new types of research in the upper atmosphere. The project has also been developing a software to help researchers download, visualize, and analyze the data provided from the IUGONET institutes/universities. The metadata database system is built on the platform of DSpace, which is an open source software for digital repositories. The data analysis software is written in the IDL language with the TDAS (THEMIS Data Analysis Software suite) library. These products have been just released for beta-testing.
INSPIRE: Managing Metadata in a Global Digital Library for High-Energy Physics

OpenAIRE

Martin Montull, Javier

2011-01-01

Four leading laboratories in the High-Energy Physics (HEP) field are collaborating to roll-out the next-generation scientific information portal: INSPIRE. The goal of this project is to replace the popular 40 year-old SPIRES database. INSPIRE already provides access to about 1 million records and includes services such as fulltext search, automatic keyword assignment, ingestion and automatic display of LaTeX, citation analysis, automatic author disambiguation, metadata harvesting, extraction ...
Medical record automation at the Los Alamos Scientific Laboratory

International Nuclear Information System (INIS)

Hogle, G.O.; Grier, R.S.

1979-01-01

With the increase in population at the Los Alamos Scientific Laboratory and the growing concern over employee health, especially concerning the effects of the work environment, the Occupational Medicine Group decided to automate its medical record keeping system to meet these growing demands. With this computer system came not only the ability for long-term study of the work environment verses employee health, but other benefits such as more comprehensive records, increased legibility, reduced physician time, and better records management

Some links on this page may take you to non-federal websites. Their policies may differ from this site.