WorldWideScience

Sample records for accessible scientific workflow

  1. Accelerating the scientific exploration process with scientific workflows

    International Nuclear Information System (INIS)

    Altintas, Ilkay; Barney, Oscar; Cheng, Zhengang; Critchlow, Terence; Ludaescher, Bertram; Parker, Steve; Shoshani, Arie; Vouk, Mladen

    2006-01-01

    Although an increasing amount of middleware has emerged in the last few years to achieve remote data access, distributed job execution, and data management, orchestrating these technologies with minimal overhead still remains a difficult task for scientists. Scientific workflow systems improve this situation by creating interfaces to a variety of technologies and automating the execution and monitoring of the workflows. Workflow systems provide domain-independent customizable interfaces and tools that combine different tools and technologies along with efficient methods for using them. As simulations and experiments move into the petascale regime, the orchestration of long running data and compute intensive tasks is becoming a major requirement for the successful steering and completion of scientific investigations. A scientific workflow is the process of combining data and processes into a configurable, structured set of steps that implement semi-automated computational solutions of a scientific problem. Kepler is a cross-project collaboration, co-founded by the SciDAC Scientific Data Management (SDM) Center, whose purpose is to develop a domain-independent scientific workflow system. It provides a workflow environment in which scientists design and execute scientific workflows by specifying the desired sequence of computational actions and the appropriate data flow, including required data transformations, between these steps. Currently deployed workflows range from local analytical pipelines to distributed, high-performance and high-throughput applications, which can be both data- and compute-intensive. The scientific workflow approach offers a number of advantages over traditional scripting-based approaches, including ease of configuration, improved reusability and maintenance of workflows and components (called actors), automated provenance management, 'smart' re-running of different versions of workflow instances, on-the-fly updateable parameters, monitoring

  2. Integrate Data into Scientific Workflows for Terrestrial Biosphere Model Evaluation through Brokers

    Science.gov (United States)

    Wei, Y.; Cook, R. B.; Du, F.; Dasgupta, A.; Poco, J.; Huntzinger, D. N.; Schwalm, C. R.; Boldrini, E.; Santoro, M.; Pearlman, J.; Pearlman, F.; Nativi, S.; Khalsa, S.

    2013-12-01

    Terrestrial biosphere models (TBMs) have become integral tools for extrapolating local observations and process-level understanding of land-atmosphere carbon exchange to larger regions. Model-model and model-observation intercomparisons are critical to understand the uncertainties within model outputs, to improve model skill, and to improve our understanding of land-atmosphere carbon exchange. The DataONE Exploration, Visualization, and Analysis (EVA) working group is evaluating TBMs using scientific workflows in UV-CDAT/VisTrails. This workflow-based approach promotes collaboration and improved tracking of evaluation provenance. But challenges still remain. The multi-scale and multi-discipline nature of TBMs makes it necessary to include diverse and distributed data resources in model evaluation. These include, among others, remote sensing data from NASA, flux tower observations from various organizations including DOE, and inventory data from US Forest Service. A key challenge is to make heterogeneous data from different organizations and disciplines discoverable and readily integrated for use in scientific workflows. This presentation introduces the brokering approach taken by the DataONE EVA to fill the gap between TBMs' evaluation scientific workflows and cross-organization and cross-discipline data resources. The DataONE EVA started the development of an Integrated Model Intercomparison Framework (IMIF) that leverages standards-based discovery and access brokers to dynamically discover, access, and transform (e.g. subset and resampling) diverse data products from DataONE, Earth System Grid (ESG), and other data repositories into a format that can be readily used by scientific workflows in UV-CDAT/VisTrails. The discovery and access brokers serve as an independent middleware that bridge existing data repositories and TBMs evaluation scientific workflows but introduce little overhead to either component. In the initial work, an OpenSearch-based discovery broker

  3. Widening the adoption of workflows to include human and human-machine scientific processes

    Science.gov (United States)

    Salayandia, L.; Pinheiro da Silva, P.; Gates, A. Q.

    2010-12-01

    Scientific workflows capture knowledge in the form of technical recipes to access and manipulate data that help scientists manage and reuse established expertise to conduct their work. Libraries of scientific workflows are being created in particular fields, e.g., Bioinformatics, where combined with cyber-infrastructure environments that provide on-demand access to data and tools, result in powerful workbenches for scientists of those communities. The focus in these particular fields, however, has been more on automating rather than documenting scientific processes. As a result, technical barriers have impeded a wider adoption of scientific workflows by scientific communities that do not rely as heavily on cyber-infrastructure and computing environments. Semantic Abstract Workflows (SAWs) are introduced to widen the applicability of workflows as a tool to document scientific recipes or processes. SAWs intend to capture a scientists’ perspective about the process of how she or he would collect, filter, curate, and manipulate data to create the artifacts that are relevant to her/his work. In contrast, scientific workflows describe the process from the point of view of how technical methods and tools are used to conduct the work. By focusing on a higher level of abstraction that is closer to a scientist’s understanding, SAWs effectively capture the controlled vocabularies that reflect a particular scientific community, as well as the types of datasets and methods used in a particular domain. From there on, SAWs provide the flexibility to adapt to different environments to carry out the recipes or processes. These environments range from manual fieldwork to highly technical cyber-infrastructure environments, i.e., such as those already supported by scientific workflows. Two cases, one from Environmental Science and another from Geophysics, are presented as illustrative examples.

  4. The PBase Scientific Workflow Provenance Repository

    Directory of Open Access Journals (Sweden)

    Víctor Cuevas-Vicenttín

    2014-10-01

    Full Text Available Scientific workflows and their supporting systems are becoming increasingly popular for compute-intensive and data-intensive scientific experiments. The advantages scientific workflows offer include rapid and easy workflow design, software and data reuse, scalable execution, sharing and collaboration, and other advantages that altogether facilitate “reproducible science”. In this context, provenance – information about the origin, context, derivation, ownership, or history of some artifact – plays a key role, since scientists are interested in examining and auditing the results of scientific experiments. However, in order to perform such analyses on scientific results as part of extended research collaborations, an adequate environment and tools are required. Concretely, the need arises for a repository that will facilitate the sharing of scientific workflows and their associated execution traces in an interoperable manner, also enabling querying and visualization. Furthermore, such functionality should be supported while taking performance and scalability into account. With this purpose in mind, we introduce PBase: a scientific workflow provenance repository implementing the ProvONE proposed standard, which extends the emerging W3C PROV standard for provenance data with workflow specific concepts. PBase is built on the Neo4j graph database, thus offering capabilities such as declarative and efficient querying. Our experiences demonstrate the power gained by supporting various types of queries for provenance data. In addition, PBase is equipped with a user friendly interface tailored for the visualization of scientific workflow provenance data, making the specification of queries and the interpretation of their results easier and more effective.

  5. Perti Net-Based Workflow Access Control Model

    Institute of Scientific and Technical Information of China (English)

    陈卓; 骆婷; 石磊; 洪帆

    2004-01-01

    Access control is an important protection mechanism for information systems. This paper shows how to make access control in workflow system. We give a workflow access control model (WACM) based on several current access control models. The model supports roles assignment and dynamic authorization. The paper defines the workflow using Petri net. It firstly gives the definition and description of the workflow, and then analyzes the architecture of the workflow access control model (WACM). Finally, an example of an e-commerce workflow access control model is discussed in detail.

  6. A method to mine workflows from provenance for assisting scientific workflow composition

    NARCIS (Netherlands)

    Zeng, R.; He, X.; Aalst, van der W.M.P.

    2011-01-01

    Scientific workflows have recently emerged as a new paradigm for representing and managing complex distributed scientific computations and are used to accelerate the pace of scientific discovery. In many disciplines, individual workflows are large and complicated due to the large quantities of data

  7. Similarity measures for scientific workflows

    OpenAIRE

    Starlinger, Johannes

    2016-01-01

    In Laufe der letzten zehn Jahre haben Scientific Workflows als Werkzeug zur Erstellung von reproduzierbaren, datenverarbeitenden in-silico Experimenten an Aufmerksamkeit gewonnen, in die sowohl lokale Skripte und Anwendungen, als auch Web-Services eingebunden werden können. Über spezialisierte Online-Bibliotheken, sogenannte Repositories, können solche Workflows veröffentlicht und wiederverwendet werden. Mit zunehmender Größe dieser Repositories werden Ähnlichkeitsmaße für Scientific Workfl...

  8. Scientific workflows as productivity tools for drug discovery.

    Science.gov (United States)

    Shon, John; Ohkawa, Hitomi; Hammer, Juergen

    2008-05-01

    Large pharmaceutical companies annually invest tens to hundreds of millions of US dollars in research informatics to support their early drug discovery processes. Traditionally, most of these investments are designed to increase the efficiency of drug discovery. The introduction of do-it-yourself scientific workflow platforms has enabled research informatics organizations to shift their efforts toward scientific innovation, ultimately resulting in a possible increase in return on their investments. Unlike the handling of most scientific data and application integration approaches, researchers apply scientific workflows to in silico experimentation and exploration, leading to scientific discoveries that lie beyond automation and integration. This review highlights some key requirements for scientific workflow environments in the pharmaceutical industry that are necessary for increasing research productivity. Examples of the application of scientific workflows in research and a summary of recent platform advances are also provided.

  9. Business and scientific workflows a web service-oriented approach

    CERN Document Server

    Tan, Wei

    2013-01-01

    Focuses on how to use web service computing and service-based workflow technologies to develop timely, effective workflows for both business and scientific fields Utilizing web computing and Service-Oriented Architecture (SOA), Business and Scientific Workflows: A Web Service-Oriented Approach focuses on how to design, analyze, and deploy web service-based workflows for both business and scientific applications in many areas of healthcare and biomedicine. It also discusses and presents the recent research and development results. This informative reference features app

  10. The Live Access Server Scientific Product Generation Through Workflow Orchestration

    Science.gov (United States)

    Hankin, S.; Calahan, J.; Li, J.; Manke, A.; O'Brien, K.; Schweitzer, R.

    2006-12-01

    The Live Access Server (LAS) is a well-established Web-application for display and analysis of geo-science data sets. The software, which can be downloaded and installed by anyone, gives data providers an easy way to establish services for their on-line data holdings, so their users can make plots; create and download data sub-sets; compare (difference) fields; and perform simple analyses. Now at version 7.0, LAS has been in operation since 1994. The current "Armstrong" release of LAS V7 consists of three components in a tiered architecture: user interface, workflow orchestration and Web Services. The LAS user interface (UI) communicates with the LAS Product Server via an XML protocol embedded in an HTTP "get" URL. Libraries (APIs) have been developed in Java, JavaScript and perl that can readily generate this URL. As a result of this flexibility it is common to find LAS user interfaces of radically different character, tailored to the nature of specific datasets or the mindset of specific users. When a request is received by the LAS Product Server (LPS -- the workflow orchestration component), business logic converts this request into a series of Web Service requests invoked via SOAP. These "back- end" Web services perform data access and generate products (visualizations, data subsets, analyses, etc.). LPS then packages these outputs into final products (typically HTML pages) via Jakarta Velocity templates for delivery to the end user. "Fine grained" data access is performed by back-end services that may utilize JDBC for data base access; the OPeNDAP "DAPPER" protocol; or (in principle) the OGC WFS protocol. Back-end visualization services are commonly legacy science applications wrapped in Java or Python (or perl) classes and deployed as Web Services accessible via SOAP. Ferret is the default visualization application used by LAS, though other applications such as Matlab, CDAT, and GrADS can also be used. Other back-end services may include generation of Google

  11. On the Support of Scientific Workflows over Pub/Sub Brokers

    Directory of Open Access Journals (Sweden)

    Edwin Cedeño

    2013-08-01

    Full Text Available The execution of scientific workflows is gaining importance as more computing resources are available in the form of grid environments. The Publish/Subscribe paradigm offers well-proven solutions for sustaining distributed scenarios while maintaining the high level of task decoupling required by scientific workflows. In this paper, we propose a new model for supporting scientific workflows that improves the dissemination of control events. The proposed solution is based on the mapping of workflow tasks to the underlying Pub/Sub event layer, and the definition of interfaces and procedures for execution on brokers. In this paper we also analyze the strengths and weaknesses of current solutions that are based on existing message exchange models for scientific workflows. Finally, we explain how our model improves the information dissemination, event filtering, task decoupling and the monitoring of scientific workflows.

  12. On the support of scientific workflows over Pub/Sub brokers.

    Science.gov (United States)

    Morales, Augusto; Robles, Tomas; Alcarria, Ramon; Cedeño, Edwin

    2013-08-20

    The execution of scientific workflows is gaining importance as more computing resources are available in the form of grid environments. The Publish/Subscribe paradigm offers well-proven solutions for sustaining distributed scenarios while maintaining the high level of task decoupling required by scientific workflows. In this paper, we propose a new model for supporting scientific workflows that improves the dissemination of control events. The proposed solution is based on the mapping of workflow tasks to the underlying Pub/Sub event layer, and the definition of interfaces and procedures for execution on brokers. In this paper we also analyze the strengths and weaknesses of current solutions that are based on existing message exchange models for scientific workflows. Finally, we explain how our model improves the information dissemination, event filtering, task decoupling and the monitoring of scientific workflows.

  13. A web accessible scientific workflow system for vadoze zone performance monitoring: design and implementation examples

    Science.gov (United States)

    Mattson, E.; Versteeg, R.; Ankeny, M.; Stormberg, G.

    2005-12-01

    Long term performance monitoring has been identified by DOE, DOD and EPA as one of the most challenging and costly elements of contaminated site remedial efforts. Such monitoring should provide timely and actionable information relevant to a multitude of stakeholder needs. This information should be obtained in a manner which is auditable, cost effective and transparent. Over the last several years INL staff has designed and implemented a web accessible scientific workflow system for environmental monitoring. This workflow environment integrates distributed, automated data acquisition from diverse sensors (geophysical, geochemical and hydrological) with server side data management and information visualization through flexible browser based data access tools. Component technologies include a rich browser-based client (using dynamic javascript and html/css) for data selection, a back-end server which uses PHP for data processing, user management, and result delivery, and third party applications which are invoked by the back-end using webservices. This system has been implemented and is operational for several sites, including the Ruby Gulch Waste Rock Repository (a capped mine waste rock dump on the Gilt Edge Mine Superfund Site), the INL Vadoze Zone Research Park and an alternative cover landfill. Implementations for other vadoze zone sites are currently in progress. These systems allow for autonomous performance monitoring through automated data analysis and report generation. This performance monitoring has allowed users to obtain insights into system dynamics, regulatory compliance and residence times of water. Our system uses modular components for data selection and graphing and WSDL compliant webservices for external functions such as statistical analyses and model invocations. Thus, implementing this system for novel sites and extending functionality (e.g. adding novel models) is relatively straightforward. As system access requires a standard webbrowser

  14. Exploring Two Approaches for an End-to-End Scientific Analysis Workflow

    Science.gov (United States)

    Dodelson, Scott; Kent, Steve; Kowalkowski, Jim; Paterno, Marc; Sehrish, Saba

    2015-12-01

    The scientific discovery process can be advanced by the integration of independently-developed programs run on disparate computing facilities into coherent workflows usable by scientists who are not experts in computing. For such advancement, we need a system which scientists can use to formulate analysis workflows, to integrate new components to these workflows, and to execute different components on resources that are best suited to run those components. In addition, we need to monitor the status of the workflow as components get scheduled and executed, and to access the intermediate and final output for visual exploration and analysis. Finally, it is important for scientists to be able to share their workflows with collaborators. We have explored two approaches for such an analysis framework for the Large Synoptic Survey Telescope (LSST) Dark Energy Science Collaboration (DESC); the first one is based on the use and extension of Galaxy, a web-based portal for biomedical research, and the second one is based on a programming language, Python. In this paper, we present a brief description of the two approaches, describe the kinds of extensions to the Galaxy system we have found necessary in order to support the wide variety of scientific analysis in the cosmology community, and discuss how similar efforts might be of benefit to the HEP community.

  15. Scientific Workflow Management in Proteomics

    Science.gov (United States)

    de Bruin, Jeroen S.; Deelder, André M.; Palmblad, Magnus

    2012-01-01

    Data processing in proteomics can be a challenging endeavor, requiring extensive knowledge of many different software packages, all with different algorithms, data format requirements, and user interfaces. In this article we describe the integration of a number of existing programs and tools in Taverna Workbench, a scientific workflow manager currently being developed in the bioinformatics community. We demonstrate how a workflow manager provides a single, visually clear and intuitive interface to complex data analysis tasks in proteomics, from raw mass spectrometry data to protein identifications and beyond. PMID:22411703

  16. What is needed for effective open access workflows?

    CERN Multimedia

    CERN. Geneva

    2017-01-01

    Institutions and funders are pushing forward open access with ever new guidelines and policies. Since institutional repositories are important maintainers of green open access, they should support easy and fast workflows for researchers and libraries to release publications. Based on the requirements specification of researchers, libraries and publishers, possible supporting software extensions are discussed. How does a typical workflow look like? What has to be considered by the researchers and by the editors in the library before releasing a green open access publication? Where and how can software support and improve existing workflows?

  17. Collaborative e-Science Experiments and Scientific Workflows

    NARCIS (Netherlands)

    Belloum, A.; Inda, M.A.; Vasunin, D.; Korkhov, V.; Zhao, Z.; Rauwerda, H.; Breit, T.M.; Bubak, M.; Hertzberger, L.O.

    2011-01-01

    Recent advances in Internet and grid technologies have greatly enhanced scientific experiments' life cycle. In addition to compute- and data-intensive tasks, large-scale collaborations involving geographically distributed scientists and e-infrastructure are now possible. Scientific workflows, which

  18. Scheduling Multilevel Deadline-Constrained Scientific Workflows on Clouds Based on Cost Optimization

    Directory of Open Access Journals (Sweden)

    Maciej Malawski

    2015-01-01

    Full Text Available This paper presents a cost optimization model for scheduling scientific workflows on IaaS clouds such as Amazon EC2 or RackSpace. We assume multiple IaaS clouds with heterogeneous virtual machine instances, with limited number of instances per cloud and hourly billing. Input and output data are stored on a cloud object store such as Amazon S3. Applications are scientific workflows modeled as DAGs as in the Pegasus Workflow Management System. We assume that tasks in the workflows are grouped into levels of identical tasks. Our model is specified using mathematical programming languages (AMPL and CMPL and allows us to minimize the cost of workflow execution under deadline constraints. We present results obtained using our model and the benchmark workflows representing real scientific applications in a variety of domains. The data used for evaluation come from the synthetic workflows and from general purpose cloud benchmarks, as well as from the data measured in our own experiments with Montage, an astronomical application, executed on Amazon EC2 cloud. We indicate how this model can be used for scenarios that require resource planning for scientific workflows and their ensembles.

  19. Comparison of Resource Platform Selection Approaches for Scientific Workflows

    Energy Technology Data Exchange (ETDEWEB)

    Simmhan, Yogesh; Ramakrishnan, Lavanya

    2010-03-05

    Cloud computing is increasingly considered as an additional computational resource platform for scientific workflows. The cloud offers opportunity to scale-out applications from desktops and local cluster resources. At the same time, it can eliminate the challenges of restricted software environments and queue delays in shared high performance computing environments. Choosing from these diverse resource platforms for a workflow execution poses a challenge for many scientists. Scientists are often faced with deciding resource platform selection trade-offs with limited information on the actual workflows. While many workflow planning methods have explored task scheduling onto different resources, these methods often require fine-scale characterization of the workflow that is onerous for a scientist. In this position paper, we describe our early exploratory work into using blackbox characteristics to do a cost-benefit analysis across of using cloud platforms. We use only very limited high-level information on the workflow length, width, and data sizes. The length and width are indicative of the workflow duration and parallelism. The data size characterizes the IO requirements. We compare the effectiveness of this approach to other resource selection models using two exemplar scientific workflows scheduled on desktops, local clusters, HPC centers, and clouds. Early results suggest that the blackbox model often makes the same resource selections as a more fine-grained whitebox model. We believe the simplicity of the blackbox model can help inform a scientist on the applicability of cloud computing resources even before porting an existing workflow.

  20. Conceptual-level workflow modeling of scientific experiments using NMR as a case study

    Directory of Open Access Journals (Sweden)

    Gryk Michael R

    2007-01-01

    Full Text Available Abstract Background Scientific workflows improve the process of scientific experiments by making computations explicit, underscoring data flow, and emphasizing the participation of humans in the process when intuition and human reasoning are required. Workflows for experiments also highlight transitions among experimental phases, allowing intermediate results to be verified and supporting the proper handling of semantic mismatches and different file formats among the various tools used in the scientific process. Thus, scientific workflows are important for the modeling and subsequent capture of bioinformatics-related data. While much research has been conducted on the implementation of scientific workflows, the initial process of actually designing and generating the workflow at the conceptual level has received little consideration. Results We propose a structured process to capture scientific workflows at the conceptual level that allows workflows to be documented efficiently, results in concise models of the workflow and more-correct workflow implementations, and provides insight into the scientific process itself. The approach uses three modeling techniques to model the structural, data flow, and control flow aspects of the workflow. The domain of biomolecular structure determination using Nuclear Magnetic Resonance spectroscopy is used to demonstrate the process. Specifically, we show the application of the approach to capture the workflow for the process of conducting biomolecular analysis using Nuclear Magnetic Resonance (NMR spectroscopy. Conclusion Using the approach, we were able to accurately document, in a short amount of time, numerous steps in the process of conducting an experiment using NMR spectroscopy. The resulting models are correct and precise, as outside validation of the models identified only minor omissions in the models. In addition, the models provide an accurate visual description of the control flow for conducting

  1. Elastic Scheduling of Scientific Workflows under Deadline Constraints in Cloud Computing Environments

    Directory of Open Access Journals (Sweden)

    Nazia Anwar

    2018-01-01

    Full Text Available Scientific workflow applications are collections of several structured activities and fine-grained computational tasks. Scientific workflow scheduling in cloud computing is a challenging research topic due to its distinctive features. In cloud environments, it has become critical to perform efficient task scheduling resulting in reduced scheduling overhead, minimized cost and maximized resource utilization while still meeting the user-specified overall deadline. This paper proposes a strategy, Dynamic Scheduling of Bag of Tasks based workflows (DSB, for scheduling scientific workflows with the aim to minimize financial cost of leasing Virtual Machines (VMs under a user-defined deadline constraint. The proposed model groups the workflow into Bag of Tasks (BoTs based on data dependency and priority constraints and thereafter optimizes the allocation and scheduling of BoTs on elastic, heterogeneous and dynamically provisioned cloud resources called VMs in order to attain the proposed method’s objectives. The proposed approach considers pay-as-you-go Infrastructure as a Service (IaaS clouds having inherent features such as elasticity, abundance, heterogeneity and VM provisioning delays. A trace-based simulation using benchmark scientific workflows representing real world applications, demonstrates a significant reduction in workflow computation cost while the workflow deadline is met. The results validate that the proposed model produces better success rates to meet deadlines and cost efficiencies in comparison to adapted state-of-the-art algorithms for similar problems.

  2. The Virtual Geophysics Laboratory (VGL): Scientific Workflows Operating Across Organizations and Across Infrastructures

    Science.gov (United States)

    Cox, S. J.; Wyborn, L. A.; Fraser, R.; Rankine, T.; Woodcock, R.; Vote, J.; Evans, B.

    2012-12-01

    The Virtual Geophysics Laboratory (VGL) is web portal that provides geoscientists with an integrated online environment that: seamlessly accesses geophysical and geoscience data services from the AuScope national geoscience information infrastructure; loosely couples these data to a variety of gesocience software tools; and provides large scale processing facilities via cloud computing. VGL is a collaboration between CSIRO, Geoscience Australia, National Computational Infrastructure, Monash University, Australian National University and the University of Queensland. The VGL provides a distributed system whereby a user can enter an online virtual laboratory to seamlessly connect to OGC web services for geoscience data. The data is supplied in open standards formats using international standards like GeoSciML. A VGL user uses a web mapping interface to discover and filter the data sources using spatial and attribute filters to define a subset. Once the data is selected the user is not required to download the data. VGL collates the service query information for later in the processing workflow where it will be staged directly to the computing facilities. The combination of deferring data download and access to Cloud computing enables VGL users to access their data at higher resolutions and to undertake larger scale inversions, more complex models and simulations than their own local computing facilities might allow. Inside the Virtual Geophysics Laboratory, the user has access to a library of existing models, complete with exemplar workflows for specific scientific problems based on those models. For example, the user can load a geological model published by Geoscience Australia, apply a basic deformation workflow provided by a CSIRO scientist, and have it run in a scientific code from Monash. Finally the user can publish these results to share with a colleague or cite in a paper. This opens new opportunities for access and collaboration as all the resources (models

  3. A Multi-Dimensional Classification Model for Scientific Workflow Characteristics

    Energy Technology Data Exchange (ETDEWEB)

    Ramakrishnan, Lavanya; Plale, Beth

    2010-04-05

    Workflows have been used to model repeatable tasks or operations in manufacturing, business process, and software. In recent years, workflows are increasingly used for orchestration of science discovery tasks that use distributed resources and web services environments through resource models such as grid and cloud computing. Workflows have disparate re uirements and constraints that affects how they might be managed in distributed environments. In this paper, we present a multi-dimensional classification model illustrated by workflow examples obtained through a survey of scientists from different domains including bioinformatics and biomedical, weather and ocean modeling, astronomy detailing their data and computational requirements. The survey results and classification model contribute to the high level understandingof scientific workflows.

  4. A method to build and analyze scientific workflows from provenance through process mining

    NARCIS (Netherlands)

    Zeng, R.; He, X.; Li, Jiafei; Liu, Zheng; Aalst, van der W.M.P.

    2011-01-01

    Scientific workflows have recently emerged as a new paradigm for representing and managing complex distributed scientific computations and are used to accelerate the pace of scientific discovery. In many disciplines, individual workflows are large due to the large quantities of data used. As

  5. Task Delegation Based Access Control Models for Workflow Systems

    Science.gov (United States)

    Gaaloul, Khaled; Charoy, François

    e-Government organisations are facilitated and conducted using workflow management systems. Role-based access control (RBAC) is recognised as an efficient access control model for large organisations. The application of RBAC in workflow systems cannot, however, grant permissions to users dynamically while business processes are being executed. We currently observe a move away from predefined strict workflow modelling towards approaches supporting flexibility on the organisational level. One specific approach is that of task delegation. Task delegation is a mechanism that supports organisational flexibility, and ensures delegation of authority in access control systems. In this paper, we propose a Task-oriented Access Control (TAC) model based on RBAC to address these requirements. We aim to reason about task from organisational perspectives and resources perspectives to analyse and specify authorisation constraints. Moreover, we present a fine grained access control protocol to support delegation based on the TAC model.

  6. Styx Grid Services: Lightweight Middleware for Efficient Scientific Workflows

    Directory of Open Access Journals (Sweden)

    J.D. Blower

    2006-01-01

    Full Text Available The service-oriented approach to performing distributed scientific research is potentially very powerful but is not yet widely used in many scientific fields. This is partly due to the technical difficulties involved in creating services and workflows and the inefficiency of many workflow systems with regard to handling large datasets. We present the Styx Grid Service, a simple system that wraps command-line programs and allows them to be run over the Internet exactly as if they were local programs. Styx Grid Services are very easy to create and use and can be composed into powerful workflows with simple shell scripts or more sophisticated graphical tools. An important feature of the system is that data can be streamed directly from service to service, significantly increasing the efficiency of workflows that use large data volumes. The status and progress of Styx Grid Services can be monitored asynchronously using a mechanism that places very few demands on firewalls. We show how Styx Grid Services can interoperate with with Web Services and WS-Resources using suitable adapters.

  7. Scientific Workflows and the Sensor Web for Virtual Environmental Observatories

    Science.gov (United States)

    Simonis, I.; Vahed, A.

    2008-12-01

    interfaces. All data sets and sensor communication follow well-defined abstract models and corresponding encodings, mostly developed by the OGC Sensor Web Enablement initiative. Scientific progress is currently accelerated by an emerging new concept called scientific workflows, which organize and manage complex distributed computations. A scientific workflow represents and records the highly complex processes that a domain scientist typically would follow in exploration, discovery and ultimately, transformation of raw data to publishable results. The challenge is now to integrate the benefits of scientific workflows with those provided by the Sensor Web in order to leverage all resources for scientific exploration, problem solving, and knowledge generation. Scientific workflows for the Sensor Web represent the next evolutionary step towards efficient, powerful, and flexible earth observation frameworks and platforms. Those platforms support the entire process from capturing data, sharing and integrating, to requesting additional observations. Multiple sites and organizations will participate on single platforms and scientists from different countries and organizations interact and contribute to large-scale research projects. Simultaneously, the data- and information overload becomes manageable, as multiple layers of abstraction will free scientists to deal with underlying data-, processing or storage peculiarities. The vision are automated investigation and discovery mechanisms that allow scientists to pose queries to the system, which in turn would identify potentially related resources, schedules processing tasks and assembles all parts in workflows that may satisfy the query.

  8. Coupling of a continuum ice sheet model and a discrete element calving model using a scientific workflow system

    Science.gov (United States)

    Memon, Shahbaz; Vallot, Dorothée; Zwinger, Thomas; Neukirchen, Helmut

    2017-04-01

    Scientific communities generate complex simulations through orchestration of semi-structured analysis pipelines which involves execution of large workflows on multiple, distributed and heterogeneous computing and data resources. Modeling ice dynamics of glaciers requires workflows consisting of many non-trivial, computationally expensive processing tasks which are coupled to each other. From this domain, we present an e-Science use case, a workflow, which requires the execution of a continuum ice flow model and a discrete element based calving model in an iterative manner. Apart from the execution, this workflow also contains data format conversion tasks that support the execution of ice flow and calving by means of transition through sequential, nested and iterative steps. Thus, the management and monitoring of all the processing tasks including data management and transfer of the workflow model becomes more complex. From the implementation perspective, this workflow model was initially developed on a set of scripts using static data input and output references. In the course of application usage when more scripts or modifications introduced as per user requirements, the debugging and validation of results were more cumbersome to achieve. To address these problems, we identified a need to have a high-level scientific workflow tool through which all the above mentioned processes can be achieved in an efficient and usable manner. We decided to make use of the e-Science middleware UNICORE (Uniform Interface to Computing Resources) that allows seamless and automated access to different heterogenous and distributed resources which is supported by a scientific workflow engine. Based on this, we developed a high-level scientific workflow model for coupling of massively parallel High-Performance Computing (HPC) jobs: a continuum ice sheet model (Elmer/Ice) and a discrete element calving and crevassing model (HiDEM). In our talk we present how the use of a high

  9. From Data to Knowledge to Discoveries: Artificial Intelligence and Scientific Workflows

    Directory of Open Access Journals (Sweden)

    Yolanda Gil

    2009-01-01

    Full Text Available Scientific computing has entered a new era of scale and sharing with the arrival of cyberinfrastructure facilities for computational experimentation. A key emerging concept is scientific workflows, which provide a declarative representation of complex scientific applications that can be automatically managed and executed in distributed shared resources. In the coming decades, computational experimentation will push the boundaries of current cyberinfrastructure in terms of inter-disciplinary scope and integrative models of scientific phenomena under study. This paper argues that knowledge-rich workflow environments will provide necessary capabilities for that vision by assisting scientists to validate and vet complex analysis processes and by automating important aspects of scientific exploration and discovery.

  10. Supporting the Construction of Workflows for Biodiversity Problem-Solving Accessing Secure, Distributed Resources

    Directory of Open Access Journals (Sweden)

    J.S. Pahwa

    2006-01-01

    Full Text Available In the Biodiversity World (BDW project we have created a flexible and extensible Web Services-based Grid environment for biodiversity researchers to solve problems in biodiversity and analyse biodiversity patterns. In this environment, heterogeneous and globally distributed biodiversity-related resources such as data sets and analytical tools are made available to be accessed and assembled by users into workflows to perform complex scientific experiments. One such experiment is bioclimatic modelling of the geographical distribution of individual species using climate variables in order to explain past and future climate-related changes in species distribution. Data sources and analytical tools required for such analysis of species distribution are widely dispersed, available on heterogeneous platforms, present data in different formats and lack inherent interoperability. The present BDW system brings all these disparate units together so that the user can combine tools with little thought as to their original availability, data formats and interoperability. The new prototype BDW system architecture not only brings together heterogeneous resources but also enables utilisation of computational resources and provides a secure access to BDW resources via a federated security model. We describe features of the new BDW system and its security model which enable user authentication from a workflow application as part of workflow execution.

  11. The MPO API: A tool for recording scientific workflows

    Energy Technology Data Exchange (ETDEWEB)

    Wright, John C., E-mail: jcwright@mit.edu [MIT Plasma Science and Fusion Center, Cambridge, MA (United States); Greenwald, Martin; Stillerman, Joshua [MIT Plasma Science and Fusion Center, Cambridge, MA (United States); Abla, Gheni; Chanthavong, Bobby; Flanagan, Sean; Schissel, David; Lee, Xia [General Atomics, San Diego, CA (United States); Romosan, Alex; Shoshani, Arie [Lawrence Berkeley Laboratory, Berkeley, CA (United States)

    2014-05-15

    Highlights: • A description of a new framework and tool for recording scientific workflows, especially those resulting from simulation and analysis. • An explanation of the underlying technologies used to implement this web based tool. • Several examples of using the tool. - Abstract: Data from large-scale experiments and extreme-scale computing is expensive to produce and may be used for high-consequence applications. The Metadata, Provenance and Ontology (MPO) project builds on previous work [M. Greenwald, Fusion Eng. Des. 87 (2012) 2205–2208] and is focused on providing documentation of workflows, data provenance and the ability to data-mine large sets of results. While there are important design and development aspects to the data structures and user interfaces, we concern ourselves in this paper with the application programming interface (API) – the set of functions that interface with the data server. Our approach for the data server is to follow the Representational State Transfer (RESTful) software architecture style for client–server communication. At its core, the API uses the POST and GET methods of the HTTP protocol to transfer workflow information in message bodies to targets specified in the URL to and from the database via a web server. Higher level API calls are built upon this core API. This design facilitates implementation on different platforms and in different languages and is robust to changes in the underlying technologies used. The command line client implementation can communicate with the data server from any machine with HTTP access.

  12. The MPO API: A tool for recording scientific workflows

    International Nuclear Information System (INIS)

    Wright, John C.; Greenwald, Martin; Stillerman, Joshua; Abla, Gheni; Chanthavong, Bobby; Flanagan, Sean; Schissel, David; Lee, Xia; Romosan, Alex; Shoshani, Arie

    2014-01-01

    Highlights: • A description of a new framework and tool for recording scientific workflows, especially those resulting from simulation and analysis. • An explanation of the underlying technologies used to implement this web based tool. • Several examples of using the tool. - Abstract: Data from large-scale experiments and extreme-scale computing is expensive to produce and may be used for high-consequence applications. The Metadata, Provenance and Ontology (MPO) project builds on previous work [M. Greenwald, Fusion Eng. Des. 87 (2012) 2205–2208] and is focused on providing documentation of workflows, data provenance and the ability to data-mine large sets of results. While there are important design and development aspects to the data structures and user interfaces, we concern ourselves in this paper with the application programming interface (API) – the set of functions that interface with the data server. Our approach for the data server is to follow the Representational State Transfer (RESTful) software architecture style for client–server communication. At its core, the API uses the POST and GET methods of the HTTP protocol to transfer workflow information in message bodies to targets specified in the URL to and from the database via a web server. Higher level API calls are built upon this core API. This design facilitates implementation on different platforms and in different languages and is robust to changes in the underlying technologies used. The command line client implementation can communicate with the data server from any machine with HTTP access

  13. Beyond GIS with EO4V is Trails: a geospatio-temporal scientific workflow environment

    CSIR Research Space (South Africa)

    Van Zyl, T

    2012-10-01

    Full Text Available be accommodated at once. The scientific workflows approach has other advantages to such as provenance, repeatability and collaboration. The paper presents EO4VisTrails as an example of such a scientific workflows approach to integration and discusses the benefit...

  14. VisTrails is an open-source scientific workflow and provenance management system

    CSIR Research Space (South Africa)

    Mthombeni, Thabo DM

    2011-12-01

    Full Text Available VisTrails is an open-source scientific workflow and provenance management system that provides support for simulations, data exploration and visualization. Whereas workflows have been traditionally used to automate repetitive tasks, for applications...

  15. The Symbiotic Relationship between Scientific Workflow and Provenance (Invited)

    Science.gov (United States)

    Stephan, E.

    2010-12-01

    The purpose of this presentation is to describe the symbiotic nature of scientific workflows and provenance. We will also discuss the current trends and real world challenges facing these two distinct research areas. Although motivated differently, the needs of the international science communities are the glue that binds this relationship together. Understanding and articulating the science drivers to these communities is paramount as these technologies evolve and mature. Originally conceived for managing business processes, workflows are now becoming invaluable assets in both computational and experimental sciences. These reconfigurable, automated systems provide essential technology to perform complex analyses by coupling together geographically distributed disparate data sources and applications. As a result, workflows are capable of higher throughput in a shorter amount of time than performing the steps manually. Today many different workflow products exist; these could include Kepler and Taverna or similar products like MeDICI, developed at PNNL, that are standardized on the Business Process Execution Language (BPEL). Provenance, originating from the French term Provenir “to come from”, is used to describe the curation process of artwork as art is passed from owner to owner. The concept of provenance was adopted by digital libraries as a means to track the lineage of documents while standards such as the DublinCore began to emerge. In recent years the systems science community has increasingly expressed the need to expand the concept of provenance to formally articulate the history of scientific data. Communities such as the International Provenance and Annotation Workshop (IPAW) have formalized a provenance data model. The Open Provenance Model, and the W3C is hosting a provenance incubator group featuring the Proof Markup Language. Although both workflows and provenance have risen from different communities and operate independently, their mutual

  16. From the desktop to the grid: scalable bioinformatics via workflow conversion.

    Science.gov (United States)

    de la Garza, Luis; Veit, Johannes; Szolek, Andras; Röttig, Marc; Aiche, Stephan; Gesing, Sandra; Reinert, Knut; Kohlbacher, Oliver

    2016-03-12

    Reproducibility is one of the tenets of the scientific method. Scientific experiments often comprise complex data flows, selection of adequate parameters, and analysis and visualization of intermediate and end results. Breaking down the complexity of such experiments into the joint collaboration of small, repeatable, well defined tasks, each with well defined inputs, parameters, and outputs, offers the immediate benefit of identifying bottlenecks, pinpoint sections which could benefit from parallelization, among others. Workflows rest upon the notion of splitting complex work into the joint effort of several manageable tasks. There are several engines that give users the ability to design and execute workflows. Each engine was created to address certain problems of a specific community, therefore each one has its advantages and shortcomings. Furthermore, not all features of all workflow engines are royalty-free -an aspect that could potentially drive away members of the scientific community. We have developed a set of tools that enables the scientific community to benefit from workflow interoperability. We developed a platform-free structured representation of parameters, inputs, outputs of command-line tools in so-called Common Tool Descriptor documents. We have also overcome the shortcomings and combined the features of two royalty-free workflow engines with a substantial user community: the Konstanz Information Miner, an engine which we see as a formidable workflow editor, and the Grid and User Support Environment, a web-based framework able to interact with several high-performance computing resources. We have thus created a free and highly accessible way to design workflows on a desktop computer and execute them on high-performance computing resources. Our work will not only reduce time spent on designing scientific workflows, but also make executing workflows on remote high-performance computing resources more accessible to technically inexperienced users. We

  17. A Scientific Workflow Platform for Generic and Scalable Object Recognition on Medical Images

    Science.gov (United States)

    Möller, Manuel; Tuot, Christopher; Sintek, Michael

    In the research project THESEUS MEDICO we aim at a system combining medical image information with semantic background knowledge from ontologies to give clinicians fully cross-modal access to biomedical image repositories. Therefore joint efforts have to be made in more than one dimension: Object detection processes have to be specified in which an abstraction is performed starting from low-level image features across landmark detection utilizing abstract domain knowledge up to high-level object recognition. We propose a system based on a client-server extension of the scientific workflow platform Kepler that assists the collaboration of medical experts and computer scientists during development and parameter learning.

  18. FOSS geospatial libraries in scientific workflow environments: experiences and directions

    CSIR Research Space (South Africa)

    McFerren, G

    2011-07-01

    Full Text Available of experiments. In context of three sets of research (wildfire research, flood modelling and the linking of disease outbreaks to multi-scale environmental conditions), we describe our efforts to provide geospatial capability for scientific workflow software...

  19. A Hybrid Metaheuristic for Multi-Objective Scientific Workflow Scheduling in a Cloud Environment

    Directory of Open Access Journals (Sweden)

    Nazia Anwar

    2018-03-01

    Full Text Available Cloud computing has emerged as a high-performance computing environment with a large pool of abstracted, virtualized, flexible, and on-demand resources and services. Scheduling of scientific workflows in a distributed environment is a well-known NP-complete problem and therefore intractable with exact solutions. It becomes even more challenging in the cloud computing platform due to its dynamic and heterogeneous nature. The aim of this study is to optimize multi-objective scheduling of scientific workflows in a cloud computing environment based on the proposed metaheuristic-based algorithm, Hybrid Bio-inspired Metaheuristic for Multi-objective Optimization (HBMMO. The strong global exploration ability of the nature-inspired metaheuristic Symbiotic Organisms Search (SOS is enhanced by involving an efficient list-scheduling heuristic, Predict Earliest Finish Time (PEFT, in the proposed algorithm to obtain better convergence and diversity of the approximate Pareto front in terms of reduced makespan, minimized cost, and efficient load balance of the Virtual Machines (VMs. The experiments using different scientific workflow applications highlight the effectiveness, practicality, and better performance of the proposed algorithm.

  20. A virtual data language and system for scientific workflow management in data grid environments

    Science.gov (United States)

    Zhao, Yong

    With advances in scientific instrumentation and simulation, scientific data is growing fast in both size and analysis complexity. So-called Data Grids aim to provide high performance, distributed data analysis infrastructure for data- intensive sciences, where scientists distributed worldwide need to extract information from large collections of data, and to share both data products and the resources needed to produce and store them. However, the description, composition, and execution of even logically simple scientific workflows are often complicated by the need to deal with "messy" issues like heterogeneous storage formats and ad-hoc file system structures. We show how these difficulties can be overcome via a typed workflow notation called virtual data language, within which issues of physical representation are cleanly separated from logical typing, and by the implementation of this notation within the context of a powerful virtual data system that supports distributed execution. The resulting language and system are capable of expressing complex workflows in a simple compact form, enacting those workflows in distributed environments, monitoring and recording the execution processes, and tracing the derivation history of data products. We describe the motivation, design, implementation, and evaluation of the virtual data language and system, and the application of the virtual data paradigm in various science disciplines, including astronomy, cognitive neuroscience.

  1. Perti Net-Based Workflow Access Control Model%基于Perti网的工作流访问控制模型研究

    Institute of Scientific and Technical Information of China (English)

    陈卓; 骆婷; 石磊; 洪帆

    2004-01-01

    Access control is an important protection mechanism for information systems.This paper shows how to make access control in workflow system.We give a workflow access control model (WACM) based on several current access control models.The model supports roles assignment and dynamic authorization.The paper defines the workflow using Petri net.It firstly gives the definition and description of the workflow, and then analyzes the architecture of the workflow access control model (WACM).Finally, an example of an e-commerce workflow access control model is discussed in detail.

  2. A proposal for the inclusion of accessibility criteria in the authoring workflow of images for scientific articles

    OpenAIRE

    Splendiani, Bruno

    2015-01-01

    This thesis investigates the problem of how to provide accessible images in academic articles in the research fields of Science, Technology, Engineering and Mathematics (STEM) and in particular in biomedicine. Currently, graphics in scientific articles are a critical information source and often provide essential information for a thorough understanding of scientific articles. People with visual and other impairments experience specific barriers that prevent them from accessing the informatio...

  3. Access Control with Delegated Authorization Policy Evaluation for Data-Driven Microservice Workflows

    Directory of Open Access Journals (Sweden)

    Davy Preuveneers

    2017-09-01

    Full Text Available Microservices offer a compelling competitive advantage for building data flow systems as a choreography of self-contained data endpoints that each implement a specific data processing functionality. Such a ‘single responsibility principle’ design makes them well suited for constructing scalable and flexible data integration and real-time data flow applications. In this paper, we investigate microservice based data processing workflows from a security point of view, i.e., (1 how to constrain data processing workflows with respect to dynamic authorization policies granting or denying access to certain microservice results depending on the flow of the data; (2 how to let multiple microservices contribute to a collective data-driven authorization decision and (3 how to put adequate measures in place such that the data within each individual microservice is protected against illegitimate access from unauthorized users or other microservices. Due to this multifold objective, enforcing access control on the data endpoints to prevent information leakage or preserve one’s privacy becomes far more challenging, as authorization policies can have dependencies and decision outcomes cross-cutting data in multiple microservices. To address this challenge, we present and evaluate a workflow-oriented authorization framework that enforces authorization policies in a decentralized manner and where the delegated policy evaluation leverages feature toggles that are managed at runtime by software circuit breakers to secure the distributed data processing workflows. The benefit of our solution is that, on the one hand, authorization policies restrict access to the data endpoints of the microservices, and on the other hand, microservices can safely rely on other data endpoints to collectively evaluate cross-cutting access control decisions without having to rely on a shared storage backend holding all the necessary information for the policy evaluation.

  4. HisT/PLIER: A two-fold Provenance Approach for Grid-enabled Scientific Workflows using WS-VLAM

    NARCIS (Netherlands)

    Gerhards, M.; Sander, V.; Belloum, A.; Vasunin, D.; Benabdelkader, A.; Jha, S.; Felde, N.G.; Buyya, R.; Fedak, G.

    2011-01-01

    Large scale scientific applications are frequently modeled as a workflow that is executed under the control of a workflow management system. One crucial requirement is the validation of the generated results, e.g. The trace ability of the experiment execution path. The automated tracking and storage

  5. Facilitating Stewardship of scientific data through standards based workflows

    Science.gov (United States)

    Bastrakova, I.; Kemp, C.; Potter, A. K.

    2013-12-01

    scientific data acquisition and analysis requirements and effective interoperable data management and delivery. This includes participating in national and international dialogue on development of standards, embedding data management activities in business processes, and developing scientific staff as effective data stewards. Similar approach is applied to the geophysical data. By ensuring the geophysical datasets at GA strictly follow metadata and industry standards we are able to implement a provenance based workflow where the data is easily discoverable, geophysical processing can be applied to it and results can be stored. The provenance based workflow enables metadata records for the results to be produced automatically from the input dataset metadata.

  6. Cloud Bursting with GlideinWMS: Means to satisfy ever increasing computing needs for Scientific Workflows

    International Nuclear Information System (INIS)

    Mhashilkar, Parag; Tiradani, Anthony; Holzman, Burt; Larson, Krista; Sfiligoi, Igor; Rynge, Mats

    2014-01-01

    Scientific communities have been in the forefront of adopting new technologies and methodologies in the computing. Scientific computing has influenced how science is done today, achieving breakthroughs that were impossible to achieve several decades ago. For the past decade several such communities in the Open Science Grid (OSG) and the European Grid Infrastructure (EGI) have been using GlideinWMS to run complex application workflows to effectively share computational resources over the grid. GlideinWMS is a pilot-based workload management system (WMS) that creates on demand, a dynamically sized overlay HTCondor batch system on grid resources. At present, the computational resources shared over the grid are just adequate to sustain the computing needs. We envision that the complexity of the science driven by 'Big Data' will further push the need for computational resources. To fulfill their increasing demands and/or to run specialized workflows, some of the big communities like CMS are investigating the use of cloud computing as Infrastructure-As-A-Service (IAAS) with GlideinWMS as a potential alternative to fill the void. Similarly, communities with no previous access to computing resources can use GlideinWMS to setup up a batch system on the cloud infrastructure. To enable this, the architecture of GlideinWMS has been extended to enable support for interfacing GlideinWMS with different Scientific and commercial cloud providers like HLT, FutureGrid, FermiCloud and Amazon EC2. In this paper, we describe a solution for cloud bursting with GlideinWMS. The paper describes the approach, architectural changes and lessons learned while enabling support for cloud infrastructures in GlideinWMS.

  7. Dynamic reusable workflows for ocean science

    Science.gov (United States)

    Signell, Richard; Fernandez, Filipe; Wilcox, Kyle

    2016-01-01

    Digital catalogs of ocean data have been available for decades, but advances in standardized services and software for catalog search and data access make it now possible to create catalog-driven workflows that automate — end-to-end — data search, analysis and visualization of data from multiple distributed sources. Further, these workflows may be shared, reused and adapted with ease. Here we describe a workflow developed within the US Integrated Ocean Observing System (IOOS) which automates the skill-assessment of water temperature forecasts from multiple ocean forecast models, allowing improved forecast products to be delivered for an open water swim event. A series of Jupyter Notebooks are used to capture and document the end-to-end workflow using a collection of Python tools that facilitate working with standardized catalog and data services. The workflow first searches a catalog of metadata using the Open Geospatial Consortium (OGC) Catalog Service for the Web (CSW), then accesses data service endpoints found in the metadata records using the OGC Sensor Observation Service (SOS) for in situ sensor data and OPeNDAP services for remotely-sensed and model data. Skill metrics are computed and time series comparisons of forecast model and observed data are displayed interactively, leveraging the capabilities of modern web browsers. The resulting workflow not only solves a challenging specific problem, but highlights the benefits of dynamic, reusable workflows in general. These workflows adapt as new data enters the data system, facilitate reproducible science, provide templates from which new scientific workflows can be developed, and encourage data providers to use standardized services. As applied to the ocean swim event, the workflow exposed problems with two of the ocean forecast products which led to improved regional forecasts once errors were corrected. While the example is specific, the approach is general, and we hope to see increased use of dynamic

  8. Dynamic Reusable Workflows for Ocean Science

    Directory of Open Access Journals (Sweden)

    Richard P. Signell

    2016-10-01

    Full Text Available Digital catalogs of ocean data have been available for decades, but advances in standardized services and software for catalog searches and data access now make it possible to create catalog-driven workflows that automate—end-to-end—data search, analysis, and visualization of data from multiple distributed sources. Further, these workflows may be shared, reused, and adapted with ease. Here we describe a workflow developed within the US Integrated Ocean Observing System (IOOS which automates the skill assessment of water temperature forecasts from multiple ocean forecast models, allowing improved forecast products to be delivered for an open water swim event. A series of Jupyter Notebooks are used to capture and document the end-to-end workflow using a collection of Python tools that facilitate working with standardized catalog and data services. The workflow first searches a catalog of metadata using the Open Geospatial Consortium (OGC Catalog Service for the Web (CSW, then accesses data service endpoints found in the metadata records using the OGC Sensor Observation Service (SOS for in situ sensor data and OPeNDAP services for remotely-sensed and model data. Skill metrics are computed and time series comparisons of forecast model and observed data are displayed interactively, leveraging the capabilities of modern web browsers. The resulting workflow not only solves a challenging specific problem, but highlights the benefits of dynamic, reusable workflows in general. These workflows adapt as new data enter the data system, facilitate reproducible science, provide templates from which new scientific workflows can be developed, and encourage data providers to use standardized services. As applied to the ocean swim event, the workflow exposed problems with two of the ocean forecast products which led to improved regional forecasts once errors were corrected. While the example is specific, the approach is general, and we hope to see increased

  9. An open source workflow for 3D printouts of scientific data volumes

    Science.gov (United States)

    Loewe, P.; Klump, J. F.; Wickert, J.; Ludwig, M.; Frigeri, A.

    2013-12-01

    As the amount of scientific data continues to grow, researchers need new tools to help them visualize complex data. Immersive data-visualisations are helpful, yet fail to provide tactile feedback and sensory feedback on spatial orientation, as provided from tangible objects. The gap in sensory feedback from virtual objects leads to the development of tangible representations of geospatial information to solve real world problems. Examples are animated globes [1], interactive environments like tangible GIS [2], and on demand 3D prints. The production of a tangible representation of a scientific data set is one step in a line of scientific thinking, leading from the physical world into scientific reasoning and back: The process starts with a physical observation, or from a data stream generated by an environmental sensor. This data stream is turned into a geo-referenced data set. This data is turned into a volume representation which is converted into command sequences for the printing device, leading to the creation of a 3D printout. As a last, but crucial step, this new object has to be documented and linked to the associated metadata, and curated in long term repositories to preserve its scientific meaning and context. The workflow to produce tangible 3D data-prints from science data at the German Research Centre for Geosciences (GFZ) was implemented as a software based on the Free and Open Source Geoinformatics tools GRASS GIS and Paraview. The workflow was successfully validated in various application scenarios at GFZ using a RapMan printer to create 3D specimens of elevation models, geological underground models, ice penetrating radar soundings for planetology, and space time stacks for Tsunami model quality assessment. While these first pilot applications have demonstrated the feasibility of the overall approach [3], current research focuses on the provision of the workflow as Software as a Service (SAAS), thematic generalisation of information content and

  10. Using SensorML to describe scientific workflows in distributed web service environments

    CSIR Research Space (South Africa)

    Van Zyl, TL

    2009-07-01

    Full Text Available for increased collaboration through workflow sharing. The Sensor Web is an open complex adaptive system the pervades the internet and provides access to sensor resources. One mechanism for describing sensor resources is through the use of Sensor ML. It is shown...

  11. Using SensorML to describe scientific workflows in distributed web service environments

    CSIR Research Space (South Africa)

    Van Zyl, TL

    2009-07-01

    Full Text Available for increased collaboration through workflow sharing. The Sensor Web is an open complex adaptive system the pervades the internet and provides access to sensor resources. One mechanism for describing sensor resources is through the use of SensorML. It is shown...

  12. EPUB as publication format in Open Access journals: Tools and workflow

    Directory of Open Access Journals (Sweden)

    Trude Eikebrokk

    2014-04-01

    Full Text Available In this article, we present a case study of how the main publishing format of an Open Access journal was changed from PDF to EPUB by designing a new workflow using JATS as the basic XML source format. We state the reasons and discuss advantages for doing this, how we did it, and the costs of changing an established Microsoft Word workflow. As an example, we use one typical sociology article with tables, illustrations and references. We then follow the article from JATS markup through different transformations resulting in XHTML, EPUB and MOBI versions. In the end, we put everything together in an automated XProc pipeline. The process has been developed on free and open source tools, and we describe and evaluate these tools in the article. The workflow is suitable for non-professional publishers, and all code is attached and free for reuse by others.

  13. The MPO system for automatic workflow documentation

    Energy Technology Data Exchange (ETDEWEB)

    Abla, G.; Coviello, E.N.; Flanagan, S.M. [General Atomics, P.O. Box 85608, San Diego, CA 92186-5608 (United States); Greenwald, M. [Massachusetts Institute of Technology, Cambridge, MA 02139 (United States); Lee, X. [General Atomics, P.O. Box 85608, San Diego, CA 92186-5608 (United States); Romosan, A. [Lawrence Berkeley National Laboratory, Berkeley, CA 94720 (United States); Schissel, D.P., E-mail: schissel@fusion.gat.com [General Atomics, P.O. Box 85608, San Diego, CA 92186-5608 (United States); Shoshani, A. [Lawrence Berkeley National Laboratory, Berkeley, CA 94720 (United States); Stillerman, J.; Wright, J. [Massachusetts Institute of Technology, Cambridge, MA 02139 (United States); Wu, K.J. [Lawrence Berkeley National Laboratory, Berkeley, CA 94720 (United States)

    2016-11-15

    Highlights: • Data model, infrastructure, and tools for data tracking, cataloging, and integration. • Automatically document workflow and data provenance in the widest sense. • Fusion Science as test bed but the system’s framework and data model is quite general. - Abstract: Data from large-scale experiments and extreme-scale computing is expensive to produce and may be used for critical applications. However, it is not the mere existence of data that is important, but our ability to make use of it. Experience has shown that when metadata is better organized and more complete, the underlying data becomes more useful. Traditionally, capturing the steps of scientific workflows and metadata was the role of the lab notebook, but the digital era has resulted instead in the fragmentation of data, processing, and annotation. This paper presents the Metadata, Provenance, and Ontology (MPO) System, the software that can automate the documentation of scientific workflows and associated information. Based on recorded metadata, it provides explicit information about the relationships among the elements of workflows in notebook form augmented with directed acyclic graphs. A set of web-based graphical navigation tools and Application Programming Interface (API) have been created for searching and browsing, as well as programmatically accessing the workflows and data. We describe the MPO concepts and its software architecture. We also report the current status of the software as well as the initial deployment experience.

  14. The MPO system for automatic workflow documentation

    International Nuclear Information System (INIS)

    Abla, G.; Coviello, E.N.; Flanagan, S.M.; Greenwald, M.; Lee, X.; Romosan, A.; Schissel, D.P.; Shoshani, A.; Stillerman, J.; Wright, J.; Wu, K.J.

    2016-01-01

    Highlights: • Data model, infrastructure, and tools for data tracking, cataloging, and integration. • Automatically document workflow and data provenance in the widest sense. • Fusion Science as test bed but the system’s framework and data model is quite general. - Abstract: Data from large-scale experiments and extreme-scale computing is expensive to produce and may be used for critical applications. However, it is not the mere existence of data that is important, but our ability to make use of it. Experience has shown that when metadata is better organized and more complete, the underlying data becomes more useful. Traditionally, capturing the steps of scientific workflows and metadata was the role of the lab notebook, but the digital era has resulted instead in the fragmentation of data, processing, and annotation. This paper presents the Metadata, Provenance, and Ontology (MPO) System, the software that can automate the documentation of scientific workflows and associated information. Based on recorded metadata, it provides explicit information about the relationships among the elements of workflows in notebook form augmented with directed acyclic graphs. A set of web-based graphical navigation tools and Application Programming Interface (API) have been created for searching and browsing, as well as programmatically accessing the workflows and data. We describe the MPO concepts and its software architecture. We also report the current status of the software as well as the initial deployment experience.

  15. wft4galaxy: a workflow testing tool for galaxy.

    Science.gov (United States)

    Piras, Marco Enrico; Pireddu, Luca; Zanetti, Gianluigi

    2017-12-01

    Workflow managers for scientific analysis provide a high-level programming platform facilitating standardization, automation, collaboration and access to sophisticated computing resources. The Galaxy workflow manager provides a prime example of this type of platform. As compositions of simpler tools, workflows effectively comprise specialized computer programs implementing often very complex analysis procedures. To date, no simple way to automatically test Galaxy workflows and ensure their correctness has appeared in the literature. With wft4galaxy we offer a tool to bring automated testing to Galaxy workflows, making it feasible to bring continuous integration to their development and ensuring that defects are detected promptly. wft4galaxy can be easily installed as a regular Python program or launched directly as a Docker container-the latter reducing installation effort to a minimum. Available at https://github.com/phnmnl/wft4galaxy under the Academic Free License v3.0. marcoenrico.piras@crs4.it. © The Author 2017. Published by Oxford University Press.

  16. An Adaptable Seismic Data Format for Modern Scientific Workflows

    Science.gov (United States)

    Smith, J. A.; Bozdag, E.; Krischer, L.; Lefebvre, M.; Lei, W.; Podhorszki, N.; Tromp, J.

    2013-12-01

    Data storage, exchange, and access play a critical role in modern seismology. Current seismic data formats, such as SEED, SAC, and SEG-Y, were designed with specific applications in mind and are frequently a major bottleneck in implementing efficient workflows. We propose a new modern parallel format that can be adapted for a variety of seismic workflows. The Adaptable Seismic Data Format (ASDF) features high-performance parallel read and write support and the ability to store an arbitrary number of traces of varying sizes. Provenance information is stored inside the file so that users know the origin of the data as well as the precise operations that have been applied to the waveforms. The design of the new format is based on several real-world use cases, including earthquake seismology and seismic interferometry. The metadata is based on the proven XML schemas StationXML and QuakeML. Existing time-series analysis tool-kits are easily interfaced with this new format so that seismologists can use robust, previously developed software packages, such as ObsPy and the SAC library. ADIOS, netCDF4, and HDF5 can be used as the underlying container format. At Princeton University, we have chosen to use ADIOS as the container format because it has shown superior scalability for certain applications, such as dealing with big data on HPC systems. In the context of high-performance computing, we have implemented ASDF into the global adjoint tomography workflow on Oak Ridge National Laboratory's supercomputer Titan.

  17. A Component Based Approach to Scientific Workflow Management

    CERN Document Server

    Le Goff, Jean-Marie; Baker, Nigel; Brooks, Peter; McClatchey, Richard

    2001-01-01

    CRISTAL is a distributed scientific workflow system used in the manufacturing and production phases of HEP experiment construction at CERN. The CRISTAL project has studied the use of a description driven approach, using meta- modelling techniques, to manage the evolving needs of a large physics community. Interest from such diverse communities as bio-informatics and manufacturing has motivated the CRISTAL team to re-engineer the system to customize functionality according to end user requirements but maximize software reuse in the process. The next generation CRISTAL vision is to build a generic component architecture from which a complete software product line can be generated according to the particular needs of the target enterprise. This paper discusses the issues of adopting a component product line based approach and our experiences of software reuse.

  18. A component based approach to scientific workflow management

    International Nuclear Information System (INIS)

    Baker, N.; Brooks, P.; McClatchey, R.; Kovacs, Z.; LeGoff, J.-M.

    2001-01-01

    CRISTAL is a distributed scientific workflow system used in the manufacturing and production phases of HEP experiment construction at CERN. The CRISTAL project has studied the use of a description driven approach, using meta-modelling techniques, to manage the evolving needs of a large physics community. Interest from such diverse communities as bio-informatics and manufacturing has motivated the CRISTAL team to re-engineer the system to customize functionality according to end user requirements but maximize software reuse in the process. The next generation CRISTAL vision is to build a generic component architecture from which a complete software product line can be generated according to the particular needs of the target enterprise. This paper discusses the issues of adopting a component product line based approach and our experiences of software reuse

  19. Integrated Automatic Workflow for Phylogenetic Tree Analysis Using Public Access and Local Web Services.

    Science.gov (United States)

    Damkliang, Kasikrit; Tandayya, Pichaya; Sangket, Unitsa; Pasomsub, Ekawat

    2016-11-28

    At the present, coding sequence (CDS) has been discovered and larger CDS is being revealed frequently. Approaches and related tools have also been developed and upgraded concurrently, especially for phylogenetic tree analysis. This paper proposes an integrated automatic Taverna workflow for the phylogenetic tree inferring analysis using public access web services at European Bioinformatics Institute (EMBL-EBI) and Swiss Institute of Bioinformatics (SIB), and our own deployed local web services. The workflow input is a set of CDS in the Fasta format. The workflow supports 1,000 to 20,000 numbers in bootstrapping replication. The workflow performs the tree inferring such as Parsimony (PARS), Distance Matrix - Neighbor Joining (DIST-NJ), and Maximum Likelihood (ML) algorithms of EMBOSS PHYLIPNEW package based on our proposed Multiple Sequence Alignment (MSA) similarity score. The local web services are implemented and deployed into two types using the Soaplab2 and Apache Axis2 deployment. There are SOAP and Java Web Service (JWS) providing WSDL endpoints to Taverna Workbench, a workflow manager. The workflow has been validated, the performance has been measured, and its results have been verified. Our workflow's execution time is less than ten minutes for inferring a tree with 10,000 replicates of the bootstrapping numbers. This paper proposes a new integrated automatic workflow which will be beneficial to the bioinformaticians with an intermediate level of knowledge and experiences. All local services have been deployed at our portal http://bioservices.sci.psu.ac.th.

  20. A practical workflow for making anatomical atlases for biological research.

    Science.gov (United States)

    Wan, Yong; Lewis, A Kelsey; Colasanto, Mary; van Langeveld, Mark; Kardon, Gabrielle; Hansen, Charles

    2012-01-01

    The anatomical atlas has been at the intersection of science and art for centuries. These atlases are essential to biological research, but high-quality atlases are often scarce. Recent advances in imaging technology have made high-quality 3D atlases possible. However, until now there has been a lack of practical workflows using standard tools to generate atlases from images of biological samples. With certain adaptations, CG artists' workflow and tools, traditionally used in the film industry, are practical for building high-quality biological atlases. Researchers have developed a workflow for generating a 3D anatomical atlas using accessible artists' tools. They used this workflow to build a mouse limb atlas for studying the musculoskeletal system's development. This research aims to raise the awareness of using artists' tools in scientific research and promote interdisciplinary collaborations between artists and scientists. This video (http://youtu.be/g61C-nia9ms) demonstrates a workflow for creating an anatomical atlas.

  1. Analysing scientific workflows: Why workflows not only connect web services

    NARCIS (Netherlands)

    Wassink, I.; van der Vet, P.E.; Wolstencroft, K.; Neerincx, P.B.T.; Roos, M.; Rauwerda, H.; Breit, T.M.; Zhang, L.J.

    2009-01-01

    Life science workflow systems are developed to help life scientists to conveniently connect various programs and web services. In practice however, much time is spent on data conversion, because web services provided by different organisations use different data formats. We have analysed all the

  2. Analysing scientific workflows: why workflows not only connect web services

    NARCIS (Netherlands)

    Wassink, I.; van der Vet, P.E.; Wolstencroft, K.; Neerincx, P.B.T.; Roos, M.; Rauwerda, H.; Breit, T.M.; Zhang, LJ.

    2009-01-01

    Life science workflow systems are developed to help life scientists to conveniently connect various programs and web services. In practice however, much time is spent on data conversion, because web services provided by different organisations use different data formats. We have analysed all the

  3. Facilitating Scientific Research through Workflows and Provenance on the DataONE Cyberinfrastructure (Invited)

    Science.gov (United States)

    Ludaescher, B.; Cuevas-Vicenttín, V.; Missier, P.; Dey, S.; Kianmajd, P.; Wei, Y.; Koop, D.; Chirigati, F.; Altintas, I.; Belhajjame, K.; Bowers, S.

    2013-12-01

    ONE CI to be 'provenance-aware'. DataONE Member Nodes manage packages containing the data along with their science/system metadata, indexing them to facilitate search by the tools of the Information Toolkit (ITK) such as ONEMercury. Our extension is based on PBase, which can manage D-PROV provenance data efficiently. D-PROV extends the W3C PROV standard by enabling the explicit representation of workflows along with traces in a standardized manner. PBase consists of components for storage, indexing, and querying. First, it has a provenance data repository capable of dealing with the scalability and availability requirements expected from large datasets. A graph database such as Neo4j could fulfill this role. Second, PBase has a specialized index enabling efficient data access, built based on but also extending the indexing capabilities furnished by the underlying database. Finally, PBase contains a querying engine supported by the provenance index. This engine will enable easy and efficient access to the provenance data, primarily through a declarative language.

  4. A framework for integration of scientific applications into the OpenTopography workflow

    Science.gov (United States)

    Nandigam, V.; Crosby, C.; Baru, C.

    2012-12-01

    The NSF-funded OpenTopography facility provides online access to Earth science-oriented high-resolution LIDAR topography data, online processing tools, and derivative products. The underlying cyberinfrastructure employs a multi-tier service oriented architecture that is comprised of an infrastructure tier, a processing services tier, and an application tier. The infrastructure tier consists of storage, compute resources as well as supporting databases. The services tier consists of the set of processing routines each deployed as a Web service. The applications tier provides client interfaces to the system. (e.g. Portal). We propose a "pluggable" infrastructure design that will allow new scientific algorithms and processing routines developed and maintained by the community to be integrated into the OpenTopography system so that the wider earth science community can benefit from its availability. All core components in OpenTopography are available as Web services using a customized open-source Opal toolkit. The Opal toolkit provides mechanisms to manage and track job submissions, with the help of a back-end database. It allows monitoring of job and system status by providing charting tools. All core components in OpenTopography have been developed, maintained and wrapped as Web services using Opal by OpenTopography developers. However, as the scientific community develops new processing and analysis approaches this integration approach is not scalable efficiently. Most of the new scientific applications will have their own active development teams performing regular updates, maintenance and other improvements. It would be optimal to have the application co-located where its developers can continue to actively work on it while still making it accessible within the OpenTopography workflow for processing capabilities. We will utilize a software framework for remote integration of these scientific applications into the OpenTopography system. This will be accomplished by

  5. Distributed execution of aggregated multi domain workflows using an agent framework

    NARCIS (Netherlands)

    Zhao, Z.; Belloum, A.; de Laat, C.; Adriaans, P.; Hertzberger, B.; Zhang, L.J.; Watson, T.J.; Yang, J.; Hung, P.C.K.

    2007-01-01

    In e-Science, meaningful experiment processes and workflow engines emerge as important scientific resources. A complex experiment often involves services and processes developed in different scientific domains. Aggregating different workflows into one meta workflow avoids unnecessary rewriting of

  6. Federated Database Services for Wind Tunnel Experiment Workflows

    Directory of Open Access Journals (Sweden)

    A. Paventhan

    2006-01-01

    Full Text Available Enabling the full life cycle of scientific and engineering workflows requires robust middleware and services that support effective data management, near-realtime data movement and custom data processing. Many existing solutions exploit the database as a passive metadata catalog. In this paper, we present an approach that makes use of federation of databases to host data-centric wind tunnel application workflows. The user is able to compose customized application workflows based on database services. We provide a reference implementation that leverages typical business tools and technologies: Microsoft SQL Server for database services and Windows Workflow Foundation for workflow services. The application data and user's code are both hosted in federated databases. With the growing interest in XML Web Services in scientific Grids, and with databases beginning to support native XML types and XML Web services, we can expect the role of databases in scientific computation to grow in importance.

  7. It's All About the Data: Workflow Systems and Weather

    Science.gov (United States)

    Plale, B.

    2009-05-01

    under high-volume conditions, or the searchability and manageability of the resulting data products is disappointingly low. The provenance of a data product is a record of its lineage, or trace of the execution history that resulted in the product. The provenance of a forecast model result, e.g., captures information about the executable version of the model, configuration parameters, input data products, execution environment, and owner. Provenance enables data to be properly attributed and captures critical parameters about the model run so the quality of the result can be ascertained. Proper provenance is essential to providing reproducible scientific computing results. Workflow languages used in science discovery are complete programming languages, and in theory can support any logic expressible by a programming language. The execution environments supporting the workflow engines, on the other hand, are subject to constraints on physical resources, and hence in practice the workflow task graphs used in science utilize relatively few of the cataloged workflow patterns. It is important to note that these workflows are executed on demand, and are executed once. Into this context is introduced the need for science discovery that is responsive to real time information. If we can use simple programming models and abstractions to make scientific discovery involving real-time data accessible to specialists who share and utilize data across scientific domains, we bring science one step closer to solving the largest of human problems.

  8. Hermes: Seamless delivery of containerized bioinformatics workflows in hybrid cloud (HTC) environments

    Science.gov (United States)

    Kintsakis, Athanassios M.; Psomopoulos, Fotis E.; Symeonidis, Andreas L.; Mitkas, Pericles A.

    Hermes introduces a new "describe once, run anywhere" paradigm for the execution of bioinformatics workflows in hybrid cloud environments. It combines the traditional features of parallelization-enabled workflow management systems and of distributed computing platforms in a container-based approach. It offers seamless deployment, overcoming the burden of setting up and configuring the software and network requirements. Most importantly, Hermes fosters the reproducibility of scientific workflows by supporting standardization of the software execution environment, thus leading to consistent scientific workflow results and accelerating scientific output.

  9. Deploying and sharing U-Compare workflows as web services.

    Science.gov (United States)

    Kontonatsios, Georgios; Korkontzelos, Ioannis; Kolluru, Balakrishna; Thompson, Paul; Ananiadou, Sophia

    2013-02-18

    U-Compare is a text mining platform that allows the construction, evaluation and comparison of text mining workflows. U-Compare contains a large library of components that are tuned to the biomedical domain. Users can rapidly develop biomedical text mining workflows by mixing and matching U-Compare's components. Workflows developed using U-Compare can be exported and sent to other users who, in turn, can import and re-use them. However, the resulting workflows are standalone applications, i.e., software tools that run and are accessible only via a local machine, and that can only be run with the U-Compare platform. We address the above issues by extending U-Compare to convert standalone workflows into web services automatically, via a two-click process. The resulting web services can be registered on a central server and made publicly available. Alternatively, users can make web services available on their own servers, after installing the web application framework, which is part of the extension to U-Compare. We have performed a user-oriented evaluation of the proposed extension, by asking users who have tested the enhanced functionality of U-Compare to complete questionnaires that assess its functionality, reliability, usability, efficiency and maintainability. The results obtained reveal that the new functionality is well received by users. The web services produced by U-Compare are built on top of open standards, i.e., REST and SOAP protocols, and therefore, they are decoupled from the underlying platform. Exported workflows can be integrated with any application that supports these open standards. We demonstrate how the newly extended U-Compare enhances the cross-platform interoperability of workflows, by seamlessly importing a number of text mining workflow web services exported from U-Compare into Taverna, i.e., a generic scientific workflow construction platform.

  10. Hermes: Seamless delivery of containerized bioinformatics workflows in hybrid cloud (HTC environments

    Directory of Open Access Journals (Sweden)

    Athanassios M. Kintsakis

    2017-01-01

    Full Text Available Hermes introduces a new “describe once, run anywhere” paradigm for the execution of bioinformatics workflows in hybrid cloud environments. It combines the traditional features of parallelization-enabled workflow management systems and of distributed computing platforms in a container-based approach. It offers seamless deployment, overcoming the burden of setting up and configuring the software and network requirements. Most importantly, Hermes fosters the reproducibility of scientific workflows by supporting standardization of the software execution environment, thus leading to consistent scientific workflow results and accelerating scientific output.

  11. Multi-level meta-workflows: new concept for regularly occurring tasks in quantum chemistry.

    Science.gov (United States)

    Arshad, Junaid; Hoffmann, Alexander; Gesing, Sandra; Grunzke, Richard; Krüger, Jens; Kiss, Tamas; Herres-Pawlis, Sonja; Terstyanszky, Gabor

    2016-01-01

    In Quantum Chemistry, many tasks are reoccurring frequently, e.g. geometry optimizations, benchmarking series etc. Here, workflows can help to reduce the time of manual job definition and output extraction. These workflows are executed on computing infrastructures and may require large computing and data resources. Scientific workflows hide these infrastructures and the resources needed to run them. It requires significant efforts and specific expertise to design, implement and test these workflows. Many of these workflows are complex and monolithic entities that can be used for particular scientific experiments. Hence, their modification is not straightforward and it makes almost impossible to share them. To address these issues we propose developing atomic workflows and embedding them in meta-workflows. Atomic workflows deliver a well-defined research domain specific function. Publishing workflows in repositories enables workflow sharing inside and/or among scientific communities. We formally specify atomic and meta-workflows in order to define data structures to be used in repositories for uploading and sharing them. Additionally, we present a formal description focused at orchestration of atomic workflows into meta-workflows. We investigated the operations that represent basic functionalities in Quantum Chemistry, developed the relevant atomic workflows and combined them into meta-workflows. Having these workflows we defined the structure of the Quantum Chemistry workflow library and uploaded these workflows in the SHIWA Workflow Repository.Graphical AbstractMeta-workflows and embedded workflows in the template representation.

  12. Implementing bioinformatic workflows within the bioextract server

    Science.gov (United States)

    Computational workflows in bioinformatics are becoming increasingly important in the achievement of scientific advances. These workflows typically require the integrated use of multiple, distributed data sources and analytic tools. The BioExtract Server (http://bioextract.org) is a distributed servi...

  13. A framework for streamlining research workflow in neuroscience and psychology

    Directory of Open Access Journals (Sweden)

    Jonas eKubilius

    2014-01-01

    Full Text Available Successful accumulation of knowledge is critically dependent on the ability to verify and replicate every part of scientific conduct. However, such principles are difficult to enact when researchers continue to resort on ad hoc workflows and with poorly maintained code base. In this paper I examine the needs of neuroscience and psychology community, and introduce psychopy_ext, a unifying framework that seamlessly integrates popular experiment building, analysis and manuscript preparation tools by choosing reasonable defaults and implementing relatively rigid patterns of workflow. This structure allows for automation of multiple tasks, such as generated user interfaces, unit testing, control analyses of stimuli, single-command access to descriptive statistics, and publication quality plotting. Taken together, psychopy_ext opens an exciting possibility for faster, more robust code development and collaboration for researchers.

  14. WS-VLAM: A GT4 based workflow management system

    NARCIS (Netherlands)

    Wibisono, A.; Vasyunin, D.; Korkhov, V.; Zhao, Z.; Belloum, A.; de Laat, C.; Adriaans, P.; Hertzberger, B.

    2007-01-01

    Generic Grid middleware, e.g., Globus Toolkit 4 (GT4), provides basic services for scientific workflow management systems to discover, store and integrate workflow components. Using the state of the art Grid services can advance the functionality of workflow engine in orchestrating distributed Grid

  15. The Frictionless Data Package: Data Containerization for Automated Scientific Workflows

    Science.gov (United States)

    Shepherd, A.; Fils, D.; Kinkade, D.; Saito, M. A.

    2017-12-01

    As cross-disciplinary geoscience research increasingly relies on machines to discover and access data, one of the critical questions facing data repositories is how data and supporting materials should be packaged for consumption. Traditionally, data repositories have relied on a human's involvement throughout discovery and access workflows. This human could assess fitness for purpose by reading loosely coupled, unstructured information from web pages and documentation. In attempts to shorten the time to science and access data resources across may disciplines, expectations for machines to mediate the process of discovery and access is challenging data repository infrastructure. This challenge is to find ways to deliver data and information in ways that enable machines to make better decisions by enabling them to understand the data and metadata of many data types. Additionally, once machines have recommended a data resource as relevant to an investigator's needs, the data resource should be easy to integrate into that investigator's toolkits for analysis and visualization. The Biological and Chemical Oceanography Data Management Office (BCO-DMO) supports NSF-funded OCE and PLR investigators with their project's data management needs. These needs involve a number of varying data types some of which require multiple files with differing formats. Presently, BCO-DMO has described these data types and the important relationships between the type's data files through human-readable documentation on web pages. For machines directly accessing data files from BCO-DMO, this documentation could be overlooked and lead to misinterpreting the data. Instead, BCO-DMO is exploring the idea of data containerization, or packaging data and related information for easier transport, interpretation, and use. In researching the landscape of data containerization, the Frictionlessdata Data Package (http://frictionlessdata.io/) provides a number of valuable advantages over similar

  16. Scientific workflow and support for high resolution global climate modeling at the Oak Ridge Leadership Computing Facility

    Science.gov (United States)

    Anantharaj, V.; Mayer, B.; Wang, F.; Hack, J.; McKenna, D.; Hartman-Baker, R.

    2012-04-01

    The Oak Ridge Leadership Computing Facility (OLCF) facilitates the execution of computational experiments that require tens of millions of CPU hours (typically using thousands of processors simultaneously) while generating hundreds of terabytes of data. A set of ultra high resolution climate experiments in progress, using the Community Earth System Model (CESM), will produce over 35,000 files, ranging in sizes from 21 MB to 110 GB each. The execution of the experiments will require nearly 70 Million CPU hours on the Jaguar and Titan supercomputers at OLCF. The total volume of the output from these climate modeling experiments will be in excess of 300 TB. This model output must then be archived, analyzed, distributed to the project partners in a timely manner, and also made available more broadly. Meeting this challenge would require efficient movement of the data, staging the simulation output to a large and fast file system that provides high volume access to other computational systems used to analyze the data and synthesize results. This file system also needs to be accessible via high speed networks to an archival system that can provide long term reliable storage. Ideally this archival system is itself directly available to other systems that can be used to host services making the data and analysis available to the participants in the distributed research project and to the broader climate community. The various resources available at the OLCF now support this workflow. The available systems include the new Jaguar Cray XK6 2.63 petaflops (estimated) supercomputer, the 10 PB Spider center-wide parallel file system, the Lens/EVEREST analysis and visualization system, the HPSS archival storage system, the Earth System Grid (ESG), and the ORNL Climate Data Server (CDS). The ESG features federated services, search & discovery, extensive data handling capabilities, deep storage access, and Live Access Server (LAS) integration. The scientific workflow enabled on

  17. Constructing Workflows from Script Applications

    Directory of Open Access Journals (Sweden)

    Mikołaj Baranowski

    2012-01-01

    Full Text Available For programming and executing complex applications on grid infrastructures, scientific workflows have been proposed as convenient high-level alternative to solutions based on general-purpose programming languages, APIs and scripts. GridSpace is a collaborative programming and execution environment, which is based on a scripting approach and it extends Ruby language with a high-level API for invoking operations on remote resources. In this paper we describe a tool which enables to convert the GridSpace application source code into a workflow representation which, in turn, may be used for scheduling, provenance, or visualization. We describe how we addressed the issues of analyzing Ruby source code, resolving variable and method dependencies, as well as building workflow representation. The solutions to these problems have been developed and they were evaluated by testing them on complex grid application workflows such as CyberShake, Epigenomics and Montage. Evaluation is enriched by representing typical workflow control flow patterns.

  18. Open access: changing global science publishing.

    Science.gov (United States)

    Gasparyan, Armen Yuri; Ayvazyan, Lilit; Kitas, George D

    2013-08-01

    The article reflects on open access as a strategy of changing the quality of science communication globally. Successful examples of open-access journals are presented to highlight implications of archiving in open digital repositories for the quality and citability of research output. Advantages and downsides of gold, green, and hybrid models of open access operating in diverse scientific environments are described. It is assumed that open access is a global trend which influences the workflow in scholarly journals, changing their quality, credibility, and indexability.

  19. Workflows for Full Waveform Inversions

    Science.gov (United States)

    Boehm, Christian; Krischer, Lion; Afanasiev, Michael; van Driel, Martin; May, Dave A.; Rietmann, Max; Fichtner, Andreas

    2017-04-01

    Despite many theoretical advances and the increasing availability of high-performance computing clusters, full seismic waveform inversions still face considerable challenges regarding data and workflow management. While the community has access to solvers which can harness modern heterogeneous computing architectures, the computational bottleneck has fallen to these often manpower-bounded issues that need to be overcome to facilitate further progress. Modern inversions involve huge amounts of data and require a tight integration between numerical PDE solvers, data acquisition and processing systems, nonlinear optimization libraries, and job orchestration frameworks. To this end we created a set of libraries and applications revolving around Salvus (http://salvus.io), a novel software package designed to solve large-scale full waveform inverse problems. This presentation focuses on solving passive source seismic full waveform inversions from local to global scales with Salvus. We discuss (i) design choices for the aforementioned components required for full waveform modeling and inversion, (ii) their implementation in the Salvus framework, and (iii) how it is all tied together by a usable workflow system. We combine state-of-the-art algorithms ranging from high-order finite-element solutions of the wave equation to quasi-Newton optimization algorithms using trust-region methods that can handle inexact derivatives. All is steered by an automated interactive graph-based workflow framework capable of orchestrating all necessary pieces. This naturally facilitates the creation of new Earth models and hopefully sparks new scientific insights. Additionally, and even more importantly, it enhances reproducibility and reliability of the final results.

  20. Funding scientific open access

    International Nuclear Information System (INIS)

    Canessa, E.; Fonda, C.; Zennaro, M.

    2006-11-01

    In order to reduce the knowledge divide, more Open Access Journals (OAJ) are needed in all languages and scholarly subject areas that exercise peer-review or editorial quality control. To finance needed costs, it is discussed why and how to sell target specific advertisement by associating ads to given scientific keywords. (author)

  1. ERINDA Scientific Results: Transnational Access Activities and Scientific Visits

    CERN Document Server

    Hambsch, Franz-Josef

    2014-01-01

    This paper gives an overview of the Transnational Access Activities and Scientific visits within the FP7 project ERINDA (European Research Infrastructures for Nuclear Data). It highlights the fact that nearly 3200 data - taking hours for external users were made available in the partner installations and 104 man weeks for scientific visits to par tner institutes. This is much more than the 2500 beam hours and 80 weeks promised in the Description of Work of the project.

  2. A data model for analyzing user collaborations in workflow-driven e-Science

    NARCIS (Netherlands)

    Altintas, I.; Anand, M.K.; Vuong, T.N.; Bowers, S.; Ludäscher, B.; Sloot, P.M.A.

    2011-01-01

    Scientific discoveries are often the result of methodical execution of many interrelated scientific workflows, where workflows and datasets published by one set of users can be used by other users to perform subsequent analyses, leading to implicit or explicit collaboration. In this paper, we

  3. Toward Exascale Seismic Imaging: Taming Workflow and I/O Issues

    Science.gov (United States)

    Lefebvre, M. P.; Bozdag, E.; Lei, W.; Rusmanugroho, H.; Smith, J. A.; Tromp, J.; Yuan, Y.

    2013-12-01

    Providing a better understanding of the physics and chemistry of Earth's interior through numerical simulations has always required tremendous computational resources. Post-petascale supercomputers are now available to solve complex scientific problems that were thought unreachable a few decades ago. They also bring a cohort of concerns on how to obtain optimum performance. Several issues are currently being investigated by the HPC community. To name a few, we can list energy consumption, fault resilience, scalability of the current parallel paradigms, large workflow management, I/O performance and feature extraction with large datasets. For this presentation, we focus on the last three issues. In the context of seismic imaging, in particular for simulations based on adjoint methods, workflows are well defined. They consist of a few collective steps (e.g., mesh generation or model updates) and of a large number of independent steps (e.g., forward and adjoint simulations of each seismic event, pre- and postprocessing of seismic traces). The greater goal is to reduce the time to solution, that is, obtaining a more precise representation of the subsurface as fast as possible. This brings us to consider both the workflow in its entirety and the parts composing it. The usual approach is to speedup the purely computational parts by code tuning in order to reach higher FLOPS and better memory usage. This still remains an important concern, but larger scale experiments show that the imaging workflow suffers from a severe I/O bottleneck. This limitation occurs both for purely computational data and seismic time series. The latter are dealt with by the introduction of a new Adaptable Seismic Data Format (ASDF). In both cases, a parallel I/O library, ORNL's ADIOS, is used to drastically lessen the weight of disk access. Moreover, parallel visualization tools, such as VisIt, are able to take advantage of the metadata included in our ADIOS outputs to extract features and

  4. Workflow-Oriented Cyberinfrastructure for Sensor Data Analytics

    Science.gov (United States)

    Orcutt, J. A.; Rajasekar, A.; Moore, R. W.; Vernon, F.

    2015-12-01

    Sensor streams comprise an increasingly large part of Earth Science data. Analytics based on sensor data require an easy way to perform operations such as acquisition, conversion to physical units, metadata linking, sensor fusion, analysis and visualization on distributed sensor streams. Furthermore, embedding real-time sensor data into scientific workflows is of growing interest. We have implemented a scalable networked architecture that can be used to dynamically access packets of data in a stream from multiple sensors, and perform synthesis and analysis across a distributed network. Our system is based on the integrated Rule Oriented Data System (irods.org), which accesses sensor data from the Antelope Real Time Data System (brtt.com), and provides virtualized access to collections of data streams. We integrate real-time data streaming from different sources, collected for different purposes, on different time and spatial scales, and sensed by different methods. iRODS, noted for its policy-oriented data management, brings to sensor processing features and facilities such as single sign-on, third party access control lists ( ACLs), location transparency, logical resource naming, and server-side modeling capabilities while reducing the burden on sensor network operators. Rich integrated metadata support also makes it straightforward to discover data streams of interest and maintain data provenance. The workflow support in iRODS readily integrates sensor processing into any analytical pipeline. The system is developed as part of the NSF-funded Datanet Federation Consortium (datafed.org). APIs for selecting, opening, reaping and closing sensor streams are provided, along with other helper functions to associate metadata and convert sensor packets into NetCDF and JSON formats. Near real-time sensor data including seismic sensors, environmental sensors, LIDAR and video streams are available through this interface. A system for archiving sensor data and metadata in Net

  5. Job life cycle management libraries for CMS workflow management projects

    International Nuclear Information System (INIS)

    Lingen, Frank van; Wilkinson, Rick; Evans, Dave; Foulkes, Stephen; Afaq, Anzar; Vaandering, Eric; Ryu, Seangchan

    2010-01-01

    Scientific analysis and simulation requires the processing and generation of millions of data samples. These tasks are often comprised of multiple smaller tasks divided over multiple (computing) sites. This paper discusses the Compact Muon Solenoid (CMS) workflow infrastructure, and specifically the Python based workflow library which is used for so called task lifecycle management. The CMS workflow infrastructure consists of three layers: high level specification of the various tasks based on input/output data sets, life cycle management of task instances derived from the high level specification and execution management. The workflow library is the result of a convergence of three CMS sub projects that respectively deal with scientific analysis, simulation and real time data aggregation from the experiment. This will reduce duplication and hence development and maintenance costs.

  6. GeNNet: an integrated platform for unifying scientific workflows and graph databases for transcriptome data analysis

    Directory of Open Access Journals (Sweden)

    Raquel L. Costa

    2017-07-01

    Full Text Available There are many steps in analyzing transcriptome data, from the acquisition of raw data to the selection of a subset of representative genes that explain a scientific hypothesis. The data produced can be represented as networks of interactions among genes and these may additionally be integrated with other biological databases, such as Protein-Protein Interactions, transcription factors and gene annotation. However, the results of these analyses remain fragmented, imposing difficulties, either for posterior inspection of results, or for meta-analysis by the incorporation of new related data. Integrating databases and tools into scientific workflows, orchestrating their execution, and managing the resulting data and its respective metadata are challenging tasks. Additionally, a great amount of effort is equally required to run in-silico experiments to structure and compose the information as needed for analysis. Different programs may need to be applied and different files are produced during the experiment cycle. In this context, the availability of a platform supporting experiment execution is paramount. We present GeNNet, an integrated transcriptome analysis platform that unifies scientific workflows with graph databases for selecting relevant genes according to the evaluated biological systems. It includes GeNNet-Wf, a scientific workflow that pre-loads biological data, pre-processes raw microarray data and conducts a series of analyses including normalization, differential expression inference, clusterization and gene set enrichment analysis. A user-friendly web interface, GeNNet-Web, allows for setting parameters, executing, and visualizing the results of GeNNet-Wf executions. To demonstrate the features of GeNNet, we performed case studies with data retrieved from GEO, particularly using a single-factor experiment in different analysis scenarios. As a result, we obtained differentially expressed genes for which biological functions were

  7. Workflows in bioinformatics: meta-analysis and prototype implementation of a workflow generator

    Directory of Open Access Journals (Sweden)

    Thoraval Samuel

    2005-04-01

    Full Text Available Abstract Background Computational methods for problem solving need to interleave information access and algorithm execution in a problem-specific workflow. The structures of these workflows are defined by a scaffold of syntactic, semantic and algebraic objects capable of representing them. Despite the proliferation of GUIs (Graphic User Interfaces in bioinformatics, only some of them provide workflow capabilities; surprisingly, no meta-analysis of workflow operators and components in bioinformatics has been reported. Results We present a set of syntactic components and algebraic operators capable of representing analytical workflows in bioinformatics. Iteration, recursion, the use of conditional statements, and management of suspend/resume tasks have traditionally been implemented on an ad hoc basis and hard-coded; by having these operators properly defined it is possible to use and parameterize them as generic re-usable components. To illustrate how these operations can be orchestrated, we present GPIPE, a prototype graphic pipeline generator for PISE that allows the definition of a pipeline, parameterization of its component methods, and storage of metadata in XML formats. This implementation goes beyond the macro capacities currently in PISE. As the entire analysis protocol is defined in XML, a complete bioinformatic experiment (linked sets of methods, parameters and results can be reproduced or shared among users. Availability: http://if-web1.imb.uq.edu.au/Pise/5.a/gpipe.html (interactive, ftp://ftp.pasteur.fr/pub/GenSoft/unix/misc/Pise/ (download. Conclusion From our meta-analysis we have identified syntactic structures and algebraic operators common to many workflows in bioinformatics. The workflow components and algebraic operators can be assimilated into re-usable software components. GPIPE, a prototype implementation of this framework, provides a GUI builder to facilitate the generation of workflows and integration of heterogeneous

  8. Scientific Data Management (SDM) Center for Enabling Technologies. 2007-2012

    Energy Technology Data Exchange (ETDEWEB)

    Ludascher, Bertram [Univ. of California, Davis, CA (United States); Altintas, Ilkay [Univ. of California, San Diego, CA (United States)

    2013-09-06

    Over the past five years, our activities have both established Kepler as a viable scientific workflow environment and demonstrated its value across multiple science applications. We have published numerous peer-reviewed papers on the technologies highlighted in this short paper and have given Kepler tutorials at SC06,SC07,SC08,and SciDAC 2007. Our outreach activities have allowed scientists to learn best practices and better utilize Kepler to address their individual workflow problems. Our contributions to advancing the state-of-the-art in scientific workflows have focused on the following areas. Progress in each of these areas is described in subsequent sections. Workflow development. The development of a deeper understanding of scientific workflows "in the wild" and of the requirements for support tools that allow easy construction of complex scientific workflows; Generic workflow components and templates. The development of generic actors (i.e.workflow components and processes) which can be broadly applied to scientific problems; Provenance collection and analysis. The design of a flexible provenance collection and analysis infrastructure within the workflow environment; and, Workflow reliability and fault tolerance. The improvement of the reliability and fault-tolerance of workflow environments.

  9. Optimizing CyberShake Seismic Hazard Workflows for Large HPC Resources

    Science.gov (United States)

    Callaghan, S.; Maechling, P. J.; Juve, G.; Vahi, K.; Deelman, E.; Jordan, T. H.

    2014-12-01

    The CyberShake computational platform is a well-integrated collection of scientific software and middleware that calculates 3D simulation-based probabilistic seismic hazard curves and hazard maps for the Los Angeles region. Currently each CyberShake model comprises about 235 million synthetic seismograms from about 415,000 rupture variations computed at 286 sites. CyberShake integrates large-scale parallel and high-throughput serial seismological research codes into a processing framework in which early stages produce files used as inputs by later stages. Scientific workflow tools are used to manage the jobs, data, and metadata. The Southern California Earthquake Center (SCEC) developed the CyberShake platform using USC High Performance Computing and Communications systems and open-science NSF resources.CyberShake calculations were migrated to the NSF Track 1 system NCSA Blue Waters when it became operational in 2013, via an interdisciplinary team approach including domain scientists, computer scientists, and middleware developers. Due to the excellent performance of Blue Waters and CyberShake software optimizations, we reduced the makespan (a measure of wallclock time-to-solution) of a CyberShake study from 1467 to 342 hours. We will describe the technical enhancements behind this improvement, including judicious introduction of new GPU software, improved scientific software components, increased workflow-based automation, and Blue Waters-specific workflow optimizations.Our CyberShake performance improvements highlight the benefits of scientific workflow tools. The CyberShake workflow software stack includes the Pegasus Workflow Management System (Pegasus-WMS, which includes Condor DAGMan), HTCondor, and Globus GRAM, with Pegasus-mpi-cluster managing the high-throughput tasks on the HPC resources. The workflow tools handle data management, automatically transferring about 13 TB back to SCEC storage.We will present performance metrics from the most recent Cyber

  10. CMS Distributed Computing Workflow Experience

    CERN Document Server

    Haas, Jeffrey David

    2010-01-01

    The vast majority of the CMS Computing capacity, which is organized in a tiered hierarchy, is located away from CERN. The 7 Tier-1 sites archive the LHC proton-proton collision data that is initially processed at CERN. These sites provide access to all recorded and simulated data for the Tier-2 sites, via wide-area network (WAN) transfers. All central data processing workflows are executed at the Tier-1 level, which contain re-reconstruction and skimming workflows of collision data as well as reprocessing of simulated data to adapt to changing detector conditions. This paper describes the operation of the CMS processing infrastructure at the Tier-1 level. The Tier-1 workflows are described in detail. The operational optimization of resource usage is described. In particular, the variation of different workflows during the data taking period of 2010, their efficiencies and latencies as well as their impact on the delivery of physics results is discussed and lessons are drawn from this experience. The simul...

  11. Performance Analysis Tool for HPC and Big Data Applications on Scientific Clusters

    Energy Technology Data Exchange (ETDEWEB)

    Yoo, Wucherl [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States); Koo, Michelle [Univ. of California, Berkeley, CA (United States); Cao, Yu [California Inst. of Technology (CalTech), Pasadena, CA (United States); Sim, Alex [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States); Nugent, Peter [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States); Univ. of California, Berkeley, CA (United States); Wu, Kesheng [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States)

    2016-09-17

    Big data is prevalent in HPC computing. Many HPC projects rely on complex workflows to analyze terabytes or petabytes of data. These workflows often require running over thousands of CPU cores and performing simultaneous data accesses, data movements, and computation. It is challenging to analyze the performance involving terabytes or petabytes of workflow data or measurement data of the executions, from complex workflows over a large number of nodes and multiple parallel task executions. To help identify performance bottlenecks or debug the performance issues in large-scale scientific applications and scientific clusters, we have developed a performance analysis framework, using state-ofthe- art open-source big data processing tools. Our tool can ingest system logs and application performance measurements to extract key performance features, and apply the most sophisticated statistical tools and data mining methods on the performance data. It utilizes an efficient data processing engine to allow users to interactively analyze a large amount of different types of logs and measurements. To illustrate the functionality of the big data analysis framework, we conduct case studies on the workflows from an astronomy project known as the Palomar Transient Factory (PTF) and the job logs from the genome analysis scientific cluster. Our study processed many terabytes of system logs and application performance measurements collected on the HPC systems at NERSC. The implementation of our tool is generic enough to be used for analyzing the performance of other HPC systems and Big Data workows.

  12. Open access: the changing face of scientific publishing.

    Science.gov (United States)

    Chatterjee, Pranab; Biswas, Tamoghna; Mishra, Vishala

    2013-04-01

    The debate on open access to scientific literature that has been raging in scholarly circles for quite some time now has been fueled further by the recent developments in the realm of the open access movement. This article is a short commentary on the current scenario, challenges, and the future of the open access movement.

  13. NeuroManager: A workflow analysis based simulation management engine for computational neuroscience

    Directory of Open Access Journals (Sweden)

    David Bruce Stockton

    2015-10-01

    Full Text Available We developed NeuroManager, an object-oriented simulation management software engine for computational neuroscience. NeuroManager automates the workflow of simulation job submissions when using heterogeneous computational resources, simulators, and simulation tasks. The object-oriented approach 1 provides flexibility to adapt to a variety of neuroscience simulators, 2 simplifies the use of heterogeneous computational resources, from desktops to super computer clusters, and 3 improves tracking of simulator/simulation evolution. We implemented NeuroManager in Matlab, a widely used engineering and scientific language, for its signal and image processing tools, prevalence in electrophysiology analysis, and increasing use in college Biology education. To design and develop NeuroManager we analyzed the workflow of simulation submission for a variety of simulators, operating systems, and computational resources, including the handling of input parameters, data, models, results, and analyses. This resulted in twenty-two stages of simulation submission workflow. The software incorporates progress notification, automatic organization, labeling, and time-stamping of data and results, and integrated access to Matlab's analysis and visualization tools. NeuroManager provides users with the tools to automate daily tasks, and assists principal investigators in tracking and recreating the evolution of research projects performed by multiple people. Overall, NeuroManager provides the infrastructure needed to improve workflow, manage multiple simultaneous simulations, and maintain provenance of the potentially large amounts of data produced during the course of a research project.

  14. ERROR HANDLING IN INTEGRATION WORKFLOWS

    Directory of Open Access Journals (Sweden)

    Alexey M. Nazarenko

    2017-01-01

    Full Text Available Simulation experiments performed while solving multidisciplinary engineering and scientific problems require joint usage of multiple software tools. Further, when following a preset plan of experiment or searching for optimum solu- tions, the same sequence of calculations is run multiple times with various simulation parameters, input data, or conditions while overall workflow does not change. Automation of simulations like these requires implementing of a workflow where tool execution and data exchange is usually controlled by a special type of software, an integration environment or plat- form. The result is an integration workflow (a platform-dependent implementation of some computing workflow which, in the context of automation, is a composition of weakly coupled (in terms of communication intensity typical subtasks. These compositions can then be decomposed back into a few workflow patterns (types of subtasks interaction. The pat- terns, in their turn, can be interpreted as higher level subtasks.This paper considers execution control and data exchange rules that should be imposed by the integration envi- ronment in the case of an error encountered by some integrated software tool. An error is defined as any abnormal behavior of a tool that invalidates its result data thus disrupting the data flow within the integration workflow. The main requirementto the error handling mechanism implemented by the integration environment is to prevent abnormal termination of theentire workflow in case of missing intermediate results data. Error handling rules are formulated on the basic pattern level and on the level of a composite task that can combine several basic patterns as next level subtasks. The cases where workflow behavior may be different, depending on user's purposes, when an error takes place, and possible error handling op- tions that can be specified by the user are also noted in the work.

  15. Enabling Structured Exploration of Workflow Performance Variability in Extreme-Scale Environments

    Energy Technology Data Exchange (ETDEWEB)

    Kleese van Dam, Kerstin; Stephan, Eric G.; Raju, Bibi; Altintas, Ilkay; Elsethagen, Todd O.; Krishnamoorthy, Sriram

    2015-11-15

    Workflows are taking an Workflows are taking an increasingly important role in orchestrating complex scientific processes in extreme scale and highly heterogeneous environments. However, to date we cannot reliably predict, understand, and optimize workflow performance. Sources of performance variability and in particular the interdependencies of workflow design, execution environment and system architecture are not well understood. While there is a rich portfolio of tools for performance analysis, modeling and prediction for single applications in homogenous computing environments, these are not applicable to workflows, due to the number and heterogeneity of the involved workflow and system components and their strong interdependencies. In this paper, we investigate workflow performance goals and identify factors that could have a relevant impact. Based on our analysis, we propose a new workflow performance provenance ontology, the Open Provenance Model-based WorkFlow Performance Provenance, or OPM-WFPP, that will enable the empirical study of workflow performance characteristics and variability including complex source attribution.

  16. An open source approach to enable the reproducibility of scientific workflows in the ocean sciences

    Science.gov (United States)

    Di Stefano, M.; Fox, P. A.; West, P.; Hare, J. A.; Maffei, A. R.

    2013-12-01

    Every scientist should be able to rerun data analyses conducted by his or her team and regenerate the figures in a paper. However, all too often the correct version of a script goes missing, or the original raw data is filtered by hand and the filtering process is undocumented, or there is lack of collaboration and communication among scientists working in a team. Here we present 3 different use cases in ocean sciences in which end-to-end workflows are tracked. The main tool that is deployed to address these use cases is based on a web application (IPython Notebook) that provides the ability to work on very diverse and heterogeneous data and information sources, providing an effective way to share the and track changes to source code used to generate data products and associated metadata, as well as to track the overall workflow provenance to allow versioned reproducibility of a data product. Use cases selected for this work are: 1) A partial reproduction of the Ecosystem Status Report (ESR) for the Northeast U.S. Continental Shelf Large Marine Ecosystem. Our goal with this use case is to enable not just the traceability but also the reproducibility of this biannual report, keeping track of all the processes behind the generation and validation of time-series and spatial data and information products. An end-to-end workflow with code versioning is developed so that indicators in the report may be traced back to the source datasets. 2) Realtime generation of web pages to be able to visualize one of the environmental indicators from the Ecosystem Advisory for the Northeast Shelf Large Marine Ecosystem web site. 3) Data and visualization integration for ocean climate forecasting. In this use case, we focus on a workflow to describe how to provide access to online data sources in the NetCDF format and other model data, and make use of multicore processing to generate video animation from time series of gridded data. For each use case we show how complete workflows

  17. Deadline-constrained workflow scheduling algorithms for Infrastructure as a Service Clouds

    NARCIS (Netherlands)

    Abrishami, S.; Naghibzadeh, M.; Epema, D.H.J.

    2013-01-01

    The advent of Cloud computing as a new model of service provisioning in distributed systems encourages researchers to investigate its benefits and drawbacks on executing scientific applications such as workflows. One of the most challenging problems in Clouds is workflow scheduling, i.e., the

  18. BioInfra.Prot: A comprehensive proteomics workflow including data standardization, protein inference, expression analysis and data publication.

    Science.gov (United States)

    Turewicz, Michael; Kohl, Michael; Ahrens, Maike; Mayer, Gerhard; Uszkoreit, Julian; Naboulsi, Wael; Bracht, Thilo; Megger, Dominik A; Sitek, Barbara; Marcus, Katrin; Eisenacher, Martin

    2017-11-10

    The analysis of high-throughput mass spectrometry-based proteomics data must address the specific challenges of this technology. To this end, the comprehensive proteomics workflow offered by the de.NBI service center BioInfra.Prot provides indispensable components for the computational and statistical analysis of this kind of data. These components include tools and methods for spectrum identification and protein inference, protein quantification, expression analysis as well as data standardization and data publication. All particular methods of the workflow which address these tasks are state-of-the-art or cutting edge. As has been shown in previous publications, each of these methods is adequate to solve its specific task and gives competitive results. However, the methods included in the workflow are continuously reviewed, updated and improved to adapt to new scientific developments. All of these particular components and methods are available as stand-alone BioInfra.Prot services or as a complete workflow. Since BioInfra.Prot provides manifold fast communication channels to get access to all components of the workflow (e.g., via the BioInfra.Prot ticket system: bioinfraprot@rub.de) users can easily benefit from this service and get support by experts. Copyright © 2017 The Authors. Published by Elsevier B.V. All rights reserved.

  19. Workflows for microarray data processing in the Kepler environment

    Science.gov (United States)

    2012-01-01

    Background Microarray data analysis has been the subject of extensive and ongoing pipeline development due to its complexity, the availability of several options at each analysis step, and the development of new analysis demands, including integration with new data sources. Bioinformatics pipelines are usually custom built for different applications, making them typically difficult to modify, extend and repurpose. Scientific workflow systems are intended to address these issues by providing general-purpose frameworks in which to develop and execute such pipelines. The Kepler workflow environment is a well-established system under continual development that is employed in several areas of scientific research. Kepler provides a flexible graphical interface, featuring clear display of parameter values, for design and modification of workflows. It has capabilities for developing novel computational components in the R, Python, and Java programming languages, all of which are widely used for bioinformatics algorithm development, along with capabilities for invoking external applications and using web services. Results We developed a series of fully functional bioinformatics pipelines addressing common tasks in microarray processing in the Kepler workflow environment. These pipelines consist of a set of tools for GFF file processing of NimbleGen chromatin immunoprecipitation on microarray (ChIP-chip) datasets and more comprehensive workflows for Affymetrix gene expression microarray bioinformatics and basic primer design for PCR experiments, which are often used to validate microarray results. Although functional in themselves, these workflows can be easily customized, extended, or repurposed to match the needs of specific projects and are designed to be a toolkit and starting point for specific applications. These workflows illustrate a workflow programming paradigm focusing on local resources (programs and data) and therefore are close to traditional shell scripting or

  20. Workflows for microarray data processing in the Kepler environment

    Directory of Open Access Journals (Sweden)

    Stropp Thomas

    2012-05-01

    Full Text Available Abstract Background Microarray data analysis has been the subject of extensive and ongoing pipeline development due to its complexity, the availability of several options at each analysis step, and the development of new analysis demands, including integration with new data sources. Bioinformatics pipelines are usually custom built for different applications, making them typically difficult to modify, extend and repurpose. Scientific workflow systems are intended to address these issues by providing general-purpose frameworks in which to develop and execute such pipelines. The Kepler workflow environment is a well-established system under continual development that is employed in several areas of scientific research. Kepler provides a flexible graphical interface, featuring clear display of parameter values, for design and modification of workflows. It has capabilities for developing novel computational components in the R, Python, and Java programming languages, all of which are widely used for bioinformatics algorithm development, along with capabilities for invoking external applications and using web services. Results We developed a series of fully functional bioinformatics pipelines addressing common tasks in microarray processing in the Kepler workflow environment. These pipelines consist of a set of tools for GFF file processing of NimbleGen chromatin immunoprecipitation on microarray (ChIP-chip datasets and more comprehensive workflows for Affymetrix gene expression microarray bioinformatics and basic primer design for PCR experiments, which are often used to validate microarray results. Although functional in themselves, these workflows can be easily customized, extended, or repurposed to match the needs of specific projects and are designed to be a toolkit and starting point for specific applications. These workflows illustrate a workflow programming paradigm focusing on local resources (programs and data and therefore are close to

  1. Workflows for microarray data processing in the Kepler environment.

    Science.gov (United States)

    Stropp, Thomas; McPhillips, Timothy; Ludäscher, Bertram; Bieda, Mark

    2012-05-17

    Microarray data analysis has been the subject of extensive and ongoing pipeline development due to its complexity, the availability of several options at each analysis step, and the development of new analysis demands, including integration with new data sources. Bioinformatics pipelines are usually custom built for different applications, making them typically difficult to modify, extend and repurpose. Scientific workflow systems are intended to address these issues by providing general-purpose frameworks in which to develop and execute such pipelines. The Kepler workflow environment is a well-established system under continual development that is employed in several areas of scientific research. Kepler provides a flexible graphical interface, featuring clear display of parameter values, for design and modification of workflows. It has capabilities for developing novel computational components in the R, Python, and Java programming languages, all of which are widely used for bioinformatics algorithm development, along with capabilities for invoking external applications and using web services. We developed a series of fully functional bioinformatics pipelines addressing common tasks in microarray processing in the Kepler workflow environment. These pipelines consist of a set of tools for GFF file processing of NimbleGen chromatin immunoprecipitation on microarray (ChIP-chip) datasets and more comprehensive workflows for Affymetrix gene expression microarray bioinformatics and basic primer design for PCR experiments, which are often used to validate microarray results. Although functional in themselves, these workflows can be easily customized, extended, or repurposed to match the needs of specific projects and are designed to be a toolkit and starting point for specific applications. These workflows illustrate a workflow programming paradigm focusing on local resources (programs and data) and therefore are close to traditional shell scripting or R

  2. Open access to scientific publishing

    Directory of Open Access Journals (Sweden)

    Janne Beate Reitan

    2016-12-01

    Full Text Available Interest in open access (OA to scientific publications is steadily increasing, both in Norway and internationally. From the outset, FORMakademisk has been published as a digital journal, and it was one of the first to offer OA in Norway. We have since the beginning used Open Journal Systems (OJS as publishing software. OJS is part of the Public Knowledge Project (PKP, which was created by Canadian John Willinsky and colleagues at the Faculty of Education at the University of British Columbia in 1998. The first version of OJS came as an open source software in 2001. The programme is free for everyone to use and is part of a larger collective movement wherein knowledge is shared. When FORMakademisk started in 2008, we received much help from the journal Acta Didactic (n.d. at the University of Oslo, which had started the year before us. They had also translated the programme to Norwegian. From the start, we were able to publish in both Norwegian and English. Other journals have used FORMakademisk as a model and source of inspiration when starting or when converting from subscription-based print journals to electronic OA, including the Journal of Norwegian Media Researchers [Norsk medietidsskrift]. It is in this way that the movement around PKP works and continues to grow to provide free access to research. As the articles are OA, they are also easily accessible to non-scientists. We also emphasise that the language should be readily available, although it should maintain a high scientific quality. Often there may be two sides of the same coin. We on the editorial team are now looking forward to adopting the newly developed OJS 3 this spring, with many new features and an improved design for users, including authors, peer reviewers, editors and readers.

  3. From shared data to sharing workflow: Merging PACS and teleradiology

    International Nuclear Information System (INIS)

    Benjamin, Menashe; Aradi, Yinon; Shreiber, Reuven

    2010-01-01

    Due to a host of technological, interface, operational and workflow limitations, teleradiology and PACS/RIS were historically developed as separate systems serving different purposes. PACS/RIS handled local radiology storage and workflow management while teleradiology addressed remote access to images. Today advanced PACS/RIS support complete site radiology workflow for attending physicians, whether on-site or remote. In parallel, teleradiology has emerged into a service of providing remote, off-hours, coverage for emergency radiology and to a lesser extent subspecialty reading to subscribing sites and radiology groups. When attending radiologists use teleradiology for remote access to a site, they may share all relevant patient data and participate in the site's workflow like their on-site peers. The operation gets cumbersome and time consuming when these radiologists serve multi-sites, each requiring a different remote access, or when the sites do not employ the same PACS/RIS/Reporting Systems and do not share the same ownership. The least efficient operation is of teleradiology companies engaged in reading for multiple facilities. As these services typically employ non-local radiologists, they are allowed to share some of the available patient data necessary to provide an emergency report but, by enlarge, they do not share the workflow of the sites they serve. Radiology stakeholders usually prefer to have their own radiologists perform all radiology tasks including interpretation of off-hour examinations. It is possible with current technology to create a system that combines the benefits of local radiology services to multiple sites with the advantages offered by adding subspecialty and off-hours emergency services through teleradiology. Such a system increases efficiency for the radiology groups by enabling all users, regardless of location, to work 'local' and fully participate in the workflow of every site. We refer to such a system as SuperPACS.

  4. Text mining meets workflow: linking U-Compare with Taverna

    Science.gov (United States)

    Kano, Yoshinobu; Dobson, Paul; Nakanishi, Mio; Tsujii, Jun'ichi; Ananiadou, Sophia

    2010-01-01

    Summary: Text mining from the biomedical literature is of increasing importance, yet it is not easy for the bioinformatics community to create and run text mining workflows due to the lack of accessibility and interoperability of the text mining resources. The U-Compare system provides a wide range of bio text mining resources in a highly interoperable workflow environment where workflows can very easily be created, executed, evaluated and visualized without coding. We have linked U-Compare to Taverna, a generic workflow system, to expose text mining functionality to the bioinformatics community. Availability: http://u-compare.org/taverna.html, http://u-compare.org Contact: kano@is.s.u-tokyo.ac.jp Supplementary information: Supplementary data are available at Bioinformatics online. PMID:20709690

  5. Automatic support for product based workflow design : generation of process models from a product data model

    NARCIS (Netherlands)

    Vanderfeesten, I.T.P.; Reijers, H.A.; Aalst, van der W.M.P.; Vogelaar, J.J.C.L.; Meersman, R.; Dillon, T.; Herrero, P.

    2010-01-01

    Product Based Workflow Design (PBWD) is one of the few scientific methodologies for the (re)design of workflow processes. It is based on an analysis of the product that is produced in the workflow process and derives a process model from the product structure. Until now this derivation has been a

  6. AutoDrug: fully automated macromolecular crystallography workflows for fragment-based drug discovery

    International Nuclear Information System (INIS)

    Tsai, Yingssu; McPhillips, Scott E.; González, Ana; McPhillips, Timothy M.; Zinn, Daniel; Cohen, Aina E.; Feese, Michael D.; Bushnell, David; Tiefenbrunn, Theresa; Stout, C. David; Ludaescher, Bertram; Hedman, Britt; Hodgson, Keith O.; Soltis, S. Michael

    2013-01-01

    New software has been developed for automating the experimental and data-processing stages of fragment-based drug discovery at a macromolecular crystallography beamline. A new workflow-automation framework orchestrates beamline-control and data-analysis software while organizing results from multiple samples. AutoDrug is software based upon the scientific workflow paradigm that integrates the Stanford Synchrotron Radiation Lightsource macromolecular crystallography beamlines and third-party processing software to automate the crystallography steps of the fragment-based drug-discovery process. AutoDrug screens a cassette of fragment-soaked crystals, selects crystals for data collection based on screening results and user-specified criteria and determines optimal data-collection strategies. It then collects and processes diffraction data, performs molecular replacement using provided models and detects electron density that is likely to arise from bound fragments. All processes are fully automated, i.e. are performed without user interaction or supervision. Samples can be screened in groups corresponding to particular proteins, crystal forms and/or soaking conditions. A single AutoDrug run is only limited by the capacity of the sample-storage dewar at the beamline: currently 288 samples. AutoDrug was developed in conjunction with RestFlow, a new scientific workflow-automation framework. RestFlow simplifies the design of AutoDrug by managing the flow of data and the organization of results and by orchestrating the execution of computational pipeline steps. It also simplifies the execution and interaction of third-party programs and the beamline-control system. Modeling AutoDrug as a scientific workflow enables multiple variants that meet the requirements of different user groups to be developed and supported. A workflow tailored to mimic the crystallography stages comprising the drug-discovery pipeline of CoCrystal Discovery Inc. has been deployed and successfully

  7. VLAM-G: Interactive Data Driven Workflow Engine for Grid-Enabled Resources

    Directory of Open Access Journals (Sweden)

    Vladimir Korkhov

    2007-01-01

    Full Text Available Grid brings the power of many computers to scientists. However, the development of Grid-enabled applications requires knowledge about Grid infrastructure and low-level API to Grid services. In turn, workflow management systems provide a high-level environment for rapid prototyping of experimental computing systems. Coupling Grid and workflow paradigms is important for the scientific community: it makes the power of the Grid easily available to the end user. The paradigm of data driven workflow execution is one of the ways to enable distributed workflow on the Grid. The work presented in this paper is carried out in the context of the Virtual Laboratory for e-Science project. We present the VLAM-G workflow management system and its core component: the Run-Time System (RTS. The RTS is a dataflow driven workflow engine which utilizes Grid resources, hiding the complexity of the Grid from a scientist. Special attention is paid to the concept of dataflow and direct data streaming between distributed workflow components. We present the architecture and components of the RTS, describe the features of VLAM-G workflow execution, and evaluate the system by performance measurements and a real life use case.

  8. Low Latency Workflow Scheduling and an Application of Hyperspectral Brightness Temperatures

    Science.gov (United States)

    Nguyen, P. T.; Chapman, D. R.; Halem, M.

    2012-12-01

    New system analytics for Big Data computing holds the promise of major scientific breakthroughs and discoveries from the exploration and mining of the massive data sets becoming available to the science community. However, such data intensive scientific applications face severe challenges in accessing, managing and analyzing petabytes of data. While the Hadoop MapReduce environment has been successfully applied to data intensive problems arising in business, there are still many scientific problem domains where limitations in the functionality of MapReduce systems prevent its wide adoption by those communities. This is mainly because MapReduce does not readily support the unique science discipline needs such as special science data formats, graphic and computational data analysis tools, maintaining high degrees of computational accuracies, and interfacing with application's existing components across heterogeneous computing processors. We address some of these limitations by exploiting the MapReduce programming model for satellite data intensive scientific problems and address scalability, reliability, scheduling, and data management issues when dealing with climate data records and their complex observational challenges. In addition, we will present techniques to support the unique Earth science discipline needs such as dealing with special science data formats (HDF and NetCDF). We have developed a Hadoop task scheduling algorithm that improves latency by 2x for a scientific workflow including the gridding of the EOS AIRS hyperspectral Brightness Temperatures (BT). This workflow processing algorithm has been tested at the Multicore Computing Center private Hadoop based Intel Nehalem cluster, as well as in a virtual mode under the Open Source Eucalyptus cloud. The 55TB AIRS hyperspectral L1b Brightness Temperature record has been gridded at the resolution of 0.5x1.0 degrees, and we have computed a 0.9 annual anti-correlation to the El Nino Southern oscillation in

  9. Biowep: a workflow enactment portal for bioinformatics applications.

    Science.gov (United States)

    Romano, Paolo; Bartocci, Ezio; Bertolini, Guglielmo; De Paoli, Flavio; Marra, Domenico; Mauri, Giancarlo; Merelli, Emanuela; Milanesi, Luciano

    2007-03-08

    The huge amount of biological information, its distribution over the Internet and the heterogeneity of available software tools makes the adoption of new data integration and analysis network tools a necessity in bioinformatics. ICT standards and tools, like Web Services and Workflow Management Systems (WMS), can support the creation and deployment of such systems. Many Web Services are already available and some WMS have been proposed. They assume that researchers know which bioinformatics resources can be reached through a programmatic interface and that they are skilled in programming and building workflows. Therefore, they are not viable to the majority of unskilled researchers. A portal enabling these to take profit from new technologies is still missing. We designed biowep, a web based client application that allows for the selection and execution of a set of predefined workflows. The system is available on-line. Biowep architecture includes a Workflow Manager, a User Interface and a Workflow Executor. The task of the Workflow Manager is the creation and annotation of workflows. These can be created by using either the Taverna Workbench or BioWMS. Enactment of workflows is carried out by FreeFluo for Taverna workflows and by BioAgent/Hermes, a mobile agent-based middleware, for BioWMS ones. Main workflows' processing steps are annotated on the basis of their input and output, elaboration type and application domain by using a classification of bioinformatics data and tasks. The interface supports users authentication and profiling. Workflows can be selected on the basis of users' profiles and can be searched through their annotations. Results can be saved. We developed a web system that support the selection and execution of predefined workflows, thus simplifying access for all researchers. The implementation of Web Services allowing specialized software to interact with an exhaustive set of biomedical databases and analysis software and the creation of

  10. Biowep: a workflow enactment portal for bioinformatics applications

    Directory of Open Access Journals (Sweden)

    Romano Paolo

    2007-03-01

    Full Text Available Abstract Background The huge amount of biological information, its distribution over the Internet and the heterogeneity of available software tools makes the adoption of new data integration and analysis network tools a necessity in bioinformatics. ICT standards and tools, like Web Services and Workflow Management Systems (WMS, can support the creation and deployment of such systems. Many Web Services are already available and some WMS have been proposed. They assume that researchers know which bioinformatics resources can be reached through a programmatic interface and that they are skilled in programming and building workflows. Therefore, they are not viable to the majority of unskilled researchers. A portal enabling these to take profit from new technologies is still missing. Results We designed biowep, a web based client application that allows for the selection and execution of a set of predefined workflows. The system is available on-line. Biowep architecture includes a Workflow Manager, a User Interface and a Workflow Executor. The task of the Workflow Manager is the creation and annotation of workflows. These can be created by using either the Taverna Workbench or BioWMS. Enactment of workflows is carried out by FreeFluo for Taverna workflows and by BioAgent/Hermes, a mobile agent-based middleware, for BioWMS ones. Main workflows' processing steps are annotated on the basis of their input and output, elaboration type and application domain by using a classification of bioinformatics data and tasks. The interface supports users authentication and profiling. Workflows can be selected on the basis of users' profiles and can be searched through their annotations. Results can be saved. Conclusion We developed a web system that support the selection and execution of predefined workflows, thus simplifying access for all researchers. The implementation of Web Services allowing specialized software to interact with an exhaustive set of biomedical

  11. Scientific Production on Open Access: A Worldwide Bibliometric Analysis in the Academic and Scientific Context

    Directory of Open Access Journals (Sweden)

    Sandra Miguel

    2016-01-01

    Full Text Available This research aims to diachronically analyze the worldwide scientific production on open access, in the academic and scientific context, in order to contribute to knowledge and visualization of its main actors. As a method, bibliographical, descriptive and analytical research was used, with the contribution of bibliometric studies, especially the production indicators, scientific collaboration and indicators of thematic co-occurrence. The Scopus database was used as a source to retrieve the articles on the subject, with a resulting corpus of 1179 articles. Using Bibexcel software, frequency tables were constructed for the variables, and Pajek software was used to visualize the collaboration network and VoSViewer for the construction of the keywords’ network. As for the results, the most productive researchers come from countries such as the United States, Canada, France and Spain. Journals with higher impact in the academic community have disseminated the new constructed knowledge. A collaborative network with a few subnets where co-authors are from different countries has been observed. As conclusions, this study allows identifying the themes of debates that mark the development of open access at the international level, and it is possible to state that open access is one of the new emerging and frontier fields of library and information science.

  12. USING THE INTERNATIONAL SCIENTOMETRIC DATABASES OF OPEN ACCESS IN SCIENTIFIC RESEARCH

    Directory of Open Access Journals (Sweden)

    O. Galchevska

    2015-05-01

    Full Text Available In the article the problem of the use of international scientometric databases in research activities as web-oriented resources and services that are the means of publication and dissemination of research results is considered. Selection criteria of scientometric platforms of open access in conducting scientific researches (coverage Ukrainian scientific periodicals and publications, data accuracy, general characteristics of international scientometrics database, technical, functional characteristics and their indexes are emphasized. The review of the most popular scientometric databases of open access Google Scholar, Russian Scientific Citation Index (RSCI, Scholarometer, Index Copernicus (IC, Microsoft Academic Search is made. Advantages of usage of International Scientometrics database Google Scholar in conducting scientific researches and prospects of research that are in the separation of cloud information and analytical services of the system are determined.

  13. Canadian National Consultation on Access to Scientific Research Data

    Directory of Open Access Journals (Sweden)

    Michel Sabourin

    2007-06-01

    Full Text Available In June 2004, an expert Task Force, appointed by the National Research Council Canada and chaired by Dr. David Strong, came together in Ottawa to plan a National Forum as the focus of the National Consultation on Access to Scientific Research Data. The Forum, which was held in November 2004, brought together more than seventy Canadian leaders in scientific research, data management, research administration, intellectual property and other pertinent areas. This article presents a comprehensive review of the issues, and the opportunities and the challenges identified during the Forum. Complex and rich arrays of scientific databases are changing how research is conducted, speeding the discovery and creation of new concepts. Increased access will accelerate such changes even more, creating other new opportunities. With the combination of databases within and among disciplines and countries, fundamental leaps in knowledge will occur that will transform our understanding of life, the world and the universe. The Canadian research community is concerned by the need to take swift action to adapt to the substantial changes required by the scientific enterprise. Because no national data preservation organization exists, may experts believe that a national strategy on data access or policies needs to be developed, and that a "Data Task Force" be created to prepare a full national implementation strategy. Once such a national strategy is broadly supported, it is proposed that a dedicated national infrastructure, tentatively called "Data Canada", be established, to assume overall leadership in the development and execution of a strategic plan.

  14. A workflow to preserve genome-quality tissue samples from plants in botanical gardens and arboreta1

    Science.gov (United States)

    Gostel, Morgan R.; Kelloff, Carol; Wallick, Kyle; Funk, Vicki A.

    2016-01-01

    Premise of the study: Internationally, gardens hold diverse living collections that can be preserved for genomic research. Workflows have been developed for genomic tissue sampling in other taxa (e.g., vertebrates), but are inadequate for plants. We outline a workflow for tissue sampling intended for two audiences: botanists interested in genomics research and garden staff who plan to voucher living collections. Methods and Results: Standard herbarium methods are used to collect vouchers, label information and images are entered into a publicly accessible database, and leaf tissue is preserved in silica and liquid nitrogen. A five-step approach for genomic tissue sampling is presented for sampling from living collections according to current best practices. Conclusions: Collecting genome-quality samples from gardens is an economical and rapid way to make available for scientific research tissue from the diversity of plants on Earth. The Global Genome Initiative will facilitate and lead this endeavor through international partnerships. PMID:27672517

  15. Tavaxy: integrating Taverna and Galaxy workflows with cloud computing support.

    Science.gov (United States)

    Abouelhoda, Mohamed; Issa, Shadi Alaa; Ghanem, Moustafa

    2012-05-04

    Over the past decade the workflow system paradigm has evolved as an efficient and user-friendly approach for developing complex bioinformatics applications. Two popular workflow systems that have gained acceptance by the bioinformatics community are Taverna and Galaxy. Each system has a large user-base and supports an ever-growing repository of application workflows. However, workflows developed for one system cannot be imported and executed easily on the other. The lack of interoperability is due to differences in the models of computation, workflow languages, and architectures of both systems. This lack of interoperability limits sharing of workflows between the user communities and leads to duplication of development efforts. In this paper, we present Tavaxy, a stand-alone system for creating and executing workflows based on using an extensible set of re-usable workflow patterns. Tavaxy offers a set of new features that simplify and enhance the development of sequence analysis applications: It allows the integration of existing Taverna and Galaxy workflows in a single environment, and supports the use of cloud computing capabilities. The integration of existing Taverna and Galaxy workflows is supported seamlessly at both run-time and design-time levels, based on the concepts of hierarchical workflows and workflow patterns. The use of cloud computing in Tavaxy is flexible, where the users can either instantiate the whole system on the cloud, or delegate the execution of certain sub-workflows to the cloud infrastructure. Tavaxy reduces the workflow development cycle by introducing the use of workflow patterns to simplify workflow creation. It enables the re-use and integration of existing (sub-) workflows from Taverna and Galaxy, and allows the creation of hybrid workflows. Its additional features exploit recent advances in high performance cloud computing to cope with the increasing data size and complexity of analysis.The system can be accessed either through a

  16. Tavaxy: Integrating Taverna and Galaxy workflows with cloud computing support

    Directory of Open Access Journals (Sweden)

    Abouelhoda Mohamed

    2012-05-01

    Full Text Available Abstract Background Over the past decade the workflow system paradigm has evolved as an efficient and user-friendly approach for developing complex bioinformatics applications. Two popular workflow systems that have gained acceptance by the bioinformatics community are Taverna and Galaxy. Each system has a large user-base and supports an ever-growing repository of application workflows. However, workflows developed for one system cannot be imported and executed easily on the other. The lack of interoperability is due to differences in the models of computation, workflow languages, and architectures of both systems. This lack of interoperability limits sharing of workflows between the user communities and leads to duplication of development efforts. Results In this paper, we present Tavaxy, a stand-alone system for creating and executing workflows based on using an extensible set of re-usable workflow patterns. Tavaxy offers a set of new features that simplify and enhance the development of sequence analysis applications: It allows the integration of existing Taverna and Galaxy workflows in a single environment, and supports the use of cloud computing capabilities. The integration of existing Taverna and Galaxy workflows is supported seamlessly at both run-time and design-time levels, based on the concepts of hierarchical workflows and workflow patterns. The use of cloud computing in Tavaxy is flexible, where the users can either instantiate the whole system on the cloud, or delegate the execution of certain sub-workflows to the cloud infrastructure. Conclusions Tavaxy reduces the workflow development cycle by introducing the use of workflow patterns to simplify workflow creation. It enables the re-use and integration of existing (sub- workflows from Taverna and Galaxy, and allows the creation of hybrid workflows. Its additional features exploit recent advances in high performance cloud computing to cope with the increasing data size and

  17. Tavaxy: Integrating Taverna and Galaxy workflows with cloud computing support

    Science.gov (United States)

    2012-01-01

    Background Over the past decade the workflow system paradigm has evolved as an efficient and user-friendly approach for developing complex bioinformatics applications. Two popular workflow systems that have gained acceptance by the bioinformatics community are Taverna and Galaxy. Each system has a large user-base and supports an ever-growing repository of application workflows. However, workflows developed for one system cannot be imported and executed easily on the other. The lack of interoperability is due to differences in the models of computation, workflow languages, and architectures of both systems. This lack of interoperability limits sharing of workflows between the user communities and leads to duplication of development efforts. Results In this paper, we present Tavaxy, a stand-alone system for creating and executing workflows based on using an extensible set of re-usable workflow patterns. Tavaxy offers a set of new features that simplify and enhance the development of sequence analysis applications: It allows the integration of existing Taverna and Galaxy workflows in a single environment, and supports the use of cloud computing capabilities. The integration of existing Taverna and Galaxy workflows is supported seamlessly at both run-time and design-time levels, based on the concepts of hierarchical workflows and workflow patterns. The use of cloud computing in Tavaxy is flexible, where the users can either instantiate the whole system on the cloud, or delegate the execution of certain sub-workflows to the cloud infrastructure. Conclusions Tavaxy reduces the workflow development cycle by introducing the use of workflow patterns to simplify workflow creation. It enables the re-use and integration of existing (sub-) workflows from Taverna and Galaxy, and allows the creation of hybrid workflows. Its additional features exploit recent advances in high performance cloud computing to cope with the increasing data size and complexity of analysis. The system

  18. Automated data reduction workflows for astronomy. The ESO Reflex environment

    Science.gov (United States)

    Freudling, W.; Romaniello, M.; Bramich, D. M.; Ballester, P.; Forchi, V.; García-Dabló, C. E.; Moehler, S.; Neeser, M. J.

    2013-11-01

    Context. Data from complex modern astronomical instruments often consist of a large number of different science and calibration files, and their reduction requires a variety of software tools. The execution chain of the tools represents a complex workflow that needs to be tuned and supervised, often by individual researchers that are not necessarily experts for any specific instrument. Aims: The efficiency of data reduction can be improved by using automatic workflows to organise data and execute a sequence of data reduction steps. To realize such efficiency gains, we designed a system that allows intuitive representation, execution and modification of the data reduction workflow, and has facilities for inspection and interaction with the data. Methods: The European Southern Observatory (ESO) has developed Reflex, an environment to automate data reduction workflows. Reflex is implemented as a package of customized components for the Kepler workflow engine. Kepler provides the graphical user interface to create an executable flowchart-like representation of the data reduction process. Key features of Reflex are a rule-based data organiser, infrastructure to re-use results, thorough book-keeping, data progeny tracking, interactive user interfaces, and a novel concept to exploit information created during data organisation for the workflow execution. Results: Automated workflows can greatly increase the efficiency of astronomical data reduction. In Reflex, workflows can be run non-interactively as a first step. Subsequent optimization can then be carried out while transparently re-using all unchanged intermediate products. We found that such workflows enable the reduction of complex data by non-expert users and minimizes mistakes due to book-keeping errors. Conclusions: Reflex includes novel concepts to increase the efficiency of astronomical data processing. While Reflex is a specific implementation of astronomical scientific workflows within the Kepler workflow

  19. Data at Risk Initiative: Examining and Facilitating the Scientific Process in Relation to Endangered Data

    Directory of Open Access Journals (Sweden)

    Angela P Murillo

    2014-01-01

    Full Text Available Examining the scientific process in relation to endangered data, data reuse, and sharing is crucial in facilitating scientific workflow. Deterioration, format obsolescence, and insufficient metadata for discovery are significant problems leading to loss of scientific data. The research presented in this paper considers these potentially lost data. Four one-hour focus groups and a demographic survey were conducted with 14 scientists to learn about their attitudes toward endangered data, data sharing, data reuse, and their opinions of the DARI inventory. The results indicate that unavailability, lack of context, accessibility issues, and potential endangerment are key concerns to scientists.

  20. A Kepler Workflow Tool for Reproducible AMBER GPU Molecular Dynamics.

    Science.gov (United States)

    Purawat, Shweta; Ieong, Pek U; Malmstrom, Robert D; Chan, Garrett J; Yeung, Alan K; Walker, Ross C; Altintas, Ilkay; Amaro, Rommie E

    2017-06-20

    With the drive toward high throughput molecular dynamics (MD) simulations involving ever-greater numbers of simulation replicates run for longer, biologically relevant timescales (microseconds), the need for improved computational methods that facilitate fully automated MD workflows gains more importance. Here we report the development of an automated workflow tool to perform AMBER GPU MD simulations. Our workflow tool capitalizes on the capabilities of the Kepler platform to deliver a flexible, intuitive, and user-friendly environment and the AMBER GPU code for a robust and high-performance simulation engine. Additionally, the workflow tool reduces user input time by automating repetitive processes and facilitates access to GPU clusters, whose high-performance processing power makes simulations of large numerical scale possible. The presented workflow tool facilitates the management and deployment of large sets of MD simulations on heterogeneous computing resources. The workflow tool also performs systematic analysis on the simulation outputs and enhances simulation reproducibility, execution scalability, and MD method development including benchmarking and validation. Copyright © 2017 Biophysical Society. Published by Elsevier Inc. All rights reserved.

  1. CMS distributed computing workflow experience

    Science.gov (United States)

    Adelman-McCarthy, Jennifer; Gutsche, Oliver; Haas, Jeffrey D.; Prosper, Harrison B.; Dutta, Valentina; Gomez-Ceballos, Guillelmo; Hahn, Kristian; Klute, Markus; Mohapatra, Ajit; Spinoso, Vincenzo; Kcira, Dorian; Caudron, Julien; Liao, Junhui; Pin, Arnaud; Schul, Nicolas; De Lentdecker, Gilles; McCartin, Joseph; Vanelderen, Lukas; Janssen, Xavier; Tsyganov, Andrey; Barge, Derek; Lahiff, Andrew

    2011-12-01

    The vast majority of the CMS Computing capacity, which is organized in a tiered hierarchy, is located away from CERN. The 7 Tier-1 sites archive the LHC proton-proton collision data that is initially processed at CERN. These sites provide access to all recorded and simulated data for the Tier-2 sites, via wide-area network (WAN) transfers. All central data processing workflows are executed at the Tier-1 level, which contain re-reconstruction and skimming workflows of collision data as well as reprocessing of simulated data to adapt to changing detector conditions. This paper describes the operation of the CMS processing infrastructure at the Tier-1 level. The Tier-1 workflows are described in detail. The operational optimization of resource usage is described. In particular, the variation of different workflows during the data taking period of 2010, their efficiencies and latencies as well as their impact on the delivery of physics results is discussed and lessons are drawn from this experience. The simulation of proton-proton collisions for the CMS experiment is primarily carried out at the second tier of the CMS computing infrastructure. Half of the Tier-2 sites of CMS are reserved for central Monte Carlo (MC) production while the other half is available for user analysis. This paper summarizes the large throughput of the MC production operation during the data taking period of 2010 and discusses the latencies and efficiencies of the various types of MC production workflows. We present the operational procedures to optimize the usage of available resources and we the operational model of CMS for including opportunistic resources, such as the larger Tier-3 sites, into the central production operation.

  2. CMS distributed computing workflow experience

    International Nuclear Information System (INIS)

    Adelman-McCarthy, Jennifer; Gutsche, Oliver; Haas, Jeffrey D; Prosper, Harrison B; Dutta, Valentina; Gomez-Ceballos, Guillelmo; Hahn, Kristian; Klute, Markus; Mohapatra, Ajit; Spinoso, Vincenzo; Kcira, Dorian; Caudron, Julien; Liao Junhui; Pin, Arnaud; Schul, Nicolas; Lentdecker, Gilles De; McCartin, Joseph; Vanelderen, Lukas; Janssen, Xavier; Tsyganov, Andrey

    2011-01-01

    The vast majority of the CMS Computing capacity, which is organized in a tiered hierarchy, is located away from CERN. The 7 Tier-1 sites archive the LHC proton-proton collision data that is initially processed at CERN. These sites provide access to all recorded and simulated data for the Tier-2 sites, via wide-area network (WAN) transfers. All central data processing workflows are executed at the Tier-1 level, which contain re-reconstruction and skimming workflows of collision data as well as reprocessing of simulated data to adapt to changing detector conditions. This paper describes the operation of the CMS processing infrastructure at the Tier-1 level. The Tier-1 workflows are described in detail. The operational optimization of resource usage is described. In particular, the variation of different workflows during the data taking period of 2010, their efficiencies and latencies as well as their impact on the delivery of physics results is discussed and lessons are drawn from this experience. The simulation of proton-proton collisions for the CMS experiment is primarily carried out at the second tier of the CMS computing infrastructure. Half of the Tier-2 sites of CMS are reserved for central Monte Carlo (MC) production while the other half is available for user analysis. This paper summarizes the large throughput of the MC production operation during the data taking period of 2010 and discusses the latencies and efficiencies of the various types of MC production workflows. We present the operational procedures to optimize the usage of available resources and we the operational model of CMS for including opportunistic resources, such as the larger Tier-3 sites, into the central production operation.

  3. Nationwide Buildings Energy Research enabled through an integrated Data Intensive Scientific Workflow and Advanced Analysis Environment

    Energy Technology Data Exchange (ETDEWEB)

    Kleese van Dam, Kerstin [Pacific Northwest National Lab. (PNNL), Richland, WA (United States); Lansing, Carina S. [Pacific Northwest National Lab. (PNNL), Richland, WA (United States); Elsethagen, Todd O. [Pacific Northwest National Lab. (PNNL), Richland, WA (United States); Hathaway, John E. [Pacific Northwest National Lab. (PNNL), Richland, WA (United States); Guillen, Zoe C. [Pacific Northwest National Lab. (PNNL), Richland, WA (United States); Dirks, James A. [Pacific Northwest National Lab. (PNNL), Richland, WA (United States); Skorski, Daniel C. [Pacific Northwest National Lab. (PNNL), Richland, WA (United States); Stephan, Eric G. [Pacific Northwest National Lab. (PNNL), Richland, WA (United States); Gorrissen, Willy J. [Pacific Northwest National Lab. (PNNL), Richland, WA (United States); Gorton, Ian [Carnegie Mellon Univ., Pittsburgh, PA (United States); Liu, Yan [Concordia Univ., Montreal, QC (Canada)

    2014-01-28

    Modern workflow systems enable scientists to run ensemble simulations at unprecedented scales and levels of complexity, allowing them to study system sizes previously impossible to achieve, due to the inherent resource requirements needed for the modeling work. However as a result of these new capabilities the science teams suddenly also face unprecedented data volumes that they are unable to analyze with their existing tools and methodologies in a timely fashion. In this paper we will describe the ongoing development work to create an integrated data intensive scientific workflow and analysis environment that offers researchers the ability to easily create and execute complex simulation studies and provides them with different scalable methods to analyze the resulting data volumes. The integration of simulation and analysis environments is hereby not only a question of ease of use, but supports fundamental functions in the correlated analysis of simulation input, execution details and derived results for multi-variant, complex studies. To this end the team extended and integrated the existing capabilities of the Velo data management and analysis infrastructure, the MeDICi data intensive workflow system and RHIPE the R for Hadoop version of the well-known statistics package, as well as developing a new visual analytics interface for the result exploitation by multi-domain users. The capabilities of the new environment are demonstrated on a use case that focusses on the Pacific Northwest National Laboratory (PNNL) building energy team, showing how they were able to take their previously local scale simulations to a nationwide level by utilizing data intensive computing techniques not only for their modeling work, but also for the subsequent analysis of their modeling results. As part of the PNNL research initiative PRIMA (Platform for Regional Integrated Modeling and Analysis) the team performed an initial 3 year study of building energy demands for the US Eastern

  4. Scientific Data Management (SDM) Center for Enabling Technologies. Final Report, 2007-2012

    Energy Technology Data Exchange (ETDEWEB)

    Ludascher, Bertram [Univ. of California, Davis, CA (United States); Altintas, Ilkay [Univ. of California, San Diego, CA (United States)

    2013-09-06

    Our contributions to advancing the State of the Art in scientific workflows have focused on the following areas: Workflow development; Generic workflow components and templates; Provenance collection and analysis; and, Workflow reliability and fault tolerance.

  5. DIaaS: Data-Intensive workflows as a service - Enabling easy composition and deployment of data-intensive workflows on Virtual Research Environments

    Science.gov (United States)

    Filgueira, R.; Ferreira da Silva, R.; Deelman, E.; Atkinson, M.

    2016-12-01

    We present the Data-Intensive workflows as a Service (DIaaS) model for enabling easy data-intensive workflow composition and deployment on clouds using containers. DIaaS model backbone is Asterism, an integrated solution for running data-intensive stream-based applications on heterogeneous systems, which combines the benefits of dispel4py with Pegasus workflow systems. The stream-based executions of an Asterism workflow are managed by dispel4py, while the data movement between different e-Infrastructures, and the coordination of the application execution are automatically managed by Pegasus. DIaaS combines Asterism framework with Docker containers to provide an integrated, complete, easy-to-use, portable approach to run data-intensive workflows on distributed platforms. Three containers integrate the DIaaS model: a Pegasus node, and an MPI and an Apache Storm clusters. Container images are described as Dockerfiles (available online at http://github.com/dispel4py/pegasus_dispel4py), linked to Docker Hub for providing continuous integration (automated image builds), and image storing and sharing. In this model, all required software (workflow systems and execution engines) for running scientific applications are packed into the containers, which significantly reduces the effort (and possible human errors) required by scientists or VRE administrators to build such systems. The most common use of DIaaS will be to act as a backend of VREs or Scientific Gateways to run data-intensive applications, deploying cloud resources upon request. We have demonstrated the feasibility of DIaaS using the data-intensive seismic ambient noise cross-correlation application (Figure 1). The application preprocesses (Phase1) and cross-correlates (Phase2) traces from several seismic stations. The application is submitted via Pegasus (Container1), and Phase1 and Phase2 are executed in the MPI (Container2) and Storm (Container3) clusters respectively. Although both phases could be executed

  6. Climate Data Analytics Workflow Management

    Science.gov (United States)

    Zhang, J.; Lee, S.; Pan, L.; Mattmann, C. A.; Lee, T. J.

    2016-12-01

    In this project we aim to pave a novel path to create a sustainable building block toward Earth science big data analytics and knowledge sharing. Closely studying how Earth scientists conduct data analytics research in their daily work, we have developed a provenance model to record their activities, and to develop a technology to automatically generate workflows for scientists from the provenance. On top of it, we have built the prototype of a data-centric provenance repository, and establish a PDSW (People, Data, Service, Workflow) knowledge network to support workflow recommendation. To ensure the scalability and performance of the expected recommendation system, we have leveraged the Apache OODT system technology. The community-approved, metrics-based performance evaluation web-service will allow a user to select a metric from the list of several community-approved metrics and to evaluate model performance using the metric as well as the reference dataset. This service will facilitate the use of reference datasets that are generated in support of the model-data intercomparison projects such as Obs4MIPs and Ana4MIPs. The data-centric repository infrastructure will allow us to catch richer provenance to further facilitate knowledge sharing and scientific collaboration in the Earth science community. This project is part of Apache incubator CMDA project.

  7. Workflow Lexicons in Healthcare: Validation of the SWIM Lexicon.

    Science.gov (United States)

    Meenan, Chris; Erickson, Bradley; Knight, Nancy; Fossett, Jewel; Olsen, Elizabeth; Mohod, Prerna; Chen, Joseph; Langer, Steve G

    2017-06-01

    For clinical departments seeking to successfully navigate the challenges of modern health reform, obtaining access to operational and clinical data to establish and sustain goals for improving quality is essential. More broadly, health delivery organizations are also seeking to understand performance across multiple facilities and often across multiple electronic medical record (EMR) systems. Interpreting operational data across multiple vendor systems can be challenging, as various manufacturers may describe different departmental workflow steps in different ways and sometimes even within a single vendor's installed customer base. In 2012, The Society for Imaging Informatics in Medicine (SIIM) recognized the need for better quality and performance data standards and formed SIIM's Workflow Initiative for Medicine (SWIM), an initiative designed to consistently describe workflow steps in radiology departments as well as defining operational quality metrics. The SWIM lexicon was published as a working model to describe operational workflow steps and quality measures. We measured the prevalence of the SWIM lexicon workflow steps in both academic and community radiology environments using real-world patient observations and correlated that information with automatically captured workflow steps from our clinical information systems. Our goal was to measure frequency of occurrence of workflow steps identified by the SWIM lexicon in a real-world clinical setting, as well as to correlate how accurately departmental information systems captured patient flow through our health facility.

  8. RMCgui: a new interface for the workflow associated with running Reverse Monte Carlo simulations

    International Nuclear Information System (INIS)

    Dove, Martin T; Rigg, Gary

    2013-01-01

    The Reverse Monte Carlo method enables construction and refinement of large atomic models of materials that are tuned to give best agreement with experimental data such as neutron and x-ray total scattering data, capturing both the average structure and fluctuations. The practical drawback with the current implementations of this approach is the relatively complex workflow required, from setting up the configuration and simulation details through to checking the final outputs and analysing the resultant configurations. In order to make this workflow more accessible to users, we have developed an end-to-end workflow wrapped within a graphical user interface—RMCgui—designed to make the Reverse Monte Carlo more widely accessible. (paper)

  9. Addressing informatics challenges in Translational Research with workflow technology.

    Science.gov (United States)

    Beaulah, Simon A; Correll, Mick A; Munro, Robin E J; Sheldon, Jonathan G

    2008-09-01

    Interest in Translational Research has been growing rapidly in recent years. In this collision of different data, technologies and cultures lie tremendous opportunities for the advancement of science and business for organisations that are able to integrate, analyse and deliver this information effectively to users. Workflow-based integration and analysis systems are becoming recognised as a fast and flexible way to build applications that are tailored to scientific areas, yet are built on a common platform. Workflow systems are allowing organisations to meet the key informatics challenges in Translational Research and improve disease understanding and patient care.

  10. Application of Workflow Technology for Big Data Analysis Service

    Directory of Open Access Journals (Sweden)

    Bin Zhang

    2018-04-01

    Full Text Available This study presents a lightweight representational state transfer-based cloud workflow system to construct a big data intelligent software-as-a-service (SaaS platform. The system supports the dynamic construction and operation of an intelligent data analysis application, and realizes rapid development and flexible deployment of the business analysis process that can improve the interaction and response time of the process. The proposed system integrates offline-batch and online-streaming analysis models that allow users to conduct batch and streaming computing simultaneously. Users can rend cloud capabilities and customize a set of big data analysis applications in the form of workflow processes. This study elucidates the architecture and application modeling, customization, dynamic construction, and scheduling of a cloud workflow system. A chain workflow foundation mechanism is proposed to combine several analysis components into a chain component that can promote efficiency. Four practical application cases are provided to verify the analysis capability of the system. Experimental results show that the proposed system can support multiple users in accessing the system concurrently and effectively uses data analysis algorithms. The proposed SaaS workflow system has been used in network operators and has achieved good results.

  11. Implementation of Cyberinfrastructure and Data Management Workflow for a Large-Scale Sensor Network

    Science.gov (United States)

    Jones, A. S.; Horsburgh, J. S.

    2014-12-01

    Monitoring with in situ environmental sensors and other forms of field-based observation presents many challenges for data management, particularly for large-scale networks consisting of multiple sites, sensors, and personnel. The availability and utility of these data in addressing scientific questions relies on effective cyberinfrastructure that facilitates transformation of raw sensor data into functional data products. It also depends on the ability of researchers to share and access the data in useable formats. In addition to addressing the challenges presented by the quantity of data, monitoring networks need practices to ensure high data quality, including procedures and tools for post processing. Data quality is further enhanced if practitioners are able to track equipment, deployments, calibrations, and other events related to site maintenance and associate these details with observational data. In this presentation we will describe the overall workflow that we have developed for research groups and sites conducting long term monitoring using in situ sensors. Features of the workflow include: software tools to automate the transfer of data from field sites to databases, a Python-based program for data quality control post-processing, a web-based application for online discovery and visualization of data, and a data model and web interface for managing physical infrastructure. By automating the data management workflow, the time from collection to analysis is reduced and sharing and publication is facilitated. The incorporation of metadata standards and descriptions and the use of open-source tools enhances the sustainability and reusability of the data. We will describe the workflow and tools that we have developed in the context of the iUTAH (innovative Urban Transitions and Aridregion Hydrosustainability) monitoring network. The iUTAH network consists of aquatic and climate sensors deployed in three watersheds to monitor Gradients Along Mountain to Urban

  12. Open Access to Scientific Data: Promoting Science and Innovation

    Directory of Open Access Journals (Sweden)

    Guan-Hua Xu

    2007-06-01

    Full Text Available As an important part of the science and technology infrastructure platform of China, the Ministry of Science and Technology launched the Scientific Data Sharing Program in 2002. Twenty-four government agencies now participate in the Program. After five years of hard work, great progress has been achieved in the policy and legal framework, data standards, pilot projects, and international cooperation. By the end of 2005, one-third of the existing public-interest and basic scientific databases in China had been integrated and upgraded. By 2020, China is expected to build a more user-friendly scientific data management and sharing system, with 80 percent of scientific data available to the general public. In order to realize this objective, the emphases of the project are to perfect the policy and legislation system, improve the quality of data resources, expand and establish national scientific data centers, and strengthen international cooperation. It is believed that with the opening up of access to scientific data in China, the Program will play a bigger role in promoting science and national innovation.

  13. News from the Library: Scientific American and Nature increasingly accessible online!

    CERN Multimedia

    CERN Library

    2011-01-01

    Since a few weeks, the CERN Library has been offering online access to "Scientific American" and "Nature" within a longer timespan. This is part of a long-term plan to extend our e-collections in order to include prestigious scientific journals from the beginning of publication.   CERN users can browse and read the complete archives of "Scientific American" since 1845. Among the many interesting articles now readable online, you can find Einstein's account of research on a generalized theory of gravitation. A small, though important addition to the Library's online collections: "Nature" online is now reaching back to 1987. You can now read online the "Nature" news column reporting about the first anti-atom discovered at CERN. We plan to further expand online access to "Nature", but in the meantime you can rely on the Library's paper collection...

  14. News from the Library: Scientific American and Nature increasingly accessible online!

    CERN Multimedia

    CERN Library

    2012-01-01

    Since a few weeks, the CERN Library has been offering online access to "Scientific American" and "Nature" within a longer timespan. This is part of a long-term plan to extend our e-collections in order to include prestigious scientific journals from the beginning of publication.   CERN users can browse and read the complete archives of "Scientific American" since 1845. Among the many interesting articles now readable online, you can find Einstein's account of research on a generalized theory of gravitation. A small, though important addition to the Library's online collections: "Nature" online is now reaching back to 1987. You can now read online the "Nature" news column reporting about the first anti-atom discovered at CERN. We plan to further expand online access to "Nature", but in the meantime you can rely on the Library's paper collection...

  15. It’s the workflows, stupid! What is required to make ‘offsetting’ work for the open access transition

    Directory of Open Access Journals (Sweden)

    Kai Geschuhn

    2017-11-01

    Full Text Available This paper makes the case for stronger engagement of libraries and consortia when it comes to negotiating and drafting offsetting agreements. Two workshops organized by the Efficiencies and Standards for Article Charges (ESAC initiative in 2016 and 2017 have shown a clear need for an improvement of the current workflows and processes between academic institutions (and libraries and the publishers they use in terms of author identification, metadata exchange and invoicing. Publishers need to invest in their editorial systems, while institutions need to get a clearer understanding of the strategic goal of offsetting. To this purpose, strategic and practical elements, which should be included in the agreements, will be introduced. Firstly, the 'Joint Understanding of Offsetting', launched in 2016, will be discussed. This introduces the ‘pay-as-you-publish’ model as a transitional pathway for the agreements. Secondly, this paper proposes a set of recommendations for article workflows and services between institutions and publishers, based on a draft document which was produced as part of the 2nd ESAC Offsetting Workshop in March 2017. These recommendations should be seen as a minimum set of practical and formal requirements for offsetting agreements and are necessary to make any publication-based open access business model work.

  16. Support for Taverna workflows in the VPH-Share cloud platform.

    Science.gov (United States)

    Kasztelnik, Marek; Coto, Ernesto; Bubak, Marian; Malawski, Maciej; Nowakowski, Piotr; Arenas, Juan; Saglimbeni, Alfredo; Testi, Debora; Frangi, Alejandro F

    2017-07-01

    To address the increasing need for collaborative endeavours within the Virtual Physiological Human (VPH) community, the VPH-Share collaborative cloud platform allows researchers to expose and share sequences of complex biomedical processing tasks in the form of computational workflows. The Taverna Workflow System is a very popular tool for orchestrating complex biomedical & bioinformatics processing tasks in the VPH community. This paper describes the VPH-Share components that support the building and execution of Taverna workflows, and explains how they interact with other VPH-Share components to improve the capabilities of the VPH-Share platform. Taverna workflow support is delivered by the Atmosphere cloud management platform and the VPH-Share Taverna plugin. These components are explained in detail, along with the two main procedures that were developed to enable this seamless integration: workflow composition and execution. 1) Seamless integration of VPH-Share with other components and systems. 2) Extended range of different tools for workflows. 3) Successful integration of scientific workflows from other VPH projects. 4) Execution speed improvement for medical applications. The presented workflow integration provides VPH-Share users with a wide range of different possibilities to compose and execute workflows, such as desktop or online composition, online batch execution, multithreading, remote execution, etc. The specific advantages of each supported tool are presented, as are the roles of Atmosphere and the VPH-Share plugin within the VPH-Share project. The combination of the VPH-Share plugin and Atmosphere engenders the VPH-Share infrastructure with far more flexible, powerful and usable capabilities for the VPH-Share community. As both components can continue to evolve and improve independently, we acknowledge that further improvements are still to be developed and will be described. Copyright © 2017 Elsevier B.V. All rights reserved.

  17. LQCD workflow execution framework: Models, provenance and fault-tolerance

    International Nuclear Information System (INIS)

    Piccoli, Luciano; Simone, James N; Kowalkowlski, James B; Dubey, Abhishek

    2010-01-01

    Large computing clusters used for scientific processing suffer from systemic failures when operated over long continuous periods for executing workflows. Diagnosing job problems and faults leading to eventual failures in this complex environment is difficult, specifically when the success of an entire workflow might be affected by a single job failure. In this paper, we introduce a model-based, hierarchical, reliable execution framework that encompass workflow specification, data provenance, execution tracking and online monitoring of each workflow task, also referred to as participants. The sequence of participants is described in an abstract parameterized view, which is translated into a concrete data dependency based sequence of participants with defined arguments. As participants belonging to a workflow are mapped onto machines and executed, periodic and on-demand monitoring of vital health parameters on allocated nodes is enabled according to pre-specified rules. These rules specify conditions that must be true pre-execution, during execution and post-execution. Monitoring information for each participant is propagated upwards through the reflex and healing architecture, which consists of a hierarchical network of decentralized fault management entities, called reflex engines. They are instantiated as state machines or timed automatons that change state and initiate reflexive mitigation action(s) upon occurrence of certain faults. We describe how this cluster reliability framework is combined with the workflow execution framework using formal rules and actions specified within a structure of first order predicate logic that enables a dynamic management design that reduces manual administrative workload, and increases cluster-productivity.

  18. Presentation of the paper “Open access repositories as channel of publication scientific grey literature”

    OpenAIRE

    Ferreras Fernández, Tránsito; García-Peñalvo, Francisco José; Merlo Vega, José Antonio

    2015-01-01

    This is the presentation of the paper entitled “Open access repositories as channel of publication scientific grey literature” in the TEEM 2015 International Conference held in Porto (Portugal) in October 7-9, 2015. In this paper we describe how the open access repositories are valid channels for the publication of scientific grey literature. Technological development facilitates the communication of scientific knowledge, allowing expand distribution channels and significantly reducing tra...

  19. [The Open Access Initiative (OAI) in the scientific literature].

    Science.gov (United States)

    Sánchez-Martín, Francisco M; Millán Rodríguez, Félix; Villavicencio Mavrich, Humberto

    2009-01-01

    According to the declaration of the Budapest Open Access Initiative (OAI) is defined as a editorial model in which access to scientific journal literature and his use are free. Free flow of information allowed by Internet has been the basis of this initiative. The Bethesda and the Berlin declarations, supported by some international agencies, proposes to require researchers to deposit copies of all articles published in a self-archive or an Open Access repository, and encourage researchers to publish their research papers in journals Open Access. This paper reviews the keys of the OAI, with their strengths and controversial aspects; and it discusses the position of databases, search engines and repositories of biomedical information, as well as the attitude of the scientists, publishers and journals. So far the journal Actas Urológicas Españolas (Act Urol Esp) offer their contents on Open Access as On Line in Spanish and English.

  20. Enabling scientific workflows in virtual reality

    Science.gov (United States)

    Kreylos, O.; Bawden, G.; Bernardin, T.; Billen, M.I.; Cowgill, E.S.; Gold, R.D.; Hamann, B.; Jadamec, M.; Kellogg, L.H.; Staadt, O.G.; Sumner, D.Y.

    2006-01-01

    To advance research and improve the scientific return on data collection and interpretation efforts in the geosciences, we have developed methods of interactive visualization, with a special focus on immersive virtual reality (VR) environments. Earth sciences employ a strongly visual approach to the measurement and analysis of geologic data due to the spatial and temporal scales over which such data ranges, As observations and simulations increase in size and complexity, the Earth sciences are challenged to manage and interpret increasing amounts of data. Reaping the full intellectual benefits of immersive VR requires us to tailor exploratory approaches to scientific problems. These applications build on the visualization method's strengths, using both 3D perception and interaction with data and models, to take advantage of the skills and training of the geological scientists exploring their data in the VR environment. This interactive approach has enabled us to develop a suite of tools that are adaptable to a range of problems in the geosciences and beyond. Copyright ?? 2008 by the Association for Computing Machinery, Inc.

  1. Access to scientific publications: the scientist's perspective.

    Directory of Open Access Journals (Sweden)

    Yegor Voronin

    Full Text Available BACKGROUND: Scientific publishing is undergoing significant changes due to the growth of online publications, increases in the number of open access journals, and policies of funders and universities requiring authors to ensure that their publications become publicly accessible. Most studies of the impact of these changes have focused on the growth of articles available through open access or the number of open-access journals. Here, we investigated access to publications at a number of institutes and universities around the world, focusing on publications in HIV vaccine research--an area of biomedical research with special importance to the developing world. METHODS AND FINDINGS: We selected research papers in HIV vaccine research field, creating: 1 a first set of 50 most recently published papers with keywords "HIV vaccine" and 2 a second set of 200 articles randomly selected from those cited in the first set. Access to the majority (80% of the recently published articles required subscription, while cited literature was much more accessible (67% freely available online. Subscriptions at a number of institutions around the world were assessed for providing access to subscription-only articles from the two sets. The access levels varied widely, ranging among institutions from 20% to 90%. Through the WHO-supported HINARI program, institutes in low-income countries had access comparable to that of institutes in the North. Finally, we examined the response rates for reprint requests sent to corresponding authors, a method commonly used before internet access became widespread. Contacting corresponding authors with requests for electronic copies of articles by email resulted in a 55-60% success rate, although in some cases it took up to 1.5 months to get a response. CONCLUSIONS: While research articles are increasingly available on the internet in open access format, institutional subscriptions continue to play an important role. However

  2. [Project evidência [evidence]: research and education about accessing scientific databases in Azores].

    Science.gov (United States)

    Soares, Hélia; Pereira, Sandra M; Neves, Ajuda; Gomes, Amy; Teixeira, Bruno; Oliveira, Carolina; Sousa, Fábio; Tavares, Márcio; Tavares, Patrícia; Dutra, Raquel; Pereira, Hélder Rocha

    2013-04-01

    Project Evidência [Evidence] intends to promote the use of scientific databases among nurses. This study aims to design educational interventions that facilitate nurses' access to these databases, to determine nurses' habits regarding the use of scientific databases, and to determine the impact that educational interventions on scientific databases have on Azorean nurses who volunteered for this project. An intervention project was conducted, and a quantitative descriptive survey was designed to evaluate the impact two and five months after the educational intervention. This impact was investigated considering certain aspects, namely, the nurses' knowledge, habits and reasons for using scientific databases. A total of 192 nurses participated in this study, and the primary results indicate that the educational intervention had a positive impact based not only on the increased frequency of using platforms or databases of scientific information (DSIs) s but also on the competence and self-awareness regarding its use and consideration of the reasons for accessing this information.

  3. Sources of variation in primary care clinical workflow: implications for the design of cognitive support.

    Science.gov (United States)

    Militello, Laura G; Arbuckle, Nicole B; Saleem, Jason J; Patterson, Emily; Flanagan, Mindy; Haggstrom, David; Doebbeling, Bradley N

    2014-03-01

    This article identifies sources of variation in clinical workflow and implications for the design and implementation of electronic clinical decision support. Sources of variation in workflow were identified via rapid ethnographic observation, focus groups, and interviews across a total of eight medical centers in both the Veterans Health Administration and academic medical centers nationally regarded as leaders in developing and using clinical decision support. Data were reviewed for types of variability within the social and technical subsystems and the external environment as described in the sociotechnical systems theory. Two researchers independently identified examples of variation and their sources, and then met with each other to discuss them until consensus was reached. Sources of variation were categorized as environmental (clinic staffing and clinic pace), social (perception of health information technology and real-time use with patients), or technical (computer access and information access). Examples of sources of variation within each of the categories are described and discussed in terms of impact on clinical workflow. As technologies are implemented, barriers to use become visible over time as users struggle to adapt workflow and work practices to accommodate new technologies. Each source of variability identified has implications for the effective design and implementation of useful health information technology. Accommodating moderate variability in workflow is anticipated to avoid brittle and inflexible workflow designs, while also avoiding unnecessary complexity for implementers and users.

  4. Reconstructing Sessions from Data Discovery and Access Logs to Build a Semantic Knowledge Base for Improving Data Discovery

    Directory of Open Access Journals (Sweden)

    Yongyao Jiang

    2016-04-01

    Full Text Available Big geospatial data are archived and made available through online web discovery and access. However, finding the right data for scientific research and application development is still a challenge. This paper aims to improve the data discovery by mining the user knowledge from log files. Specifically, user web session reconstruction is focused upon in this paper as a critical step for extracting usage patterns. However, reconstructing user sessions from raw web logs has always been difficult, as a session identifier tends to be missing in most data portals. To address this problem, we propose two session identification methods, including time-clustering-based and time-referrer-based methods. We also present the workflow of session reconstruction and discuss the approach of selecting appropriate thresholds for relevant steps in the workflow. The proposed session identification methods and workflow are proven to be able to extract data access patterns for further pattern analyses of user behavior and improvement of data discovery for more relevancy data ranking, suggestion, and navigation.

  5. Computer network access to scientific information systems for minority universities

    Science.gov (United States)

    Thomas, Valerie L.; Wakim, Nagi T.

    1993-08-01

    The evolution of computer networking technology has lead to the establishment of a massive networking infrastructure which interconnects various types of computing resources at many government, academic, and corporate institutions. A large segment of this infrastructure has been developed to facilitate information exchange and resource sharing within the scientific community. The National Aeronautics and Space Administration (NASA) supports both the development and the application of computer networks which provide its community with access to many valuable multi-disciplinary scientific information systems and on-line databases. Recognizing the need to extend the benefits of this advanced networking technology to the under-represented community, the National Space Science Data Center (NSSDC) in the Space Data and Computing Division at the Goddard Space Flight Center has developed the Minority University-Space Interdisciplinary Network (MU-SPIN) Program: a major networking and education initiative for Historically Black Colleges and Universities (HBCUs) and Minority Universities (MUs). In this paper, we will briefly explain the various components of the MU-SPIN Program while highlighting how, by providing access to scientific information systems and on-line data, it promotes a higher level of collaboration among faculty and students and NASA scientists.

  6. Multi-Objective Approach for Energy-Aware Workflow Scheduling in Cloud Computing Environments

    Directory of Open Access Journals (Sweden)

    Sonia Yassa

    2013-01-01

    Full Text Available We address the problem of scheduling workflow applications on heterogeneous computing systems like cloud computing infrastructures. In general, the cloud workflow scheduling is a complex optimization problem which requires considering different criteria so as to meet a large number of QoS (Quality of Service requirements. Traditional research in workflow scheduling mainly focuses on the optimization constrained by time or cost without paying attention to energy consumption. The main contribution of this study is to propose a new approach for multi-objective workflow scheduling in clouds, and present the hybrid PSO algorithm to optimize the scheduling performance. Our method is based on the Dynamic Voltage and Frequency Scaling (DVFS technique to minimize energy consumption. This technique allows processors to operate in different voltage supply levels by sacrificing clock frequencies. This multiple voltage involves a compromise between the quality of schedules and energy. Simulation results on synthetic and real-world scientific applications highlight the robust performance of the proposed approach.

  7. Multi-Objective Approach for Energy-Aware Workflow Scheduling in Cloud Computing Environments

    Science.gov (United States)

    Kadima, Hubert; Granado, Bertrand

    2013-01-01

    We address the problem of scheduling workflow applications on heterogeneous computing systems like cloud computing infrastructures. In general, the cloud workflow scheduling is a complex optimization problem which requires considering different criteria so as to meet a large number of QoS (Quality of Service) requirements. Traditional research in workflow scheduling mainly focuses on the optimization constrained by time or cost without paying attention to energy consumption. The main contribution of this study is to propose a new approach for multi-objective workflow scheduling in clouds, and present the hybrid PSO algorithm to optimize the scheduling performance. Our method is based on the Dynamic Voltage and Frequency Scaling (DVFS) technique to minimize energy consumption. This technique allows processors to operate in different voltage supply levels by sacrificing clock frequencies. This multiple voltage involves a compromise between the quality of schedules and energy. Simulation results on synthetic and real-world scientific applications highlight the robust performance of the proposed approach. PMID:24319361

  8. Electronic Health Record-Driven Workflow for Diagnostic Radiologists.

    Science.gov (United States)

    Geeslin, Matthew G; Gaskin, Cree M

    2016-01-01

    In most settings, radiologists maintain a high-throughput practice in which efficiency is crucial. The conversion from film-based to digital study interpretation and data storage launched the era of PACS-driven workflow, leading to significant gains in speed. The advent of electronic health records improved radiologists' access to patient data; however, many still find this aspect of workflow to be relatively cumbersome. Nevertheless, the ability to guide a diagnostic interpretation with clinical information, beyond that provided in the examination indication, can add significantly to the specificity of a radiologist's interpretation. Responsibilities of the radiologist include, but are not limited to, protocoling examinations, interpreting studies, chart review, peer review, writing notes, placing orders, and communicating with referring providers. Most of the aforementioned activities are not PACS-centric and require a login to one or more additional applications. Consolidation of these tasks for completion through a single interface can simplify workflow, save time, and potentially reduce the incidence of errors. Here, the authors describe diagnostic radiology workflow that leverages the electronic health record to significantly add to a radiologist's ability to be part of the health care team, provide relevant interpretations, and improve efficiency and quality. Copyright © 2016 American College of Radiology. Published by Elsevier Inc. All rights reserved.

  9. geoKepler Workflow Module for Computationally Scalable and Reproducible Geoprocessing and Modeling

    Science.gov (United States)

    Cowart, C.; Block, J.; Crawl, D.; Graham, J.; Gupta, A.; Nguyen, M.; de Callafon, R.; Smarr, L.; Altintas, I.

    2015-12-01

    The NSF-funded WIFIRE project has developed an open-source, online geospatial workflow platform for unifying geoprocessing tools and models for for fire and other geospatially dependent modeling applications. It is a product of WIFIRE's objective to build an end-to-end cyberinfrastructure for real-time and data-driven simulation, prediction and visualization of wildfire behavior. geoKepler includes a set of reusable GIS components, or actors, for the Kepler Scientific Workflow System (https://kepler-project.org). Actors exist for reading and writing GIS data in formats such as Shapefile, GeoJSON, KML, and using OGC web services such as WFS. The actors also allow for calling geoprocessing tools in other packages such as GDAL and GRASS. Kepler integrates functions from multiple platforms and file formats into one framework, thus enabling optimal GIS interoperability, model coupling, and scalability. Products of the GIS actors can be fed directly to models such as FARSITE and WRF. Kepler's ability to schedule and scale processes using Hadoop and Spark also makes geoprocessing ultimately extensible and computationally scalable. The reusable workflows in geoKepler can be made to run automatically when alerted by real-time environmental conditions. Here, we show breakthroughs in the speed of creating complex data for hazard assessments with this platform. We also demonstrate geoKepler workflows that use Data Assimilation to ingest real-time weather data into wildfire simulations, and for data mining techniques to gain insight into environmental conditions affecting fire behavior. Existing machine learning tools and libraries such as R and MLlib are being leveraged for this purpose in Kepler, as well as Kepler's Distributed Data Parallel (DDP) capability to provide a framework for scalable processing. geoKepler workflows can be executed via an iPython notebook as a part of a Jupyter hub at UC San Diego for sharing and reporting of the scientific analysis and results from

  10. Towards an Intelligent Workflow Designer based on the Reuse of Workflow Patterns

    NARCIS (Netherlands)

    Iochpe, Cirano; Chiao, Carolina; Hess, Guillermo; Nascimento, Gleison; Thom, Lucinéia; Reichert, Manfred

    2007-01-01

    In order to perform process-aware information systems we need sophisticated methods and concepts for designing and modeling processes. Recently, research on workflow patterns has emerged in order to increase the reuse of recurring workflow structures. However, current workflow modeling tools do not

  11. PGen: large-scale genomic variations analysis workflow and browser in SoyKB.

    Science.gov (United States)

    Liu, Yang; Khan, Saad M; Wang, Juexin; Rynge, Mats; Zhang, Yuanxun; Zeng, Shuai; Chen, Shiyuan; Maldonado Dos Santos, Joao V; Valliyodan, Babu; Calyam, Prasad P; Merchant, Nirav; Nguyen, Henry T; Xu, Dong; Joshi, Trupti

    2016-10-06

    With the advances in next-generation sequencing (NGS) technology and significant reductions in sequencing costs, it is now possible to sequence large collections of germplasm in crops for detecting genome-scale genetic variations and to apply the knowledge towards improvements in traits. To efficiently facilitate large-scale NGS resequencing data analysis of genomic variations, we have developed "PGen", an integrated and optimized workflow using the Extreme Science and Engineering Discovery Environment (XSEDE) high-performance computing (HPC) virtual system, iPlant cloud data storage resources and Pegasus workflow management system (Pegasus-WMS). The workflow allows users to identify single nucleotide polymorphisms (SNPs) and insertion-deletions (indels), perform SNP annotations and conduct copy number variation analyses on multiple resequencing datasets in a user-friendly and seamless way. We have developed both a Linux version in GitHub ( https://github.com/pegasus-isi/PGen-GenomicVariations-Workflow ) and a web-based implementation of the PGen workflow integrated within the Soybean Knowledge Base (SoyKB), ( http://soykb.org/Pegasus/index.php ). Using PGen, we identified 10,218,140 single-nucleotide polymorphisms (SNPs) and 1,398,982 indels from analysis of 106 soybean lines sequenced at 15X coverage. 297,245 non-synonymous SNPs and 3330 copy number variation (CNV) regions were identified from this analysis. SNPs identified using PGen from additional soybean resequencing projects adding to 500+ soybean germplasm lines in total have been integrated. These SNPs are being utilized for trait improvement using genotype to phenotype prediction approaches developed in-house. In order to browse and access NGS data easily, we have also developed an NGS resequencing data browser ( http://soykb.org/NGS_Resequence/NGS_index.php ) within SoyKB to provide easy access to SNP and downstream analysis results for soybean researchers. PGen workflow has been optimized for the most

  12. High performance workflow implementation for protein surface characterization using grid technology

    Directory of Open Access Journals (Sweden)

    Clematis Andrea

    2005-12-01

    Full Text Available Abstract Background This study concerns the development of a high performance workflow that, using grid technology, correlates different kinds of Bioinformatics data, starting from the base pairs of the nucleotide sequence to the exposed residues of the protein surface. The implementation of this workflow is based on the Italian Grid.it project infrastructure, that is a network of several computational resources and storage facilities distributed at different grid sites. Methods Workflows are very common in Bioinformatics because they allow to process large quantities of data by delegating the management of resources to the information streaming. Grid technology optimizes the computational load during the different workflow steps, dividing the more expensive tasks into a set of small jobs. Results Grid technology allows efficient database management, a crucial problem for obtaining good results in Bioinformatics applications. The proposed workflow is implemented to integrate huge amounts of data and the results themselves must be stored into a relational database, which results as the added value to the global knowledge. Conclusion A web interface has been developed to make this technology accessible to grid users. Once the workflow has started, by means of the simplified interface, it is possible to follow all the different steps throughout the data processing. Eventually, when the workflow has been terminated, the different features of the protein, like the amino acids exposed on the protein surface, can be compared with the data present in the output database.

  13. Data Workflow - A Workflow Model for Continuous Data Processing

    NARCIS (Netherlands)

    Wombacher, Andreas

    2010-01-01

    Online data or streaming data are getting more and more important for enterprise information systems, e.g. by integrating sensor data and workflows. The continuous flow of data provided e.g. by sensors requires new workflow models addressing the data perspective of these applications, since

  14. BioMoby extensions to the Taverna workflow management and enactment software

    Directory of Open Access Journals (Sweden)

    Senger Martin

    2006-11-01

    Full Text Available Abstract Background As biology becomes an increasingly computational science, it is critical that we develop software tools that support not only bioinformaticians, but also bench biologists in their exploration of the vast and complex data-sets that continue to build from international genomic, proteomic, and systems-biology projects. The BioMoby interoperability system was created with the goal of facilitating the movement of data from one Web-based resource to another to fulfill the requirements of non-expert bioinformaticians. In parallel with the development of BioMoby, the European myGrid project was designing Taverna, a bioinformatics workflow design and enactment tool. Here we describe the marriage of these two projects in the form of a Taverna plug-in that provides access to many of BioMoby's features through the Taverna interface. Results The exposed BioMoby functionality aids in the design of "sensible" BioMoby workflows, aids in pipelining BioMoby and non-BioMoby-based resources, and ensures that end-users need only a minimal understanding of both BioMoby, and the Taverna interface itself. Users are guided through the construction of syntactically and semantically correct workflows through plug-in calls to the Moby Central registry. Moby Central provides a menu of only those BioMoby services capable of operating on the data-type(s that exist at any given position in the workflow. Moreover, the plug-in automatically and correctly connects a selected service into the workflow such that users are not required to understand the nature of the inputs or outputs for any service, leaving them to focus on the biological meaning of the workflow they are constructing, rather than the technical details of how the services will interoperate. Conclusion With the availability of the BioMoby plug-in to Taverna, we believe that BioMoby-based Web Services are now significantly more useful and accessible to bench scientists than are more traditional

  15. Developing a workflow to identify inconsistencies in volunteered geographic information: a phenological case study

    Science.gov (United States)

    Mehdipoor, Hamed; Zurita-Milla, Raul; Rosemartin, Alyssa; Gerst, Katharine L.; Weltzin, Jake F.

    2015-01-01

    Recent improvements in online information communication and mobile location-aware technologies have led to the production of large volumes of volunteered geographic information. Widespread, large-scale efforts by volunteers to collect data can inform and drive scientific advances in diverse fields, including ecology and climatology. Traditional workflows to check the quality of such volunteered information can be costly and time consuming as they heavily rely on human interventions. However, identifying factors that can influence data quality, such as inconsistency, is crucial when these data are used in modeling and decision-making frameworks. Recently developed workflows use simple statistical approaches that assume that the majority of the information is consistent. However, this assumption is not generalizable, and ignores underlying geographic and environmental contextual variability that may explain apparent inconsistencies. Here we describe an automated workflow to check inconsistency based on the availability of contextual environmental information for sampling locations. The workflow consists of three steps: (1) dimensionality reduction to facilitate further analysis and interpretation of results, (2) model-based clustering to group observations according to their contextual conditions, and (3) identification of inconsistent observations within each cluster. The workflow was applied to volunteered observations of flowering in common and cloned lilac plants (Syringa vulgaris and Syringa x chinensis) in the United States for the period 1980 to 2013. About 97% of the observations for both common and cloned lilacs were flagged as consistent, indicating that volunteers provided reliable information for this case study. Relative to the original dataset, the exclusion of inconsistent observations changed the apparent rate of change in lilac bloom dates by two days per decade, indicating the importance of inconsistency checking as a key step in data quality

  16. Developing a Workflow to Identify Inconsistencies in Volunteered Geographic Information: A Phenological Case Study.

    Science.gov (United States)

    Mehdipoor, Hamed; Zurita-Milla, Raul; Rosemartin, Alyssa; Gerst, Katharine L; Weltzin, Jake F

    2015-01-01

    Recent improvements in online information communication and mobile location-aware technologies have led to the production of large volumes of volunteered geographic information. Widespread, large-scale efforts by volunteers to collect data can inform and drive scientific advances in diverse fields, including ecology and climatology. Traditional workflows to check the quality of such volunteered information can be costly and time consuming as they heavily rely on human interventions. However, identifying factors that can influence data quality, such as inconsistency, is crucial when these data are used in modeling and decision-making frameworks. Recently developed workflows use simple statistical approaches that assume that the majority of the information is consistent. However, this assumption is not generalizable, and ignores underlying geographic and environmental contextual variability that may explain apparent inconsistencies. Here we describe an automated workflow to check inconsistency based on the availability of contextual environmental information for sampling locations. The workflow consists of three steps: (1) dimensionality reduction to facilitate further analysis and interpretation of results, (2) model-based clustering to group observations according to their contextual conditions, and (3) identification of inconsistent observations within each cluster. The workflow was applied to volunteered observations of flowering in common and cloned lilac plants (Syringa vulgaris and Syringa x chinensis) in the United States for the period 1980 to 2013. About 97% of the observations for both common and cloned lilacs were flagged as consistent, indicating that volunteers provided reliable information for this case study. Relative to the original dataset, the exclusion of inconsistent observations changed the apparent rate of change in lilac bloom dates by two days per decade, indicating the importance of inconsistency checking as a key step in data quality

  17. Workflow in Almaraz NPP

    International Nuclear Information System (INIS)

    Gonzalez Crego, E.; Martin Lopez-Suevos, C.

    2000-01-01

    Almaraz NPP decided to incorporate Workflow into its information system in response to the need to provide exhaustive follow-up and monitoring of each phase of the different procedures it manages. Oracle's Workflow was chosen for this purpose and it was integrated with previously developed applications. The objectives to be met in the incorporation of Workflow were as follows: Strict monitoring of procedures and processes. Detection of bottlenecks in the flow of information. Notification of those affected by pending tasks. Flexible allocation of tasks to user groups. Improved monitoring of management procedures. Improved communication. Similarly, special care was taken to: Integrate workflow processes with existing control panels. Synchronize workflow with installation procedures. Ensure that the system reflects use of paper forms. At present the Corrective Maintenance Request module is being operated using Workflow and the Work Orders and Notice of Order modules are about to follow suit. (Author)

  18. How to plan workflow changes: a practical quality improvement tool used in an outpatient hospital pharmacy.

    Science.gov (United States)

    Aguilar, Christine; Chau, Connie; Giridharan, Neha; Huh, Youchin; Cooley, Janet; Warholak, Terri L

    2013-06-01

    A quality improvement tool is provided to improve pharmacy workflow with the goal of minimizing errors caused by workflow issues. This study involved workflow evaluation and reorganization, and staff opinions of these proposed changes. The study pharmacy was an outpatient pharmacy in the Tucson area. However, the quality improvement tool may be applied in all pharmacy settings, including but not limited to community, hospital, and independent pharmacies. This tool can help the user to identify potential workflow problem spots, such as high-traffic areas through the creation of current and proposed workflow diagrams. Creating a visual representation can help the user to identify problem spots and to propose changes to optimize workflow. It may also be helpful to assess employees' opinions of these changes. The workflow improvement tool can be used to assess where improvements are needed in a pharmacy's floor plan and workflow. Suggestions for improvements in the study pharmacy included increasing the number of verification points and decreasing high traffic areas in the workflow. The employees of the study pharmacy felt that the proposed changes displayed greater continuity, sufficiency, accessibility, and space within the pharmacy.

  19. An Integrated Workflow For Secondary Use of Patient Data for Clinical Research.

    Science.gov (United States)

    Bouzillé, Guillaume; Sylvestre, Emmanuelle; Campillo-Gimenez, Boris; Renault, Eric; Ledieu, Thibault; Delamarre, Denis; Cuggia, Marc

    2015-01-01

    This work proposes an integrated workflow for secondary use of medical data to serve feasibility studies, and the prescreening and monitoring of research studies. All research issues are initially addressed by the Clinical Research Office through a research portal and subsequently redirected to relevant experts in the determined field of concentration. For secondary use of data, the workflow is then based on the clinical data warehouse of the hospital. A datamart with potentially eligible research candidates is constructed. Datamarts can either produce aggregated data, de-identified data, or identified data, according to the kind of study being treated. In conclusion, integrating the secondary use of data process into a general research workflow allows visibility of information technologies and improves the accessability of clinical data.

  20. RABIX: AN OPEN-SOURCE WORKFLOW EXECUTOR SUPPORTING RECOMPUTABILITY AND INTEROPERABILITY OF WORKFLOW DESCRIPTIONS.

    Science.gov (United States)

    Kaushik, Gaurav; Ivkovic, Sinisa; Simonovic, Janko; Tijanic, Nebojsa; Davis-Dusenbery, Brandi; Kural, Deniz

    2017-01-01

    As biomedical data has become increasingly easy to generate in large quantities, the methods used to analyze it have proliferated rapidly. Reproducible and reusable methods are required to learn from large volumes of data reliably. To address this issue, numerous groups have developed workflow specifications or execution engines, which provide a framework with which to perform a sequence of analyses. One such specification is the Common Workflow Language, an emerging standard which provides a robust and flexible framework for describing data analysis tools and workflows. In addition, reproducibility can be furthered by executors or workflow engines which interpret the specification and enable additional features, such as error logging, file organization, optim1izations to computation and job scheduling, and allow for easy computing on large volumes of data. To this end, we have developed the Rabix Executor, an open-source workflow engine for the purposes of improving reproducibility through reusability and interoperability of workflow descriptions.

  1. DataSpaces: An Interaction and Coordination Framework for Coupled Simulation Workflows

    International Nuclear Information System (INIS)

    Docan, Ciprian; Klasky, Scott A.; Parashar, Manish

    2010-01-01

    Emerging high-performance distributed computing environments are enabling new end-to-end formulations in science and engineering that involve multiple interacting processes and data-intensive application workflows. For example, current fusion simulation efforts are exploring coupled models and codes that simultaneously simulate separate application processes, such as the core and the edge turbulence, and run on different high performance computing resources. These components need to interact, at runtime, with each other and with services for data monitoring, data analysis and visualization, and data archiving. As a result, they require efficient support for dynamic and flexible couplings and interactions, which remains a challenge. This paper presents Data-Spaces, a flexible interaction and coordination substrate that addresses this challenge. DataSpaces essentially implements a semantically specialized virtual shared space abstraction that can be associatively accessed by all components and services in the application workflow. It enables live data to be extracted from running simulation components, indexes this data online, and then allows it to be monitored, queried and accessed by other components and services via the space using semantically meaningful operators. The underlying data transport is asynchronous, low-overhead and largely memory-to-memory. The design, implementation, and experimental evaluation of DataSpaces using a coupled fusion simulation workflow is presented.

  2. Eleven quick tips for architecting biomedical informatics workflows with cloud computing

    Science.gov (United States)

    Moore, Jason H.

    2018-01-01

    Cloud computing has revolutionized the development and operations of hardware and software across diverse technological arenas, yet academic biomedical research has lagged behind despite the numerous and weighty advantages that cloud computing offers. Biomedical researchers who embrace cloud computing can reap rewards in cost reduction, decreased development and maintenance workload, increased reproducibility, ease of sharing data and software, enhanced security, horizontal and vertical scalability, high availability, a thriving technology partner ecosystem, and much more. Despite these advantages that cloud-based workflows offer, the majority of scientific software developed in academia does not utilize cloud computing and must be migrated to the cloud by the user. In this article, we present 11 quick tips for architecting biomedical informatics workflows on compute clouds, distilling knowledge gained from experience developing, operating, maintaining, and distributing software and virtualized appliances on the world’s largest cloud. Researchers who follow these tips stand to benefit immediately by migrating their workflows to cloud computing and embracing the paradigm of abstraction. PMID:29596416

  3. Eleven quick tips for architecting biomedical informatics workflows with cloud computing.

    Science.gov (United States)

    Cole, Brian S; Moore, Jason H

    2018-03-01

    Cloud computing has revolutionized the development and operations of hardware and software across diverse technological arenas, yet academic biomedical research has lagged behind despite the numerous and weighty advantages that cloud computing offers. Biomedical researchers who embrace cloud computing can reap rewards in cost reduction, decreased development and maintenance workload, increased reproducibility, ease of sharing data and software, enhanced security, horizontal and vertical scalability, high availability, a thriving technology partner ecosystem, and much more. Despite these advantages that cloud-based workflows offer, the majority of scientific software developed in academia does not utilize cloud computing and must be migrated to the cloud by the user. In this article, we present 11 quick tips for architecting biomedical informatics workflows on compute clouds, distilling knowledge gained from experience developing, operating, maintaining, and distributing software and virtualized appliances on the world's largest cloud. Researchers who follow these tips stand to benefit immediately by migrating their workflows to cloud computing and embracing the paradigm of abstraction.

  4. Eleven quick tips for architecting biomedical informatics workflows with cloud computing.

    Directory of Open Access Journals (Sweden)

    Brian S Cole

    2018-03-01

    Full Text Available Cloud computing has revolutionized the development and operations of hardware and software across diverse technological arenas, yet academic biomedical research has lagged behind despite the numerous and weighty advantages that cloud computing offers. Biomedical researchers who embrace cloud computing can reap rewards in cost reduction, decreased development and maintenance workload, increased reproducibility, ease of sharing data and software, enhanced security, horizontal and vertical scalability, high availability, a thriving technology partner ecosystem, and much more. Despite these advantages that cloud-based workflows offer, the majority of scientific software developed in academia does not utilize cloud computing and must be migrated to the cloud by the user. In this article, we present 11 quick tips for architecting biomedical informatics workflows on compute clouds, distilling knowledge gained from experience developing, operating, maintaining, and distributing software and virtualized appliances on the world's largest cloud. Researchers who follow these tips stand to benefit immediately by migrating their workflows to cloud computing and embracing the paradigm of abstraction.

  5. NASA Plan for Increasing Access to the Results of Scientific Research

    Science.gov (United States)

    2014-01-01

    This plan is issued in response to the Executive Office of the President's February 22, 2013, Office of Science and Technology Policy (OSTP) Memorandum for the Heads of Executive Departments and Agencies, "Increasing Access to the Results of Federally Funded Scientific Research." Through this memorandum, OSTP directed all agencies with more than $100 million in annual research and development expenditures to prepare a plan for improving the public's access to the results of federally funded research. The National Aeronautics and Space Administration (NASA) invests on the order of $3 billion annually in fundamental and applied research and technology development1 across a broad range of topics, including space and Earth sciences, life and physical sciences, human health, aeronautics, and technology. Promoting the full and open sharing of data with research communities, private industry, academia, and the general public is one of NASA's longstanding core values. For example, NASA's space and suborbital mission personnel routinely process, archive, and distribute their data to researchers around the globe. This plan expands the breadth of NASA's open-access culture to include data and publications for all of the scientific research that the Agency sponsors.

  6. SwinDeW-C: A Peer-to-Peer Based Cloud Workflow System

    Science.gov (United States)

    Liu, Xiao; Yuan, Dong; Zhang, Gaofeng; Chen, Jinjun; Yang, Yun

    Workflow systems are designed to support the process automation of large scale business and scientific applications. In recent years, many workflow systems have been deployed on high performance computing infrastructures such as cluster, peer-to-peer (p2p), and grid computing (Moore, 2004; Wang, Jie, & Chen, 2009; Yang, Liu, Chen, Lignier, & Jin, 2007). One of the driving forces is the increasing demand of large scale instance and data/computation intensive workflow applications (large scale workflow applications for short) which are common in both eBusiness and eScience application areas. Typical examples (will be detailed in Section 13.2.1) include such as the transaction intensive nation-wide insurance claim application process; the data and computation intensive pulsar searching process in Astrophysics. Generally speaking, instance intensive applications are those processes which need to be executed for a large number of times sequentially within a very short period or concurrently with a large number of instances (Liu, Chen, Yang, & Jin, 2008; Liu et al., 2010; Yang et al., 2008). Therefore, large scale workflow applications normally require the support of high performance computing infrastructures (e.g. advanced CPU units, large memory space and high speed network), especially when workflow activities are of data and computation intensive themselves. In the real world, to accommodate such a request, expensive computing infrastructures including such as supercomputers and data servers are bought, installed, integrated and maintained with huge cost by system users

  7. Realization of Best-in-Class Workflows Using Open Spirit Technology

    International Nuclear Information System (INIS)

    Hauser, K.

    2002-01-01

    Open Spirit is the realization of a long sought after dream to create an open systems approach to G and G computing. The focus is on developing a plug-and-play, platform independent, vendor neutral application framework to enable workflow optimization for the oil and gas industry. Through Open Spirit, Oil and gas clients can either develop or purchase Open Spirit enabled applications and combine those applications together to optimize a particular workflow. Currently Open Spirit supports GeoQuest's Geo Frame and Landmark's Open Works project data stores. There are three primary benefits to using Open Spirit enabled applications. Users of Open Spirit enabled applications can access data from a variety of data sources without having to move or reformat the data. This reduces the time to get the information into the appropriate application and eliminates the need to have multiple copies of the same data, simplifying data management efforts. Second, Open Spirit bridges the gap between Unix and PC applications. Through Open Spirit, any applications developed for the NT platform can access information residing in a Unix project data store. Third, Open Spirit supports the concept of a virtual project set. A user can combine any number project data stores (GeoFrame, Open Works, Finder, RECALL and PDS/Tigress) and use the combined project sets as if they were a single project. The data is not moved, but instead accessed dynamically through Open Spirit from the appropriate native data store. Open Spirit allows interpreters to develop their own Best-in-Class workflow and to mix and match the applications they determine to be the best of their interpretation teams independent of vendor

  8. Progress in digital color workflow understanding in the International Color Consortium (ICC) Workflow WG

    Science.gov (United States)

    McCarthy, Ann

    2006-01-01

    The ICC Workflow WG serves as the bridge between ICC color management technologies and use of those technologies in real world color production applications. ICC color management is applicable to and is used in a wide range of color systems, from highly specialized digital cinema color special effects to high volume publications printing to home photography. The ICC Workflow WG works to align ICC technologies so that the color management needs of these diverse use case systems are addressed in an open, platform independent manner. This report provides a high level summary of the ICC Workflow WG objectives and work to date, focusing on the ways in which workflow can impact image quality and color systems performance. The 'ICC Workflow Primitives' and 'ICC Workflow Patterns and Dimensions' workflow models are covered in some detail. Consider the questions, "How much of dissatisfaction with color management today is the result of 'the wrong color transformation at the wrong time' and 'I can't get to the right conversion at the right point in my work process'?" Put another way, consider how image quality through a workflow can be negatively affected when the coordination and control level of the color management system is not sufficient.

  9. "The architecture of access to scientific knowledge: just how badly we have messed this up"

    CERN Multimedia

    CERN. Geneva

    2011-01-01

    In this talk, Professor Lessig will review the evolution of access to scientific scholarship, and evaluate the success of this system of access against a background norm of universal access.While copyright battles involving artists has gotten most of the public's attention, the real battle should be over access to knowledge, not culture. That battle we are losing.

  10. Managing and Communicating Operational Workflow: Designing and Implementing an Electronic Outpatient Whiteboard.

    Science.gov (United States)

    Steitz, Bryan D; Weinberg, Stuart T; Danciu, Ioana; Unertl, Kim M

    2016-01-01

    Healthcare team members in emergency department contexts have used electronic whiteboard solutions to help manage operational workflow for many years. Ambulatory clinic settings have highly complex operational workflow, but are still limited in electronic assistance to communicate and coordinate work activities. To describe and discuss the design, implementation, use, and ongoing evolution of a coordination and collaboration tool supporting ambulatory clinic operational workflow at Vanderbilt University Medical Center (VUMC). The outpatient whiteboard tool was initially designed to support healthcare work related to an electronic chemotherapy order-entry application. After a highly successful initial implementation in an oncology context, a high demand emerged across the organization for the outpatient whiteboard implementation. Over the past 10 years, developers have followed an iterative user-centered design process to evolve the tool. The electronic outpatient whiteboard system supports 194 separate whiteboards and is accessed by over 2800 distinct users on a typical day. Clinics can configure their whiteboards to support unique workflow elements. Since initial release, features such as immunization clinical decision support have been integrated into the system, based on requests from end users. The success of the electronic outpatient whiteboard demonstrates the usefulness of an operational workflow tool within the ambulatory clinic setting. Operational workflow tools can play a significant role in supporting coordination, collaboration, and teamwork in ambulatory healthcare settings.

  11. Promoting Access to Public Research Data for Scientific, Economic, and Social Development

    Directory of Open Access Journals (Sweden)

    P Arzberger

    2006-01-01

    Full Text Available Access to and sharing of data are essential for the conduct and advancement of science. This article argues that publicly funded research data should be openly available to the maximum extent possible. To seize upon advancements of cyberinfrastructure and the explosion of data in a range of scientific disciplines, this access to and sharing of publicly funded data must be advanced within an international framework, beyond technological solutions. The authors, members of an OECD Follow-up Group, present their research findings, based closely on their report to OECD, on key issues in data access, as well as operating principles and management aspects necessary to successful data access regimes.

  12. Insightful Workflow For Grid Computing

    Energy Technology Data Exchange (ETDEWEB)

    Dr. Charles Earl

    2008-10-09

    We developed a workflow adaptation and scheduling system for Grid workflow. The system currently interfaces with and uses the Karajan workflow system. We developed machine learning agents that provide the planner/scheduler with information needed to make decisions about when and how to replan. The Kubrick restructures workflow at runtime, making it unique among workflow scheduling systems. The existing Kubrick system provides a platform on which to integrate additional quality of service constraints and in which to explore the use of an ensemble of scheduling and planning algorithms. This will be the principle thrust of our Phase II work.

  13. Interacting with the National Database for Autism Research (NDAR) via the LONI Pipeline workflow environment.

    Science.gov (United States)

    Torgerson, Carinna M; Quinn, Catherine; Dinov, Ivo; Liu, Zhizhong; Petrosyan, Petros; Pelphrey, Kevin; Haselgrove, Christian; Kennedy, David N; Toga, Arthur W; Van Horn, John Darrell

    2015-03-01

    Under the umbrella of the National Database for Clinical Trials (NDCT) related to mental illnesses, the National Database for Autism Research (NDAR) seeks to gather, curate, and make openly available neuroimaging data from NIH-funded studies of autism spectrum disorder (ASD). NDAR has recently made its database accessible through the LONI Pipeline workflow design and execution environment to enable large-scale analyses of cortical architecture and function via local, cluster, or "cloud"-based computing resources. This presents a unique opportunity to overcome many of the customary limitations to fostering biomedical neuroimaging as a science of discovery. Providing open access to primary neuroimaging data, workflow methods, and high-performance computing will increase uniformity in data collection protocols, encourage greater reliability of published data, results replication, and broaden the range of researchers now able to perform larger studies than ever before. To illustrate the use of NDAR and LONI Pipeline for performing several commonly performed neuroimaging processing steps and analyses, this paper presents example workflows useful for ASD neuroimaging researchers seeking to begin using this valuable combination of online data and computational resources. We discuss the utility of such database and workflow processing interactivity as a motivation for the sharing of additional primary data in ASD research and elsewhere.

  14. In ... and out: open access publishing in scientific journals.

    Science.gov (United States)

    Boumil, Marcia M; Salem, Deeb N

    2014-01-01

    Open access (OA) journals are a growing phenomenon largely of the past decade wherein readers can access the content of scientific journals without paying for a subscription. The costs are borne by authors (or their institutions) who pay a fee to be published, thus allowing readers to access, search, print, and cite the journals without cost. Although the OA model, in and of itself, need not diminish scientific rigor, selectivity, or peer review, the "author pays" model creates an inherent conflict of interest: it operates with the incentive on the part of the journal to publish more and reject less. This is coupled with cost containment measures that affect the journals' ability to engage experienced editors and professional staff to scrutinize data, data analyses, and author conflicts of interest. While some OA journals appear to be comparable to their print competitors, others are "predatory" and have no legitimacy at all. Two recent "scams"--one recently published in Science--highlight the urgency of addressing the issues raised by OA publication so that OA does not lose its credibility just as it begins to gather substantial momentum. High-quality journals develop their reputations over time, and OA outlets will be no exception. For this to occur, however, the OA audience will need to be satisfied that OA can deliver high-quality publications utilizing rigorous peer review, editing, and conflict of interest scrutiny. Academic tenure and promotion committees that review scholarly credentials are understandably skeptical of publications in unrecognized journals, and the large number of new OA outlets contributes to this urgency from their perspective as well.

  15. Model-as-you-go for Choreographies : Rewinding and Repeating Scientific Choreographies

    NARCIS (Netherlands)

    Weiss, Andreass; Andrikopoulos, Vasilios; Hahn, Michael; Karastoyanova, Dimka

    2017-01-01

    Scientists are increasingly using the workflow technology as a means for modeling and execution of scientific experiments. Despite being a very powerful paradigm workflows still lack support for trial-and-error modeling, as well as flexibility mechanisms that enable the ad hoc repetition of

  16. Implementing Oracle Workflow

    CERN Document Server

    Mathieson, D W

    1999-01-01

    CERN (see [CERN]) is the world's largest physics research centre. Currently there are around 5,000 people working at the CERN site located on the border of France and Switzerland near Geneva along with another 4,000 working remotely at institutes situated all around the globe. CERN is currently working on the construction of our newest scientific instrument called the Large Hadron Collider (LHC); the construction alone of this 27-kilometre particle accelerator will not complete until 2005. Like many businesses in the current economic climate CERN is expected to continue growing, yet staff numbers are planned to fall in the coming years. In essence, do more with less. In an environment such as this, it is critical that the administration is as efficient as possible. One of the ways that administrative procedures are streamlined is by the use of an organisation-wide workflow system.

  17. Cloud-based bioinformatics workflow platform for large-scale next-generation sequencing analyses.

    Science.gov (United States)

    Liu, Bo; Madduri, Ravi K; Sotomayor, Borja; Chard, Kyle; Lacinski, Lukasz; Dave, Utpal J; Li, Jianqiang; Liu, Chunchen; Foster, Ian T

    2014-06-01

    Due to the upcoming data deluge of genome data, the need for storing and processing large-scale genome data, easy access to biomedical analyses tools, efficient data sharing and retrieval has presented significant challenges. The variability in data volume results in variable computing and storage requirements, therefore biomedical researchers are pursuing more reliable, dynamic and convenient methods for conducting sequencing analyses. This paper proposes a Cloud-based bioinformatics workflow platform for large-scale next-generation sequencing analyses, which enables reliable and highly scalable execution of sequencing analyses workflows in a fully automated manner. Our platform extends the existing Galaxy workflow system by adding data management capabilities for transferring large quantities of data efficiently and reliably (via Globus Transfer), domain-specific analyses tools preconfigured for immediate use by researchers (via user-specific tools integration), automatic deployment on Cloud for on-demand resource allocation and pay-as-you-go pricing (via Globus Provision), a Cloud provisioning tool for auto-scaling (via HTCondor scheduler), and the support for validating the correctness of workflows (via semantic verification tools). Two bioinformatics workflow use cases as well as performance evaluation are presented to validate the feasibility of the proposed approach. Copyright © 2014 Elsevier Inc. All rights reserved.

  18. Grid workflow validation using ontology-based tacit knowledge: A case study for quantitative remote sensing applications

    Science.gov (United States)

    Liu, Jia; Liu, Longli; Xue, Yong; Dong, Jing; Hu, Yingcui; Hill, Richard; Guang, Jie; Li, Chi

    2017-01-01

    Workflow for remote sensing quantitative retrieval is the ;bridge; between Grid services and Grid-enabled application of remote sensing quantitative retrieval. Workflow averts low-level implementation details of the Grid and hence enables users to focus on higher levels of application. The workflow for remote sensing quantitative retrieval plays an important role in remote sensing Grid and Cloud computing services, which can support the modelling, construction and implementation of large-scale complicated applications of remote sensing science. The validation of workflow is important in order to support the large-scale sophisticated scientific computation processes with enhanced performance and to minimize potential waste of time and resources. To research the semantic correctness of user-defined workflows, in this paper, we propose a workflow validation method based on tacit knowledge research in the remote sensing domain. We first discuss the remote sensing model and metadata. Through detailed analysis, we then discuss the method of extracting the domain tacit knowledge and expressing the knowledge with ontology. Additionally, we construct the domain ontology with Protégé. Through our experimental study, we verify the validity of this method in two ways, namely data source consistency error validation and parameters matching error validation.

  19. Querying Workflow Logs

    Directory of Open Access Journals (Sweden)

    Yan Tang

    2018-01-01

    Full Text Available A business process or workflow is an assembly of tasks that accomplishes a business goal. Business process management is the study of the design, configuration/implementation, enactment and monitoring, analysis, and re-design of workflows. The traditional methodology for the re-design and improvement of workflows relies on the well-known sequence of extract, transform, and load (ETL, data/process warehousing, and online analytical processing (OLAP tools. In this paper, we study the ad hoc queryiny of process enactments for (data-centric business processes, bypassing the traditional methodology for more flexibility in querying. We develop an algebraic query language based on “incident patterns” with four operators inspired from Business Process Model and Notation (BPMN representation, allowing the user to formulate ad hoc queries directly over workflow logs. A formal semantics of this query language, a preliminary query evaluation algorithm, and a group of elementary properties of the operators are provided.

  20. INSPIRE: a new scientific information system for HEP

    CERN Document Server

    Ivanov, R; CERN. Geneva. IT Department

    2010-01-01

    The status of high-energy physics (HEP) information systems has been jointly analyzed by the libraries of CERN, DESY, Fermilab and SLAC. As a result, the four laboratories have started the INSPIRE project – a new platform built by moving the successful SPIRES features and content, curated at DESY, Fermilab and SLAC, into the open-source CDS Invenio digital library software that was developed at CERN. INSPIRE will integrate current acquisition workflows and databases to host the entire body of the HEP literature (about one million records), aiming to become the reference HEP scientific information platform worldwide. It will provide users with fast access to full text journal articles and preprints, but also material such as conference slides and multimedia. INSPIRE will empower scientists with new tools to discover and access the results most relevant to their research, enable novel text- and data-mining applications, and deploy new metrics to assess the impact of articles and authors. In addition, it will ...

  1. INSPIRE: a new scientific information system for HEP

    CERN Multimedia

    Ivanov, R

    2009-01-01

    The status of high-energy physics (HEP) information systems has been jointly analyzed by the libraries of CERN, DESY, Fermilab and SLAC. As a result, the four laboratories have started the INSPIRE project – a new platform built by moving the successful SPIRES features and content, curated at DESY, Fermilab and SLAC, into the open-source CDS Invenio digital library software that was developed at CERN. INSPIRE will integrate present acquisition workflows and databases to host the entire body of the HEP literature (about one million records), aiming to become the reference HEP scientific information platform worldwide. It will provide users with fast access to full-text journal articles and preprints, but also material such as conference slides and multimedia. INSPIRE will empower scientists with new tools to discover and access the results most relevant to their research, enable novel text- and data-mining applications, and deploy new metrics to assess the impact of articles and authors. In addition, it will ...

  2. Office 2010 Workflow Developing Collaborative Solutions

    CERN Document Server

    Mann, David; Enterprises, Creative

    2010-01-01

    Workflow is the glue that binds information worker processes, users, and artifacts. Without workflow, information workers are just islands of data and potential. Office 2010 Workflow details how to implement workflow in SharePoint 2010 and the client Microsoft Office 2010 suite to help information workers share data, enforce processes and business rules, and work more efficiently together or solo. This book covers everything you need to know-from what workflow is all about to creating new activities; from the SharePoint Designer to Visual Studio 2010; from out-of-the-box workflows to state mac

  3. Semantic representation of scientific literature: bringing claims, contributions and named entities onto the Linked Open Data cloud

    Directory of Open Access Journals (Sweden)

    Bahar Sateli

    2015-12-01

    Full Text Available Motivation. Finding relevant scientific literature is one of the essential tasks researchers are facing on a daily basis. Digital libraries and web information retrieval techniques provide rapid access to a vast amount of scientific literature. However, no further automated support is available that would enable fine-grained access to the knowledge ‘stored’ in these documents. The emerging domain of Semantic Publishing aims at making scientific knowledge accessible to both humans and machines, by adding semantic annotations to content, such as a publication’s contributions, methods, or application domains. However, despite the promises of better knowledge access, the manual annotation of existing research literature is prohibitively expensive for wide-spread adoption. We argue that a novel combination of three distinct methods can significantly advance this vision in a fully-automated way: (i Natural Language Processing (NLP for Rhetorical Entity (RE detection; (ii Named Entity (NE recognition based on the Linked Open Data (LOD cloud; and (iii automatic knowledge base construction for both NEs and REs using semantic web ontologies that interconnect entities in documents with the machine-readable LOD cloud.Results. We present a complete workflow to transform scientific literature into a semantic knowledge base, based on the W3C standards RDF and RDFS. A text mining pipeline, implemented based on the GATE framework, automatically extracts rhetorical entities of type Claims and Contributions from full-text scientific literature. These REs are further enriched with named entities, represented as URIs to the linked open data cloud, by integrating the DBpedia Spotlight tool into our workflow. Text mining results are stored in a knowledge base through a flexible export process that provides for a dynamic mapping of semantic annotations to LOD vocabularies through rules stored in the knowledge base. We created a gold standard corpus from computer

  4. DEWEY: the DICOM-enabled workflow engine system.

    Science.gov (United States)

    Erickson, Bradley J; Langer, Steve G; Blezek, Daniel J; Ryan, William J; French, Todd L

    2014-06-01

    Workflow is a widely used term to describe the sequence of steps to accomplish a task. The use of workflow technology in medicine and medical imaging in particular is limited. In this article, we describe the application of a workflow engine to improve workflow in a radiology department. We implemented a DICOM-enabled workflow engine system in our department. We designed it in a way to allow for scalability, reliability, and flexibility. We implemented several workflows, including one that replaced an existing manual workflow and measured the number of examinations prepared in time without and with the workflow system. The system significantly increased the number of examinations prepared in time for clinical review compared to human effort. It also met the design goals defined at its outset. Workflow engines appear to have value as ways to efficiently assure that complex workflows are completed in a timely fashion.

  5. Modeling workflow to design machine translation applications for public health practice.

    Science.gov (United States)

    Turner, Anne M; Brownstein, Megumu K; Cole, Kate; Karasz, Hilary; Kirchhoff, Katrin

    2015-02-01

    Provide a detailed understanding of the information workflow processes related to translating health promotion materials for limited English proficiency individuals in order to inform the design of context-driven machine translation (MT) tools for public health (PH). We applied a cognitive work analysis framework to investigate the translation information workflow processes of two large health departments in Washington State. Researchers conducted interviews, performed a task analysis, and validated results with PH professionals to model translation workflow and identify functional requirements for a translation system for PH. The study resulted in a detailed description of work related to translation of PH materials, an information workflow diagram, and a description of attitudes towards MT technology. We identified a number of themes that hold design implications for incorporating MT in PH translation practice. A PH translation tool prototype was designed based on these findings. This study underscores the importance of understanding the work context and information workflow for which systems will be designed. Based on themes and translation information workflow processes, we identified key design guidelines for incorporating MT into PH translation work. Primary amongst these is that MT should be followed by human review for translations to be of high quality and for the technology to be adopted into practice. The time and costs of creating multilingual health promotion materials are barriers to translation. PH personnel were interested in MT's potential to improve access to low-cost translated PH materials, but expressed concerns about ensuring quality. We outline design considerations and a potential machine translation tool to best fit MT systems into PH practice. Copyright © 2014 Elsevier Inc. All rights reserved.

  6. SegMine workflows for semantic microarray data analysis in Orange4WS

    Directory of Open Access Journals (Sweden)

    Kulovesi Kimmo

    2011-10-01

    Full Text Available Abstract Background In experimental data analysis, bioinformatics researchers increasingly rely on tools that enable the composition and reuse of scientific workflows. The utility of current bioinformatics workflow environments can be significantly increased by offering advanced data mining services as workflow components. Such services can support, for instance, knowledge discovery from diverse distributed data and knowledge sources (such as GO, KEGG, PubMed, and experimental databases. Specifically, cutting-edge data analysis approaches, such as semantic data mining, link discovery, and visualization, have not yet been made available to researchers investigating complex biological datasets. Results We present a new methodology, SegMine, for semantic analysis of microarray data by exploiting general biological knowledge, and a new workflow environment, Orange4WS, with integrated support for web services in which the SegMine methodology is implemented. The SegMine methodology consists of two main steps. First, the semantic subgroup discovery algorithm is used to construct elaborate rules that identify enriched gene sets. Then, a link discovery service is used for the creation and visualization of new biological hypotheses. The utility of SegMine, implemented as a set of workflows in Orange4WS, is demonstrated in two microarray data analysis applications. In the analysis of senescence in human stem cells, the use of SegMine resulted in three novel research hypotheses that could improve understanding of the underlying mechanisms of senescence and identification of candidate marker genes. Conclusions Compared to the available data analysis systems, SegMine offers improved hypothesis generation and data interpretation for bioinformatics in an easy-to-use integrated workflow environment.

  7. Identifying Challenges Associated With the Care Transition Workflow From Hospital to Skilled Home Health Care: Perspectives of Home Health Care Agency Providers.

    Science.gov (United States)

    Nasarwanji, Mahiyar; Werner, Nicole E; Carl, Kimberly; Hohl, Dawn; Leff, Bruce; Gurses, Ayse P; Arbaje, Alicia I

    2015-01-01

    Older adults discharged from the hospital to skilled home health care (SHHC) are at high risk for experiencing suboptimal transitions. Using the human factors approach of shadowing and contextual inquiry, we studied the workflow for transitioning older adults from the hospital to SHHC. We created a representative diagram of the hospital to SHHC transition workflow, we examined potential workflow variations, we categorized workflow challenges, and we identified artifacts developed to manage variations and challenges. We identified three overarching challenges to optimal care transitions-information access, coordination, and communication/teamwork. Future investigations could test whether redesigning the transition from hospital to SHHC, based on our findings, improves workflow and care quality.

  8. Modeling Complex Workflow in Molecular Diagnostics

    Science.gov (United States)

    Gomah, Mohamed E.; Turley, James P.; Lu, Huimin; Jones, Dan

    2010-01-01

    One of the hurdles to achieving personalized medicine has been implementing the laboratory processes for performing and reporting complex molecular tests. The rapidly changing test rosters and complex analysis platforms in molecular diagnostics have meant that many clinical laboratories still use labor-intensive manual processing and testing without the level of automation seen in high-volume chemistry and hematology testing. We provide here a discussion of design requirements and the results of implementation of a suite of lab management tools that incorporate the many elements required for use of molecular diagnostics in personalized medicine, particularly in cancer. These applications provide the functionality required for sample accessioning and tracking, material generation, and testing that are particular to the evolving needs of individualized molecular diagnostics. On implementation, the applications described here resulted in improvements in the turn-around time for reporting of more complex molecular test sets, and significant changes in the workflow. Therefore, careful mapping of workflow can permit design of software applications that simplify even the complex demands of specialized molecular testing. By incorporating design features for order review, software tools can permit a more personalized approach to sample handling and test selection without compromising efficiency. PMID:20007844

  9. Integrating a work-flow engine within a commercial SCADA to build end users applications in a scientific environment

    International Nuclear Information System (INIS)

    Ounsy, M.; Pierre-Joseph Zephir, S.; Saintin, K.; Abeille, G.; Ley, E. de

    2012-01-01

    To build integrated high-level applications, SOLEIL is using an original component-oriented approach based on GlobalSCREEN, an industrial Java SCADA. The aim of this integrated development environment is to give SOLEIL's scientific and technical staff a way to develop GUI (Graphical User Interface) applications for external users of beamlines. These GUI applications must address the two following needs: monitoring and supervision of a control system and development and execution of automated processes (as beamline alignment, data collection and on-line data analysis). The first need is now completely answered through a rich set of Java graphical components based on the COMETE library and providing a high level of service for data logging, scanning and so on. To reach the same quality of service for process automation, a big effort has been made for more seamless integration of PASSERELLE, a work-flow engine with dedicated user-friendly interfaces for end users, packaged as JavaBeans in GlobalSCREEN components library. Starting with brief descriptions of software architecture of the PASSERELLE and GlobalSCREEN environments, we will then present the overall system integration design as well as the current status of deployment on SOLEIL beamlines. (authors)

  10. A Semi-Automated Workflow Solution for Data Set Publication

    Directory of Open Access Journals (Sweden)

    Suresh Vannan

    2016-03-01

    Full Text Available To address the need for published data, considerable effort has gone into formalizing the process of data publication. From funding agencies to publishers, data publication has rapidly become a requirement. Digital Object Identifiers (DOI and data citations have enhanced the integration and availability of data. The challenge facing data publishers now is to deal with the increased number of publishable data products and most importantly the difficulties of publishing diverse data products into an online archive. The Oak Ridge National Laboratory Distributed Active Archive Center (ORNL DAAC, a NASA-funded data center, faces these challenges as it deals with data products created by individual investigators. This paper summarizes the challenges of curating data and provides a summary of a workflow solution that ORNL DAAC researcher and technical staffs have created to deal with publication of the diverse data products. The workflow solution presented here is generic and can be applied to data from any scientific domain and data located at any data center.

  11. Open access versus subscription journals: a comparison of scientific impact.

    Science.gov (United States)

    Björk, Bo-Christer; Solomon, David

    2012-07-17

    In the past few years there has been an ongoing debate as to whether the proliferation of open access (OA) publishing would damage the peer review system and put the quality of scientific journal publishing at risk. Our aim was to inform this debate by comparing the scientific impact of OA journals with subscription journals, controlling for journal age, the country of the publisher, discipline and (for OA publishers) their business model. The 2-year impact factors (the average number of citations to the articles in a journal) were used as a proxy for scientific impact. The Directory of Open Access Journals (DOAJ) was used to identify OA journals as well as their business model. Journal age and discipline were obtained from the Ulrich's periodicals directory. Comparisons were performed on the journal level as well as on the article level where the results were weighted by the number of articles published in a journal. A total of 610 OA journals were compared with 7,609 subscription journals using Web of Science citation data while an overlapping set of 1,327 OA journals were compared with 11,124 subscription journals using Scopus data. Overall, average citation rates, both unweighted and weighted for the number of articles per journal, were about 30% higher for subscription journals. However, after controlling for discipline (medicine and health versus other), age of the journal (three time periods) and the location of the publisher (four largest publishing countries versus other countries) the differences largely disappeared in most subcategories except for journals that had been launched prior to 1996. OA journals that fund publishing with article processing charges (APCs) are on average cited more than other OA journals. In medicine and health, OA journals founded in the last 10 years are receiving about as many citations as subscription journals launched during the same period. Our results indicate that OA journals indexed in Web of Science and/or Scopus are

  12. On Lifecycle Constraints of Artifact-Centric Workflows

    Science.gov (United States)

    Kucukoguz, Esra; Su, Jianwen

    Data plays a fundamental role in modeling and management of business processes and workflows. Among the recent "data-aware" workflow models, artifact-centric models are particularly interesting. (Business) artifacts are the key data entities that are used in workflows and can reflect both the business logic and the execution states of a running workflow. The notion of artifacts succinctly captures the fluidity aspect of data during workflow executions. However, much of the technical dimension concerning artifacts in workflows is not well understood. In this paper, we study a key concept of an artifact "lifecycle". In particular, we allow declarative specifications/constraints of artifact lifecycle in the spirit of DecSerFlow, and formulate the notion of lifecycle as the set of all possible paths an artifact can navigate through. We investigate two technical problems: (Compliance) does a given workflow (schema) contain only lifecycle allowed by a constraint? And (automated construction) from a given lifecycle specification (constraint), is it possible to construct a "compliant" workflow? The study is based on a new formal variant of artifact-centric workflow model called "ArtiNets" and two classes of lifecycle constraints named "regular" and "counting" constraints. We present a range of technical results concerning compliance and automated construction, including: (1) compliance is decidable when workflow is atomic or constraints are regular, (2) for each constraint, we can always construct a workflow that satisfies the constraint, and (3) sufficient conditions where atomic workflows can be constructed.

  13. Open innovation: Towards sharing of data, models and workflows.

    Science.gov (United States)

    Conrado, Daniela J; Karlsson, Mats O; Romero, Klaus; Sarr, Céline; Wilkins, Justin J

    2017-11-15

    Sharing of resources across organisations to support open innovation is an old idea, but which is being taken up by the scientific community at increasing speed, concerning public sharing in particular. The ability to address new questions or provide more precise answers to old questions through merged information is among the attractive features of sharing. Increased efficiency through reuse, and increased reliability of scientific findings through enhanced transparency, are expected outcomes from sharing. In the field of pharmacometrics, efforts to publicly share data, models and workflow have recently started. Sharing of individual-level longitudinal data for modelling requires solving legal, ethical and proprietary issues similar to many other fields, but there are also pharmacometric-specific aspects regarding data formats, exchange standards, and database properties. Several organisations (CDISC, C-Path, IMI, ISoP) are working to solve these issues and propose standards. There are also a number of initiatives aimed at collecting disease-specific databases - Alzheimer's Disease (ADNI, CAMD), malaria (WWARN), oncology (PDS), Parkinson's Disease (PPMI), tuberculosis (CPTR, TB-PACTS, ReSeqTB) - suitable for drug-disease modelling. Organized sharing of pharmacometric executable model code and associated information has in the past been sparse, but a model repository (DDMoRe Model Repository) intended for the purpose has recently been launched. In addition several other services can facilitate model sharing more generally. Pharmacometric workflows have matured over the last decades and initiatives to more fully capture those applied to analyses are ongoing. In order to maximize both the impact of pharmacometrics and the knowledge extracted from clinical data, the scientific community needs to take ownership of and create opportunities for open innovation. Copyright © 2017 Elsevier B.V. All rights reserved.

  14. Nanocuration workflows: Establishing best practices for identifying, inputting, and sharing data to inform decisions on nanomaterials

    Directory of Open Access Journals (Sweden)

    Christina M. Powers

    2015-09-01

    Full Text Available There is a critical opportunity in the field of nanoscience to compare and integrate information across diverse fields of study through informatics (i.e., nanoinformatics. This paper is one in a series of articles on the data curation process in nanoinformatics (nanocuration. Other articles in this series discuss key aspects of nanocuration (temporal metadata, data completeness, database integration, while the focus of this article is on the nanocuration workflow, or the process of identifying, inputting, and reviewing nanomaterial data in a data repository. In particular, the article discusses: 1 the rationale and importance of a defined workflow in nanocuration, 2 the influence of organizational goals or purpose on the workflow, 3 established workflow practices in other fields, 4 current workflow practices in nanocuration, 5 key challenges for workflows in emerging fields like nanomaterials, 6 examples to make these challenges more tangible, and 7 recommendations to address the identified challenges. Throughout the article, there is an emphasis on illustrating key concepts and current practices in the field. Data on current practices in the field are from a group of stakeholders active in nanocuration. In general, the development of workflows for nanocuration is nascent, with few individuals formally trained in data curation or utilizing available nanocuration resources (e.g., ISA-TAB-Nano. Additional emphasis on the potential benefits of cultivating nanomaterial data via nanocuration processes (e.g., capability to analyze data from across research groups and providing nanocuration resources (e.g., training will likely prove crucial for the wider application of nanocuration workflows in the scientific community.

  15. SYRMEP Tomo Project: a graphical user interface for customizing CT reconstruction workflows.

    Science.gov (United States)

    Brun, Francesco; Massimi, Lorenzo; Fratini, Michela; Dreossi, Diego; Billé, Fulvio; Accardo, Agostino; Pugliese, Roberto; Cedola, Alessia

    2017-01-01

    When considering the acquisition of experimental synchrotron radiation (SR) X-ray CT data, the reconstruction workflow cannot be limited to the essential computational steps of flat fielding and filtered back projection (FBP). More refined image processing is often required, usually to compensate artifacts and enhance the quality of the reconstructed images. In principle, it would be desirable to optimize the reconstruction workflow at the facility during the experiment (beamtime). However, several practical factors affect the image reconstruction part of the experiment and users are likely to conclude the beamtime with sub-optimal reconstructed images. Through an example of application, this article presents SYRMEP Tomo Project (STP), an open-source software tool conceived to let users design custom CT reconstruction workflows. STP has been designed for post-beamtime (off-line use) and for a new reconstruction of past archived data at user's home institution where simple computing resources are available. Releases of the software can be downloaded at the Elettra Scientific Computing group GitHub repository https://github.com/ElettraSciComp/STP-Gui.

  16. From Requirements via Colored Workflow Nets to an Implementation in Several Workflow Systems

    DEFF Research Database (Denmark)

    Mans, Ronnie S:; van der Aalst, Wil M.P.; Bakker, Piet J.M.

    2007-01-01

    care process of the Academic Medical Center (AMC) hospital is used as reference process. The process consists of hundreds of activities. These have been modeled and analyzed using an EUC and a CWN. Moreover, based on the CWN, the process has been implemented using four different workflow systems......Care organizations, such as hospitals, need to support complex and dynamic workflows. More- over, many disciplines are involved. This makes it important to avoid the typical disconnect between requirements and the actual implementation of the system. This paper proposes an approach where...... an Executable Use Case (EUC) and Colored Workflow Net (CWN) are used to close the gap between the given requirements specification and the realization of these requirements with the help of a workflow system. This paper describes a large case study where the diagnostic tra jectory of the gynaecological oncology...

  17. Digitizing specimens in a small herbarium: A viable workflow for collections working with limited resources.

    Science.gov (United States)

    Harris, Kari M; Marsico, Travis D

    2017-04-01

    Small herbaria represent a significant portion of herbaria in the United States, but many are not digitizing their collections. At the Arkansas State University Herbarium (STAR), we have created a viable workflow to help small herbaria begin the digitization process, including suggestions for publishing data on the Internet. We calculated hourly rates of each phase of the digitization process. We also mapped accessions at the county level to determine geographic strengths in the collection. All 17,678 accessioned flowering plant specimens at STAR are imaged, databased in Specify, and available electronically on the herbarium's website. Students imaged the specimens at a mean rate of 145/h. We found differences in databasing rates between the graduate student leading the project (47/h) and undergraduate assistants (25/h). The majority of specimens at STAR were collected within the counties neighboring the institution. With this workflow, we estimate that one person can digitize a 20,000-specimen collection in less than 2.5 yr by working only 10 h/wk. Because STAR is a small herbarium with limited resources, the application of the workflow described should assist curators of similar-sized collections as they contemplate and undertake the digitization process.

  18. The Federal University of Minas Gerais into context of open access to scientific information: identification of their information systems

    Directory of Open Access Journals (Sweden)

    Ráisa Mendes Fernandes de Souza

    2014-12-01

    Full Text Available The obstacles faced by the scientific community in the dissemination and access to their own productions, contextualized in need of open access to scientific information, boosted the creation of Institutional Repositories. These are technologies adopted by educational institutions and research that aim to manage and provide scientific production site. Objective: to characterize Federal University of Minas Gerais’ information systems, analyzing the perception of the actors responsible for their existence / maintenance within the context of open access to scientific information. Methodology: This is a descriptive and qualitative research, which is engaged in the information systems developed within the university to support the teaching, research and extension, which contain the records of productions or productions, scientific or not, the local academic community. Results: we found some awareness of the actors interviewed in relation to the existence of other systems, although such awareness is not ideal. Generally, UFMG has Managers and depositors competent enough to manage a repository greater, as is the case of an RI. Conclusion: It is necessary an information policy that is born of consolidated scientific sectors hierarchically superior and inferior sectors to be transferred to, well, be possible to articulate the entire university community in support of a common cause. The study emphasizes the community of UFMG and optimization of open access to publications of the University as the scope for possible future studies.

  19. Provenance for Runtime Workflow Steering and Validation in Computational Seismology

    Science.gov (United States)

    Spinuso, A.; Krischer, L.; Krause, A.; Filgueira, R.; Magnoni, F.; Muraleedharan, V.; David, M.

    2014-12-01

    Provenance systems may be offered by modern workflow engines to collect metadata about the data transformations at runtime. If combined with effective visualisation and monitoring interfaces, these provenance recordings can speed up the validation process of an experiment, suggesting interactive or automated interventions with immediate effects on the lifecycle of a workflow run. For instance, in the field of computational seismology, if we consider research applications performing long lasting cross correlation analysis and high resolution simulations, the immediate notification of logical errors and the rapid access to intermediate results, can produce reactions which foster a more efficient progress of the research. These applications are often executed in secured and sophisticated HPC and HTC infrastructures, highlighting the need for a comprehensive framework that facilitates the extraction of fine grained provenance and the development of provenance aware components, leveraging the scalability characteristics of the adopted workflow engines, whose enactment can be mapped to different technologies (MPI, Storm clusters, etc). This work looks at the adoption of W3C-PROV concepts and data model within a user driven processing and validation framework for seismic data, supporting also computational and data management steering. Validation needs to balance automation with user intervention, considering the scientist as part of the archiving process. Therefore, the provenance data is enriched with community-specific metadata vocabularies and control messages, making an experiment reproducible and its description consistent with the community understandings. Moreover, it can contain user defined terms and annotations. The current implementation of the system is supported by the EU-Funded VERCE (http://verce.eu). It provides, as well as the provenance generation mechanisms, a prototypal browser-based user interface and a web API built on top of a NoSQL storage

  20. Snakemake-a scalable bioinformatics workflow engine

    NARCIS (Netherlands)

    J. Köster (Johannes); S. Rahmann (Sven)

    2012-01-01

    textabstractSnakemake is a workflow engine that provides a readable Python-based workflow definition language and a powerful execution environment that scales from single-core workstations to compute clusters without modifying the workflow. It is the first system to support the use of automatically

  1. Pegasus Workflow Management System: Helping Applications From Earth and Space

    Science.gov (United States)

    Mehta, G.; Deelman, E.; Vahi, K.; Silva, F.

    2010-12-01

    Pegasus WMS is a Workflow Management System that can manage large-scale scientific workflows across Grid, local and Cloud resources simultaneously. Pegasus WMS provides a means for representing the workflow of an application in an abstract XML form, agnostic of the resources available to run it and the location of data and executables. It then compiles these workflows into concrete plans by querying catalogs and farming computations across local and distributed computing resources, as well as emerging commercial and community cloud environments in an easy and reliable manner. Pegasus WMS optimizes the execution as well as data movement by leveraging existing Grid and cloud technologies via a flexible pluggable interface and provides advanced features like reusing existing data, automatic cleanup of generated data, and recursive workflows with deferred planning. It also captures all the provenance of the workflow from the planning stage to the execution of the generated data, helping scientists to accurately measure performance metrics of their workflow as well as data reproducibility issues. Pegasus WMS was initially developed as part of the GriPhyN project to support large-scale high-energy physics and astrophysics experiments. Direct funding from the NSF enabled support for a wide variety of applications from diverse domains including earthquake simulation, bacterial RNA studies, helioseismology and ocean modeling. Earthquake Simulation: Pegasus WMS was recently used in a large scale production run in 2009 by the Southern California Earthquake Centre to run 192 million loosely coupled tasks and about 2000 tightly coupled MPI style tasks on National Cyber infrastructure for generating a probabilistic seismic hazard map of the Southern California region. SCEC ran 223 workflows over a period of eight weeks, using on average 4,420 cores, with a peak of 14,540 cores. A total of 192 million files were produced totaling about 165TB out of which 11TB of data was saved

  2. Multidetector-row CT: economics and workflow

    International Nuclear Information System (INIS)

    Pottala, K.M.; Kalra, M.K.; Saini, S.; Ouellette, K.; Sahani, D.; Thrall, J.H.

    2005-01-01

    With rapid evolution of multidetector-row CT (MDCT) technology and applications, several factors such ad technology upgrade and turf battles for sharing cost and profitability affect MDCT workflow and economics. MDCT workflow optimization can enhance productivity and reduce unit costs as well as increase profitability, in spite of decrease in reimbursement rates. Strategies for workflow management include standardization, automation, and constant assessment of various steps involved in MDCT operations. In this review article, we describe issues related to MDCT economics and workflow. (orig.)

  3. INSPIRE: A new scientific information system for HEP

    International Nuclear Information System (INIS)

    Ivanov, R; Raae, L

    2010-01-01

    The status of high-energy physics (HEP) information systems has been jointly analyzed by the libraries of CERN, DESY, Fermilab and SLAC. As a result, the four laboratories have started the INSPIRE project - a new platform built by moving the successful SPIRES features and content, curated at DESY, Fermilab and SLAC, into the open-source CDS Invenio digital library software that was developed at CERN. INSPIRE will integrate current acquisition workflows and databases to host the entire body of the HEP literature (about one million records), aiming to become the reference HEP scientific information platform worldwide. It will provide users with fast access to full text journal articles and preprints, but also material such as conference slides and multimedia. INSPIRE will empower scientists with new tools to discover and access the results most relevant to their research, enable novel text- and data-mining applications, and deploy new metrics to assess the impact of articles and authors. In addition, it will introduce the 'Web 2.0' paradigm of user-enriched content in the domain of sciences, with community-based approaches to scientific publishing. INSPIRE represents a natural evolution of scholarly communication built on successful community-based information systems, and it provides a vision for information management in other fields of science. Inspired by the needs of HEP, we hope that the INSPIRE project will be inspiring for other communities.

  4. Integration of services into workflow applications

    CERN Document Server

    Czarnul, Pawel

    2015-01-01

    Describing state-of-the-art solutions in distributed system architectures, Integration of Services into Workflow Applications presents a concise approach to the integration of loosely coupled services into workflow applications. It discusses key challenges related to the integration of distributed systems and proposes solutions, both in terms of theoretical aspects such as models and workflow scheduling algorithms, and technical solutions such as software tools and APIs.The book provides an in-depth look at workflow scheduling and proposes a way to integrate several different types of services

  5. A Formal Framework for Workflow Analysis

    Science.gov (United States)

    Cravo, Glória

    2010-09-01

    In this paper we provide a new formal framework to model and analyse workflows. A workflow is the formal definition of a business process that consists in the execution of tasks in order to achieve a certain objective. In our work we describe a workflow as a graph whose vertices represent tasks and the arcs are associated to workflow transitions. Each task has associated an input/output logic operator. This logic operator can be the logical AND (•), the OR (⊗), or the XOR -exclusive-or—(⊕). Moreover, we introduce algebraic concepts in order to completely describe completely the structure of workflows. We also introduce the concept of logical termination. Finally, we provide a necessary and sufficient condition for this property to hold.

  6. Ferret Workflow Anomaly Detection System

    National Research Council Canada - National Science Library

    Smith, Timothy J; Bryant, Stephany

    2005-01-01

    The Ferret workflow anomaly detection system project 2003-2004 has provided validation and anomaly detection in accredited workflows in secure knowledge management systems through the use of continuous, automated audits...

  7. Scientific Utopia: An agenda for improving scientific communication (Invited)

    Science.gov (United States)

    Nosek, B.

    2013-12-01

    The scientist's primary incentive is publication. In the present culture, open practices do not increase chances of publication, and they often require additional work. Practicing the abstract scientific values of openness and reproducibility thus requires behaviors in addition to those relevant for the primary, concrete rewards. When in conflict, concrete rewards are likely to dominate over abstract ones. As a consequence, the reward structure for scientists does not encourage openness and reproducibility. This can be changed by nudging incentives to align scientific practices with scientific values. Science will benefit by creating and connecting technologies that nudge incentives while supporting and improving the scientific workflow. For example, it should be as easy to search the research literature for my topic as it is to search the Internet to find hilarious videos of cats falling off of furniture. I will introduce the Center for Open Science (http://centerforopenscience.org/) and efforts to improve openness and reproducibility such as http://openscienceframework.org/. There will be no cats.

  8. Open access to scientific publications - an analysis of the barriers to change

    Directory of Open Access Journals (Sweden)

    Professor Bo-Christer Björk

    2004-01-01

    Full Text Available One of the effects of the Internet is that the dissemination of scientific publications in a few years has migrated to electronic formats. The basic business practices between libraries and publishers for selling and buying the content have, however, not changed much. Scientists have in protest against the high subscription prices of mainstream publishers started Open Access (OA journals and e-print repositories, which distribute scientific information freely. Despite widespread agreement among academics that OA would be the optimal distribution mode for publicly financed research results, OA channels still constitute only a marginal phenomenon in the global scholarly communication system. This paper discusses, in view of the experiences of the last ten years, the many barriers hindering a rapid proliferation of Open Access. The discussion is structured according to the main OA channels; peer-reviewed journals for primary publishing, subject-specific and institutional repositories for secondary parallel publishing. It also discusses the types of barriers, which can be classified as: legal framework, IT-infrastructure, business models, indexing services and standards, academic reward system, marketing and critical mass.

  9. Intangible Capital: Four years of growth as an open-access scientific publication

    Directory of Open Access Journals (Sweden)

    Pep Simo

    2008-01-01

    Full Text Available This issue opens the fourth volume of the Intangible Capital journal, which makes its way towards the fifth year of publication. As usually, we start this volume by evaluating the previous one and tracing new directions. Among the main contributions during the year 2007, we consider important to highlight the following aspects: the renewal of the scientific indexation agreements, the platform change to OJS, the appointment of a new editor, new members included in the editorial board, the board of reviewers, the change towards a bilingual model, the new financing obtained and, the last but not the least, the work undertaken together with many scientific editors of open access Spanish journals for obtaining the positive evaluation of the CNEAI (National Commission for the Evaluation of the Research Activity and thus, being a proof of scientific excellence.

  10. SciELO, Scientific Electronic Library Online, a Database of Open Access Journals

    Science.gov (United States)

    Meneghini, Rogerio

    2013-01-01

    This essay discusses SciELO, a scientific journal database operating in 14 countries. It covers over 1000 journals providing open access to full text and table sets of scientometrics data. In Brazil it is responsible for a collection of nearly 300 journals, selected along 15 years as the best Brazilian periodicals in natural and social sciences.…

  11. A workflow learning model to improve geovisual analytics utility.

    Science.gov (United States)

    Roth, Robert E; Maceachren, Alan M; McCabe, Craig A

    2009-01-01

    the concept of scientific workflows. Second, we implemented an interface in the G-EX Portal Learn Module to demonstrate the workflow learning model. The workflow interface allows users to drag learning artifacts uploaded to the G-EX Portal onto a central whiteboard and then annotate the workflow using text and drawing tools. Once completed, users can visit the assembled workflow to get an idea of the kind, number, and scale of analysis steps, view individual learning artifacts associated with each node in the workflow, and ask questions about the overall workflow or individual learning artifacts through the associated forums. An example learning workflow in the domain of epidemiology is provided to demonstrate the effectiveness of the approach. RESULTS/CONCLUSIONS: In the context of geovisual analytics, GIScientists are not only responsible for developing software to facilitate visually-mediated reasoning about large and complex spatiotemporal information, but also for ensuring that this software works. The workflow learning model discussed in this paper and demonstrated in the G-EX Portal Learn Module is one approach to improving the utility of geovisual analytics software. While development of the G-EX Portal Learn Module is ongoing, we expect to release the G-EX Portal Learn Module by Summer 2009.

  12. Radiology information system: a workflow-based approach

    International Nuclear Information System (INIS)

    Zhang, Jinyan; Lu, Xudong; Nie, Hongchao; Huang, Zhengxing; Aalst, W.M.P. van der

    2009-01-01

    Introducing workflow management technology in healthcare seems to be prospective in dealing with the problem that the current healthcare Information Systems cannot provide sufficient support for the process management, although several challenges still exist. The purpose of this paper is to study the method of developing workflow-based information system in radiology department as a use case. First, a workflow model of typical radiology process was established. Second, based on the model, the system could be designed and implemented as a group of loosely coupled components. Each component corresponded to one task in the process and could be assembled by the workflow management system. The legacy systems could be taken as special components, which also corresponded to the tasks and were integrated through transferring non-work- flow-aware interfaces to the standard ones. Finally, a workflow dashboard was designed and implemented to provide an integral view of radiology processes. The workflow-based Radiology Information System was deployed in the radiology department of Zhejiang Chinese Medicine Hospital in China. The results showed that it could be adjusted flexibly in response to the needs of changing process, and enhance the process management in the department. It can also provide a more workflow-aware integration method, comparing with other methods such as IHE-based ones. The workflow-based approach is a new method of developing radiology information system with more flexibility, more functionalities of process management and more workflow-aware integration. The work of this paper is an initial endeavor for introducing workflow management technology in healthcare. (orig.)

  13. Plan to increase public access to the results of Federally-funded scientific research results.

    Science.gov (United States)

    2015-12-16

    This plan is issued in response to the February 22, 2013 Office of Science and Technology Policy (OSTP) Memorandum for the Heads of Executive Departments and Agencies entitled Increasing Access to the Results of Federally Funded Scientific Researc...

  14. ScienceCentral: open access full-text archive of scientific journals based on Journal Article Tag Suite regardless of their languages.

    Science.gov (United States)

    Huh, Sun

    2013-01-01

    ScienceCentral, a free or open access, full-text archive of scientific journal literature at the Korean Federation of Science and Technology Societies, was under test in September 2013. Since it is a Journal Article Tag Suite-based full text database, extensible markup language files of all languages can be presented, according to Unicode Transformation Format 8-bit encoding. It is comparable to PubMed Central: however, there are two distinct differences. First, its scope comprises all science fields; second, it accepts all language journals. Launching ScienceCentral is the first step for free access or open access academic scientific journals of all languages to leap to the world, including scientific journals from Croatia.

  15. Polymers – A New Open Access Scientific Journal on Polymer Science

    Directory of Open Access Journals (Sweden)

    Shu-Kun Lin

    2009-12-01

    Full Text Available Polymers is a new interdisciplinary, Open Access scientific journal on polymer science, published by Molecular Diversity Preservation International (MDPI. This journal welcomes manuscript submissions on polymer chemistry, macromolecular chemistry, polymer physics, polymer characterization and all related topics. Both synthetic polymers and natural polymers, including biopolymers, are considered. Manuscripts will be thoroughly peer-reviewed in a timely fashion, and papers will be published, if accepted, within 6 to 8 weeks after submission. [...

  16. A reliable computational workflow for the selection of optimal screening libraries.

    Science.gov (United States)

    Gilad, Yocheved; Nadassy, Katalin; Senderowitz, Hanoch

    2015-01-01

    The experimental screening of compound collections is a common starting point in many drug discovery projects. Successes of such screening campaigns critically depend on the quality of the screened library. Many libraries are currently available from different vendors yet the selection of the optimal screening library for a specific project is challenging. We have devised a novel workflow for the rational selection of project-specific screening libraries. The workflow accepts as input a set of virtual candidate libraries and applies the following steps to each library: (1) data curation; (2) assessment of ADME/T profile; (3) assessment of the number of promiscuous binders/frequent HTS hitters; (4) assessment of internal diversity; (5) assessment of similarity to known active compound(s) (optional); (6) assessment of similarity to in-house or otherwise accessible compound collections (optional). For ADME/T profiling, Lipinski's and Veber's rule-based filters were implemented and a new blood brain barrier permeation model was developed and validated (85 and 74 % success rate for training set and test set, respectively). Diversity and similarity descriptors which demonstrated best performances in terms of their ability to select either diverse or focused sets of compounds from three databases (Drug Bank, CMC and CHEMBL) were identified and used for diversity and similarity assessments. The workflow was used to analyze nine common screening libraries available from six vendors. The results of this analysis are reported for each library providing an assessment of its quality. Furthermore, a consensus approach was developed to combine the results of these analyses into a single score for selecting the optimal library under different scenarios. We have devised and tested a new workflow for the rational selection of screening libraries under different scenarios. The current workflow was implemented using the Pipeline Pilot software yet due to the usage of generic

  17. Swabs to genomes: a comprehensive workflow

    Directory of Open Access Journals (Sweden)

    Madison I. Dunitz

    2015-05-01

    Full Text Available The sequencing, assembly, and basic analysis of microbial genomes, once a painstaking and expensive undertaking, has become much easier for research labs with access to standard molecular biology and computational tools. However, there are a confusing variety of options available for DNA library preparation and sequencing, and inexperience with bioinformatics can pose a significant barrier to entry for many who may be interested in microbial genomics. The objective of the present study was to design, test, troubleshoot, and publish a simple, comprehensive workflow from the collection of an environmental sample (a swab to a published microbial genome; empowering even a lab or classroom with limited resources and bioinformatics experience to perform it.

  18. A Hybrid Task Graph Scheduler for High Performance Image Processing Workflows.

    Science.gov (United States)

    Blattner, Timothy; Keyrouz, Walid; Bhattacharyya, Shuvra S; Halem, Milton; Brady, Mary

    2017-12-01

    Designing applications for scalability is key to improving their performance in hybrid and cluster computing. Scheduling code to utilize parallelism is difficult, particularly when dealing with data dependencies, memory management, data motion, and processor occupancy. The Hybrid Task Graph Scheduler (HTGS) improves programmer productivity when implementing hybrid workflows for multi-core and multi-GPU systems. The Hybrid Task Graph Scheduler (HTGS) is an abstract execution model, framework, and API that increases programmer productivity when implementing hybrid workflows for such systems. HTGS manages dependencies between tasks, represents CPU and GPU memories independently, overlaps computations with disk I/O and memory transfers, keeps multiple GPUs occupied, and uses all available compute resources. Through these abstractions, data motion and memory are explicit; this makes data locality decisions more accessible. To demonstrate the HTGS application program interface (API), we present implementations of two example algorithms: (1) a matrix multiplication that shows how easily task graphs can be used; and (2) a hybrid implementation of microscopy image stitching that reduces code size by ≈ 43% compared to a manually coded hybrid workflow implementation and showcases the minimal overhead of task graphs in HTGS. Both of the HTGS-based implementations show good performance. In image stitching the HTGS implementation achieves similar performance to the hybrid workflow implementation. Matrix multiplication with HTGS achieves 1.3× and 1.8× speedup over the multi-threaded OpenBLAS library for 16k × 16k and 32k × 32k size matrices, respectively.

  19. Systematisation of spatial uncertainties for comparison between a MR and a CT-based radiotherapy workflow for prostate treatments

    International Nuclear Information System (INIS)

    Nyholm, Tufve; Nyberg, Morgan; Karlsson, Magnus G; Karlsson, Mikael

    2009-01-01

    In the present work we compared the spatial uncertainties associated with a MR-based workflow for external radiotherapy of prostate cancer to a standard CT-based workflow. The MR-based workflow relies on target definition and patient positioning based on MR imaging. A solution for patient transport between the MR scanner and the treatment units has been developed. For the CT-based workflow, the target is defined on a MR series but then transferred to a CT study through image registration before treatment planning, and a patient positioning using portal imaging and fiducial markers. An 'open bore' 1.5T MRI scanner, Siemens Espree, has been installed in the radiotherapy department in near proximity to a treatment unit to enable patient transport between the two installations, and hence use the MRI for patient positioning. The spatial uncertainty caused by the transport was added to the uncertainty originating from the target definition process, estimated through a review of the scientific literature. The uncertainty in the CT-based workflow was estimated through a literature review. The systematic uncertainties, affecting all treatment fractions, are reduced from 3-4 mm (1Sd) with a CT based workflow to 2-3 mm with a MR based workflow. The main contributing factor to this improvement is the exclusion of registration between MR and CT in the planning phase of the treatment. Treatment planning directly on MR images reduce the spatial uncertainty for prostate treatments

  20. Embracing the Importance of FAIR Research Products - Findable, Accessible, Interoperable, and Reusable

    Science.gov (United States)

    Stall, S.

    2017-12-01

    Integrity and transparency within research is solidified by a complete set of research products that are findable, accessible, interoperable, and reusable. In other words, they follow the FAIR Guidelines developed by FORCE11.org. Your datasets, images, video, software, scripts, models, physical samples, and other tools and technology are an integral part of the narrative you tell about your research. These research products increasingly are being captured through workflow tools and preserved and connected through persistent identifiers across multiple repositories that keep them safe. They help secure, with your publications, the supporting evidence and integrity of the scientific record. This is the direction that Earth and space science as well as other disciplines is moving. Within our community, some science domains are further along, and others are taking more measured steps. AGU as a publisher is working to support the full scientific record with peer reviewed publications. Working with our community and all the Earth and space science journals, AGU is developing new policies to encourage researchers to plan for proper data preservation and provide data citations along with their research submission and to encourage adoption of best practices throughout the research workflow and data life cycle. Providing incentives, community standards, and easy-to-use tools are some important factors for helping researchers embrace the FAIR Guidelines and support transparency and integrity.

  1. Open Access to the Scientific Journal Literature: Situation 2009

    Science.gov (United States)

    Björk, Bo-Christer; Welling, Patrik; Laakso, Mikael; Majlender, Peter; Hedlund, Turid; Guðnason, Guðni

    2010-01-01

    Background The Internet has recently made possible the free global availability of scientific journal articles. Open Access (OA) can occur either via OA scientific journals, or via authors posting manuscripts of articles published in subscription journals in open web repositories. So far there have been few systematic studies showing how big the extent of OA is, in particular studies covering all fields of science. Methodology/Principal Findings The proportion of peer reviewed scholarly journal articles, which are available openly in full text on the web, was studied using a random sample of 1837 titles and a web search engine. Of articles published in 2008, 8,5% were freely available at the publishers' sites. For an additional 11,9% free manuscript versions could be found using search engines, making the overall OA percentage 20,4%. Chemistry (13%) had the lowest overall share of OA, Earth Sciences (33%) the highest. In medicine, biochemistry and chemistry publishing in OA journals was more common. In all other fields author-posted manuscript copies dominated the picture. Conclusions/Significance The results show that OA already has a significant positive impact on the availability of the scientific journal literature and that there are big differences between scientific disciplines in the uptake. Due to the lack of awareness of OA-publishing among scientists in most fields outside physics, the results should be of general interest to all scholars. The results should also interest academic publishers, who need to take into account OA in their business strategies and copyright policies, as well as research funders, who like the NIH are starting to require OA availability of results from research projects they fund. The method and search tools developed also offer a good basis for more in-depth studies as well as longitudinal studies. PMID:20585653

  2. Open access to the scientific journal literature: situation 2009.

    Directory of Open Access Journals (Sweden)

    Bo-Christer Björk

    Full Text Available BACKGROUND: The Internet has recently made possible the free global availability of scientific journal articles. Open Access (OA can occur either via OA scientific journals, or via authors posting manuscripts of articles published in subscription journals in open web repositories. So far there have been few systematic studies showing how big the extent of OA is, in particular studies covering all fields of science. METHODOLOGY/PRINCIPAL FINDINGS: The proportion of peer reviewed scholarly journal articles, which are available openly in full text on the web, was studied using a random sample of 1837 titles and a web search engine. Of articles published in 2008, 8.5% were freely available at the publishers' sites. For an additional 11.9% free manuscript versions could be found using search engines, making the overall OA percentage 20.4%. Chemistry (13% had the lowest overall share of OA, Earth Sciences (33% the highest. In medicine, biochemistry and chemistry publishing in OA journals was more common. In all other fields author-posted manuscript copies dominated the picture. CONCLUSIONS/SIGNIFICANCE: The results show that OA already has a significant positive impact on the availability of the scientific journal literature and that there are big differences between scientific disciplines in the uptake. Due to the lack of awareness of OA-publishing among scientists in most fields outside physics, the results should be of general interest to all scholars. The results should also interest academic publishers, who need to take into account OA in their business strategies and copyright policies, as well as research funders, who like the NIH are starting to require OA availability of results from research projects they fund. The method and search tools developed also offer a good basis for more in-depth studies as well as longitudinal studies.

  3. Workflow Support for Advanced Grid-Enabled Computing

    OpenAIRE

    Xu, Fenglian; Eres, M.H.; Tao, Feng; Cox, Simon J.

    2004-01-01

    The Geodise project brings computer scientists and engineer's skills together to build up a service-oriented computing environmnet for engineers to perform complicated computations in a distributed system. The workflow tool is a front GUI to provide a full life cycle of workflow functions for Grid-enabled computing. The full life cycle of workflow functions have been enhanced based our initial research and development. The life cycle starts with a composition of a workflow, followed by an ins...

  4. ATLAS Grid Workflow Performance Optimization

    CERN Document Server

    Elmsheuser, Johannes; The ATLAS collaboration

    2018-01-01

    The CERN ATLAS experiment grid workflow system manages routinely 250 to 500 thousand concurrently running production and analysis jobs to process simulation and detector data. In total more than 300 PB of data is distributed over more than 150 sites in the WLCG. At this scale small improvements in the software and computing performance and workflows can lead to significant resource usage gains. ATLAS is reviewing together with CERN IT experts several typical simulation and data processing workloads for potential performance improvements in terms of memory and CPU usage, disk and network I/O. All ATLAS production and analysis grid jobs are instrumented to collect many performance metrics for detailed statistical studies using modern data analytics tools like ElasticSearch and Kibana. This presentation will review and explain the performance gains of several ATLAS simulation and data processing workflows and present analytics studies of the ATLAS grid workflows.

  5. Behavioral technique for workflow abstraction and matching

    NARCIS (Netherlands)

    Klai, K.; Ould Ahmed M'bareck, N.; Tata, S.; Dustdar, S.; Fiadeiro, J.L.; Sheth, A.

    2006-01-01

    This work is in line with the CoopFlow approach dedicated for workflow advertisement, interconnection, and cooperation in virtual organizations. In order to advertise workflows into a registry, we present in this paper a novel method to abstract behaviors of workflows into symbolic observation

  6. A performance study of grid workflow engines

    NARCIS (Netherlands)

    Stratan, C.; Iosup, A.; Epema, D.H.J.

    2008-01-01

    To benefit from grids, scientists require grid workflow engines that automatically manage the execution of inter-related jobs on the grid infrastructure. So far, the workflows community has focused on scheduling algorithms and on interface tools. Thus, while several grid workflow engines have been

  7. Experiences and lessons learned from creating a generalized workflow for data publication of field campaign datasets

    Science.gov (United States)

    Santhana Vannan, S. K.; Ramachandran, R.; Deb, D.; Beaty, T.; Wright, D.

    2017-12-01

    This paper summarizes the workflow challenges of curating and publishing data produced from disparate data sources and provides a generalized workflow solution to efficiently archive data generated by researchers. The Oak Ridge National Laboratory Distributed Active Archive Center (ORNL DAAC) for biogeochemical dynamics and the Global Hydrology Resource Center (GHRC) DAAC have been collaborating on the development of a generalized workflow solution to efficiently manage the data publication process. The generalized workflow presented here are built on lessons learned from implementations of the workflow system. Data publication consists of the following steps: Accepting the data package from the data providers, ensuring the full integrity of the data files. Identifying and addressing data quality issues Assembling standardized, detailed metadata and documentation, including file level details, processing methodology, and characteristics of data files Setting up data access mechanisms Setup of the data in data tools and services for improved data dissemination and user experience Registering the dataset in online search and discovery catalogues Preserving the data location through Digital Object Identifiers (DOI) We will describe the steps taken to automate, and realize efficiencies to the above process. The goals of the workflow system are to reduce the time taken to publish a dataset, to increase the quality of documentation and metadata, and to track individual datasets through the data curation process. Utilities developed to achieve these goal will be described. We will also share metrics driven value of the workflow system and discuss the future steps towards creation of a common software framework.

  8. Workflow Management in CLARIN-DK

    DEFF Research Database (Denmark)

    Jongejan, Bart

    2013-01-01

    The CLARIN-DK infrastructure is not only a repository of resources, but also a place where users can analyse, annotate, reformat and potentially even translate resources, using tools that are integrated in the infrastructure as web services. In many cases a single tool does not produce the desired...... with the features that describe her goal, because the workflow manager not only executes chains of tools in a workflow, but also takes care of autonomously devising workflows that serve the user’s intention, given the tools that currently are integrated in the infrastructure as web services. To do this...

  9. SWS: accessing SRS sites contents through Web Services.

    Science.gov (United States)

    Romano, Paolo; Marra, Domenico

    2008-03-26

    Web Services and Workflow Management Systems can support creation and deployment of network systems, able to automate data analysis and retrieval processes in biomedical research. Web Services have been implemented at bioinformatics centres and workflow systems have been proposed for biological data analysis. New databanks are often developed by taking into account these technologies, but many existing databases do not allow a programmatic access. Only a fraction of available databanks can thus be queried through programmatic interfaces. SRS is a well know indexing and search engine for biomedical databanks offering public access to many databanks and analysis tools. Unfortunately, these data are not easily and efficiently accessible through Web Services. We have developed 'SRS by WS' (SWS), a tool that makes information available in SRS sites accessible through Web Services. Information on known sites is maintained in a database, srsdb. SWS consists in a suite of WS that can query both srsdb, for information on sites and databases, and SRS sites. SWS returns results in a text-only format and can be accessed through a WSDL compliant client. SWS enables interoperability between workflow systems and SRS implementations, by also managing access to alternative sites, in order to cope with network and maintenance problems, and selecting the most up-to-date among available systems. Development and implementation of Web Services, allowing to make a programmatic access to an exhaustive set of biomedical databases can significantly improve automation of in-silico analysis. SWS supports this activity by making biological databanks that are managed in public SRS sites available through a programmatic interface.

  10. Multilevel Workflow System in the ATLAS Experiment

    CERN Document Server

    Borodin, M; The ATLAS collaboration; Golubkov, D; Klimentov, A; Maeno, T; Vaniachine, A

    2015-01-01

    The ATLAS experiment is scaling up Big Data processing for the next LHC run using a multilevel workflow system comprised of many layers. In Big Data processing ATLAS deals with datasets, not individual files. Similarly a task (comprised of many jobs) has become a unit of the ATLAS workflow in distributed computing, with about 0.8M tasks processed per year. In order to manage the diversity of LHC physics (exceeding 35K physics samples per year), the individual data processing tasks are organized into workflows. For example, the Monte Carlo workflow is composed of many steps: generate or configure hard-processes, hadronize signal and minimum-bias (pileup) events, simulate energy deposition in the ATLAS detector, digitize electronics response, simulate triggers, reconstruct data, convert the reconstructed data into ROOT ntuples for physics analysis, etc. Outputs are merged and/or filtered as necessary to optimize the chain. The bi-level workflow manager - ProdSys2 - generates actual workflow tasks and their jobs...

  11. Galaxy-M: a Galaxy workflow for processing and analyzing direct infusion and liquid chromatography mass spectrometry-based metabolomics data.

    Science.gov (United States)

    Davidson, Robert L; Weber, Ralf J M; Liu, Haoyu; Sharma-Oates, Archana; Viant, Mark R

    2016-01-01

    Metabolomics is increasingly recognized as an invaluable tool in the biological, medical and environmental sciences yet lags behind the methodological maturity of other omics fields. To achieve its full potential, including the integration of multiple omics modalities, the accessibility, standardization and reproducibility of computational metabolomics tools must be improved significantly. Here we present our end-to-end mass spectrometry metabolomics workflow in the widely used platform, Galaxy. Named Galaxy-M, our workflow has been developed for both direct infusion mass spectrometry (DIMS) and liquid chromatography mass spectrometry (LC-MS) metabolomics. The range of tools presented spans from processing of raw data, e.g. peak picking and alignment, through data cleansing, e.g. missing value imputation, to preparation for statistical analysis, e.g. normalization and scaling, and principal components analysis (PCA) with associated statistical evaluation. We demonstrate the ease of using these Galaxy workflows via the analysis of DIMS and LC-MS datasets, and provide PCA scores and associated statistics to help other users to ensure that they can accurately repeat the processing and analysis of these two datasets. Galaxy and data are all provided pre-installed in a virtual machine (VM) that can be downloaded from the GigaDB repository. Additionally, source code, executables and installation instructions are available from GitHub. The Galaxy platform has enabled us to produce an easily accessible and reproducible computational metabolomics workflow. More tools could be added by the community to expand its functionality. We recommend that Galaxy-M workflow files are included within the supplementary information of publications, enabling metabolomics studies to achieve greater reproducibility.

  12. From Requirements via Colored Workflow Nets to an Implementation in Several Workflow Systems

    DEFF Research Database (Denmark)

    Mans, Ronny S.; van der Aalst, Willibrordus Martinus Pancratius; Molemann, A.J.

    2007-01-01

    Care organizations, such as hospitals, need to support complex and dynamic workflows. More- over, many disciplines are involved. This makes it important to avoid the typical disconnect between requirements and the actual implementation of the system. This paper proposes an approach where an Execu......Care organizations, such as hospitals, need to support complex and dynamic workflows. More- over, many disciplines are involved. This makes it important to avoid the typical disconnect between requirements and the actual implementation of the system. This paper proposes an approach where...... an Executable Use Case (EUC) and Colored Care organizations, such as hospitals, need to support complex and dynamic workflows. Moreover, many disciplines are involved. This makes it important to avoid the typical disconnect between requirements and the actual implementation of the system. This paper proposes...

  13. Digitizing specimens in a small herbarium: A viable workflow for collections working with limited resources1

    Science.gov (United States)

    Harris, Kari M.; Marsico, Travis D.

    2017-01-01

    Premise of the study: Small herbaria represent a significant portion of herbaria in the United States, but many are not digitizing their collections. Methods: At the Arkansas State University Herbarium (STAR), we have created a viable workflow to help small herbaria begin the digitization process, including suggestions for publishing data on the Internet. We calculated hourly rates of each phase of the digitization process. We also mapped accessions at the county level to determine geographic strengths in the collection. Results: All 17,678 accessioned flowering plant specimens at STAR are imaged, databased in Specify, and available electronically on the herbarium’s website. Students imaged the specimens at a mean rate of 145/h. We found differences in databasing rates between the graduate student leading the project (47/h) and undergraduate assistants (25/h). The majority of specimens at STAR were collected within the counties neighboring the institution. Discussion: With this workflow, we estimate that one person can digitize a 20,000-specimen collection in less than 2.5 yr by working only 10 h/wk. Because STAR is a small herbarium with limited resources, the application of the workflow described should assist curators of similar-sized collections as they contemplate and undertake the digitization process. PMID:28439474

  14. Digital workflows in contemporary orthodontics

    Directory of Open Access Journals (Sweden)

    Lars R Christensen

    2017-01-01

    Full Text Available Digital workflows are now increasingly possible in orthodontic practice. Workflows designed to improve the customization of orthodontic appliances are now available through laboratories and orthodontic manufacturing facilities in many parts of the world. These now have the potential to improve certain aspects of patient care.

  15. The Digital Editions of Revista Estudos Feministas (and the Democratization of the Access to the Scientific and Scademic Production

    Directory of Open Access Journals (Sweden)

    Sônia Weidner Maluf

    2008-01-01

    Full Text Available In this article we describe the electronic editions of Revista Estudos Feministas and our contribution towards a politics of democratization of the access to scientific and academic production. After a brief description of the three spaces of electronic editions of REF (the Electronic Library Scielo, the Scielo Social Science Portal and the Feminist Portal, we discuss the contribution of gender and feminist studies for the democratization of the access to the scientific and academic production.

  16. Web-based execution of graphical work-flows: a modular platform for multifunctional scientific process automation

    International Nuclear Information System (INIS)

    De Ley, E.; Jacobs, D.; Ounsy, M.

    2012-01-01

    The Passerelle process automation suite offers a fundamentally modular solution platform, based on a layered integration of several best-of-breed technologies. It has been successfully applied by Synchrotron Soleil as the sequencer for data acquisition and control processes on its beamlines, integrated with TANGO as a control bus and GlobalScreen TM ) as the SCADA package. Since last year it is being used as the graphical work-flow component for the development of an eclipse-based Data Analysis Work Bench, at ESRF. The top layer of Passerelle exposes an actor-based development paradigm, based on the Ptolemy framework (UC Berkeley). Actors provide explicit reusability and strong decoupling, combined with an inherently concurrent execution model. Actor libraries exist for TANGO integration, web-services, database operations, flow control, rules-based analysis, mathematical calculations, launching external scripts etc. Passerelle's internal architecture is based on OSGi, the major Java framework for modular service-based applications. A large set of modules exist that can be recombined as desired to obtain different features and deployment models. Besides desktop versions of the Passerelle work-flow workbench, there is also the Passerelle Manager. It is a secured web application including a graphical editor, for centralized design, execution, management and monitoring of process flows, integrating standard Java Enterprise services with OSGi. We will present the internal technical architecture, some interesting application cases and the lessons learnt. (authors)

  17. Enhanced reproducibility of SADI web service workflows with Galaxy and Docker.

    Science.gov (United States)

    Aranguren, Mikel Egaña; Wilkinson, Mark D

    2015-01-01

    Semantic Web technologies have been widely applied in the life sciences, for example by data providers such as OpenLifeData and through web services frameworks such as SADI. The recently reported OpenLifeData2SADI project offers access to the vast OpenLifeData data store through SADI services. This article describes how to merge data retrieved from OpenLifeData2SADI with other SADI services using the Galaxy bioinformatics analysis platform, thus making this semantic data more amenable to complex analyses. This is demonstrated using a working example, which is made distributable and reproducible through a Docker image that includes SADI tools, along with the data and workflows that constitute the demonstration. The combination of Galaxy and Docker offers a solution for faithfully reproducing and sharing complex data retrieval and analysis workflows based on the SADI Semantic web service design patterns.

  18. The equivalency between logic Petri workflow nets and workflow nets.

    Science.gov (United States)

    Wang, Jing; Yu, ShuXia; Du, YuYue

    2015-01-01

    Logic Petri nets (LPNs) can describe and analyze batch processing functions and passing value indeterminacy in cooperative systems. Logic Petri workflow nets (LPWNs) are proposed based on LPNs in this paper. Process mining is regarded as an important bridge between modeling and analysis of data mining and business process. Workflow nets (WF-nets) are the extension to Petri nets (PNs), and have successfully been used to process mining. Some shortcomings cannot be avoided in process mining, such as duplicate tasks, invisible tasks, and the noise of logs. The online shop in electronic commerce in this paper is modeled to prove the equivalence between LPWNs and WF-nets, and advantages of LPWNs are presented.

  19. The Equivalency between Logic Petri Workflow Nets and Workflow Nets

    Science.gov (United States)

    Wang, Jing; Yu, ShuXia; Du, YuYue

    2015-01-01

    Logic Petri nets (LPNs) can describe and analyze batch processing functions and passing value indeterminacy in cooperative systems. Logic Petri workflow nets (LPWNs) are proposed based on LPNs in this paper. Process mining is regarded as an important bridge between modeling and analysis of data mining and business process. Workflow nets (WF-nets) are the extension to Petri nets (PNs), and have successfully been used to process mining. Some shortcomings cannot be avoided in process mining, such as duplicate tasks, invisible tasks, and the noise of logs. The online shop in electronic commerce in this paper is modeled to prove the equivalence between LPWNs and WF-nets, and advantages of LPWNs are presented. PMID:25821845

  20. Responsive web design workflow

    OpenAIRE

    LAAK, TIMO

    2013-01-01

    Responsive Web Design Workflow is a literature review about Responsive Web Design, a web standards based modern web design paradigm. The goals of this research were to define what responsive web design is, determine its importance in building modern websites and describe a workflow for responsive web design projects. Responsive web design is a paradigm to create adaptive websites, which respond to the properties of the media that is used to render them. The three key elements of responsi...

  1. Pro WF Windows Workflow in NET 40

    CERN Document Server

    Bukovics, Bruce

    2010-01-01

    Windows Workflow Foundation (WF) is a revolutionary part of the .NET 4 Framework that allows you to orchestrate human and system interactions as a series of workflows that can be easily mapped, analyzed, adjusted, and implemented. As business problems become more complex, the need for workflow-based solutions has never been more evident. WF provides a simple and consistent way to model and implement complex problems. As a developer, you focus on developing the business logic for individual workflow tasks. The runtime handles the execution of those tasks after they have been composed into a wor

  2. Patient-centered care requires a patient-oriented workflow model.

    Science.gov (United States)

    Ozkaynak, Mustafa; Brennan, Patricia Flatley; Hanauer, David A; Johnson, Sharon; Aarts, Jos; Zheng, Kai; Haque, Saira N

    2013-06-01

    Effective design of health information technology (HIT) for patient-centered care requires consideration of workflow from the patient's perspective, termed 'patient-oriented workflow.' This approach organizes the building blocks of work around the patients who are moving through the care system. Patient-oriented workflow complements the more familiar clinician-oriented workflow approaches, and offers several advantages, including the ability to capture simultaneous, cooperative work, which is essential in care delivery. Patient-oriented workflow models can also provide an understanding of healthcare work taking place in various formal and informal health settings in an integrated manner. We present two cases demonstrating the potential value of patient-oriented workflow models. Significant theoretical, methodological, and practical challenges must be met to ensure adoption of patient-oriented workflow models. Patient-oriented workflow models define meaningful system boundaries and can lead to HIT implementations that are more consistent with cooperative work and its emergent features.

  3. Using Mobile Agents to Implement Workflow System

    Institute of Scientific and Technical Information of China (English)

    LI Jie; LIU Xian-xing; GUO Zheng-wei

    2004-01-01

    Current workflow management systems usually adopt the existing technologies such as TCP/IP-based Web technologies and CORBA as well to fulfill the bottom communications.Very often it has been considered only from a theoretical point of view, mainly for the lack of concrete possibilities to execute with elasticity.MAT (Mobile Agent Technology) represents a very attractive approach to the distributed control of computer networks and a valid alternative to the implementation of strategies for workflow system.This paper mainly focuses on improving the performance of workflow system by using MAT.Firstly, the performances of workflow systems based on both CORBA and mobile agent are summarized and analyzed; Secondly, the performance contrast is presented by introducing the mathematic model of each kind of data interaction process respectively.Last, a mobile agent-based workflow system named MAWMS is presented and described in detail.

  4. SWS: accessing SRS sites contents through Web Services

    OpenAIRE

    Romano, Paolo; Marra, Domenico

    2008-01-01

    Background Web Services and Workflow Management Systems can support creation and deployment of network systems, able to automate data analysis and retrieval processes in biomedical research. Web Services have been implemented at bioinformatics centres and workflow systems have been proposed for biological data analysis. New databanks are often developed by taking into account these technologies, but many existing databases do not allow a programmatic access. Only a fraction of available datab...

  5. Some challenges and issues in managing, and preserving access to, long-lived collections of digital scientific and technical data

    Directory of Open Access Journals (Sweden)

    W L Anderson

    2006-01-01

    Full Text Available One goal of the Committee on Data for Science and Technology is to solicit information about, promote discussion of, and support action on the many issues related to scientific and technical data preservation, archiving, and access. This brief paper describes four broad categories of issues that help to organize discussion, learning, and action regarding the work needed to support the long-term preservation of, and access to, scientific and technical data. In each category, some specific issues and areas of concern are described.

  6. ADVANCED APPROACH TO PRODUCTION WORKFLOW COMPOSITION ON ENGINEERING KNOWLEDGE PORTALS

    OpenAIRE

    Novogrudska, Rina; Kot, Tatyana; Globa, Larisa; Schill, Alexander

    2016-01-01

    Background. In the environment of engineering knowledge portals great amount of partial workflows is concentrated. Such workflows are composed into general workflow aiming to perform real complex production task. Characteristics of partial workflows and general workflow structure are not studied enough, that affects the impossibility of general production workflowdynamic composition.Objective. Creating an approach to the general production workflow dynamic composition based on the partial wor...

  7. Ontology-Driven Discovery of Scientific Computational Entities

    Science.gov (United States)

    Brazier, Pearl W.

    2010-01-01

    Many geoscientists use modern computational resources, such as software applications, Web services, scientific workflows and datasets that are readily available on the Internet, to support their research and many common tasks. These resources are often shared via human contact and sometimes stored in data portals; however, they are not necessarily…

  8. Wildfire: distributed, Grid-enabled workflow construction and execution

    Directory of Open Access Journals (Sweden)

    Issac Praveen

    2005-03-01

    Full Text Available Abstract Background We observe two trends in bioinformatics: (i analyses are increasing in complexity, often requiring several applications to be run as a workflow; and (ii multiple CPU clusters and Grids are available to more scientists. The traditional solution to the problem of running workflows across multiple CPUs required programming, often in a scripting language such as perl. Programming places such solutions beyond the reach of many bioinformatics consumers. Results We present Wildfire, a graphical user interface for constructing and running workflows. Wildfire borrows user interface features from Jemboss and adds a drag-and-drop interface allowing the user to compose EMBOSS (and other programs into workflows. For execution, Wildfire uses GEL, the underlying workflow execution engine, which can exploit available parallelism on multiple CPU machines including Beowulf-class clusters and Grids. Conclusion Wildfire simplifies the tasks of constructing and executing bioinformatics workflows.

  9. Database Support for Workflow Management: The WIDE Project

    NARCIS (Netherlands)

    Grefen, P.W.P.J.; Pernici, B; Sánchez, G.; Unknown, [Unknown

    1999-01-01

    Database Support for Workflow Management: The WIDE Project presents the results of the ESPRIT WIDE project on advanced database support for workflow management. The book discusses the state of the art in combining database management and workflow management technology, especially in the areas of

  10. Scientific Data Management Center for Enabling Technologies

    Energy Technology Data Exchange (ETDEWEB)

    Vouk, Mladen A.

    2013-01-15

    data management technologies to DOE application scientists in astrophysics, climate, fusion, and biology. Equally important, it established collaborations with these scientists to better understand their science as well as their forthcoming data management and data analytics challenges. Building on our early successes, we have greatly enhanced, robustified, and deployed our technology to these communities. In some cases, we identified new needs that have been addressed in order to simplify the use of our technology by scientists. This report summarizes our work so far in SciDAC-2. Our approach is to employ an evolutionary development and deployment process: from research through prototypes to deployment and infrastructure. Accordingly, we have organized our activities in three layers that abstract the end-to-end data flow described above. We labeled the layers (from bottom to top): a) Storage Efficient Access (SEA), b) Data Mining and Analysis (DMA), c) Scientific Process Automation (SPA). The SEA layer is immediately on top of hardware, operating systems, file systems, and mass storage systems, and provides parallel data access technology, and transparent access to archival storage. The DMA layer, which builds on the functionality of the SEA layer, consists of indexing, feature identification, and parallel statistical analysis technology. The SPA layer, which is on top of the DMA layer, provides the ability to compose scientific workflows from the components in the DMA layer as well as application specific modules. NCSU work performed under this contract was primarily at the SPA layer.

  11. Contracts for Cross-Organizational Workflow Management

    NARCIS (Netherlands)

    Koetsier, M.J.; Grefen, P.W.P.J.; Vonk, J.

    1999-01-01

    Nowadays, many organizations form dynamic partnerships to deal effectively with market requirements. As companies use automated workflow systems to control their processes, a way of linking workflow processes in different organizations is useful in turning the co-operating companies into a seamless

  12. Workflow Patterns for Business Process Modeling

    NARCIS (Netherlands)

    Thom, Lucineia Heloisa; Lochpe, Cirano; Reichert, M.U.

    For its reuse advantages, workflow patterns (e.g., control flow patterns, data patterns, resource patterns) are increasingly attracting the interest of both researchers and vendors. Frequently, business process or workflow models can be assembeled out of a set of recurrent process fragments (or

  13. Implementing Workflow Reconfiguration in WS-BPEL

    DEFF Research Database (Denmark)

    Mazzara, Manuel; Dragoni, Nicola; Zhou, Mu

    2012-01-01

    This paper investigates the problem of dynamic reconfiguration by means of a workflow-based case study used for discussion. We state the requirements on a system implementing the workflow and its reconfiguration, and we describe the system’s design in BPMN. WS-BPEL, a language that would not natu......This paper investigates the problem of dynamic reconfiguration by means of a workflow-based case study used for discussion. We state the requirements on a system implementing the workflow and its reconfiguration, and we describe the system’s design in BPMN. WS-BPEL, a language that would...... not naturally support dynamic change, is used as a target for implementation. The WS-BPEL recovery framework is here exploited to implement the reconfiguration using principles derived from previous research in process algebra and two mappings from BPMN to WS-BPEL are presented, one automatic and only mostly...

  14. Integrated workflows for spiking neuronal network simulations

    Directory of Open Access Journals (Sweden)

    Ján eAntolík

    2013-12-01

    Full Text Available The increasing availability of computational resources is enabling more detailed, realistic modelling in computational neuroscience, resulting in a shift towards more heterogeneous models of neuronal circuits, and employment of complex experimental protocols. This poses a challenge for existing tool chains, as the set of tools involved in a typical modeller's workflow is expanding concomitantly, with growing complexity in the metadata flowing between them. For many parts of the workflow, a range of tools is available; however, numerous areas lack dedicated tools, while integration of existing tools is limited. This forces modellers to either handle the workflow manually, leading to errors, or to write substantial amounts of code to automate parts of the workflow, in both cases reducing their productivity.To address these issues, we have developed Mozaik: a workflow system for spiking neuronal network simulations written in Python. Mozaik integrates model, experiment and stimulation specification, simulation execution, data storage, data analysis and visualisation into a single automated workflow, ensuring that all relevant metadata are available to all workflow components. It is based on several existing tools, including PyNN, Neo and Matplotlib. It offers a declarative way to specify models and recording configurations using hierarchically organised configuration files. Mozaik automatically records all data together with all relevant metadata about the experimental context, allowing automation of the analysis and visualisation stages. Mozaik has a modular architecture, and the existing modules are designed to be extensible with minimal programming effort. Mozaik increases the productivity of running virtual experiments on highly structured neuronal networks by automating the entire experimental cycle, while increasing the reliability of modelling studies by relieving the user from manual handling of the flow of metadata between the individual

  15. ESO Reflex: a graphical workflow engine for data reduction

    Science.gov (United States)

    Hook, Richard; Ullgrén, Marko; Romaniello, Martino; Maisala, Sami; Oittinen, Tero; Solin, Otto; Savolainen, Ville; Järveläinen, Pekka; Tyynelä, Jani; Péron, Michèle; Ballester, Pascal; Gabasch, Armin; Izzo, Carlo

    ESO Reflex is a prototype software tool that provides a novel approach to astronomical data reduction by integrating a modern graphical workflow system (Taverna) with existing legacy data reduction algorithms. Most of the raw data produced by instruments at the ESO Very Large Telescope (VLT) in Chile are reduced using recipes. These are compiled C applications following an ESO standard and utilising routines provided by the Common Pipeline Library (CPL). Currently these are run in batch mode as part of the data flow system to generate the input to the ESO/VLT quality control process and are also exported for use offline. ESO Reflex can invoke CPL-based recipes in a flexible way through a general purpose graphical interface. ESO Reflex is based on the Taverna system that was originally developed within the UK life-sciences community. Workflows have been created so far for three VLT/VLTI instruments, and the GUI allows the user to make changes to these or create workflows of their own. Python scripts or IDL procedures can be easily brought into workflows and a variety of visualisation and display options, including custom product inspection and validation steps, are available. Taverna is intended for use with web services and experiments using ESO Reflex to access Virtual Observatory web services have been successfully performed. ESO Reflex is the main product developed by Sampo, a project led by ESO and conducted by a software development team from Finland as an in-kind contribution to joining ESO. The goal was to look into the needs of the ESO community in the area of data reduction environments and to create pilot software products that illustrate critical steps along the road to a new system. Sampo concluded early in 2008. This contribution will describe ESO Reflex and show several examples of its use both locally and using Virtual Observatory remote web services. ESO Reflex is expected to be released to the community in early 2009.

  16. Evaluation of Workflow Management Systems - A Meta Model Approach

    Directory of Open Access Journals (Sweden)

    Michael Rosemann

    1998-11-01

    Full Text Available The automated enactment of processes through the use of workflow management systems enables the outsourcing of the control flow from application systems. By now a large number of systems, that follow different workflow paradigms, are available. This leads to the problem of selecting the appropriate workflow management system for a given situation. In this paper we outline the benefits of a meta model approach for the evaluation and comparison of different workflow management systems. After a general introduction on the topic of meta modeling the meta models of the workflow management systems WorkParty (Siemens Nixdorf and FlowMark (IBM are compared as an example. These product specific meta models can be generalized to meta reference models, which helps to specify a workflow methodology. Exemplary, an organisational reference meta model is presented, which helps users in specifying their requirements for a workflow management system.

  17. The complete digital workflow in fixed prosthodontics: a systematic review.

    Science.gov (United States)

    Joda, Tim; Zarone, Fernando; Ferrari, Marco

    2017-09-19

    generation, allocation concealment, blinding, completeness of outcome data, selective reporting, and other bias using the Cochrane Collaboration tool. A judgment of risk of bias was assigned if one or more key domains had a high or unclear risk of bias. An official registration of the systematic review was not performed. The systematic search identified 67 titles, 32 abstracts thereof were screened, and subsequently, three full-texts included for data extraction. Analysed RCTs were heterogeneous without follow-up. One study demonstrated that fully digitally produced dental crowns revealed the feasibility of the process itself; however, the marginal precision was lower for lithium disilicate (LS2) restorations (113.8 μm) compared to conventional metal-ceramic (92.4 μm) and zirconium dioxide (ZrO2) crowns (68.5 μm) (p digital workflow was more than twofold faster (75.3 min) in comparison to the mixed analog-digital workflow (156.6 min) (p digital workflows in fixed prosthodontics is low. Scientifically proven recommendations for clinical routine cannot be given at this time. Research with high-quality trials seems to be slower than the industrial progress of available digital applications. Future research with well-designed RCTs including follow-up observation is compellingly necessary in the field of complete digital processing.

  18. Role of library's subscription licenses in promoting open access to scientific research

    KAUST Repository

    Buck, Stephen

    2018-01-01

    This presentation, based on KAUST’’s experience to date, will attempt to explain the different ways of bringing Open Access models to scientific Publisher’s licenses. Our dual approach with offset pricing is to redirect subscription money to publishing money and embed green open access deposition terms in understandable language in our license agreements. Resolving the inherent complexities in open access publishing, repository depositions and offsetting models will save libraries money and also time wasted on tedious and unnecessary administration work. Researchers will also save their time with overall clarity and transparency. This will enable trust and, where mistakes are made, and there inevitably will be with untried models, we can learn from these mistakes and make better, more robust services with auto deposition of our articles to our repository fed by Publishers’ themselves. The plan is to cover all Publishers with OA license terms for KAUST author’s right while continuing our subscription to them. There are marketing campaigns, awareness sessions are planned, in addition to establishing Libguides to help researchers, in addition to manage offset pricing models.

  19. Role of library's subscription licenses in promoting open access to scientific research

    KAUST Repository

    Buck, Stephen

    2018-04-30

    This presentation, based on KAUST’’s experience to date, will attempt to explain the different ways of bringing Open Access models to scientific Publisher’s licenses. Our dual approach with offset pricing is to redirect subscription money to publishing money and embed green open access deposition terms in understandable language in our license agreements. Resolving the inherent complexities in open access publishing, repository depositions and offsetting models will save libraries money and also time wasted on tedious and unnecessary administration work. Researchers will also save their time with overall clarity and transparency. This will enable trust and, where mistakes are made, and there inevitably will be with untried models, we can learn from these mistakes and make better, more robust services with auto deposition of our articles to our repository fed by Publishers’ themselves. The plan is to cover all Publishers with OA license terms for KAUST author’s right while continuing our subscription to them. There are marketing campaigns, awareness sessions are planned, in addition to establishing Libguides to help researchers, in addition to manage offset pricing models.

  20. Workflow of the Grover algorithm simulation incorporating CUDA and GPGPU

    Science.gov (United States)

    Lu, Xiangwen; Yuan, Jiabin; Zhang, Weiwei

    2013-09-01

    The Grover quantum search algorithm, one of only a few representative quantum algorithms, can speed up many classical algorithms that use search heuristics. No true quantum computer has yet been developed. For the present, simulation is one effective means of verifying the search algorithm. In this work, we focus on the simulation workflow using a compute unified device architecture (CUDA). Two simulation workflow schemes are proposed. These schemes combine the characteristics of the Grover algorithm and the parallelism of general-purpose computing on graphics processing units (GPGPU). We also analyzed the optimization of memory space and memory access from this perspective. We implemented four programs on CUDA to evaluate the performance of schemes and optimization. Through experimentation, we analyzed the organization of threads suited to Grover algorithm simulations, compared the storage costs of the four programs, and validated the effectiveness of optimization. Experimental results also showed that the distinguished program on CUDA outperformed the serial program of libquantum on a CPU with a speedup of up to 23 times (12 times on average), depending on the scale of the simulation.

  1. Multilevel Workflow System in the ATLAS Experiment

    International Nuclear Information System (INIS)

    Borodin, M; De, K; Navarro, J Garcia; Golubkov, D; Klimentov, A; Maeno, T; Vaniachine, A

    2015-01-01

    The ATLAS experiment is scaling up Big Data processing for the next LHC run using a multilevel workflow system comprised of many layers. In Big Data processing ATLAS deals with datasets, not individual files. Similarly a task (comprised of many jobs) has become a unit of the ATLAS workflow in distributed computing, with about 0.8M tasks processed per year. In order to manage the diversity of LHC physics (exceeding 35K physics samples per year), the individual data processing tasks are organized into workflows. For example, the Monte Carlo workflow is composed of many steps: generate or configure hard-processes, hadronize signal and minimum-bias (pileup) events, simulate energy deposition in the ATLAS detector, digitize electronics response, simulate triggers, reconstruct data, convert the reconstructed data into ROOT ntuples for physics analysis, etc. Outputs are merged and/or filtered as necessary to optimize the chain. The bi-level workflow manager - ProdSys2 - generates actual workflow tasks and their jobs are executed across more than a hundred distributed computing sites by PanDA - the ATLAS job-level workload management system. On the outer level, the Database Engine for Tasks (DEfT) empowers production managers with templated workflow definitions. On the next level, the Job Execution and Definition Interface (JEDI) is integrated with PanDA to provide dynamic job definition tailored to the sites capabilities. We report on scaling up the production system to accommodate a growing number of requirements from main ATLAS areas: Trigger, Physics and Data Preparation. (paper)

  2. Workflow User Interfaces Patterns

    Directory of Open Access Journals (Sweden)

    Jean Vanderdonckt

    2012-03-01

    Full Text Available Este trabajo presenta una colección de patrones de diseño de interfaces de usuario para sistemas de información para el flujo de trabajo; la colección incluye cuarenta y tres patrones clasificados en siete categorías identificados a partir de la lógica del ciclo de vida de la tarea sobre la base de la oferta y la asignación de tareas a los responsables de realizarlas (i. e. recursos humanos durante el flujo de trabajo. Cada patrón de la interfaz de usuario de flujo de trabajo (WUIP, por sus siglas en inglés se caracteriza por las propiedades expresadas en el lenguaje PLML para expresar patrones y complementado por otros atributos y modelos que se adjuntan a dicho modelo: la interfaz de usuario abstracta y el modelo de tareas correspondiente. Estos modelos se especifican en un lenguaje de descripción de interfaces de usuario. Todos los WUIPs se almacenan en una biblioteca y se pueden recuperar a través de un editor de flujo de trabajo que vincula a cada patrón de asignación de trabajo a su WUIP correspondiente.A collection of user interface design patterns for workflow information systems is presented that contains forty three resource patterns classified in seven categories. These categories and their corresponding patterns have been logically identified from the task life cycle based on offering and allocation operations. Each Workflow User Interface Pattern (WUIP is characterized by properties expressed in the PLML markup language for expressing patterns and augmented by additional attributes and models attached to the pattern: the abstract user interface and the corresponding task model. These models are specified in a User Interface Description Language. All WUIPs are stored in a library and can be retrieved within a workflow editor that links each workflow pattern to its corresponding WUIP, thus giving rise to a user interface for each workflow pattern.

  3. Enhancing Scientific Collaboration, Transparency, and Public Access: Utilizing the Second Life Platform to Convene a Scientific Conference in 3-D Virtual Space

    Science.gov (United States)

    McGee, B. W.

    2006-12-01

    Recent studies reveal a general mistrust of science as well as a distorted perception of the scientific method by the public at-large. Concurrently, the number of science undergraduate and graduate students is in decline. By taking advantage of emergent technologies not only for direct public outreach but also to enhance public accessibility to the science process, it may be possible to both begin a reversal of popular scientific misconceptions and to engage a new generation of scientists. The Second Life platform is a 3-D virtual world produced and operated by Linden Research, Inc., a privately owned company instituted to develop new forms of immersive entertainment. Free and downloadable to the public, Second Life offers an imbedded physics engine, streaming audio and video capability, and unlike other "multiplayer" software, the objects and inhabitants of Second Life are entirely designed and created by its users, providing an open-ended experience without the structure of a traditional video game. Already, educational institutions, virtual museums, and real-world businesses are utilizing Second Life for teleconferencing, pre-visualization, and distance education, as well as to conduct traditional business. However, the untapped potential of Second Life lies in its versatility, where the limitations of traditional scientific meeting venues do not exist, and attendees need not be restricted by prohibitive travel costs. It will be shown that the Second Life system enables scientific authors and presenters at a "virtual conference" to display figures and images at full resolution, employ audio-visual content typically not available to conference organizers, and to perform demonstrations or premier three-dimensional renderings of objects, processes, or information. An enhanced presentation like those possible with Second Life would be more engaging to non- scientists, and such an event would be accessible to the general users of Second Life, who could have an

  4. Optimal resource assignment in workflows for maximizing cooperation

    NARCIS (Netherlands)

    Kumar, Akhil; Dijkman, R.M.; Song, Minseok; Daniel, Fl.; Wang, J.; Weber, B.

    2013-01-01

    A workflow is a team process since many actors work on various tasks to complete an instance. Resource management in such workflows deals with assignment of tasks to workers or actors. In team formation, it is necessary to ensure that members of a team are compatible with each other. When a workflow

  5. Performing Workflows in Pervasive Environments Based on Context Specifications

    OpenAIRE

    Xiping Liu; Jianxin Chen

    2010-01-01

    The workflow performance consists of the performance of activities and transitions between activities. Along with the fast development of varied computing devices, activities in workflows and transitions between activities could be performed in pervasive ways, whichcauses that the workflow performance need to migrate from traditional computing environments to pervasive environments. To perform workflows in pervasive environments needs to take account of the context information which affects b...

  6. Unified Access Architecture for Large-Scale Scientific Datasets

    Science.gov (United States)

    Karna, Risav

    2014-05-01

    Data-intensive sciences have to deploy diverse large scale database technologies for data analytics as scientists have now been dealing with much larger volume than ever before. While array databases have bridged many gaps between the needs of data-intensive research fields and DBMS technologies (Zhang 2011), invocation of other big data tools accompanying these databases is still manual and separate the database management's interface. We identify this as an architectural challenge that will increasingly complicate the user's work flow owing to the growing number of useful but isolated and niche database tools. Such use of data analysis tools in effect leaves the burden on the user's end to synchronize the results from other data manipulation analysis tools with the database management system. To this end, we propose a unified access interface for using big data tools within large scale scientific array database using the database queries themselves to embed foreign routines belonging to the big data tools. Such an invocation of foreign data manipulation routines inside a query into a database can be made possible through a user-defined function (UDF). UDFs that allow such levels of freedom as to call modules from another language and interface back and forth between the query body and the side-loaded functions would be needed for this purpose. For the purpose of this research we attempt coupling of four widely used tools Hadoop (hadoop1), Matlab (matlab1), R (r1) and ScaLAPACK (scalapack1) with UDF feature of rasdaman (Baumann 98), an array-based data manager, for investigating this concept. The native array data model used by an array-based data manager provides compact data storage and high performance operations on ordered data such as spatial data, temporal data, and matrix-based data for linear algebra operations (scidbusr1). Performances issues arising due to coupling of tools with different paradigms, niche functionalities, separate processes and output

  7. Verifying generalized soundness for workflow nets

    NARCIS (Netherlands)

    Hee, van K.M.; Oanea, O.I.; Sidorova, N.; Voorhoeve, M.; Virbitskaite, I.; Voronkov, A.

    2007-01-01

    We improve the decision procedure from [10] for the problem of generalized soundness of workflow nets. A workflow net is generalized sound iff every marking reachable from an initial marking with k tokens on the initial place terminates properly, i.e. it can reach a marking with k tokens on the

  8. Worklist handling in workflow-enabled radiological application systems

    Science.gov (United States)

    Wendler, Thomas; Meetz, Kirsten; Schmidt, Joachim; von Berg, Jens

    2000-05-01

    For the next generation integrated information systems for health care applications, more emphasis has to be put on systems which, by design, support the reduction of cost, the increase inefficiency and the improvement of the quality of services. A substantial contribution to this will be the modeling. optimization, automation and enactment of processes in health care institutions. One of the perceived key success factors for the system integration of processes will be the application of workflow management, with workflow management systems as key technology components. In this paper we address workflow management in radiology. We focus on an important aspect of workflow management, the generation and handling of worklists, which provide workflow participants automatically with work items that reflect tasks to be performed. The display of worklists and the functions associated with work items are the visible part for the end-users of an information system using a workflow management approach. Appropriate worklist design and implementation will influence user friendliness of a system and will largely influence work efficiency. Technically, in current imaging department information system environments (modality-PACS-RIS installations), a data-driven approach has been taken: Worklist -- if present at all -- are generated from filtered views on application data bases. In a future workflow-based approach, worklists will be generated by autonomous workflow services based on explicit process models and organizational models. This process-oriented approach will provide us with an integral view of entire health care processes or sub- processes. The paper describes the basic mechanisms of this approach and summarizes its benefits.

  9. A virtual radiation therapy workflow training simulation

    International Nuclear Information System (INIS)

    Bridge, P.; Crowe, S.B.; Gibson, G.; Ellemor, N.J.; Hargrave, C.; Carmichael, M.

    2016-01-01

    Aim: Simulation forms an increasingly vital component of clinical skills development in a wide range of professional disciplines. Simulation of clinical techniques and equipment is designed to better prepare students for placement by providing an opportunity to learn technical skills in a “safe” academic environment. In radiotherapy training over the last decade or so this has predominantly comprised treatment planning software and small ancillary equipment such as mould room apparatus. Recent virtual reality developments have dramatically changed this approach. Innovative new simulation applications and file processing and interrogation software have helped to fill in the gaps to provide a streamlined virtual workflow solution. This paper outlines the innovations that have enabled this, along with an evaluation of the impact on students and educators. Method: Virtual reality software and workflow applications have been developed to enable the following steps of radiation therapy to be simulated in an academic environment: CT scanning using a 3D virtual CT scanner simulation; batch CT duplication; treatment planning; 3D plan evaluation using a virtual linear accelerator; quantitative plan assessment, patient setup with lasers; and image guided radiotherapy software. Results: Evaluation of the impact of the virtual reality workflow system highlighted substantial time saving for academic staff as well as positive feedback from students relating to preparation for clinical placements. Students valued practice in the “safe” environment and the opportunity to understand the clinical workflow ahead of clinical department experience. Conclusion: Simulation of most of the radiation therapy workflow and tasks is feasible using a raft of virtual reality simulation applications and supporting software. Benefits of this approach include time-saving, embedding of a case-study based approach, increased student confidence, and optimal use of the clinical environment

  10. Design, Modelling and Analysis of a Workflow Reconfiguration

    DEFF Research Database (Denmark)

    Mazzara, Manuel; Abouzaid, Faisal; Dragoni, Nicola

    2011-01-01

    This paper describes a case study involving the reconfiguration of an office workflow. We state the requirements on a system implementing the workflow and its reconfiguration, and describe the system’s design in BPMN. We then use an asynchronous pi-calculus and Web.1 to model the design and to ve......This paper describes a case study involving the reconfiguration of an office workflow. We state the requirements on a system implementing the workflow and its reconfiguration, and describe the system’s design in BPMN. We then use an asynchronous pi-calculus and Web.1 to model the design...

  11. REDLetr: Workflow and tools to support the migration of legacy clinical data capture systems to REDCap.

    Science.gov (United States)

    Dunn, William D; Cobb, Jake; Levey, Allan I; Gutman, David A

    2016-09-01

    A memory clinic at an academic medical center has relied on several ad hoc data capture systems including Microsoft Access and Excel for cognitive assessments over the last several years. However these solutions are challenging to maintain and limit the potential of hypothesis-driven or longitudinal research. REDCap, a secure web application based on PHP and MySQL, is a practical solution for improving data capture and organization. Here, we present a workflow and toolset to facilitate legacy data migration and real-time clinical research data collection into REDCap as well as challenges encountered. Legacy data consisted of neuropsychological tests stored in over 4000 Excel workbooks. Functions for data extraction, norm scoring, converting to REDCap-compatible formats, accessing the REDCap API, and clinical report generation were developed and executed in Python. Over 400 unique data points for each workbook were migrated and integrated into our REDCap database. Moving forward, our REDCap-based system replaces the Excel-based data collection method as well as eases the integration into the standard clinical research workflow and Electronic Health Record. In the age of growing data, efficient organization and storage of clinical and research data is critical for advancing research and providing efficient patient care. We believe that the workflow and tools described in this work to promote legacy data integration as well as real time data collection into REDCap ultimately facilitate these goals. Published by Elsevier Ireland Ltd.

  12. Scientific data management challenges, technology and deployment

    CERN Document Server

    Rotem, Doron

    2010-01-01

    Dealing with the volume, complexity, and diversity of data currently being generated by scientific experiments and simulations often causes scientists to waste productive time. Scientific Data Management: Challenges, Technology, and Deployment describes cutting-edge technologies and solutions for managing and analyzing vast amounts of data, helping scientists focus on their scientific goals. The book begins with coverage of efficient storage systems, discussing how to write and read large volumes of data without slowing the simulation, analysis, or visualization processes. It then focuses on the efficient data movement and management of storage spaces and explores emerging database systems for scientific data. The book also addresses how to best organize data for analysis purposes, how to effectively conduct searches over large datasets, how to successfully automate multistep scientific process workflows, and how to automatically collect metadata and lineage information. This book provides a comprehensive u...

  13. Quantitative analysis of probabilistic BPMN workflows

    DEFF Research Database (Denmark)

    Herbert, Luke Thomas; Sharp, Robin

    2012-01-01

    We present a framework for modelling and analysis of realworld business workflows. We present a formalised core subset of the Business Process Modelling and Notation (BPMN) and then proceed to extend this language with probabilistic nondeterministic branching and general-purpose reward annotations...... of events, reward-based properties and best- and worst- case scenarios. We develop a simple example of medical workflow and demonstrate the utility of this analysis in accurate provisioning of drug stocks. Finally, we suggest a path to building upon these techniques to cover the entire BPMN language, allow...... for more complex annotations and ultimately to automatically synthesise workflows by composing predefined sub-processes, in order to achieve a configuration that is optimal for parameters of interest....

  14. Resilient workflows for computational mechanics platforms

    International Nuclear Information System (INIS)

    Nguyen, Toan; Trifan, Laurentiu; Desideri, Jean-Antoine

    2010-01-01

    Workflow management systems have recently been the focus of much interest and many research and deployment for scientific applications worldwide. Their ability to abstract the applications by wrapping application codes have also stressed the usefulness of such systems for multidiscipline applications. When complex applications need to provide seamless interfaces hiding the technicalities of the computing infrastructures, their high-level modeling, monitoring and execution functionalities help giving production teams seamless and effective facilities. Software integration infrastructures based on programming paradigms such as Python, Mathlab and Scilab have also provided evidence of the usefulness of such approaches for the tight coupling of multidisciplne application codes. Also high-performance computing based on multi-core multi-cluster infrastructures open new opportunities for more accurate, more extensive and effective robust multi-discipline simulations for the decades to come. This supports the goal of full flight dynamics simulation for 3D aircraft models within the next decade, opening the way to virtual flight-tests and certification of aircraft in the future.

  15. LabelFlow Framework for Annotating Workflow Provenance

    Directory of Open Access Journals (Sweden)

    Pinar Alper

    2018-02-01

    Full Text Available Scientists routinely analyse and share data for others to use. Successful data (reuse relies on having metadata describing the context of analysis of data. In many disciplines the creation of contextual metadata is referred to as reporting. One method of implementing analyses is with workflows. A stand-out feature of workflows is their ability to record provenance from executions. Provenance is useful when analyses are executed with changing parameters (changing contexts and results need to be traced to respective parameters. In this paper we investigate whether provenance can be exploited to support reporting. Specifically; we outline a case-study based on a real-world workflow and set of reporting queries. We observe that provenance, as collected from workflow executions, is of limited use for reporting, as it supports queries partially. We identify that this is due to the generic nature of provenance, its lack of domain-specific contextual metadata. We observe that the required information is available in implicit form, embedded in data. We describe LabelFlow, a framework comprised of four Labelling Operators for decorating provenance with domain-specific Labels. LabelFlow can be instantiated for a domain by plugging it with domain-specific metadata extractors. We provide a tool that takes as input a workflow, and produces as output a Labelling Pipeline for that workflow, comprised of Labelling Operators. We revisit the case-study and show how Labels provide a more complete implementation of reporting queries.

  16. Provenance-based refresh in data-oriented workflows

    KAUST Repository

    Ikeda, Robert; Salihoglu, Semih; Widom, Jennifer

    2011-01-01

    We consider a general workflow setting in which input data sets are processed by a graph of transformations to produce output results. Our goal is to perform efficient selective refresh of elements in the output data, i.e., compute the latest values of specific output elements when the input data may have changed. We explore how data provenance can be used to enable efficient refresh. Our approach is based on capturing one-level data provenance at each transformation when the workflow is run initially. Then at refresh time provenance is used to determine (transitively) which input elements are responsible for given output elements, and the workflow is rerun only on that portion of the data needed for refresh. Our contributions are to formalize the problem setting and the problem itself, to specify properties of transformations and provenance that are required for efficient refresh, and to provide algorithms that apply to a wide class of transformations and workflows. We have built a prototype system supporting the features and algorithms presented in the paper. We report preliminary experimental results on the overhead of provenance capture, and on the crossover point between selective refresh and full workflow recomputation. © 2011 ACM.

  17. COSMOS: Python library for massively parallel workflows.

    Science.gov (United States)

    Gafni, Erik; Luquette, Lovelace J; Lancaster, Alex K; Hawkins, Jared B; Jung, Jae-Yoon; Souilmi, Yassine; Wall, Dennis P; Tonellato, Peter J

    2014-10-15

    Efficient workflows to shepherd clinically generated genomic data through the multiple stages of a next-generation sequencing pipeline are of critical importance in translational biomedical science. Here we present COSMOS, a Python library for workflow management that allows formal description of pipelines and partitioning of jobs. In addition, it includes a user interface for tracking the progress of jobs, abstraction of the queuing system and fine-grained control over the workflow. Workflows can be created on traditional computing clusters as well as cloud-based services. Source code is available for academic non-commercial research purposes. Links to code and documentation are provided at http://lpm.hms.harvard.edu and http://wall-lab.stanford.edu. dpwall@stanford.edu or peter_tonellato@hms.harvard.edu. Supplementary data are available at Bioinformatics online. © The Author 2014. Published by Oxford University Press.

  18. Modelling and analysis of workflow for lean supply chains

    Science.gov (United States)

    Ma, Jinping; Wang, Kanliang; Xu, Lida

    2011-11-01

    Cross-organisational workflow systems are a component of enterprise information systems which support collaborative business process among organisations in supply chain. Currently, the majority of workflow systems is developed in perspectives of information modelling without considering actual requirements of supply chain management. In this article, we focus on the modelling and analysis of the cross-organisational workflow systems in the context of lean supply chain (LSC) using Petri nets. First, the article describes the assumed conditions of cross-organisation workflow net according to the idea of LSC and then discusses the standardisation of collaborating business process between organisations in the context of LSC. Second, the concept of labelled time Petri nets (LTPNs) is defined through combining labelled Petri nets with time Petri nets, and the concept of labelled time workflow nets (LTWNs) is also defined based on LTPNs. Cross-organisational labelled time workflow nets (CLTWNs) is then defined based on LTWNs. Third, the article proposes the notion of OR-silent CLTWNS and a verifying approach to the soundness of LTWNs and CLTWNs. Finally, this article illustrates how to use the proposed method by a simple example. The purpose of this research is to establish a formal method of modelling and analysis of workflow systems for LSC. This study initiates a new perspective of research on cross-organisational workflow management and promotes operation management of LSC in real world settings.

  19. Design Tools and Workflows for Braided Structures

    DEFF Research Database (Denmark)

    Vestartas, Petras; Heinrich, Mary Katherine; Zwierzycki, Mateusz

    2017-01-01

    and merits of our method, demonstrated though four example design and analysis workflows. The workflows frame specific aspects of enquiry for the ongoing research project flora robotica. These include modelling target geometries, automatically producing instructions for fabrication, conducting structural...

  20. JMS: An Open Source Workflow Management System and Web-Based Cluster Front-End for High Performance Computing.

    Science.gov (United States)

    Brown, David K; Penkler, David L; Musyoka, Thommas M; Bishop, Özlem Tastan

    2015-01-01

    Complex computational pipelines are becoming a staple of modern scientific research. Often these pipelines are resource intensive and require days of computing time. In such cases, it makes sense to run them over high performance computing (HPC) clusters where they can take advantage of the aggregated resources of many powerful computers. In addition to this, researchers often want to integrate their workflows into their own web servers. In these cases, software is needed to manage the submission of jobs from the web interface to the cluster and then return the results once the job has finished executing. We have developed the Job Management System (JMS), a workflow management system and web interface for high performance computing (HPC). JMS provides users with a user-friendly web interface for creating complex workflows with multiple stages. It integrates this workflow functionality with the resource manager, a tool that is used to control and manage batch jobs on HPC clusters. As such, JMS combines workflow management functionality with cluster administration functionality. In addition, JMS provides developer tools including a code editor and the ability to version tools and scripts. JMS can be used by researchers from any field to build and run complex computational pipelines and provides functionality to include these pipelines in external interfaces. JMS is currently being used to house a number of bioinformatics pipelines at the Research Unit in Bioinformatics (RUBi) at Rhodes University. JMS is an open-source project and is freely available at https://github.com/RUBi-ZA/JMS.

  1. JMS: An Open Source Workflow Management System and Web-Based Cluster Front-End for High Performance Computing.

    Directory of Open Access Journals (Sweden)

    David K Brown

    Full Text Available Complex computational pipelines are becoming a staple of modern scientific research. Often these pipelines are resource intensive and require days of computing time. In such cases, it makes sense to run them over high performance computing (HPC clusters where they can take advantage of the aggregated resources of many powerful computers. In addition to this, researchers often want to integrate their workflows into their own web servers. In these cases, software is needed to manage the submission of jobs from the web interface to the cluster and then return the results once the job has finished executing. We have developed the Job Management System (JMS, a workflow management system and web interface for high performance computing (HPC. JMS provides users with a user-friendly web interface for creating complex workflows with multiple stages. It integrates this workflow functionality with the resource manager, a tool that is used to control and manage batch jobs on HPC clusters. As such, JMS combines workflow management functionality with cluster administration functionality. In addition, JMS provides developer tools including a code editor and the ability to version tools and scripts. JMS can be used by researchers from any field to build and run complex computational pipelines and provides functionality to include these pipelines in external interfaces. JMS is currently being used to house a number of bioinformatics pipelines at the Research Unit in Bioinformatics (RUBi at Rhodes University. JMS is an open-source project and is freely available at https://github.com/RUBi-ZA/JMS.

  2. JMS: An Open Source Workflow Management System and Web-Based Cluster Front-End for High Performance Computing

    Science.gov (United States)

    Brown, David K.; Penkler, David L.; Musyoka, Thommas M.; Bishop, Özlem Tastan

    2015-01-01

    Complex computational pipelines are becoming a staple of modern scientific research. Often these pipelines are resource intensive and require days of computing time. In such cases, it makes sense to run them over high performance computing (HPC) clusters where they can take advantage of the aggregated resources of many powerful computers. In addition to this, researchers often want to integrate their workflows into their own web servers. In these cases, software is needed to manage the submission of jobs from the web interface to the cluster and then return the results once the job has finished executing. We have developed the Job Management System (JMS), a workflow management system and web interface for high performance computing (HPC). JMS provides users with a user-friendly web interface for creating complex workflows with multiple stages. It integrates this workflow functionality with the resource manager, a tool that is used to control and manage batch jobs on HPC clusters. As such, JMS combines workflow management functionality with cluster administration functionality. In addition, JMS provides developer tools including a code editor and the ability to version tools and scripts. JMS can be used by researchers from any field to build and run complex computational pipelines and provides functionality to include these pipelines in external interfaces. JMS is currently being used to house a number of bioinformatics pipelines at the Research Unit in Bioinformatics (RUBi) at Rhodes University. JMS is an open-source project and is freely available at https://github.com/RUBi-ZA/JMS. PMID:26280450

  3. InvertNet: a new paradigm for digital access to invertebrate collections

    Directory of Open Access Journals (Sweden)

    Chris Dietrich

    2012-07-01

    Full Text Available InvertNet, one of the three Thematic Collection Networks (TCNs funded in the first round of the U.S. National Science Foundation’s Advancing Digitization of Biological Collections (ADBC program, is tasked with providing digital access to ~60 million specimens housed in 22 arthropod (primarily insect collections at institutions distributed throughout the upper midwestern USA. The traditional workflow for insect collection digitization involves manually keying information from specimen labels into a database and attaching a unique identifier label to each specimen. This remains the dominant paradigm, despite some recent attempts to automate various steps in the process using more advanced technologies. InvertNet aims to develop improved semi-automated, high-throughput workflows for digitizing and providing access to invertebrate collections that balance the need for speed and cost-effectiveness with long-term preservation of specimens and accuracy of data capture. The proposed workflows build on recent methods for digitizing and providing access to high-quality images of multiple specimens (e.g., entire drawers of pinned insects simultaneously. Limitations of previous approaches are discussed and possible solutions are proposed that incorporate advanced imaging and 3-D reconstruction technologies. InvertNet couples efficient digitization workflows with a highly robust network infrastructure capable of managing massive amounts of image data and related metadata and delivering high-quality images, including interactive 3-D reconstructions in real time via the Internet.

  4. An ontology model for execution records of Grid scientific applications

    NARCIS (Netherlands)

    Baliś, B.; Bubak, M.

    2008-01-01

    Records of past application executions are particularly important in the case of loosely-coupled, workflow driven scientific applications which are used to conduct in silico experiments, often on top of Grid infrastructures. In this paper, we propose an ontology-based model for storing and querying

  5. A Model of Workflow Composition for Emergency Management

    Science.gov (United States)

    Xin, Chen; Bin-ge, Cui; Feng, Zhang; Xue-hui, Xu; Shan-shan, Fu

    The common-used workflow technology is not flexible enough in dealing with concurrent emergency situations. The paper proposes a novel model for defining emergency plans, in which workflow segments appear as a constituent part. A formal abstraction, which contains four operations, is defined to compose workflow segments under constraint rule. The software system of the business process resources construction and composition is implemented and integrated into Emergency Plan Management Application System.

  6. Workflow Fault Tree Generation Through Model Checking

    DEFF Research Database (Denmark)

    Herbert, Luke Thomas; Sharp, Robin

    2014-01-01

    We present a framework for the automated generation of fault trees from models of realworld process workflows, expressed in a formalised subset of the popular Business Process Modelling and Notation (BPMN) language. To capture uncertainty and unreliability in workflows, we extend this formalism...

  7. Social Networking Adapted for Distributed Scientific Collaboration

    Science.gov (United States)

    Karimabadi, Homa

    2012-01-01

    Share is a social networking site with novel, specially designed feature sets to enable simultaneous remote collaboration and sharing of large data sets among scientists. The site will include not only the standard features found on popular consumer-oriented social networking sites such as Facebook and Myspace, but also a number of powerful tools to extend its functionality to a science collaboration site. A Virtual Observatory is a promising technology for making data accessible from various missions and instruments through a Web browser. Sci-Share augments services provided by Virtual Observatories by enabling distributed collaboration and sharing of downloaded and/or processed data among scientists. This will, in turn, increase science returns from NASA missions. Sci-Share also enables better utilization of NASA s high-performance computing resources by providing an easy and central mechanism to access and share large files on users space or those saved on mass storage. The most common means of remote scientific collaboration today remains the trio of e-mail for electronic communication, FTP for file sharing, and personalized Web sites for dissemination of papers and research results. Each of these tools has well-known limitations. Sci-Share transforms the social networking paradigm into a scientific collaboration environment by offering powerful tools for cooperative discourse and digital content sharing. Sci-Share differentiates itself by serving as an online repository for users digital content with the following unique features: a) Sharing of any file type, any size, from anywhere; b) Creation of projects and groups for controlled sharing; c) Module for sharing files on HPC (High Performance Computing) sites; d) Universal accessibility of staged files as embedded links on other sites (e.g. Facebook) and tools (e.g. e-mail); e) Drag-and-drop transfer of large files, replacing awkward e-mail attachments (and file size limitations); f) Enterprise-level data and

  8. Jenkins-CI, an Open-Source Continuous Integration System, as a Scientific Data and Image-Processing Platform

    Science.gov (United States)

    Moutsatsos, Ioannis K.; Hossain, Imtiaz; Agarinis, Claudia; Harbinski, Fred; Abraham, Yann; Dobler, Luc; Zhang, Xian; Wilson, Christopher J.; Jenkins, Jeremy L.; Holway, Nicholas; Tallarico, John; Parker, Christian N.

    2016-01-01

    High-throughput screening generates large volumes of heterogeneous data that require a diverse set of computational tools for management, processing, and analysis. Building integrated, scalable, and robust computational workflows for such applications is challenging but highly valuable. Scientific data integration and pipelining facilitate standardized data processing, collaboration, and reuse of best practices. We describe how Jenkins-CI, an “off-the-shelf,” open-source, continuous integration system, is used to build pipelines for processing images and associated data from high-content screening (HCS). Jenkins-CI provides numerous plugins for standard compute tasks, and its design allows the quick integration of external scientific applications. Using Jenkins-CI, we integrated CellProfiler, an open-source image-processing platform, with various HCS utilities and a high-performance Linux cluster. The platform is web-accessible, facilitates access and sharing of high-performance compute resources, and automates previously cumbersome data and image-processing tasks. Imaging pipelines developed using the desktop CellProfiler client can be managed and shared through a centralized Jenkins-CI repository. Pipelines and managed data are annotated to facilitate collaboration and reuse. Limitations with Jenkins-CI (primarily around the user interface) were addressed through the selection of helper plugins from the Jenkins-CI community. PMID:27899692

  9. Jenkins-CI, an Open-Source Continuous Integration System, as a Scientific Data and Image-Processing Platform.

    Science.gov (United States)

    Moutsatsos, Ioannis K; Hossain, Imtiaz; Agarinis, Claudia; Harbinski, Fred; Abraham, Yann; Dobler, Luc; Zhang, Xian; Wilson, Christopher J; Jenkins, Jeremy L; Holway, Nicholas; Tallarico, John; Parker, Christian N

    2017-03-01

    High-throughput screening generates large volumes of heterogeneous data that require a diverse set of computational tools for management, processing, and analysis. Building integrated, scalable, and robust computational workflows for such applications is challenging but highly valuable. Scientific data integration and pipelining facilitate standardized data processing, collaboration, and reuse of best practices. We describe how Jenkins-CI, an "off-the-shelf," open-source, continuous integration system, is used to build pipelines for processing images and associated data from high-content screening (HCS). Jenkins-CI provides numerous plugins for standard compute tasks, and its design allows the quick integration of external scientific applications. Using Jenkins-CI, we integrated CellProfiler, an open-source image-processing platform, with various HCS utilities and a high-performance Linux cluster. The platform is web-accessible, facilitates access and sharing of high-performance compute resources, and automates previously cumbersome data and image-processing tasks. Imaging pipelines developed using the desktop CellProfiler client can be managed and shared through a centralized Jenkins-CI repository. Pipelines and managed data are annotated to facilitate collaboration and reuse. Limitations with Jenkins-CI (primarily around the user interface) were addressed through the selection of helper plugins from the Jenkins-CI community.

  10. Rapid development of scalable scientific software using a process oriented approach

    DEFF Research Database (Denmark)

    Friborg, Rune Møllegaard; Vinter, Brian

    2011-01-01

    Scientific applications are often not written with multiprocessing, cluster computing or grid computing in mind. This paper suggests using Python and PyCSP to structure scientific software through Communicating Sequential Processes. Three scientific applications are used to demonstrate the features...... of PyCSP and how networks of processes may easily be mapped into a visual representation for better understanding of the process workflow. We show that for many sequential solutions, the difficulty in implementing a parallel application is removed. The use of standard multi-threading mechanisms...

  11. Search, access and dissemination of scientific information from scientists, social scientists and humanists

    Directory of Open Access Journals (Sweden)

    Fernando César Lima Leite

    2015-05-01

    Full Text Available This paper presents results of study on the characteristics of search activities, access to and use of information, and dissemination habits of researchers from scientific research institutes. From the methodological point of view, it is a mixed methods study which adopted the concurrent triangulation strategy. Data were collected through questionnaires, interviews and checklist, and then submitted to statistical and text analysis. The research sphere was consisted of researchers linked to the research units of the Ministry of Science, Technology and Innovation, and the sample basis were the researchers of the Brazilian Centre for Physics Research (CBPF and Museum of Astronomy and Related Sciences (MAST. Among other aspects, the findings shows that the safeguarded their disciplinary differences, search, access and communication activities, regardless of the knowledge area, occurring mainly in the digital environment; communication habits are stimulated by motives common to scientists and social scientists and humanists, share knowledge and visibility are the main reasons for the dissemination of research results, physicists are naturally within the open access context.

  12. Conventions and workflows for using Situs

    International Nuclear Information System (INIS)

    Wriggers, Willy

    2012-01-01

    Recent developments of the Situs software suite for multi-scale modeling are reviewed. Typical workflows and conventions encountered during processing of biophysical data from electron microscopy, tomography or small-angle X-ray scattering are described. Situs is a modular program package for the multi-scale modeling of atomic resolution structures and low-resolution biophysical data from electron microscopy, tomography or small-angle X-ray scattering. This article provides an overview of recent developments in the Situs package, with an emphasis on workflows and conventions that are important for practical applications. The modular design of the programs facilitates scripting in the bash shell that allows specific programs to be combined in creative ways that go beyond the original intent of the developers. Several scripting-enabled functionalities, such as flexible transformations of data type, the use of symmetry constraints or the creation of two-dimensional projection images, are described. The processing of low-resolution biophysical maps in such workflows follows not only first principles but often relies on implicit conventions. Situs conventions related to map formats, resolution, correlation functions and feature detection are reviewed and summarized. The compatibility of the Situs workflow with CCP4 conventions and programs is discussed

  13. Building and documenting workflows with python-based snakemake

    OpenAIRE

    Köster, Johannes; Rahmann, Sven

    2012-01-01

    textabstractSnakemake is a novel workflow engine with a simple Python-derived workflow definition language and an optimizing execution environment. It is the first system that supports multiple named wildcards (or variables) in input and output filenames of each rule definition. It also allows to write human-readable workflows that document themselves. We have found Snakemake especially useful for building high-throughput sequencing data analysis pipelines and present examples from this area....

  14. Concurrency & Asynchrony in Declarative Workflows

    DEFF Research Database (Denmark)

    Debois, Søren; Hildebrandt, Thomas; Slaats, Tijs

    2015-01-01

    of concurrency in DCR Graphs admits asynchronous execution of declarative workflows both conceptually and by reporting on a prototype implementation of a distributed declarative workflow engine. Both the theoretical development and the implementation is supported by an extended example; moreover, the theoretical....... In this paper, we pro- pose a notion of concurrency for declarative process models, formulated in the context of Dynamic Condition Response (DCR) graphs, and exploiting the so-called “true concurrency” semantics of Labelled Asynchronous Transition Systems. We demonstrate how this semantic underpinning...

  15. Improving adherence to the Epic Beacon ambulatory workflow.

    Science.gov (United States)

    Chackunkal, Ellen; Dhanapal Vogel, Vishnuprabha; Grycki, Meredith; Kostoff, Diana

    2017-06-01

    Computerized physician order entry has been shown to significantly improve chemotherapy safety by reducing the number of prescribing errors. Epic's Beacon Oncology Information System of computerized physician order entry and electronic medication administration was implemented in Henry Ford Health System's ambulatory oncology infusion centers on 9 November 2013. Since that time, compliance to the infusion workflow had not been assessed. The objective of this study was to optimize the current workflow and improve the compliance to this workflow in the ambulatory oncology setting. This study was a retrospective, quasi-experimental study which analyzed the composite workflow compliance rate of patient encounters from 9 to 23 November 2014. Based on this analysis, an intervention was identified and implemented in February 2015 to improve workflow compliance. The primary endpoint was to compare the composite compliance rate to the Beacon workflow before and after a pharmacy-initiated intervention. The intervention, which was education of infusion center staff, was initiated by ambulatory-based, oncology pharmacists and implemented by a multi-disciplinary team of pharmacists and nurses. The composite compliance rate was then reassessed for patient encounters from 2 to 13 March 2015 in order to analyze the effects of the determined intervention on compliance. The initial analysis in November 2014 revealed a composite compliance rate of 38%, and data analysis after the intervention revealed a statistically significant increase in the composite compliance rate to 83% ( p < 0.001). This study supports a pharmacist-initiated educational intervention can improve compliance to an ambulatory, oncology infusion workflow.

  16. The BEL information extraction workflow (BELIEF): evaluation in the BioCreative V BEL and IAT track

    OpenAIRE

    Madan, Sumit; Hodapp, Sven; Senger, Philipp; Ansari, Sam; Szostak, Justyna; Hoeng, Julia; Peitsch, Manuel; Fluck, Juliane

    2016-01-01

    Network-based approaches have become extremely important in systems biology to achieve a better understanding of biological mechanisms. For network representation, the Biological Expression Language (BEL) is well designed to collate findings from the scientific literature into biological network models. To facilitate encoding and biocuration of such findings in BEL, a BEL Information Extraction Workflow (BELIEF) was developed. BELIEF provides a web-based curation interface, the BELIEF Dashboa...

  17. The standard-based open workflow system in GeoBrain (Invited)

    Science.gov (United States)

    Di, L.; Yu, G.; Zhao, P.; Deng, M.

    2013-12-01

    GeoBrain is an Earth science Web-service system developed and operated by the Center for Spatial Information Science and Systems, George Mason University. In GeoBrain, a standard-based open workflow system has been implemented to accommodate the automated processing of geospatial data through a set of complex geo-processing functions for advanced production generation. The GeoBrain models the complex geoprocessing at two levels, the conceptual and concrete. At the conceptual level, the workflows exist in the form of data and service types defined by ontologies. The workflows at conceptual level are called geo-processing models and cataloged in GeoBrain as virtual product types. A conceptual workflow is instantiated into a concrete, executable workflow when a user requests a product that matches a virtual product type. Both conceptual and concrete workflows are encoded in Business Process Execution Language (BPEL). A BPEL workflow engine, called BPELPower, has been implemented to execute the workflow for the product generation. A provenance capturing service has been implemented to generate the ISO 19115-compliant complete product provenance metadata before and after the workflow execution. The generation of provenance metadata before the workflow execution allows users to examine the usability of the final product before the lengthy and expensive execution takes place. The three modes of workflow executions defined in the ISO 19119, transparent, translucent, and opaque, are available in GeoBrain. A geoprocessing modeling portal has been developed to allow domain experts to develop geoprocessing models at the type level with the support of both data and service/processing ontologies. The geoprocessing models capture the knowledge of the domain experts and are become the operational offering of the products after a proper peer review of models is conducted. An automated workflow composition has been experimented successfully based on ontologies and artificial

  18. A Web application for the management of clinical workflow in image-guided and adaptive proton therapy for prostate cancer treatments.

    Science.gov (United States)

    Yeung, Daniel; Boes, Peter; Ho, Meng Wei; Li, Zuofeng

    2015-05-08

    Image-guided radiotherapy (IGRT), based on radiopaque markers placed in the prostate gland, was used for proton therapy of prostate patients. Orthogonal X-rays and the IBA Digital Image Positioning System (DIPS) were used for setup correction prior to treatment and were repeated after treatment delivery. Following a rationale for margin estimates similar to that of van Herk,(1) the daily post-treatment DIPS data were analyzed to determine if an adaptive radiotherapy plan was necessary. A Web application using ASP.NET MVC5, Entity Framework, and an SQL database was designed to automate this process. The designed features included state-of-the-art Web technologies, a domain model closely matching the workflow, a database-supporting concurrency and data mining, access to the DIPS database, secured user access and roles management, and graphing and analysis tools. The Model-View-Controller (MVC) paradigm allowed clean domain logic, unit testing, and extensibility. Client-side technologies, such as jQuery, jQuery Plug-ins, and Ajax, were adopted to achieve a rich user environment and fast response. Data models included patients, staff, treatment fields and records, correction vectors, DIPS images, and association logics. Data entry, analysis, workflow logics, and notifications were implemented. The system effectively modeled the clinical workflow and IGRT process.

  19. GO2OGS 1.0: a versatile workflow to integrate complex geological information with fault data into numerical simulation models

    Science.gov (United States)

    Fischer, T.; Naumov, D.; Sattler, S.; Kolditz, O.; Walther, M.

    2015-11-01

    We offer a versatile workflow to convert geological models built with the ParadigmTM GOCAD© (Geological Object Computer Aided Design) software into the open-source VTU (Visualization Toolkit unstructured grid) format for usage in numerical simulation models. Tackling relevant scientific questions or engineering tasks often involves multidisciplinary approaches. Conversion workflows are needed as a way of communication between the diverse tools of the various disciplines. Our approach offers an open-source, platform-independent, robust, and comprehensible method that is potentially useful for a multitude of environmental studies. With two application examples in the Thuringian Syncline, we show how a heterogeneous geological GOCAD model including multiple layers and faults can be used for numerical groundwater flow modeling, in our case employing the OpenGeoSys open-source numerical toolbox for groundwater flow simulations. The presented workflow offers the chance to incorporate increasingly detailed data, utilizing the growing availability of computational power to simulate numerical models.

  20. Best Practices for Making Scientific Data Discoverable and Accessible through Integrated, Standards-Based Data Portals

    Science.gov (United States)

    Lucido, J. M.

    2013-12-01

    Scientists in the fields of hydrology, geophysics, and climatology are increasingly using the vast quantity of publicly-available data to address broadly-scoped scientific questions. For example, researchers studying contamination of nearshore waters could use a combination of radar indicated precipitation, modeled water currents, and various sources of in-situ monitoring data to predict water quality near a beach. In discovering, gathering, visualizing and analyzing potentially useful data sets, data portals have become invaluable tools. The most effective data portals often aggregate distributed data sets seamlessly and allow multiple avenues for accessing the underlying data, facilitated by the use of open standards. Additionally, adequate metadata are necessary for attribution, documentation of provenance and relating data sets to one another. Metadata also enable thematic, geospatial and temporal indexing of data sets and entities. Furthermore, effective portals make use of common vocabularies for scientific methods, units of measure, geologic features, chemical, and biological constituents as they allow investigators to correctly interpret and utilize data from external sources. One application that employs these principles is the National Ground Water Monitoring Network (NGWMN) Data Portal (http://cida.usgs.gov/ngwmn), which makes groundwater data from distributed data providers available through a single, publicly accessible web application by mediating and aggregating native data exposed via web services on-the-fly into Open Geospatial Consortium (OGC) compliant service output. That output may be accessed either through the map-based user interface or through the aforementioned OGC web services. Furthermore, the Geo Data Portal (http://cida.usgs.gov/climate/gdp/), which is a system that provides users with data access, subsetting and geospatial processing of large and complex climate and land use data, exemplifies the application of International Standards

  1. A standard-enabled workflow for synthetic biology

    KAUST Repository

    Myers, Chris J.

    2017-06-15

    A synthetic biology workflow is composed of data repositories that provide information about genetic parts, sequence-level design tools to compose these parts into circuits, visualization tools to depict these designs, genetic design tools to select parts to create systems, and modeling and simulation tools to evaluate alternative design choices. Data standards enable the ready exchange of information within such a workflow, allowing repositories and tools to be connected from a diversity of sources. The present paper describes one such workflow that utilizes, among others, the Synthetic Biology Open Language (SBOL) to describe genetic designs, the Systems Biology Markup Language to model these designs, and SBOL Visual to visualize these designs. We describe how a standard-enabled workflow can be used to produce types of design information, including multiple repositories and software tools exchanging information using a variety of data standards. Recently, the ACS Synthetic Biology journal has recommended the use of SBOL in their publications.

  2. Agreement Workflow Tool (AWT)

    Data.gov (United States)

    Social Security Administration — The Agreement Workflow Tool (AWT) is a role-based Intranet application used for processing SSA's Reimbursable Agreements according to SSA's standards. AWT provides...

  3. Kronos: a workflow assembler for genome analytics and informatics

    Science.gov (United States)

    Taghiyar, M. Jafar; Rosner, Jamie; Grewal, Diljot; Grande, Bruno M.; Aniba, Radhouane; Grewal, Jasleen; Boutros, Paul C.; Morin, Ryan D.

    2017-01-01

    Abstract Background: The field of next-generation sequencing informatics has matured to a point where algorithmic advances in sequence alignment and individual feature detection methods have stabilized. Practical and robust implementation of complex analytical workflows (where such tools are structured into “best practices” for automated analysis of next-generation sequencing datasets) still requires significant programming investment and expertise. Results: We present Kronos, a software platform for facilitating the development and execution of modular, auditable, and distributable bioinformatics workflows. Kronos obviates the need for explicit coding of workflows by compiling a text configuration file into executable Python applications. Making analysis modules would still require programming. The framework of each workflow includes a run manager to execute the encoded workflows locally (or on a cluster or cloud), parallelize tasks, and log all runtime events. The resulting workflows are highly modular and configurable by construction, facilitating flexible and extensible meta-applications that can be modified easily through configuration file editing. The workflows are fully encoded for ease of distribution and can be instantiated on external systems, a step toward reproducible research and comparative analyses. We introduce a framework for building Kronos components that function as shareable, modular nodes in Kronos workflows. Conclusions: The Kronos platform provides a standard framework for developers to implement custom tools, reuse existing tools, and contribute to the community at large. Kronos is shipped with both Docker and Amazon Web Services Machine Images. It is free, open source, and available through the Python Package Index and at https://github.com/jtaghiyar/kronos. PMID:28655203

  4. A Tool Supporting Collaborative Data Analytics Workflow Design and Management

    Science.gov (United States)

    Zhang, J.; Bao, Q.; Lee, T. J.

    2016-12-01

    Collaborative experiment design could significantly enhance the sharing and adoption of the data analytics algorithms and models emerged in Earth science. Existing data-oriented workflow tools, however, are not suitable to support collaborative design of such a workflow, to name a few, to support real-time co-design; to track how a workflow evolves over time based on changing designs contributed by multiple Earth scientists; and to capture and retrieve collaboration knowledge on workflow design (discussions that lead to a design). To address the aforementioned challenges, we have designed and developed a technique supporting collaborative data-oriented workflow composition and management, as a key component toward supporting big data collaboration through the Internet. Reproducibility and scalability are two major targets demanding fundamental infrastructural support. One outcome of the project os a software tool, supporting an elastic number of groups of Earth scientists to collaboratively design and compose data analytics workflows through the Internet. Instead of recreating the wheel, we have extended an existing workflow tool VisTrails into an online collaborative environment as a proof of concept.

  5. Declarative Modelling and Safe Distribution of Healthcare Workflows

    DEFF Research Database (Denmark)

    Hildebrandt, Thomas; Mukkamala, Raghava Rao; Slaats, Tijs

    2012-01-01

    We present a formal technique for safe distribution of workflow processes described declaratively as Nested Condition Response (NCR) Graphs and apply the technique to a distributed healthcare workflow. Concretely, we provide a method to synthesize from a NCR Graph and any distribution of its events......-organizational case management. The contributions of this paper is to adapt the technique to allow for nested processes and milestones and to apply it to a healthcare workflow identified in a previous field study at danish hospitals....

  6. Integrating configuration workflows with project management system

    International Nuclear Information System (INIS)

    Nilsen, Dimitri; Weber, Pavel

    2014-01-01

    The complexity of the heterogeneous computing resources, services and recurring infrastructure changes at the GridKa WLCG Tier-1 computing center require a structured approach to configuration management and optimization of interplay between functional components of the whole system. A set of tools deployed at GridKa, including Puppet, Redmine, Foreman, SVN and Icinga, provides the administrative environment giving the possibility to define and develop configuration workflows, reduce the administrative effort and improve sustainable operation of the whole computing center. In this presentation we discuss the developed configuration scenarios implemented at GridKa, which we use for host installation, service deployment, change management procedures, service retirement etc. The integration of Puppet with a project management tool like Redmine provides us with the opportunity to track problem issues, organize tasks and automate these workflows. The interaction between Puppet and Redmine results in automatic updates of the issues related to the executed workflow performed by different system components. The extensive configuration workflows require collaboration and interaction between different departments like network, security, production etc. at GridKa. Redmine plugins developed at GridKa and integrated in its administrative environment provide an effective way of collaboration within the GridKa team. We present the structural overview of the software components, their connections, communication protocols and show a few working examples of the workflows and their automation.

  7. A new public policy to ensure access to scientific information resources: the case of Chile

    Directory of Open Access Journals (Sweden)

    María Soledad Bravo-Marchant

    2012-11-01

    Full Text Available The design and implementation of public policies to grant access to scientific information is now a marked trend among numerous countries of Latin America. The creation of specific instruments, the allocation of an ongoing budget and the accumulation of experience in negotiation and contracting of national licences have all been clear signs of the achievements resulting from recent initiatives in these countries. This article reviews the experience of the Consorcio para el Acceso a la Información Cientifíca Electrónica (CINCEL Corporation, a Chilean consortium created in 2002, the public policy that made it possible and the evaluation experience of its main programme, the Electronic Library of Scientific Information (BEIC.

  8. Workflow as a Service in the Cloud: Architecture and Scheduling Algorithms

    Science.gov (United States)

    Wang, Jianwu; Korambath, Prakashan; Altintas, Ilkay; Davis, Jim; Crawl, Daniel

    2017-01-01

    With more and more workflow systems adopting cloud as their execution environment, it becomes increasingly challenging on how to efficiently manage various workflows, virtual machines (VMs) and workflow execution on VM instances. To make the system scalable and easy-to-extend, we design a Workflow as a Service (WFaaS) architecture with independent services. A core part of the architecture is how to efficiently respond continuous workflow requests from users and schedule their executions in the cloud. Based on different targets, we propose four heuristic workflow scheduling algorithms for the WFaaS architecture, and analyze the differences and best usages of the algorithms in terms of performance, cost and the price/performance ratio via experimental studies. PMID:29399237

  9. Workflow as a Service in the Cloud: Architecture and Scheduling Algorithms.

    Science.gov (United States)

    Wang, Jianwu; Korambath, Prakashan; Altintas, Ilkay; Davis, Jim; Crawl, Daniel

    2014-01-01

    With more and more workflow systems adopting cloud as their execution environment, it becomes increasingly challenging on how to efficiently manage various workflows, virtual machines (VMs) and workflow execution on VM instances. To make the system scalable and easy-to-extend, we design a Workflow as a Service (WFaaS) architecture with independent services. A core part of the architecture is how to efficiently respond continuous workflow requests from users and schedule their executions in the cloud. Based on different targets, we propose four heuristic workflow scheduling algorithms for the WFaaS architecture, and analyze the differences and best usages of the algorithms in terms of performance, cost and the price/performance ratio via experimental studies.

  10. Workflow Dynamics and the Imaging Value Chain: Quantifying the Effect of Designating a Nonimage-Interpretive Task Workflow.

    Science.gov (United States)

    Lee, Matthew H; Schemmel, Andrew J; Pooler, B Dustin; Hanley, Taylor; Kennedy, Tabassum A; Field, Aaron S; Wiegmann, Douglas; Yu, John-Paul J

    To assess the impact of separate non-image interpretive task and image-interpretive task workflows in an academic neuroradiology practice. A prospective, randomized, observational investigation of a centralized academic neuroradiology reading room was performed. The primary reading room fellow was observed over a one-month period using a time-and-motion methodology, recording frequency and duration of tasks performed. Tasks were categorized into separate image interpretive and non-image interpretive workflows. Post-intervention observation of the primary fellow was repeated following the implementation of a consult assistant responsible for non-image interpretive tasks. Pre- and post-intervention data were compared. Following separation of image-interpretive and non-image interpretive workflows, time spent on image-interpretive tasks by the primary fellow increased from 53.8% to 73.2% while non-image interpretive tasks decreased from 20.4% to 4.4%. Mean time duration of image interpretation nearly doubled, from 05:44 to 11:01 (p = 0.002). Decreases in specific non-image interpretive tasks, including phone calls/paging (2.86/hr versus 0.80/hr), in-room consultations (1.36/hr versus 0.80/hr), and protocoling (0.99/hr versus 0.10/hr), were observed. The consult assistant experienced 29.4 task switching events per hour. Rates of specific non-image interpretive tasks for the CA were 6.41/hr for phone calls/paging, 3.60/hr for in-room consultations, and 3.83/hr for protocoling. Separating responsibilities into NIT and IIT workflows substantially increased image interpretation time and decreased TSEs for the primary fellow. Consolidation of NITs into a separate workflow may allow for more efficient task completion. Copyright © 2017 Elsevier Inc. All rights reserved.

  11. The myth of standardized workflow in primary care.

    Science.gov (United States)

    Holman, G Talley; Beasley, John W; Karsh, Ben-Tzion; Stone, Jamie A; Smith, Paul D; Wetterneck, Tosha B

    2016-01-01

    Primary care efficiency and quality are essential for the nation's health. The demands on primary care physicians (PCPs) are increasing as healthcare becomes more complex. A more complete understanding of PCP workflow variation is needed to guide future healthcare redesigns. This analysis evaluates workflow variation in terms of the sequence of tasks performed during patient visits. Two patient visits from 10 PCPs from 10 different United States Midwestern primary care clinics were analyzed to determine physician workflow. Tasks and the progressive sequence of those tasks were observed, documented, and coded by task category using a PCP task list. Variations in the sequence and prevalence of tasks at each stage of the primary care visit were assessed considering the physician, the patient, the visit's progression, and the presence of an electronic health record (EHR) at the clinic. PCP workflow during patient visits varies significantly, even for an individual physician, with no single or even common workflow pattern being present. The prevalence of specific tasks shifts significantly as primary care visits progress to their conclusion but, notably, PCPs collect patient information throughout the visit. PCP workflows were unpredictable during face-to-face patient visits. Workflow emerges as the result of a "dance" between physician and patient as their separate agendas are addressed, a side effect of patient-centered practice. Future healthcare redesigns should support a wide variety of task sequences to deliver high-quality primary care. The development of tools such as electronic health records must be based on the realities of primary care visits if they are to successfully support a PCP's mental and physical work, resulting in effective, safe, and efficient primary care. © The Author 2015. Published by Oxford University Press on behalf of the American Medical Informatics Association. All rights reserved. For Permissions, please email: journals.permissions@oup.com.

  12. A standard-enabled workflow for synthetic biology.

    Science.gov (United States)

    Myers, Chris J; Beal, Jacob; Gorochowski, Thomas E; Kuwahara, Hiroyuki; Madsen, Curtis; McLaughlin, James Alastair; Mısırlı, Göksel; Nguyen, Tramy; Oberortner, Ernst; Samineni, Meher; Wipat, Anil; Zhang, Michael; Zundel, Zach

    2017-06-15

    A synthetic biology workflow is composed of data repositories that provide information about genetic parts, sequence-level design tools to compose these parts into circuits, visualization tools to depict these designs, genetic design tools to select parts to create systems, and modeling and simulation tools to evaluate alternative design choices. Data standards enable the ready exchange of information within such a workflow, allowing repositories and tools to be connected from a diversity of sources. The present paper describes one such workflow that utilizes, among others, the Synthetic Biology Open Language (SBOL) to describe genetic designs, the Systems Biology Markup Language to model these designs, and SBOL Visual to visualize these designs. We describe how a standard-enabled workflow can be used to produce types of design information, including multiple repositories and software tools exchanging information using a variety of data standards. Recently, the ACS Synthetic Biology journal has recommended the use of SBOL in their publications. © 2017 The Author(s); published by Portland Press Limited on behalf of the Biochemical Society.

  13. xGDBvm: A Web GUI-Driven Workflow for Annotating Eukaryotic Genomes in the Cloud.

    Science.gov (United States)

    Duvick, Jon; Standage, Daniel S; Merchant, Nirav; Brendel, Volker P

    2016-04-01

    Genome-wide annotation of gene structure requires the integration of numerous computational steps. Currently, annotation is arguably best accomplished through collaboration of bioinformatics and domain experts, with broad community involvement. However, such a collaborative approach is not scalable at today's pace of sequence generation. To address this problem, we developed the xGDBvm software, which uses an intuitive graphical user interface to access a number of common genome analysis and gene structure tools, preconfigured in a self-contained virtual machine image. Once their virtual machine instance is deployed through iPlant's Atmosphere cloud services, users access the xGDBvm workflow via a unified Web interface to manage inputs, set program parameters, configure links to high-performance computing (HPC) resources, view and manage output, apply analysis and editing tools, or access contextual help. The xGDBvm workflow will mask the genome, compute spliced alignments from transcript and/or protein inputs (locally or on a remote HPC cluster), predict gene structures and gene structure quality, and display output in a public or private genome browser complete with accessory tools. Problematic gene predictions are flagged and can be reannotated using the integrated yrGATE annotation tool. xGDBvm can also be configured to append or replace existing data or load precomputed data. Multiple genomes can be annotated and displayed, and outputs can be archived for sharing or backup. xGDBvm can be adapted to a variety of use cases including de novo genome annotation, reannotation, comparison of different annotations, and training or teaching. © 2016 American Society of Plant Biologists. All rights reserved.

  14. Examining daily activity routines of older adults using workflow.

    Science.gov (United States)

    Chung, Jane; Ozkaynak, Mustafa; Demiris, George

    2017-07-01

    We evaluated the value of workflow analysis supported by a novel visualization technique to better understand the daily routines of older adults and highlight their patterns of daily activities and normal variability in physical functions. We used a self-reported activity diary to obtain data from six community-dwelling older adults for 14 consecutive days. Workflow for daily routine was analyzed using the EventFlow tool, which aggregates workflow information to highlight patterns and variabilities. A total of 1453 events were included in the data analysis. To demonstrate the patterns and variability of each individual's daily activities, participant activity workflows were visualized and compared. The workflow analysis revealed great variability in activity types, regularity, frequency, duration, and timing of performing certain activities across individuals. Also, when workflow approach was applied to spatial information of activities, the analysis revealed the ability to provide meaningful data on individuals' mobility in different levels of life spaces from home to community. Results suggest that using workflows to characterize the daily activities of older adults will be helpful for clinicians and researchers in understanding their daily routines and preparing education and prevention strategies tailored to each individual's activity level. This tool also has the potential to be integrated into consumer informatics technologies, such as patient portals or personal health records, so that consumers may be encouraged to become actively involved in monitoring and managing their health. Copyright © 2017 Elsevier Inc. All rights reserved.

  15. A three-level atomicity model for decentralized workflow management systems

    Science.gov (United States)

    Ben-Shaul, Israel Z.; Heineman, George T.

    1996-12-01

    A workflow management system (WFMS) employs a workflow manager (WM) to execute and automate the various activities within a workflow. To protect the consistency of data, the WM encapsulates each activity with a transaction; a transaction manager (TM) then guarantees the atomicity of activities. Since workflows often group several activities together, the TM is responsible for guaranteeing the atomicity of these units. There are scalability issues, however, with centralized WFMSs. Decentralized WFMSs provide an architecture for multiple autonomous WFMSs to interoperate, thus accommodating multiple workflows and geographically-dispersed teams. When atomic units are composed of activities spread across multiple WFMSs, however, there is a conflict between global atomicity and local autonomy of each WFMS. This paper describes a decentralized atomicity model that enables workflow administrators to specify the scope of multi-site atomicity based upon the desired semantics of multi-site tasks in the decentralized WFMS. We describe an architecture that realizes our model and execution paradigm.

  16. Distributed Workflow Service Composition Based on CTR Technology

    Science.gov (United States)

    Feng, Zhilin; Ye, Yanming

    Recently, WS-BPEL has gradually become the basis of a standard for web service description and composition. However, WS-BPEL cannot efficiently describe distributed workflow services for lacking of special expressive power and formal semantics. This paper presents a novel method for modeling distributed workflow service composition with Concurrent TRansaction logic (CTR). The syntactic structure of WS-BPEL and CTR are analyzed, and new rules of mapping WS-BPEL into CTR are given. A case study is put forward to show that the proposed method is appropriate for modeling workflow business services under distributed environments.

  17. Design and evaluation of a multimedia electronic patient record "oncoflow" with clinical workflow assistance for head and neck tumor therapy.

    Science.gov (United States)

    Meier, Jens; Boehm, Andreas; Kielhorn, Anne; Dietz, Andreas; Bohn, Stefan; Neumuth, Thomas

    2014-11-01

    The management of patient-specific information is a challenging task for surgeons and physicians because existing clinical information systems are insufficiently integrated into daily clinical routine and contained information entities are distributed across different proprietary databases. Thus, existing information is hardly usable for further electronic processing, workflow support or clinical studies. A Web-based clinical information system has been developed that automatically imports patient-specific information from different information systems. The system is tailored to the existing workflow for the treatment of patients with head and neck cancer. In this paper, the clinical assistance functions and a quantitative as well as a qualitative system evaluation are presented. The information system has been deployed at a clinical site and is in use in daily clinical routine. Two evaluation studies show that the information integration, the structured information presentation in the Web browser and the assistance functions improve the physician's workflow. The studies also show that the usage of the new information system does not impair the time physicians need for a process step compared with the usage of the existing information system. Information integration is crucial for efficient workflow support in the clinic. The central access to information within a modern and structured user interface saves valuable time for the physician. The comprehensive database allows an instant usage of the existing information clinical workflow support or the conduction of trial studies.

  18. Resilient workflows for computational mechanics platforms

    Science.gov (United States)

    Nguyên, Toàn; Trifan, Laurentiu; Désidéri, Jean-Antoine

    2010-06-01

    Workflow management systems have recently been the focus of much interest and many research and deployment for scientific applications worldwide [26, 27]. Their ability to abstract the applications by wrapping application codes have also stressed the usefulness of such systems for multidiscipline applications [23, 24]. When complex applications need to provide seamless interfaces hiding the technicalities of the computing infrastructures, their high-level modeling, monitoring and execution functionalities help giving production teams seamless and effective facilities [25, 31, 33]. Software integration infrastructures based on programming paradigms such as Python, Mathlab and Scilab have also provided evidence of the usefulness of such approaches for the tight coupling of multidisciplne application codes [22, 24]. Also high-performance computing based on multi-core multi-cluster infrastructures open new opportunities for more accurate, more extensive and effective robust multi-discipline simulations for the decades to come [28]. This supports the goal of full flight dynamics simulation for 3D aircraft models within the next decade, opening the way to virtual flight-tests and certification of aircraft in the future [23, 24, 29].

  19. Two-Layer Transaction Management for Workflow Management Applications

    NARCIS (Netherlands)

    Grefen, P.W.P.J.; Vonk, J.; Boertjes, E.M.; Apers, Peter M.G.

    Workflow management applications require advanced transaction management that is not offered by traditional database systems. For this reason, a number of extended transaction models has been proposed in the past. None of these models seems completely adequate, though, because workflow management

  20. [Does open access publishing increase the impact of scientific articles? An empirical study in the field of intensive care medicine].

    Science.gov (United States)

    Riera, M; Aibar, E

    2013-05-01

    Some studies suggest that open access articles are more often cited than non-open access articles. However, the relationship between open access and citations count in a discipline such as intensive care medicine has not been studied to date. The present article analyzes the effect of open access publishing of scientific articles in intensive care medicine journals in terms of citations count. We evaluated a total of 161 articles (76% being non-open access articles) published in Intensive Care Medicine in the year 2008. Citation data were compared between the two groups up until April 30, 2011. Potentially confounding variables for citation counts were adjusted for in a linear multiple regression model. The median number (interquartile range) of citations of non-open access articles was 8 (4-12) versus 9 (6-18) in the case of open access articles (p=0.084). In the highest citation range (>8), the citation count was 13 (10-16) and 18 (13-21) (p=0.008), respectively. The mean follow-up was 37.5 ± 3 months in both groups. In the 30-35 months after publication, the average number (mean ± standard deviation) of citations per article per month of non-open access articles was 0.28 ± 0.6 versus 0.38 ± 0.7 in the case of open access articles (p=0.043). Independent factors for citation advantage were the Hirsch index of the first signing author (β=0.207; p=0.015) and open access status (β=3.618; p=0.006). Open access publishing and the Hirsch index of the first signing author increase the impact of scientific articles. The open access advantage is greater for the more highly cited articles, and appears in the 30-35 months after publication. Copyright © 2012 Elsevier España, S.L. and SEMICYUC. All rights reserved.

  1. SciELO, Scientific Electronic Library Online, a Database of Open Access Journals

    Directory of Open Access Journals (Sweden)

    Rogerio Meneghini

    2013-09-01

    Full Text Available   This essay discusses SciELO, a scientific journal database operating in 14 countries. It covers over 1000 journals providing open access to full text and table sets of scientometrics data. In Brazil it is responsible for a collection of nearly 300 journals, selected along 15 years as the best Brazilian periodicals in natural and social sciences. Nonetheless, they still are national journal in the sense that over 80% of the articles are published by Brazilian scientists. Important initiatives focused on professionalization and internationalization are considered to bring these journals to a higher level of quality and visibility. DOI: 10.18870/hlrc.v3i3.153

  2. CO2 Storage Feasibility: A Workflow for Site Characterisation

    Directory of Open Access Journals (Sweden)

    Nepveu Manuel

    2015-04-01

    Full Text Available In this paper, we present an overview of the SiteChar workflow model for site characterisation and assessment for CO2 storage. Site characterisation and assessment is required when permits are requested from the legal authorities in the process of starting a CO2 storage process at a given site. The goal is to assess whether a proposed CO2 storage site can indeed be used for permanent storage while meeting the safety requirements demanded by the European Commission (EC Storage Directive (9, Storage Directive 2009/31/EC. Many issues have to be scrutinised, and the workflow presented here is put forward to help efficiently organise this complex task. Three issues are highlighted: communication within the working team and with the authorities; interdependencies in the workflow and feedback loops; and the risk-based character of the workflow. A general overview (helicopter view of the workflow is given; the issues involved in communication and the risk assessment process are described in more detail. The workflow as described has been tested within the SiteChar project on five potential storage sites throughout Europe. This resulted in a list of key aspects of site characterisation which can help prepare and focus new site characterisation studies.

  3. Reasoning about repairability of workflows at design time

    NARCIS (Netherlands)

    Tagni, Gaston; Ten Teije, Annette; Van Harmelen, Frank

    2009-01-01

    This paper describes an approach for reasoning about the repairability of workflows at design time. We propose a heuristic-based analysis of a workflow that aims at evaluating its definition, considering different design aspects and characteristics that affect its repairability (called repairability

  4. Distributed Global Transaction Support for Workflow Management Applications

    NARCIS (Netherlands)

    Vonk, J.; Grefen, P.W.P.J.; Boertjes, E.M.; Apers, Peter M.G.

    Workflow management systems require advanced transaction support to cope with their inherently long-running processes. The recent trend to distribute workflow executions requires an even more advanced transaction support system that is able to handle distribution. This paper presents a model as well

  5. Flexible End2End Workflow Automation of Hit-Discovery Research.

    Science.gov (United States)

    Holzmüller-Laue, Silke; Göde, Bernd; Thurow, Kerstin

    2014-08-01

    The article considers a new approach of more complex laboratory automation at the workflow layer. The authors purpose the automation of end2end workflows. The combination of all relevant subprocesses-whether automated or manually performed, independently, and in which organizational unit-results in end2end processes that include all result dependencies. The end2end approach focuses on not only the classical experiments in synthesis or screening, but also on auxiliary processes such as the production and storage of chemicals, cell culturing, and maintenance as well as preparatory activities and analyses of experiments. Furthermore, the connection of control flow and data flow in the same process model leads to reducing of effort of the data transfer between the involved systems, including the necessary data transformations. This end2end laboratory automation can be realized effectively with the modern methods of business process management (BPM). This approach is based on a new standardization of the process-modeling notation Business Process Model and Notation 2.0. In drug discovery, several scientific disciplines act together with manifold modern methods, technologies, and a wide range of automated instruments for the discovery and design of target-based drugs. The article discusses the novel BPM-based automation concept with an implemented example of a high-throughput screening of previously synthesized compound libraries. © 2014 Society for Laboratory Automation and Screening.

  6. Decaf: Decoupled Dataflows for In Situ High-Performance Workflows

    Energy Technology Data Exchange (ETDEWEB)

    Dreher, M.; Peterka, T.

    2017-07-31

    Decaf is a dataflow system for the parallel communication of coupled tasks in an HPC workflow. The dataflow can perform arbitrary data transformations ranging from simply forwarding data to complex data redistribution. Decaf does this by allowing the user to allocate resources and execute custom code in the dataflow. All communication through the dataflow is efficient parallel message passing over MPI. The runtime for calling tasks is entirely message-driven; Decaf executes a task when all messages for the task have been received. Such a messagedriven runtime allows cyclic task dependencies in the workflow graph, for example, to enact computational steering based on the result of downstream tasks. Decaf includes a simple Python API for describing the workflow graph. This allows Decaf to stand alone as a complete workflow system, but Decaf can also be used as the dataflow layer by one or more other workflow systems to form a heterogeneous task-based computing environment. In one experiment, we couple a molecular dynamics code with a visualization tool using the FlowVR and Damaris workflow systems and Decaf for the dataflow. In another experiment, we test the coupling of a cosmology code with Voronoi tessellation and density estimation codes using MPI for the simulation, the DIY programming model for the two analysis codes, and Decaf for the dataflow. Such workflows consisting of heterogeneous software infrastructures exist because components are developed separately with different programming models and runtimes, and this is the first time that such heterogeneous coupling of diverse components was demonstrated in situ on HPC systems.

  7. Schedule-Aware Workflow Management Systems

    Science.gov (United States)

    Mans, Ronny S.; Russell, Nick C.; van der Aalst, Wil M. P.; Moleman, Arnold J.; Bakker, Piet J. M.

    Contemporary workflow management systems offer work-items to users through specific work-lists. Users select the work-items they will perform without having a specific schedule in mind. However, in many environments work needs to be scheduled and performed at particular times. For example, in hospitals many work-items are linked to appointments, e.g., a doctor cannot perform surgery without reserving an operating theater and making sure that the patient is present. One of the problems when applying workflow technology in such domains is the lack of calendar-based scheduling support. In this paper, we present an approach that supports the seamless integration of unscheduled (flow) and scheduled (schedule) tasks. Using CPN Tools we have developed a specification and simulation model for schedule-aware workflow management systems. Based on this a system has been realized that uses YAWL, Microsoft Exchange Server 2007, Outlook, and a dedicated scheduling service. The approach is illustrated using a real-life case study at the AMC hospital in the Netherlands. In addition, we elaborate on the experiences obtained when developing and implementing a system of this scale using formal techniques.

  8. Building and documenting workflows with python-based snakemake

    NARCIS (Netherlands)

    J. Köster (Johannes); S. Rahmann (Sven)

    2012-01-01

    textabstractSnakemake is a novel workflow engine with a simple Python-derived workflow definition language and an optimizing execution environment. It is the first system that supports multiple named wildcards (or variables) in input and output filenames of each rule definition. It also allows to

  9. Logical provenance in data-oriented workflows?

    KAUST Repository

    Ikeda, R.

    2013-04-01

    We consider the problem of defining, generating, and tracing provenance in data-oriented workflows, in which input data sets are processed by a graph of transformations to produce output results. We first give a new general definition of provenance for general transformations, introducing the notions of correctness, precision, and minimality. We then determine when properties such as correctness and minimality carry over from the individual transformations\\' provenance to the workflow provenance. We describe a simple logical-provenance specification language consisting of attribute mappings and filters. We provide an algorithm for provenance tracing in workflows where logical provenance for each transformation is specified using our language. We consider logical provenance in the relational setting, observing that for a class of Select-Project-Join (SPJ) transformations, logical provenance specifications encode minimal provenance. We have built a prototype system supporting the features and algorithms presented in the paper, and we report a few preliminary experimental results. © 2013 IEEE.

  10. Impact of CGNS on CFD Workflow

    Science.gov (United States)

    Poinot, M.; Rumsey, C. L.; Mani, M.

    2004-01-01

    CFD tools are an integral part of industrial and research processes, for which the amount of data is increasing at a high rate. These data are used in a multi-disciplinary fluid dynamics environment, including structural, thermal, chemical or even electrical topics. We show that the data specification is an important challenge that must be tackled to achieve an efficient workflow for use in this environment. We compare the process with other software techniques, such as network or database type, where past experiences showed how difficult it was to bridge the gap between completely general specifications and dedicated specific applications. We show two aspects of the use of CFD General Notation System (CGNS) that impact CFD workflow: as a data specification framework and as a data storage means. Then, we give examples of projects involving CFD workflows where the use of the CGNS standard leads to a useful method either for data specification, exchange, or storage.

  11. Detecting dissonance in clinical and research workflow for translational psychiatric registries.

    Science.gov (United States)

    Cofiel, Luciana; Bassi, Débora U; Ray, Ryan Kumar; Pietrobon, Ricardo; Brentani, Helena

    2013-01-01

    The interplay between the workflow for clinical tasks and research data collection is often overlooked, ultimately making it ineffective. To the best of our knowledge, no previous studies have developed standards that allow for the comparison of workflow models derived from clinical and research tasks toward the improvement of data collection processes. In this study we used the term dissonance for the occurrences where there was a discord between clinical and research workflows. We developed workflow models for a translational research study in psychiatry and the clinic where its data collection was carried out. After identifying points of dissonance between clinical and research models we derived a corresponding classification system that ultimately enabled us to re-engineer the data collection workflow. We considered (1) the number of patients approached for enrollment and (2) the number of patients enrolled in the study as indicators of efficiency in research workflow. We also recorded the number of dissonances before and after the workflow modification. We identified 22 episodes of dissonance across 6 dissonance categories: actor, communication, information, artifact, time, and space. We were able to eliminate 18 episodes of dissonance and increase the number of patients approached and enrolled in research study trough workflow modification. The classification developed in this study is useful for guiding the identification of dissonances and reveal modifications required to align the workflow of data collection and the clinical setting. The methodology described in this study can be used by researchers to standardize data collection process.

  12. Automating Registration of Digital Preservation Copies: The Place of Registries in the Digitization Workflow

    Directory of Open Access Journals (Sweden)

    William Carney

    2008-03-01

    Full Text Available I would like to thank LIBER and EBLIDA for inviting me to present this paper on the role of registries in the digitization workflow. During the past 18 months, OCLC has been working on a project to synchronize WorldCat with mass digitization projects, which we will begin to pilot shortly. The concept is to educate WorldCat about the millions of new digital manifestations for print items being produced. During the past 35 years, librarians have built a comprehensive representation of print materials and holdings in WorldCat item-by-item. However, as we move into a more digital world through the production of born-digital materials and the high-volume reformatting of our print heritage, it is impractical to catalog these new manifestations via traditional workflows. The OCLC eContent Synchronization program is one example of how OCLC is moving to address the need to ingest metadata representing digital works on an industrial scale. Through strategic alliances with key digital content producers and automated processing, new digital surrogate records will be created to increase the visibility of and access to content at the point of need. While the eContent Synchronization program is an important initiative for visibility and access, of equal importance is the process of registering the existence of the preservation copies of these digital items.

  13. Integrating prediction, provenance, and optimization into high energy workflows

    Energy Technology Data Exchange (ETDEWEB)

    Schram, M.; Bansal, V.; Friese, R. D.; Tallent, N. R.; Yin, J.; Barker, K. J.; Stephan, E.; Halappanavar, M.; Kerbyson, D. J.

    2017-10-01

    We propose a novel approach for efficient execution of workflows on distributed resources. The key components of this framework include: performance modeling to quantitatively predict workflow component behavior; optimization-based scheduling such as choosing an optimal subset of resources to meet demand and assignment of tasks to resources; distributed I/O optimizations such as prefetching; and provenance methods for collecting performance data. In preliminary results, these techniques improve throughput on a small Belle II workflow by 20%.

  14. The P2P approach to interorganizational workflows

    NARCIS (Netherlands)

    Aalst, van der W.M.P.; Weske, M.H.; Dittrich, K.R.; Geppert, A.; Norrie, M.C.

    2001-01-01

    This paper describes in an informal way the Public-To-Private (P2P) approach to interorganizational workflows, which is based on a notion of inheritance. The approach consists of three steps: (1) create a common understanding of the interorganizational workflow by specifying a shared public

  15. Open source workflow : a viable direction for BPM?

    NARCIS (Netherlands)

    Wohed, P.; Russell, N.C.; Hofstede, ter A.H.M.; Andersson, B.; Aalst, van der W.M.P.; Bellahsène, Z.; Léonard, M.

    2008-01-01

    With the growing interest in open source software in general and business process management and workflow systems in particular, it is worthwhile investigating the state of open source workflow management. The plethora of these offerings (recent surveys such as [4,6], each contain more than 30 such

  16. Privacy-aware workflow management

    NARCIS (Netherlands)

    Alhaqbani, B.; Adams, M.; Fidge, C.J.; Hofstede, ter A.H.M.; Glykas, M.

    2013-01-01

    Information security policies play an important role in achieving information security. Confidentiality, Integrity, and Availability are classic information security goals attained by enforcing appropriate security policies. Workflow Management Systems (WfMSs) also benefit from inclusion of these

  17. DeepGreen - Entwicklung eines rechtssicheren Workflows zur effizienten Umsetzung der Open-Access-Komponente in den Allianz-Lizenzen für die Wissenschaft

    Directory of Open Access Journals (Sweden)

    Markus Putnings

    2016-12-01

    the alliance licences contracted since 2011 has shown that entitled authors make almost no use of their open access rights. Thus, a tremendous amount of scientific literature has yet to be uncovered from publishers’ closed access systems. The project DeepGreen, approved by DFG (based on the 2014 initiative „Open Access Transformation“, seeks to establish a convenient and automated technical infrastructure to support existing open access agreements. The intended scenario is that publishers will be required to deliver periodically all publications eligible for open access through defined interfaces, instead of authors (or their respective libraries having to upload these items manually into corresponding open access repositories. To this end, the members of the project consortium (University Libraries of Friedrich-Alexander University Erlangen-Nürnberg (FAU and TU Berlin, Helmholtz Open Science Office at the German GeoResearch Centre, Bavarian State Library, and two Librarian Network Organizations, BVB and KOBV will build the platform DeepGreen, essentially a dark archive, into which publications and metadata are fed periodically according to the publishers’ contracted alliance licences. In turn, the platform Deep-Green will deliver these publications to the legitimate repositories automatically. Two publishers, Karger Publishers and SAGE Publications, have agreed to pilot the project as associated partners. This paper presents the project and the current status of the work.

  18. Verification of Timed Healthcare Workflows Using Component Timed-Arc Petri Nets

    DEFF Research Database (Denmark)

    Bertolini, Cristiano; Liu, Zhiming; Srba, Jiri

    2013-01-01

    Workflows in modern healthcare systems are becoming increasingly complex and their execution involves concurrency and sharing of resources. The definition, analysis and management of collaborative healthcare workflows requires abstract model notations with a precisely defined semantics and a supp......Workflows in modern healthcare systems are becoming increasingly complex and their execution involves concurrency and sharing of resources. The definition, analysis and management of collaborative healthcare workflows requires abstract model notations with a precisely defined semantics...

  19. Data partitioning enables the use of standard SOAP Web Services in genome-scale workflows.

    Science.gov (United States)

    Sztromwasser, Pawel; Puntervoll, Pål; Petersen, Kjell

    2011-07-26

    Biological databases and computational biology tools are provided by research groups around the world, and made accessible on the Web. Combining these resources is a common practice in bioinformatics, but integration of heterogeneous and often distributed tools and datasets can be challenging. To date, this challenge has been commonly addressed in a pragmatic way, by tedious and error-prone scripting. Recently however a more reliable technique has been identified and proposed as the platform that would tie together bioinformatics resources, namely Web Services. In the last decade the Web Services have spread wide in bioinformatics, and earned the title of recommended technology. However, in the era of high-throughput experimentation, a major concern regarding Web Services is their ability to handle large-scale data traffic. We propose a stream-like communication pattern for standard SOAP Web Services, that enables efficient flow of large data traffic between a workflow orchestrator and Web Services. We evaluated the data-partitioning strategy by comparing it with typical communication patterns on an example pipeline for genomic sequence annotation. The results show that data-partitioning lowers resource demands of services and increases their throughput, which in consequence allows to execute in-silico experiments on genome-scale, using standard SOAP Web Services and workflows. As a proof-of-principle we annotated an RNA-seq dataset using a plain BPEL workflow engine.

  20. Data partitioning enables the use of standard SOAP Web Services in genome-scale workflows

    Directory of Open Access Journals (Sweden)

    Sztromwasser Paweł

    2011-06-01

    Full Text Available Biological databases and computational biology tools are provided by research groups around the world, and made accessible on the Web. Combining these resources is a common practice in bioinformatics, but integration of heterogeneous and often distributed tools and datasets can be challenging. To date, this challenge has been commonly addressed in a pragmatic way, by tedious and error-prone scripting. Recently however a more reliable technique has been identified and proposed as the platform that would tie together bioinformatics resources, namely Web Services. In the last decade the Web Services have spread wide in bioinformatics, and earned the title of recommended technology. However, in the era of high-throughput experimentation, a major concern regarding Web Services is their ability to handle large-scale data traffic. We propose a stream-like communication pattern for standard SOAP Web Services, that enables efficient flow of large data traffic between a workflow orchestrator and Web Services. We evaluated the data-partitioning strategy by comparing it with typical communication patterns on an example pipeline for genomic sequence annotation. The results show that data-partitioning lowers resource demands of services and increases their throughput, which in consequence allows to execute in-silico experiments on genome-scale, using standard SOAP Web Services and workflows. As a proof-of-principle we annotated an RNA-seq dataset using a plain BPEL workflow engine.

  1. Building transparent data access for ocean observatories: Coordination of U.S. IOOS DMAC with NSF's OOI Cyberinfrastructure

    Science.gov (United States)

    Arrott, M.; Alexander, Corrine; Graybeal, J.; Mueller, C.; Signell, R.; de La Beaujardière, J.; Taylor, A.; Wilkin, J.; Powell, B.; Orcutt, J.

    2011-01-01

    The NOAA-led U.S. Integrated Ocean Observing System (IOOS) and the National Science Foundation's Ocean Observatories Initiative (OOI) have been collaborating since 2007 on advanced tools and technologies that ensure open access to ocean observations and models. Initial collaboration focused on serving ocean data via cloud computing-a key component of the OOI cyberinfrastructure (CI) architecture. As the OOI transitioned from planning to execution in the Fall of 2009, an OOI/IOOS team developed a customer-based "use case" to align more closely with the emerging objectives of OOI-CI team's first software release scheduled for Summer 2011 and provide a quantitative capacity for stress-testing these tools and protocols. A requirements process was initiated with coastal modelers, focusing on improved workflows to deliver ocean observation data. Accomplishments to date include the documentation and assessment of scientific workflows for two "early adopter" modeling teams from IOOS Regional partners (Rutgers-the State University of New Jersey and University of Hawaii's School of Ocean and Earth Science and Technology) to enable full understanding of data sources and needs; generation of all-inclusive lists of the data sets required and those obtainable through IOOS; a more complete understanding of areas where IOOS can expand data access capabilities to better serve the needs of the modeling community; and development of "data set agents" (software) to facilitate data acquisition from numerous data providers and conversions of the data format to the OOI-CI canonical form. ?? 2011 MTS.

  2. Data intensive ATLAS workflows in the Cloud

    CERN Document Server

    Rzehorz, Gerhard Ferdinand; The ATLAS collaboration

    2016-01-01

    This contribution reports on the feasibility of executing data intensive workflows on Cloud infrastructures. In order to assess this, the metric ETC = Events/Time/Cost is formed, which quantifies the different workflow and infrastructure configurations that are tested against each other. In these tests ATLAS reconstruction Jobs are run, examining the effects of overcommitting (more parallel processes running than CPU cores available), scheduling (staggered execution) and scaling (number of cores). The desirability of commissioning storage in the cloud is evaluated, in conjunction with a simple analytical model of the system, and correlated with questions about the network bandwidth, caches and what kind of storage to utilise. In the end a cost/benefit evaluation of different infrastructure configurations and workflows is undertaken, with the goal to find the maximum of the ETC value

  3. Data intensive ATLAS workflows in the Cloud

    CERN Document Server

    AUTHOR|(INSPIRE)INSPIRE-00396985; The ATLAS collaboration; Keeble, Oliver; Quadt, Arnulf; Kawamura, Gen

    2017-01-01

    This contribution reports on the feasibility of executing data intensive workflows on Cloud infrastructures. In order to assess this, the metric ETC = Events/Time/Cost is formed, which quantifies the different workflow and infrastructure configurations that are tested against each other. In these tests ATLAS reconstruction Jobs are run, examining the effects of overcommitting (more parallel processes running than CPU cores available), scheduling (staggered execution) and scaling (number of cores). The desirability of commissioning storage in the Cloud is evaluated, in conjunction with a simple analytical model of the system, and correlated with questions about the network bandwidth, caches and what kind of storage to utilise. In the end a cost/benefit evaluation of different infrastructure configurations and workflows is undertaken, with the goal to find the maximum of the ETC value.

  4. xGDBvm: A Web GUI-Driven Workflow for Annotating Eukaryotic Genomes in the Cloud[OPEN

    Science.gov (United States)

    Merchant, Nirav

    2016-01-01

    Genome-wide annotation of gene structure requires the integration of numerous computational steps. Currently, annotation is arguably best accomplished through collaboration of bioinformatics and domain experts, with broad community involvement. However, such a collaborative approach is not scalable at today’s pace of sequence generation. To address this problem, we developed the xGDBvm software, which uses an intuitive graphical user interface to access a number of common genome analysis and gene structure tools, preconfigured in a self-contained virtual machine image. Once their virtual machine instance is deployed through iPlant’s Atmosphere cloud services, users access the xGDBvm workflow via a unified Web interface to manage inputs, set program parameters, configure links to high-performance computing (HPC) resources, view and manage output, apply analysis and editing tools, or access contextual help. The xGDBvm workflow will mask the genome, compute spliced alignments from transcript and/or protein inputs (locally or on a remote HPC cluster), predict gene structures and gene structure quality, and display output in a public or private genome browser complete with accessory tools. Problematic gene predictions are flagged and can be reannotated using the integrated yrGATE annotation tool. xGDBvm can also be configured to append or replace existing data or load precomputed data. Multiple genomes can be annotated and displayed, and outputs can be archived for sharing or backup. xGDBvm can be adapted to a variety of use cases including de novo genome annotation, reannotation, comparison of different annotations, and training or teaching. PMID:27020957

  5. Quantitative workflow based on NN for weighting criteria in landfill suitability mapping

    Science.gov (United States)

    Abujayyab, Sohaib K. M.; Ahamad, Mohd Sanusi S.; Yahya, Ahmad Shukri; Ahmad, Siti Zubaidah; Alkhasawneh, Mutasem Sh.; Aziz, Hamidi Abdul

    2017-10-01

    Our study aims to introduce a new quantitative workflow that integrates neural networks (NNs) and multi criteria decision analysis (MCDA). Existing MCDA workflows reveal a number of drawbacks, because of the reliance on human knowledge in the weighting stage. Thus, new workflow presented to form suitability maps at the regional scale for solid waste planning based on NNs. A feed-forward neural network employed in the workflow. A total of 34 criteria were pre-processed to establish the input dataset for NN modelling. The final learned network used to acquire the weights of the criteria. Accuracies of 95.2% and 93.2% achieved for the training dataset and testing dataset, respectively. The workflow was found to be capable of reducing human interference to generate highly reliable maps. The proposed workflow reveals the applicability of NN in generating landfill suitability maps and the feasibility of integrating them with existing MCDA workflows.

  6. Quantitative analysis of probabilistic BPMN workflows

    DEFF Research Database (Denmark)

    Herbert, Luke Thomas; Sharp, Robin

    2012-01-01

    We present a framework for modelling and analysis of realworld business workflows. We present a formalised core subset of the Business Process Modelling and Notation (BPMN) and then proceed to extend this language with probabilistic nondeterministic branching and general-purpose reward annotations...... of events, reward-based properties and best- and worst- case scenarios. We develop a simple example of medical workflow and demonstrate the utility of this analysis in accurate provisioning of drug stocks. Finally, we suggest a path to building upon these techniques to cover the entire BPMN language, allow...

  7. Using CyberShake Workflows to Manage Big Seismic Hazard Data on Large-Scale Open-Science HPC Resources

    Science.gov (United States)

    Callaghan, S.; Maechling, P. J.; Juve, G.; Vahi, K.; Deelman, E.; Jordan, T. H.

    2015-12-01

    The CyberShake computational platform, developed by the Southern California Earthquake Center (SCEC), is an integrated collection of scientific software and middleware that performs 3D physics-based probabilistic seismic hazard analysis (PSHA) for Southern California. CyberShake integrates large-scale and high-throughput research codes to produce probabilistic seismic hazard curves for individual locations of interest and hazard maps for an entire region. A recent CyberShake calculation produced about 500,000 two-component seismograms for each of 336 locations, resulting in over 300 million synthetic seismograms in a Los Angeles-area probabilistic seismic hazard model. CyberShake calculations require a series of scientific software programs. Early computational stages produce data used as inputs by later stages, so we describe CyberShake calculations using a workflow definition language. Scientific workflow tools automate and manage the input and output data and enable remote job execution on large-scale HPC systems. To satisfy the requests of broad impact users of CyberShake data, such as seismologists, utility companies, and building code engineers, we successfully completed CyberShake Study 15.4 in April and May 2015, calculating a 1 Hz urban seismic hazard map for Los Angeles. We distributed the calculation between the NSF Track 1 system NCSA Blue Waters, the DOE Leadership-class system OLCF Titan, and USC's Center for High Performance Computing. This study ran for over 5 weeks, burning about 1.1 million node-hours and producing over half a petabyte of data. The CyberShake Study 15.4 results doubled the maximum simulated seismic frequency from 0.5 Hz to 1.0 Hz as compared to previous studies, representing a factor of 16 increase in computational complexity. We will describe how our workflow tools supported splitting the calculation across multiple systems. We will explain how we modified CyberShake software components, including GPU implementations and

  8. Dynamic file-access characteristics of a production parallel scientific workload

    Science.gov (United States)

    Kotz, David; Nieuwejaar, Nils

    1994-01-01

    Multiprocessors have permitted astounding increases in computational performance, but many cannot meet the intense I/O requirements of some scientific applications. An important component of any solution to this I/O bottleneck is a parallel file system that can provide high-bandwidth access to tremendous amounts of data in parallel to hundreds or thousands of processors. Most successful systems are based on a solid understanding of the expected workload, but thus far there have been no comprehensive workload characterizations of multiprocessor file systems. This paper presents the results of a three week tracing study in which all file-related activity on a massively parallel computer was recorded. Our instrumentation differs from previous efforts in that it collects information about every I/O request and about the mix of jobs running in a production environment. We also present the results of a trace-driven caching simulation and recommendations for designers of multiprocessor file systems.

  9. Contract-Based Transaction Management in Cross-Organizational Workflow Management

    NARCIS (Netherlands)

    Grefen, P.W.P.J.

    Cross-organizational workflow management is an essential ingredient for process integration in virtual enterprises. To obtain cross-organizational workflow processes with robust semantics, these processes should be supported by highlevel cross-organizational transaction management. In this context,

  10. Exploring Dental Providers' Workflow in an Electronic Dental Record Environment.

    Science.gov (United States)

    Schwei, Kelsey M; Cooper, Ryan; Mahnke, Andrea N; Ye, Zhan; Acharya, Amit

    2016-01-01

    A workflow is defined as a predefined set of work steps and partial ordering of these steps in any environment to achieve the expected outcome. Few studies have investigated the workflow of providers in a dental office. It is important to understand the interaction of dental providers with the existing technologies at point of care to assess breakdown in the workflow which could contribute to better technology designs. The study objective was to assess electronic dental record (EDR) workflows using time and motion methodology in order to identify breakdowns and opportunities for process improvement. A time and motion methodology was used to study the human-computer interaction and workflow of dental providers with an EDR in four dental centers at a large healthcare organization. A data collection tool was developed to capture the workflow of dental providers and staff while they interacted with an EDR during initial, planned, and emergency patient visits, and at the front desk. Qualitative and quantitative analysis was conducted on the observational data. Breakdowns in workflow were identified while posting charges, viewing radiographs, e-prescribing, and interacting with patient scheduler. EDR interaction time was significantly different between dentists and dental assistants (6:20 min vs. 10:57 min, p = 0.013) and between dentists and dental hygienists (6:20 min vs. 9:36 min, p = 0.003). On average, a dentist spent far less time than dental assistants and dental hygienists in data recording within the EDR.

  11. Web-video-mining-supported workflow modeling for laparoscopic surgeries.

    Science.gov (United States)

    Liu, Rui; Zhang, Xiaoli; Zhang, Hao

    2016-11-01

    As quality assurance is of strong concern in advanced surgeries, intelligent surgical systems are expected to have knowledge such as the knowledge of the surgical workflow model (SWM) to support their intuitive cooperation with surgeons. For generating a robust and reliable SWM, a large amount of training data is required. However, training data collected by physically recording surgery operations is often limited and data collection is time-consuming and labor-intensive, severely influencing knowledge scalability of the surgical systems. The objective of this research is to solve the knowledge scalability problem in surgical workflow modeling with a low cost and labor efficient way. A novel web-video-mining-supported surgical workflow modeling (webSWM) method is developed. A novel video quality analysis method based on topic analysis and sentiment analysis techniques is developed to select high-quality videos from abundant and noisy web videos. A statistical learning method is then used to build the workflow model based on the selected videos. To test the effectiveness of the webSWM method, 250 web videos were mined to generate a surgical workflow for the robotic cholecystectomy surgery. The generated workflow was evaluated by 4 web-retrieved videos and 4 operation-room-recorded videos, respectively. The evaluation results (video selection consistency n-index ≥0.60; surgical workflow matching degree ≥0.84) proved the effectiveness of the webSWM method in generating robust and reliable SWM knowledge by mining web videos. With the webSWM method, abundant web videos were selected and a reliable SWM was modeled in a short time with low labor cost. Satisfied performances in mining web videos and learning surgery-related knowledge show that the webSWM method is promising in scaling knowledge for intelligent surgical systems. Copyright © 2016 Elsevier B.V. All rights reserved.

  12. Open Access Centre at the Nature Research Centre: a facility for enhancement of scientific research, education and public outreach in Lithuania

    Science.gov (United States)

    Šerpenskienė, Silvija; Skridlaitė, Gražina

    2014-05-01

    Open Access Centre (OAC) was established in Vilnius, Lithuania in 2013 as a subdivision of the Nature Research Centre (NRC) operating on the principle of open access for both internal and external users. The OAC consists of 15 units, i.e. 15 NRC laboratories or their branches. Forty four sets of research equipment were purchased. The OAC cooperates with Lithuanian science and studies institutions, business sector and other governmental and public institutions. Investigations can be carried in the Geosciences, Biotaxonomy, Ecology and Molecular Research, and Ecotoxicology fields. Environmental radioactivity, radioecology, nuclear geophysics, microscopic and chemical composition of natural compounds (minerals, rocks etc.), paleomagnetic, magnetic and environmental investigations, as well as ground and water contamination by oil products and other organic environment polluting compounds, identification of fossils, rocks and minerals can be studied in the Georesearch field. Ecosystems and identification of plants, animals and microorganisms are main subjects of the Biotaxonomy, Ecology and Molecular Research field. The Ecotoxicologal Research deals with toxic and genotoxic effects of toxic substances and other sources of pollution on macro- and microorganisms and cell cultures. Open access is guaranteed by: (1) providing scientific research and experimental development services; (2) implementing joint business and science projects; (3) using facilities for the training of specialists of the highest qualifications; (4) providing properly qualified and technically trained users with opportunities to carry out their scientific research and/or experiments in the OAC laboratories by themselves. Services provided in the Open Access Centre can be received by both internal and external users: persons undertaking innovative economic activities, students of other educational institutions, interns, external teams of researchers engaged in scientific research activities, teachers

  13. The impact of electronic medical record systems on outpatient workflows: a longitudinal evaluation of its workflow effects.

    Science.gov (United States)

    Vishwanath, Arun; Singh, Sandeep Rajan; Winkelstein, Peter

    2010-11-01

    The promise of the electronic medical record (EMR) lies in its ability to reduce the costs of health care delivery and improve the overall quality of care--a promise that is realized through major changes in workflows within the health care organization. Yet little systematic information exists about the workflow effects of EMRs. Moreover, some of the research to-date points to reduced satisfaction among physicians after implementation of the EMR and increased time, i.e., negative workflow effects. A better understanding of the impact of the EMR on workflows is, hence, vital to understanding what the technology really does offer that is new and unique. (i) To empirically develop a physician centric conceptual model of the workflow effects of EMRs; (ii) To use the model to understand the antecedents to the physicians' workflow expectation from the new EMR; (iii) To track physicians' satisfaction overtime, 3 months and 20 months after implementation of the EMR; (iv) To explore the impact of technology learning curves on physicians' reported satisfaction levels. The current research uses the mixed-method technique of concept mapping to empirically develop the conceptual model of an EMR's workflow effects. The model is then used within a controlled study to track physician expectations from a new EMR system as well as their assessments of the EMR's performance 3 months and 20 months after implementation. The research tracks the actual implementation of a new EMR within the outpatient clinics of a large northeastern research hospital. The pre-implementation survey netted 20 physician responses; post-implementation Time 1 survey netted 22 responses, and Time 2 survey netted 26 physician responses. The implementation of the actual EMR served as the intervention. Since the study was conducted within the same setting and tracked a homogenous group of respondents, the overall study design ensured against extraneous influences on the results. Outcome measures were derived

  14. Modeling, Design, and Implementation of a Cloud Workflow Engine Based on Aneka

    OpenAIRE

    Zhou, Jiantao; Sun, Chaoxin; Fu, Weina; Liu, Jing; Jia, Lei; Tan, Hongyan

    2014-01-01

    This paper presents a Petri net-based model for cloud workflow which plays a key role in industry. Three kinds of parallelisms in cloud workflow are characterized and modeled. Based on the analysis of the modeling, a cloud workflow engine is designed and implemented in Aneka cloud environment. The experimental results validate the effectiveness of our approach of modeling, design, and implementation of cloud workflow.

  15. From Design to Production Control Through the Integration of Engineering Data Management and Workflow Management Systems

    CERN Document Server

    Le Goff, J M; Bityukov, S; Estrella, F; Kovács, Z; Le Flour, T; Lieunard, S; McClatchey, R; Murray, S; Organtini, G; Vialle, J P; Bazan, A; Chevenier, G

    1997-01-01

    At a time when many companies are under pressure to reduce "times-to-market" the management of product information from the early stages of design through assembly to manufacture and production has become increasingly important. Similarly in the construction of high energy physics devices the collection of ( often evolving) engineering data is central to the subsequent physics analysis. Traditionally in industry design engineers have employed Engineering Data Management Systems ( also called Product Data Management Systems) to coordinate and control access to documented versions of product designs. However, these systems provide control only at the collaborative design level and are seldom used beyond design. Workflow management systems, on the other hand, are employed in industry to coordinate and support the more complex and repeatable work processes of the production environment. Commer cial workflow products cannot support the highly dynamic activities found both in the design stages of product developmen...

  16. A Workflow to Improve the Alignment of Prostate Imaging with Whole-mount Histopathology.

    Science.gov (United States)

    Yamamoto, Hidekazu; Nir, Dror; Vyas, Lona; Chang, Richard T; Popert, Rick; Cahill, Declan; Challacombe, Ben; Dasgupta, Prokar; Chandra, Ashish

    2014-08-01

    Evaluation of prostate imaging tests against whole-mount histology specimens requires accurate alignment between radiologic and histologic data sets. Misalignment results in false-positive and -negative zones as assessed by imaging. We describe a workflow for three-dimensional alignment of prostate imaging data against whole-mount prostatectomy reference specimens and assess its performance against a standard workflow. Ethical approval was granted. Patients underwent motorized transrectal ultrasound (Prostate Histoscanning) to generate a three-dimensional image of the prostate before radical prostatectomy. The test workflow incorporated steps for axial alignment between imaging and histology, size adjustments following formalin fixation, and use of custom-made parallel cutters and digital caliper instruments. The control workflow comprised freehand cutting and assumed homogeneous block thicknesses at the same relative angles between pathology and imaging sections. Thirty radical prostatectomy specimens were histologically and radiologically processed, either by an alignment-optimized workflow (n = 20) or a control workflow (n = 10). The optimized workflow generated tissue blocks of heterogeneous thicknesses but with no significant drifting in the cutting plane. The control workflow resulted in significantly nonparallel blocks, accurately matching only one out of four histology blocks to their respective imaging data. The image-to-histology alignment accuracy was 20% greater in the optimized workflow (P alignment was observed in the optimized workflow. Evaluation of prostate imaging biomarkers using whole-mount histology references should include a test-to-reference spatial alignment workflow. Copyright © 2014 AUR. Published by Elsevier Inc. All rights reserved.

  17. Akuna: An Open Source User Environment for Managing Subsurface Simulation Workflows

    Science.gov (United States)

    Freedman, V. L.; Agarwal, D.; Bensema, K.; Finsterle, S.; Gable, C. W.; Keating, E. H.; Krishnan, H.; Lansing, C.; Moeglein, W.; Pau, G. S. H.; Porter, E.; Scheibe, T. D.

    2014-12-01

    The U.S. Department of Energy (DOE) is investing in development of a numerical modeling toolset called ASCEM (Advanced Simulation Capability for Environmental Management) to support modeling analyses at legacy waste sites. ASCEM is an open source and modular computing framework that incorporates new advances and tools for predicting contaminant fate and transport in natural and engineered systems. The ASCEM toolset includes both a Platform with Integrated Toolsets (called Akuna) and a High-Performance Computing multi-process simulator (called Amanzi). The focus of this presentation is on Akuna, an open-source user environment that manages subsurface simulation workflows and associated data and metadata. In this presentation, key elements of Akuna are demonstrated, which includes toolsets for model setup, database management, sensitivity analysis, parameter estimation, uncertainty quantification, and visualization of both model setup and simulation results. A key component of the workflow is in the automated job launching and monitoring capabilities, which allow a user to submit and monitor simulation runs on high-performance, parallel computers. Visualization of large outputs can also be performed without moving data back to local resources. These capabilities make high-performance computing accessible to the users who might not be familiar with batch queue systems and usage protocols on different supercomputers and clusters.

  18. Applying a text mining framework to the extraction of numerical parameters from scientific literature in the biotechnology domain

    Directory of Open Access Journals (Sweden)

    André SANTOS

    2012-07-01

    Full Text Available Scientific publications are the main vehicle to disseminate information in the field of biotechnology for wastewater treatment. Indeed, the new research paradigms and the application of high-throughput technologies have increased the rate of publication considerably. The problem is that manual curation becomes harder, prone-to-errors and time-consuming, leading to a probable loss of information and inefficient knowledge acquisition. As a result, research outputs are hardly reaching engineers, hampering the calibration of mathematical models used to optimize the stability and performance of biotechnological systems. In this context, we have developed a data curation workflow, based on text mining techniques, to extract numerical parameters from scientific literature, and applied it to the biotechnology domain. A workflow was built to process wastewater-related articles with the main goal of identifying physico-chemical parameters mentioned in the text. This work describes the implementation of the workflow, identifies achievements and current limitations in the overall process, and presents the results obtained for a corpus of 50 full-text documents.

  19. Applying a text mining framework to the extraction of numerical parameters from scientific literature in the biotechnology domain

    Directory of Open Access Journals (Sweden)

    Anália LOURENÇO

    2013-07-01

    Full Text Available Scientific publications are the main vehicle to disseminate information in the field of biotechnology for wastewater treatment. Indeed, the new research paradigms and the application of high-throughput technologies have increased the rate of publication considerably. The problem is that manual curation becomes harder, prone-to-errors and time-consuming, leading to a probable loss of information and inefficient knowledge acquisition. As a result, research outputs are hardly reaching engineers, hampering the calibration of mathematical models used to optimize the stability and performance of biotechnological systems. In this context, we have developed a data curation workflow, based on text mining techniques, to extract numerical parameters from scientific literature, and applied it to the biotechnology domain. A workflow was built to process wastewater-related articles with the main goal of identifying physico-chemical parameters mentioned in the text. This work describes the implementation of the workflow, identifies achievements and current limitations in the overall process, and presents the results obtained for a corpus of 50 full-text documents.

  20. Text mining for the biocuration workflow.

    Science.gov (United States)

    Hirschman, Lynette; Burns, Gully A P C; Krallinger, Martin; Arighi, Cecilia; Cohen, K Bretonnel; Valencia, Alfonso; Wu, Cathy H; Chatr-Aryamontri, Andrew; Dowell, Karen G; Huala, Eva; Lourenço, Anália; Nash, Robert; Veuthey, Anne-Lise; Wiegers, Thomas; Winter, Andrew G

    2012-01-01

    Molecular biology has become heavily dependent on biological knowledge encoded in expert curated biological databases. As the volume of biological literature increases, biocurators need help in keeping up with the literature; (semi-) automated aids for biocuration would seem to be an ideal application for natural language processing and text mining. However, to date, there have been few documented successes for improving biocuration throughput using text mining. Our initial investigations took place for the workshop on 'Text Mining for the BioCuration Workflow' at the third International Biocuration Conference (Berlin, 2009). We interviewed biocurators to obtain workflows from eight biological databases. This initial study revealed high-level commonalities, including (i) selection of documents for curation; (ii) indexing of documents with biologically relevant entities (e.g. genes); and (iii) detailed curation of specific relations (e.g. interactions); however, the detailed workflows also showed many variabilities. Following the workshop, we conducted a survey of biocurators. The survey identified biocurator priorities, including the handling of full text indexed with biological entities and support for the identification and prioritization of documents for curation. It also indicated that two-thirds of the biocuration teams had experimented with text mining and almost half were using text mining at that time. Analysis of our interviews and survey provide a set of requirements for the integration of text mining into the biocuration workflow. These can guide the identification of common needs across curated databases and encourage joint experimentation involving biocurators, text mining developers and the larger biomedical research community.

  1. Workflow management: an overview

    NARCIS (Netherlands)

    Ouyang, C.; Adams, M.; Wynn, M.T.; Hofstede, ter A.H.M.; Brocke, vom J.; Rosemann, M.

    2010-01-01

    Workflow management has its origin in the office automation systems of the seventies, but it is not until fairly recently that conceptual and technological breakthroughs have led to its widespread adoption. In fact, nowadays, processawareness has become an accepted and integral part of various types

  2. A Workflow for Automated Satellite Image Processing: from Raw VHSR Data to Object-Based Spectral Information for Smallholder Agriculture

    Directory of Open Access Journals (Sweden)

    Dimitris Stratoulias

    2017-10-01

    Full Text Available Earth Observation has become a progressively important source of information for land use and land cover services over the past decades. At the same time, an increasing number of reconnaissance satellites have been set in orbit with ever increasing spatial, temporal, spectral, and radiometric resolutions. The available bulk of data, fostered by open access policies adopted by several agencies, is setting a new landscape in remote sensing in which timeliness and efficiency are important aspects of data processing. This study presents a fully automated workflow able to process a large collection of very high spatial resolution satellite images to produce actionable information in the application framework of smallholder farming. The workflow applies sequential image processing, extracts meaningful statistical information from agricultural parcels, and stores them in a crop spectrotemporal signature library. An important objective is to follow crop development through the season by analyzing multi-temporal and multi-sensor images. The workflow is based on free and open-source software, namely R, Python, Linux shell scripts, the Geospatial Data Abstraction Library, custom FORTRAN, C++, and the GNU Make utilities. We tested and applied this workflow on a multi-sensor image archive of over 270 VHSR WorldView-2, -3, QuickBird, GeoEye, and RapidEye images acquired over five different study areas where smallholder agriculture prevails.

  3. A history-tracing XML-based provenance framework for workflows

    NARCIS (Netherlands)

    Gerhards, M; Belloum, A.; Berretz, F.; Sander, V.; Skorupa, S.

    2010-01-01

    The importance of validating and reproducing the outcome of computational processes is fundamental to many application domains. Assuring the provenance of workflows will likely become even more important with respect to the incorporation of human tasks to standard workflows by emerging standards

  4. Conceptual framework and architecture for service mediating workflow management

    NARCIS (Netherlands)

    Hu, Jinmin; Grefen, P.W.P.J.

    2003-01-01

    This paper proposes a three-layer workflow concept framework to realize workflow enactment flexibility by dynamically binding activities to their implementations at run time. A service mediating layer is added to bridge business process definition and its implementation. Based on this framework, we

  5. Exploring Dental Providers’ Workflow in an Electronic Dental Record Environment

    Science.gov (United States)

    Schwei, Kelsey M; Cooper, Ryan; Mahnke, Andrea N.; Ye, Zhan

    2016-01-01

    Summary Background A workflow is defined as a predefined set of work steps and partial ordering of these steps in any environment to achieve the expected outcome. Few studies have investigated the workflow of providers in a dental office. It is important to understand the interaction of dental providers with the existing technologies at point of care to assess breakdown in the workflow which could contribute to better technology designs. Objective The study objective was to assess electronic dental record (EDR) workflows using time and motion methodology in order to identify breakdowns and opportunities for process improvement. Methods A time and motion methodology was used to study the human-computer interaction and workflow of dental providers with an EDR in four dental centers at a large healthcare organization. A data collection tool was developed to capture the workflow of dental providers and staff while they interacted with an EDR during initial, planned, and emergency patient visits, and at the front desk. Qualitative and quantitative analysis was conducted on the observational data. Results Breakdowns in workflow were identified while posting charges, viewing radiographs, e-prescribing, and interacting with patient scheduler. EDR interaction time was significantly different between dentists and dental assistants (6:20 min vs. 10:57 min, p = 0.013) and between dentists and dental hygienists (6:20 min vs. 9:36 min, p = 0.003). Conclusions On average, a dentist spent far less time than dental assistants and dental hygienists in data recording within the EDR. PMID:27437058

  6. Special issue on enabling open and interoperable access to Planetary Science and Heliophysics databases and tools

    Science.gov (United States)

    2018-01-01

    The large amount of data generated by modern space missions calls for a change of organization of data distribution and access procedures. Although long term archives exist for telescopic and space-borne observations, high-level functions need to be developed on top of these repositories to make Planetary Science and Heliophysics data more accessible and to favor interoperability. Results of simulations and reference laboratory data also need to be integrated to support and interpret the observations. Interoperable software and interfaces have recently been developed in many scientific domains. The Virtual Observatory (VO) interoperable standards developed for Astronomy by the International Virtual Observatory Alliance (IVOA) can be adapted to Planetary Sciences, as demonstrated by the VESPA (Virtual European Solar and Planetary Access) team within the Europlanet-H2020-RI project. Other communities have developed their own standards: GIS (Geographic Information System) for Earth and planetary surfaces tools, SPASE (Space Physics Archive Search and Extract) for space plasma, PDS4 (NASA Planetary Data System, version 4) and IPDA (International Planetary Data Alliance) for planetary mission archives, etc, and an effort to make them interoperable altogether is starting, including automated workflows to process related data from different sources.

  7. Workflow automation based on OSI job transfer and manipulation

    NARCIS (Netherlands)

    van Sinderen, Marten J.; Joosten, Stef M.M.; Guareis de farias, Cléver

    1999-01-01

    This paper shows that Workflow Management Systems (WFMS) and a data communication standard called Job Transfer and Manipulation (JTM) are built on the same concepts, even though different words are used. The paper analyses the correspondence of workflow concepts and JTM concepts. Besides, the

  8. Dynamic Service Selection in Workflows Using Performance Data

    Directory of Open Access Journals (Sweden)

    David W. Walker

    2007-01-01

    Full Text Available An approach to dynamic workflow management and optimisation using near-realtime performance data is presented. Strategies are discussed for choosing an optimal service (based on user-specified criteria from several semantically equivalent Web services. Such an approach may involve finding "similar" services, by first pruning the set of discovered services based on service metadata, and subsequently selecting an optimal service based on performance data. The current implementation of the prototype workflow framework is described, and demonstrated with a simple workflow. Performance results are presented that show the performance benefits of dynamic service selection. A statistical analysis based on the first order statistic is used to investigate the likely improvement in service response time arising from dynamic service selection.

  9. On Secure Workflow Decentralisation on the Internet

    Directory of Open Access Journals (Sweden)

    Petteri Kaskenpalo

    2010-06-01

    Full Text Available Decentralised workflow management systems are a new research area, where most work to-date has focused on the system's overall architecture. As little attention has been given to the security aspects in such systems, we follow a security driven approach, and consider, from the perspective of available security building blocks, how security can be implemented and what new opportunities are presented when empowering the decentralised environment with modern distributed security protocols. Our research is motivated by a more general question of how to combine the positive enablers that email exchange enjoys, with the general benefits of workflow systems, and more specifically with the benefits that can be introduced in a decentralised environment. This aims to equip email users with a set of tools to manage the semantics of a message exchange, contents, participants and their roles in the exchange in an environment that provides inherent assurances of security and privacy. This work is based on a survey of contemporary distributed security protocols, and considers how these protocols could be used in implementing a distributed workflow management system with decentralised control . We review a set of these protocols, focusing on the required message sequences in reviewing the protocols, and discuss how these security protocols provide the foundations for implementing core control-flow, data, and resource patterns in a distributed workflow environment.

  10. High Energy Physics Exascale Requirements Review. An Office of Science review sponsored jointly by Advanced Scientific Computing Research and High Energy Physics, June 10-12, 2015, Bethesda, Maryland

    Energy Technology Data Exchange (ETDEWEB)

    Habib, Salman [Argonne National Lab. (ANL), Argonne, IL (United States); Roser, Robert [Fermi National Accelerator Lab. (FNAL), Batavia, IL (United States); Gerber, Richard [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States); Antypas, Katie [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States); Dart, Eli [Esnet, Berkeley, CA (United States); Dosanjh, Sudip [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States); Hack, James [Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States); Monga, Inder [Esnet, Berkeley, CA (United States); Papka, Michael E. [Argonne National Lab. (ANL), Argonne, IL (United States); Riley, Katherine [Argonne National Lab. (ANL), Argonne, IL (United States); Rotman, Lauren [Esnet, Berkeley, CA (United States); Straatsma, Tjerk [Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States); Wells, Jack [Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States); Williams, Tim [Argonne National Lab. (ANL), Argonne, IL (United States); Almgren, A. [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States); Amundson, J. [Fermi National Accelerator Lab. (FNAL), Batavia, IL (United States); Bailey, Stephen [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States); Bard, Deborah [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States); Bloom, Ken [Univ. of Nebraska, Lincoln, NE (United States); Bockelman, Brian [Univ. of Nebraska, Lincoln, NE (United States); Borgland, Anders [SLAC National Accelerator Lab., Menlo Park, CA (United States); Borrill, Julian [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States); Boughezal, Radja [Argonne National Lab. (ANL), Argonne, IL (United States); Brower, Richard [Boston Univ., MA (United States); Cowan, Benjamin [SLAC National Accelerator Lab., Menlo Park, CA (United States); Finkel, Hal [Argonne National Lab. (ANL), Argonne, IL (United States); Frontiere, Nicholas [Argonne National Lab. (ANL), Argonne, IL (United States); Fuess, Stuart [Fermi National Accelerator Lab. (FNAL), Batavia, IL (United States); Ge, Lixin [SLAC National Accelerator Lab., Menlo Park, CA (United States); Gnedin, Nick [Fermi National Accelerator Lab. (FNAL), Batavia, IL (United States); Gottlieb, Steven [Indiana Univ., Bloomington, IN (United States); Gutsche, Oliver [Fermi National Accelerator Lab. (FNAL), Batavia, IL (United States); Han, T. [Indiana Univ., Bloomington, IN (United States); Heitmann, Katrin [Argonne National Lab. (ANL), Argonne, IL (United States); Hoeche, Stefan [SLAC National Accelerator Lab., Menlo Park, CA (United States); Ko, Kwok [SLAC National Accelerator Lab., Menlo Park, CA (United States); Kononenko, Oleksiy [SLAC National Accelerator Lab., Menlo Park, CA (United States); LeCompte, Thomas [Argonne National Lab. (ANL), Argonne, IL (United States); Li, Zheng [SLAC National Accelerator Lab., Menlo Park, CA (United States); Lukic, Zarija [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States); Mori, Warren [Univ. of California, Los Angeles, CA (United States); Ng, Cho-Kuen [SLAC National Accelerator Lab., Menlo Park, CA (United States); Nugent, Peter [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States); Oleynik, Gene [Fermi National Accelerator Lab. (FNAL), Batavia, IL (United States); O’Shea, Brian [Michigan State Univ., East Lansing, MI (United States); Padmanabhan, Nikhil [Yale Univ., New Haven, CT (United States); Petravick, Donald [Univ. of Illinois, Urbana, IL (United States). National Center for Supercomputing Applications; Petriello, Frank J. [Argonne National Lab. (ANL), Argonne, IL (United States); Pope, Adrian [Argonne National Lab. (ANL), Argonne, IL (United States); Power, John [Argonne National Lab. (ANL), Argonne, IL (United States); Qiang, Ji [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States); Reina, Laura [Florida State Univ., Tallahassee, FL (United States); Rizzo, Thomas Gerard [SLAC National Accelerator Lab., Menlo Park, CA (United States); Ryne, Robert [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States); Schram, Malachi [Pacific Northwest National Lab. (PNNL), Richland, WA (United States); Spentzouris, P. [Fermi National Accelerator Lab. (FNAL), Batavia, IL (United States); Toussaint, Doug [Univ. of Arizona, Tucson, AZ (United States); Vay, Jean Luc [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States); Viren, B. [Brookhaven National Lab. (BNL), Upton, NY (United States); Wuerthwein, Frank [Univ. of California, San Diego, CA (United States); Xiao, Liling [SLAC National Accelerator Lab., Menlo Park, CA (United States); Coffey, Richard [Argonne National Lab. (ANL), Argonne, IL (United States)

    2016-11-29

    The U.S. Department of Energy (DOE) Office of Science (SC) Offices of High Energy Physics (HEP) and Advanced Scientific Computing Research (ASCR) convened a programmatic Exascale Requirements Review on June 10–12, 2015, in Bethesda, Maryland. This report summarizes the findings, results, and recommendations derived from that meeting. The high-level findings and observations are as follows. Larger, more capable computing and data facilities are needed to support HEP science goals in all three frontiers: Energy, Intensity, and Cosmic. The expected scale of the demand at the 2025 timescale is at least two orders of magnitude — and in some cases greater — than that available currently. The growth rate of data produced by simulations is overwhelming the current ability of both facilities and researchers to store and analyze it. Additional resources and new techniques for data analysis are urgently needed. Data rates and volumes from experimental facilities are also straining the current HEP infrastructure in its ability to store and analyze large and complex data volumes. Appropriately configured leadership-class facilities can play a transformational role in enabling scientific discovery from these datasets. A close integration of high-performance computing (HPC) simulation and data analysis will greatly aid in interpreting the results of HEP experiments. Such an integration will minimize data movement and facilitate interdependent workflows. Long-range planning between HEP and ASCR will be required to meet HEP’s research needs. To best use ASCR HPC resources, the experimental HEP program needs (1) an established, long-term plan for access to ASCR computational and data resources, (2) the ability to map workflows to HPC resources, (3) the ability for ASCR facilities to accommodate workflows run by collaborations potentially comprising thousands of individual members, (4) to transition codes to the next-generation HPC platforms that will be available at ASCR

  11. "Intelligent" tools for workflow process redesign : a research agenda

    NARCIS (Netherlands)

    Netjes, M.; Vanderfeesten, I.T.P.; Reijers, H.A.; Bussler, C.; Haller, A.

    2006-01-01

    Although much attention is being paid to business processes during the past decades, the design of business processes and particularly workflow processes is still more art than science. In this workshop paper, we present our view on modeling methods for workflow processes and introduce our research

  12. Oceans of Data: In what ways can learning research inform the development of electronic interfaces and tools for use by students accessing large scientific databases?

    Science.gov (United States)

    Krumhansl, R. A.; Foster, J.; Peach, C. L.; Busey, A.; Baker, I.

    2012-12-01

    The practice of science and engineering is being revolutionized by the development of cyberinfrastructure for accessing near real-time and archived observatory data. Large cyberinfrastructure projects have the potential to transform the way science is taught in high school classrooms, making enormous quantities of scientific data available, giving students opportunities to analyze and draw conclusions from many kinds of complex data, and providing students with experiences using state-of-the-art resources and techniques for scientific investigations. However, online interfaces to scientific data are built by scientists for scientists, and their design can significantly impede broad use by novices. Knowledge relevant to the design of student interfaces to complex scientific databases is broadly dispersed among disciplines ranging from cognitive science to computer science and cartography and is not easily accessible to designers of educational interfaces. To inform efforts at bridging scientific cyberinfrastructure to the high school classroom, Education Development Center, Inc. and the Scripps Institution of Oceanography conducted an NSF-funded 2-year interdisciplinary review of literature and expert opinion pertinent to making interfaces to large scientific databases accessible to and usable by precollege learners and their teachers. Project findings are grounded in the fundamentals of Cognitive Load Theory, Visual Perception, Schemata formation and Universal Design for Learning. The Knowledge Status Report (KSR) presents cross-cutting and visualization-specific guidelines that highlight how interface design features can address/ ameliorate challenges novice high school students face as they navigate complex databases to find data, and construct and look for patterns in maps, graphs, animations and other data visualizations. The guidelines present ways to make scientific databases more broadly accessible by: 1) adjusting the cognitive load imposed by the user

  13. When Workflow Management Systems and Logging Systems Meet: Analyzing Large-Scale Execution Traces

    Energy Technology Data Exchange (ETDEWEB)

    Gunter, Daniel

    2008-07-31

    This poster shows the benefits of integrating a workflow management system with logging and log mining capabilities. By combing two existing, mature technologies: Pegasus-WMS and Netlogger, we are able to efficiently process execution logs of earthquake science workflows consisting of hundreds of thousands to one million tasks. In particular we show results of processing logs of CyberShake, a workflow application running on the TeraGrid. Client-side tools allow scientists to quickly gather statistics about a workflow run and find out which tasks executed, where they were executed, what was their runtime, etc. These statistics can be used to understand the performance characteristics of a workflow and help tune the execution parameters of the workflow management system. This poster shows the scalability of the system presenting results of uploading task execution records into the system and by showing results of querying the system for overall workflow performance information.

  14. mORCA: ubiquitous access to life science web services.

    Science.gov (United States)

    Diaz-Del-Pino, Sergio; Trelles, Oswaldo; Falgueras, Juan

    2018-01-16

    Technical advances in mobile devices such as smartphones and tablets have produced an extraordinary increase in their use around the world and have become part of our daily lives. The possibility of carrying these devices in a pocket, particularly mobile phones, has enabled ubiquitous access to Internet resources. Furthermore, in the life sciences world there has been a vast proliferation of data types and services that finish as Web Services. This suggests the need for research into mobile clients to deal with life sciences applications for effective usage and exploitation. Analysing the current features in existing bioinformatics applications managing Web Services, we have devised, implemented, and deployed an easy-to-use web-based lightweight mobile client. This client is able to browse, select, compose parameters, invoke, and monitor the execution of Web Services stored in catalogues or central repositories. The client is also able to deal with huge amounts of data between external storage mounts. In addition, we also present a validation use case, which illustrates the usage of the application while executing, monitoring, and exploring the results of a registered workflow. The software its available in the Apple Store and Android Market and the source code is publicly available in Github. Mobile devices are becoming increasingly important in the scientific world due to their strong potential impact on scientific applications. Bioinformatics should not fall behind this trend. We present an original software client that deals with the intrinsic limitations of such devices and propose different guidelines to provide location-independent access to computational resources in bioinformatics and biomedicine. Its modular design makes it easily expandable with the inclusion of new repositories, tools, types of visualization, etc.

  15. Scientific collaboratories in higher education

    DEFF Research Database (Denmark)

    Sonnenwald, Diane H.; Li, Bin

    2003-01-01

    Scientific collaboratories hold the promise of providing students access to specialized scientific instruments, data and experts, enabling learning opportunities perhaps otherwise not available. However, evaluation of scientific collaboratories in higher education has lagged behind...

  16. Data intensive ATLAS workflows in the Cloud

    CERN Document Server

    Rzehorz, Gerhard Ferdinand; The ATLAS collaboration

    2018-01-01

    From 2025 onwards, the ATLAS collaboration at the Large Hadron Collider (LHC) at CERN will experience a massive increase in data quantity as well as complexity. Including mitigating factors, the prevalent computing power by that time will only fulfil one tenth of the requirement. This contribution will focus on Cloud computing as an approach to help overcome this challenge by providing flexible hardware that can be configured to the specific needs of a workflow. Experience with Cloud computing exists, but there is a large uncertainty if and to which degree it can be able to reduce the burden by 2025. In order to understand and quantify the benefits of Cloud computing, the "Workflow and Infrastructure Model" was created. It estimates the viability of Cloud computing by combining different inputs from the workflow side with infrastructure specifications. The model delivers metrics that enable the comparison of different Cloud configurations as well as different Cloud offerings with each other. A wide range of r...

  17. FAO's role in facilitating access to the scientific and technical literature in Agriculture in developing countries

    OpenAIRE

    Katz, Stephen

    2007-01-01

    Research generated in developing and emerging countries is currently “missing” from the international knowledge bases because of financial consequences affecting its publication and distribution. Much of the scientific research output from Africa for example, is in form of grey literature and hardly accessible. FAO is the specialized United Nation agency that leads international efforts to defeat hunger and emphasizes on knowledge management for food and agriculture. FAO more than an extensiv...

  18. Declarative Event-Based Workflow as Distributed Dynamic Condition Response Graphs

    DEFF Research Database (Denmark)

    Hildebrandt, Thomas; Mukkamala, Raghava Rao

    2010-01-01

    We present Dynamic Condition Response Graphs (DCR Graphs) as a declarative, event-based process model inspired by the workflow language employed by our industrial partner and conservatively generalizing prime event structures. A dynamic condition response graph is a directed graph with nodes repr...... exemplify the use of distributed DCR Graphs on a simple workflow taken from a field study at a Danish hospital, pointing out their flexibility compared to imperative workflow models. Finally we provide a mapping from DCR Graphs to Buchi-automata....

  19. Design decisions in workflow management and quality of work.

    NARCIS (Netherlands)

    Waal, B.M.E. de; Batenburg, R.

    2009-01-01

    In this paper, the design and implementation of a workflow management (WFM) system in a large Dutch social insurance organisation is described. The effect of workflow design decisions on the quality of work is explored theoretically and empirically, using the model of Zur Mühlen as a frame of

  20. Open access and beyond

    Directory of Open Access Journals (Sweden)

    Das Chhaya

    2006-09-01

    Full Text Available Abstract Uncensored exchange of scientific results hastens progress. Open Access does not stop at the removal of price and permission barriers; still, censorship and reading disabilities, to name a few, hamper access to information. Here, we invite the scientific community and the public to discuss new methods to distribute, store and manage literature in order to achieve unfettered access to literature.

  1. Scientific information: technology, will and access

    Directory of Open Access Journals (Sweden)

    Ana M. B. Pavani

    2007-12-01

    Full Text Available This article addresses the access to information from a point of view that relates the evolution of technology to the methods of treating information and to the desire for knowledge. The first part introduces some important events in ancient times, the end of the middle ages/beginning of the modern age and the XIX century. Then, it presents an overview of the current situation of traditional libraries and compares some characteristics with the corresponding ones for digital libraries. It ends by mentioning the international efforts towards open archives and open access to information; it shows examples of positive results.

  2. Editorial and scientific quality in the parameters for inclusion of journals commercial and open access databases

    Directory of Open Access Journals (Sweden)

    Cecilia Rozemblum

    2015-04-01

    Full Text Available In this article, the parameters used by RedALyC, Catalogo Latindex, SciELO, Scopus and Web of Science for the incorporation of scientific journals in their collections are analyzed with the goal of proving their relation with the objectives of each database in addition of debating the valuation that the scientific society is giving to those systems as decisive of "scientific quality". The used indicators are classified in: 1 Editorial quality (formal aspects or editorial management. 2 Content quality (peer review or originality and 3 Visibility (prestige of editors and editorial use and impact, accessibility and indexing It is revealed that: a between 9 and 16% of the indicators are related to the quality of content; b Lack specificity in their definition and determination of measure systems, and c match the goals of each base, although a marked trend towards formal aspects related and visibility is observed. Thus makes it clear that these systems pursuing their own objectives, making a core of journals of “quality” for its readership. We conclude, therefore, that the presence or absence of a journal in these collections is not sufficient to determine the quality of scientific magazine and its contents parameter.

  3. Outside The Box: Building a Digital Asset Management Ecosystem for Preservation and Access

    Directory of Open Access Journals (Sweden)

    Andrew Weidner

    2017-04-01

    Full Text Available The University of Houston (UH Libraries made an institutional commitment in late 2015 to migrate the data for its digitized cultural heritage collections to open source systems for preservation and access: Hydra-in-a-Box, Archivematica, and ArchivesSpace. This article describes the work that the UH Libraries implementation team has completed to date, including open source tools for streamlining digital curation workflows, minting and resolving identifiers, and managing SKOS vocabularies. These systems, workflows, and tools, collectively known as the Bayou City Digital Asset Management System (BCDAMS, represent a novel effort to solve common issues in the digital curation lifecycle and may serve as a model for other institutions seeking to implement flexible and comprehensive systems for digital preservation and access.

  4. Automated evolutionary restructuring of workflows to minimise errors via stochastic model checking

    DEFF Research Database (Denmark)

    Herbert, Luke Thomas; Hansen, Zaza Nadja Lee; Jacobsen, Peter

    2014-01-01

    This paper presents a framework for the automated restructuring of workflows that allows one to minimise the impact of errors on a production workflow. The framework allows for the modelling of workflows by means of a formalised subset of the Business Process Modelling and Notation (BPMN) language...

  5. Making USGS Science Data more Open, Accessible, and Usable: Leveraging ScienceBase for Success

    Science.gov (United States)

    Chang, M.; Ignizio, D.; Langseth, M. L.; Norkin, T.

    2016-12-01

    In 2013, the White House released initiatives requiring federally funded research to be made publicly available and machine readable. In response, the U.S. Geological Survey (USGS) has been developing a unified approach to make USGS data available and open. This effort has involved the establishment of internal policies and the release of a Public Access Plan, which outlines a strategy for the USGS to move forward into the modern era in scientific data management. Originally designed as a catalog and collaborative data management platform, ScienceBase (www.sciencebase.gov) is being leveraged to serve as a robust data hosting solution for USGS researchers to make scientific data accessible. With the goal of maintaining persistent access to formal data products and developing a management approach to facilitate stable data citation, the ScienceBase Data Release Team was established to ensure the quality, consistency, and meaningful organization of USGS data through standardized workflows and best practices. These practices include the creation and maintenance of persistent identifiers for data, improving the use of open data formats, establishing permissions for read/write access, validating the quality of standards compliant metadata, verifying that data have been reviewed and approved prior to release, and connecting to external search catalogs such as the USGS Science Data Catalog (data.usgs.gov) and data.gov. The ScienceBase team is actively building features to support this effort by automating steps to streamline the process, building metrics to track site visits and downloads, and connecting published digital resources in line with USGS and Federal policy. By utilizing ScienceBase to achieve stewardship quality and employing a dedicated team to help USGS scientists improve the quality of their data, the USGS is helping to meet today's data quality management challenges and ensure that reliable USGS data are available to and reusable for the public.

  6. Interoperability Using Lightweight Metadata Standards: Service & Data Casting, OpenSearch, OPM Provenance, and Shared SciFlo Workflows

    Science.gov (United States)

    Wilson, B. D.; Manipon, G.; Hua, H.; Fetzer, E.

    2011-12-01

    Under several NASA grants, we are generating multi-sensor merged atmospheric datasets to enable the detection of instrument biases and studies of climate trends over decades of data. For example, under a NASA MEASURES grant we are producing a water vapor climatology from the A-Train instruments, stratified by the Cloudsat cloud classification for each geophysical scene. The generation and proper use of such multi-sensor climate data records (CDR's) requires a high level of openness, transparency, and traceability. To make the datasets self-documenting and provide access to full metadata and traceability, we have implemented a set of capabilities and services using known, interoperable protocols. These protocols include OpenSearch, OPeNDAP, Open Provenance Model, service & data casting technologies using Atom feeds, and REST-callable analysis workflows implemented as SciFlo (XML) documents. We advocate that our approach can serve as a blueprint for how to openly "document and serve" complex, multi-sensor CDR's with full traceability. The capabilities and services provided include: - Discovery of the collections by keyword search, exposed using OpenSearch protocol; - Space/time query across the CDR's granules and all of the input datasets via OpenSearch; - User-level configuration of the production workflows so that scientists can select additional physical variables from the A-Train to add to the next iteration of the merged datasets; - Efficient data merging using on-the-fly OPeNDAP variable slicing & spatial subsetting of data out of input netCDF and HDF files (without moving the entire files); - Self-documenting CDR's published in a highly usable netCDF4 format with groups used to organize the variables, CF-style attributes for each variable, numeric array compression, & links to OPM provenance; - Recording of processing provenance and data lineage into a query-able provenance trail in Open Provenance Model (OPM) format, auto-captured by the workflow engine

  7. Incorporating Workflow Interference in Facility Layout Design: The Quartic Assignment Problem

    OpenAIRE

    Wen-Chyuan Chiang; Panagiotis Kouvelis; Timothy L. Urban

    2002-01-01

    Although many authors have noted the importance of minimizing workflow interference in facility layout design, traditional layout research tends to focus on minimizing the distance-based transportation cost. This paper formalizes the concept of workflow interference from a facility layout perspective. A model, formulated as a quartic assignment problem, is developed that explicitly considers the interference of workflow. Optimal and heuristic solution methodologies are developed and evaluated.

  8. Provenance-Based Debugging and Drill-Down in Data-Oriented Workflows

    KAUST Repository

    Ikeda, Robert; Cho, Junsang; Fang, Charlie; Salihoglu, Semih; Torikai, Satoshi; Widom, Jennifer

    2012-01-01

    Panda (for Provenance and Data) is a system that supports the creation and execution of data-oriented workflows, with automatic provenance generation and built-in provenance tracing operations. Workflows in Panda are arbitrary a cyclic graphs

  9. Semantic Document Library: A Virtual Research Environment for Documents, Data and Workflows Sharing

    Science.gov (United States)

    Kotwani, K.; Liu, Y.; Myers, J.; Futrelle, J.

    2008-12-01

    The Semantic Document Library (SDL) was driven by use cases from the environmental observatory communities and is designed to provide conventional document repository features of uploading, downloading, editing and versioning of documents as well as value adding features of tagging, querying, sharing, annotating, ranking, provenance, social networking and geo-spatial mapping services. It allows users to organize a catalogue of watershed observation data, model output, workflows, as well publications and documents related to the same watershed study through the tagging capability. Users can tag all relevant materials using the same watershed name and find all of them easily later using this tag. The underpinning semantic content repository can store materials from other cyberenvironments such as workflow or simulation tools and SDL provides an effective interface to query and organize materials from various sources. Advanced features of the SDL allow users to visualize the provenance of the materials such as the source and how the output data is derived. Other novel features include visualizing all geo-referenced materials on a geospatial map. SDL as a component of a cyberenvironment portal (the NCSA Cybercollaboratory) has goal of efficient management of information and relationships between published artifacts (Validated models, vetted data, workflows, annotations, best practices, reviews and papers) produced from raw research artifacts (data, notes, plans etc.) through agents (people, sensors etc.). Tremendous scientific potential of artifacts is achieved through mechanisms of sharing, reuse and collaboration - empowering scientists to spread their knowledge and protocols and to benefit from the knowledge of others. SDL successfully implements web 2.0 technologies and design patterns along with semantic content management approach that enables use of multiple ontologies and dynamic evolution (e.g. folksonomies) of terminology. Scientific documents involved with

  10. Security aspects in teleradiology workflow

    Science.gov (United States)

    Soegner, Peter I.; Helweg, Gernot; Holzer, Heimo; zur Nedden, Dieter

    2000-05-01

    The medicolegal necessity of privacy, security and confidentiality was the aim of the attempt to develop a secure teleradiology workflow between the telepartners -- radiologist and the referring physician. To avoid the lack of dataprotection and datasecurity we introduced biometric fingerprint scanners in combination with smart cards to identify the teleradiology partners and communicated over an encrypted TCP/IP satellite link between Innsbruck and Reutte. We used an asymmetric kryptography method to guarantee authentification, integrity of the data-packages and confidentiality of the medical data. It was necessary to use a biometric feature to avoid a case of mistaken identity of persons, who wanted access to the system. Only an invariable electronical identification allowed a legal liability to the final report and only a secure dataconnection allowed the exchange of sensible medical data between different partners of Health Care Networks. In our study we selected the user friendly combination of a smart card and a biometric fingerprint technique, called SkymedTM Double Guard Secure Keyboard (Agfa-Gevaert) to confirm identities and log into the imaging workstations and the electronic patient record. We examined the interoperability of the used software with the existing platforms. Only the WIN-XX operating systems could be protected at the time of our study.

  11. SDM center technologies for accelerating scientific discoveries

    International Nuclear Information System (INIS)

    Shoshani, Arie; Altintas, Ilkay; Choudhary, Alok; Critchlow, Terence; Kamath, Chandrika; Ludaescher, Bertram; Nieplocha, Jarek; Parker, Steve; Ross, Rob; Samatova, Nagiza; Vouk, Mladen

    2007-01-01

    With the increasing volume and complexity of data produced by ultra-scale simulations and high-throughput experiments, understanding the science is largely hampered by the lack of comprehensive, end-to-end data management solutions ranging from initial data acquisition to final analysis and visualization. The SciDAC-1 Scientific Data Management (SDM) Center succeeded in bringing an initial set of advanced data management technologies to DOE application scientists in astrophysics, climate, fusion, and biology. Equally important, it established collaborations with these scientists to better understand their science as well as their forthcoming data management and data analytics challenges. Our future focus is on improving the SDM framework to address the needs of ultra-scale science during SciDAC-2. Specifically, we are enhancing and extending our existing tools to allow for more interactivity and fault tolerance when managing scientists' workflows, for better parallelism and feature extraction capabilities in their data analytics operations, and for greater efficiency and functionality in users' interactions with local parallel file systems, active storage, and access to remote storage. These improvements are necessary for the scalability and complexity challenges presented by hardware and applications at ultra scale, and are complemented by continued efforts to work with application scientists in various domains

  12. A Collaborative Workflow for the Digitization of Unique Materials

    Science.gov (United States)

    Gueguen, Gretchen; Hanlon, Ann M.

    2009-01-01

    This paper examines the experience of one institution, the University of Maryland Libraries, as it made organizational efforts to harness existing workflows and to capture digitization done in the course of responding to patron requests. By examining the way this organization adjusted its existing workflows to put in place more systematic methods…

  13. Development of the workflow kine systems for support on KAIZEN.

    Science.gov (United States)

    Mizuno, Yuki; Ito, Toshihiko; Yoshikawa, Toru; Yomogida, Satoshi; Morio, Koji; Sakai, Kazuhiro

    2012-01-01

    In this paper, we introduce the new workflow line system consisted of the location and image recording, which led to the acquisition of workflow information and the analysis display. From the results of workflow line investigation, we considered the anticipated effects and the problems on KAIZEN. Workflow line information included the location information and action contents information. These technologies suggest the viewpoints to help improvement, for example, exclusion of useless movement, the redesign of layout and the review of work procedure. Manufacturing factory, it was clear that there was much movement from the standard operation place and accumulation residence time. The following was shown as a result of this investigation, to be concrete, the efficient layout was suggested by this system. In the case of the hospital, similarly, it is pointed out that the workflow has the problem of layout and setup operations based on the effective movement pattern of the experts. This system could adapt to routine work, including as well as non-routine work. By the development of this system which can fit and adapt to industrial diversification, more effective "visual management" (visualization of work) is expected in the future.

  14. Restructuring of workflows to minimise errors via stochastic model checking: An automated evolutionary approach

    International Nuclear Information System (INIS)

    Herbert, L.T.; Hansen, Z.N.L.

    2016-01-01

    This paper presents a framework for the automated restructuring of stochastic workflows to reduce the impact of faults. The framework allows for the modelling of workflows by means of a formalised subset of the BPMN workflow language. We extend this modelling formalism to describe faults and incorporate an intention preserving stochastic semantics able to model both probabilistic- and non-deterministic behaviour. Stochastic model checking techniques are employed to generate the state-space of a given workflow. Possible improvements obtained by restructuring are measured by employing the framework's capacity for tracking real-valued quantities associated with states and transitions of the workflow. The space of possible restructurings of a workflow is explored by means of an evolutionary algorithm, where the goals for improvement are defined in terms of optimising quantities, typically employed to model resources, associated with a workflow. The approach is fully automated and only the modelling of the production workflows, potential faults and the expression of the goals require manual input. We present the design of a software tool implementing this framework and explore the practical utility of this approach through an industrial case study in which the risk of production failures and their impact are reduced by restructuring the workflow. - Highlights: • We present a framework which allows for the automated restructuring of workflows. • This framework seeks to minimise the impact of errors on the workflow. • We illustrate a scalable software implementation of this framework. • We explore the practical utility of this approach through an industry case. • The impact of errors can be substantially reduced by restructuring the workflow.

  15. A Prudent Approach to Fair Use Workflow

    Directory of Open Access Journals (Sweden)

    Karey Patterson

    2018-02-01

    Full Text Available This poster will outline a new highly efficient workflow for the management of copyright materials that is prudent and accommodates generally and legally accepted Fair Use limits. The workflow allows library or copyright staff an easy means to keep on top of their copyright obligations, manage licenses and review and adjust schedules but is still a highly efficient means to cope with large numbers of requests to use materials. The poster details speed and efficiency gains for professors and library staff while reducing legal exposure.

  16. Scrambled eggs: A highly sensitive molecular diagnostic workflow for Fasciola species specific detection from faecal samples.

    Directory of Open Access Journals (Sweden)

    Nichola Eliza Davies Calvani

    2017-09-01

    Full Text Available Fasciolosis, due to Fasciola hepatica and Fasciola gigantica, is a re-emerging zoonotic parasitic disease of worldwide importance. Human and animal infections are commonly diagnosed by the traditional sedimentation and faecal egg-counting technique. However, this technique is time-consuming and prone to sensitivity errors when a large number of samples must be processed or if the operator lacks sufficient experience. Additionally, diagnosis can only be made once the 12-week pre-patent period has passed. Recently, a commercially available coprological antigen ELISA has enabled detection of F. hepatica prior to the completion of the pre-patent period, providing earlier diagnosis and increased throughput, although species differentiation is not possible in areas of parasite sympatry. Real-time PCR offers the combined benefits of highly sensitive species differentiation for medium to large sample sizes. However, no molecular diagnostic workflow currently exists for the identification of Fasciola spp. in faecal samples.A new molecular diagnostic workflow for the highly-sensitive detection and quantification of Fasciola spp. in faecal samples was developed. The technique involves sedimenting and pelleting the samples prior to DNA isolation in order to concentrate the eggs, followed by disruption by bead-beating in a benchtop homogeniser to ensure access to DNA. Although both the new molecular workflow and the traditional sedimentation technique were sensitive and specific, the new molecular workflow enabled faster sample throughput in medium to large epidemiological studies, and provided the additional benefit of speciation. Further, good correlation (R2 = 0.74-0.76 was observed between the real-time PCR values and the faecal egg count (FEC using the new molecular workflow for all herds and sampling periods. Finally, no effect of storage in 70% ethanol was detected on sedimentation and DNA isolation outcomes; enabling transport of samples from endemic

  17. Scrambled eggs: A highly sensitive molecular diagnostic workflow for Fasciola species specific detection from faecal samples

    Science.gov (United States)

    Calvani, Nichola Eliza Davies; Windsor, Peter Andrew; Bush, Russell David

    2017-01-01

    Background Fasciolosis, due to Fasciola hepatica and Fasciola gigantica, is a re-emerging zoonotic parasitic disease of worldwide importance. Human and animal infections are commonly diagnosed by the traditional sedimentation and faecal egg-counting technique. However, this technique is time-consuming and prone to sensitivity errors when a large number of samples must be processed or if the operator lacks sufficient experience. Additionally, diagnosis can only be made once the 12-week pre-patent period has passed. Recently, a commercially available coprological antigen ELISA has enabled detection of F. hepatica prior to the completion of the pre-patent period, providing earlier diagnosis and increased throughput, although species differentiation is not possible in areas of parasite sympatry. Real-time PCR offers the combined benefits of highly sensitive species differentiation for medium to large sample sizes. However, no molecular diagnostic workflow currently exists for the identification of Fasciola spp. in faecal samples. Methodology/Principal findings A new molecular diagnostic workflow for the highly-sensitive detection and quantification of Fasciola spp. in faecal samples was developed. The technique involves sedimenting and pelleting the samples prior to DNA isolation in order to concentrate the eggs, followed by disruption by bead-beating in a benchtop homogeniser to ensure access to DNA. Although both the new molecular workflow and the traditional sedimentation technique were sensitive and specific, the new molecular workflow enabled faster sample throughput in medium to large epidemiological studies, and provided the additional benefit of speciation. Further, good correlation (R2 = 0.74–0.76) was observed between the real-time PCR values and the faecal egg count (FEC) using the new molecular workflow for all herds and sampling periods. Finally, no effect of storage in 70% ethanol was detected on sedimentation and DNA isolation outcomes; enabling

  18. Scrambled eggs: A highly sensitive molecular diagnostic workflow for Fasciola species specific detection from faecal samples.

    Science.gov (United States)

    Calvani, Nichola Eliza Davies; Windsor, Peter Andrew; Bush, Russell David; Šlapeta, Jan

    2017-09-01

    Fasciolosis, due to Fasciola hepatica and Fasciola gigantica, is a re-emerging zoonotic parasitic disease of worldwide importance. Human and animal infections are commonly diagnosed by the traditional sedimentation and faecal egg-counting technique. However, this technique is time-consuming and prone to sensitivity errors when a large number of samples must be processed or if the operator lacks sufficient experience. Additionally, diagnosis can only be made once the 12-week pre-patent period has passed. Recently, a commercially available coprological antigen ELISA has enabled detection of F. hepatica prior to the completion of the pre-patent period, providing earlier diagnosis and increased throughput, although species differentiation is not possible in areas of parasite sympatry. Real-time PCR offers the combined benefits of highly sensitive species differentiation for medium to large sample sizes. However, no molecular diagnostic workflow currently exists for the identification of Fasciola spp. in faecal samples. A new molecular diagnostic workflow for the highly-sensitive detection and quantification of Fasciola spp. in faecal samples was developed. The technique involves sedimenting and pelleting the samples prior to DNA isolation in order to concentrate the eggs, followed by disruption by bead-beating in a benchtop homogeniser to ensure access to DNA. Although both the new molecular workflow and the traditional sedimentation technique were sensitive and specific, the new molecular workflow enabled faster sample throughput in medium to large epidemiological studies, and provided the additional benefit of speciation. Further, good correlation (R2 = 0.74-0.76) was observed between the real-time PCR values and the faecal egg count (FEC) using the new molecular workflow for all herds and sampling periods. Finally, no effect of storage in 70% ethanol was detected on sedimentation and DNA isolation outcomes; enabling transport of samples from endemic to non

  19. SPECT/CT workflow and imaging protocols

    Energy Technology Data Exchange (ETDEWEB)

    Beckers, Catherine [University Hospital of Liege, Division of Nuclear Medicine and Oncological Imaging, Department of Medical Physics, Liege (Belgium); Hustinx, Roland [University Hospital of Liege, Division of Nuclear Medicine and Oncological Imaging, Department of Medical Physics, Liege (Belgium); Domaine Universitaire du Sart Tilman, Service de Medecine Nucleaire et Imagerie Oncologique, CHU de Liege, Liege (Belgium)

    2014-05-15

    Introducing a hybrid imaging method such as single photon emission computed tomography (SPECT)/CT greatly alters the routine in the nuclear medicine department. It requires designing new workflow processes and the revision of original scheduling process and imaging protocols. In addition, the imaging protocol should be adapted for each individual patient, so that performing CT is fully justified and the CT procedure is fully tailored to address the clinical issue. Such refinements often occur before the procedure is started but may be required at some intermediate stage of the procedure. Furthermore, SPECT/CT leads in many instances to a new partnership with the radiology department. This article presents practical advice and highlights the key clinical elements which need to be considered to help understand the workflow process of SPECT/CT and optimise imaging protocols. The workflow process using SPECT/CT is complex in particular because of its bimodal character, the large spectrum of stakeholders, the multiplicity of their activities at various time points and the need for real-time decision-making. With help from analytical tools developed for quality assessment, the workflow process using SPECT/CT may be separated into related, but independent steps, each with its specific human and material resources to use as inputs or outputs. This helps identify factors that could contribute to failure in routine clinical practice. At each step of the process, practical aspects to optimise imaging procedure and protocols are developed. A decision-making algorithm for justifying each CT indication as well as the appropriateness of each CT protocol is the cornerstone of routine clinical practice using SPECT/CT. In conclusion, implementing hybrid SPECT/CT imaging requires new ways of working. It is highly rewarding from a clinical perspective, but it also proves to be a daily challenge in terms of management. (orig.)

  20. SPECT/CT workflow and imaging protocols

    International Nuclear Information System (INIS)

    Beckers, Catherine; Hustinx, Roland

    2014-01-01

    Introducing a hybrid imaging method such as single photon emission computed tomography (SPECT)/CT greatly alters the routine in the nuclear medicine department. It requires designing new workflow processes and the revision of original scheduling process and imaging protocols. In addition, the imaging protocol should be adapted for each individual patient, so that performing CT is fully justified and the CT procedure is fully tailored to address the clinical issue. Such refinements often occur before the procedure is started but may be required at some intermediate stage of the procedure. Furthermore, SPECT/CT leads in many instances to a new partnership with the radiology department. This article presents practical advice and highlights the key clinical elements which need to be considered to help understand the workflow process of SPECT/CT and optimise imaging protocols. The workflow process using SPECT/CT is complex in particular because of its bimodal character, the large spectrum of stakeholders, the multiplicity of their activities at various time points and the need for real-time decision-making. With help from analytical tools developed for quality assessment, the workflow process using SPECT/CT may be separated into related, but independent steps, each with its specific human and material resources to use as inputs or outputs. This helps identify factors that could contribute to failure in routine clinical practice. At each step of the process, practical aspects to optimise imaging procedure and protocols are developed. A decision-making algorithm for justifying each CT indication as well as the appropriateness of each CT protocol is the cornerstone of routine clinical practice using SPECT/CT. In conclusion, implementing hybrid SPECT/CT imaging requires new ways of working. It is highly rewarding from a clinical perspective, but it also proves to be a daily challenge in terms of management. (orig.)

  1. Enhancing and Customizing Laboratory Information Systems to Improve/Enhance Pathologist Workflow.

    Science.gov (United States)

    Hartman, Douglas J

    2015-06-01

    Optimizing pathologist workflow can be difficult because it is affected by many variables. Surgical pathologists must complete many tasks that culminate in a final pathology report. Several software systems can be used to enhance/improve pathologist workflow. These include voice recognition software, pre-sign-out quality assurance, image utilization, and computerized provider order entry. Recent changes in the diagnostic coding and the more prominent role of centralized electronic health records represent potential areas for increased ways to enhance/improve the workflow for surgical pathologists. Additional unforeseen changes to the pathologist workflow may accompany the introduction of whole-slide imaging technology to the routine diagnostic work. Copyright © 2015 Elsevier Inc. All rights reserved.

  2. Analyzing the Gap between Workflows and their Natural Language Descriptions

    NARCIS (Netherlands)

    Groth, P.T.; Gil, Y

    2009-01-01

    Scientists increasingly use workflows to represent and share their computational experiments. Because of their declarative nature, focus on pre-existing component composition and the availability of visual editors, workflows provide a valuable start for creating user-friendly environments for end

  3. Assessment of the Nurse Medication Administration Workflow Process

    Directory of Open Access Journals (Sweden)

    Nathan Huynh

    2016-01-01

    Full Text Available This paper presents findings of an observational study of the Registered Nurse (RN Medication Administration Process (MAP conducted on two comparable medical units in a large urban tertiary care medical center in Columbia, South Carolina. A total of 305 individual MAP observations were recorded over a 6-week period with an average of 5 MAP observations per RN participant for both clinical units. A key MAP variation was identified in terms of unbundled versus bundled MAP performance. In the unbundled workflow, an RN engages in the MAP by performing only MAP tasks during a care episode. In the bundled workflow, an RN completes medication administration along with other patient care responsibilities during the care episode. Using a discrete-event simulation model, this paper addresses the difference between unbundled and bundled workflow and their effects on simulated redesign interventions.

  4. Developing integrated workflows for the digitisation of herbarium specimens using a modular and scalable approach.

    Science.gov (United States)

    Haston, Elspeth; Cubey, Robert; Pullan, Martin; Atkins, Hannah; Harris, David J

    2012-01-01

    Digitisation programmes in many institutes frequently involve disparate and irregular funding, diverse selection criteria and scope, with different members of staff managing and operating the processes. These factors have influenced the decision at the Royal Botanic Garden Edinburgh to develop an integrated workflow for the digitisation of herbarium specimens which is modular and scalable to enable a single overall workflow to be used for all digitisation projects. This integrated workflow is comprised of three principal elements: a specimen workflow, a data workflow and an image workflow.The specimen workflow is strongly linked to curatorial processes which will impact on the prioritisation, selection and preparation of the specimens. The importance of including a conservation element within the digitisation workflow is highlighted. The data workflow includes the concept of three main categories of collection data: label data, curatorial data and supplementary data. It is shown that each category of data has its own properties which influence the timing of data capture within the workflow. Development of software has been carried out for the rapid capture of curatorial data, and optical character recognition (OCR) software is being used to increase the efficiency of capturing label data and supplementary data. The large number and size of the images has necessitated the inclusion of automated systems within the image workflow.

  5. Enabling the Use of Authentic Scientific Data in the Classroom--Lessons Learned from the AccessData and Data Services Workshops

    Science.gov (United States)

    Lynds, S. E.; Buhr, S. M.; Ledley, T. S.

    2007-12-01

    Enabling the Use of Authentic Scientific Data in the Classroom--Lessons Learned from the AccessData and Data Services Workshops Since 2004, the annual AccessData and DLESE Data Services workshops have gathered scientists, data managers, technology specialists, teachers, and curriculum developers to work together creating classroom- ready scientific data modules. Teams of five (one participant from each of the five professions) develop topic- specific online educational units of the Earth Exploration Toolbook (serc.carleton.edu/eet/). Educators from middle schools through undergraduate colleges have been represented, as have scientific data professionals from many organizations across the United States. Extensive evaluation has been included in the design of each workshop. The evaluation results have been used each year to improve subsequent workshops. In addition to refining the format and process of the workshop itself, evaluation data collected reveal attendees' experiences using scientific data for educational purposes. Workshop attendees greatly value the opportunity to network with those of other professional roles in developing a real-world education project using scientific data. Educators appreciate the opportunity to work directly with scientists and technology specialists, while researchers and those in technical fields value the classroom expertise of the educators. Attendees' data use experiences are explored every year. Although bandwidth and connectivity were problems for data use in 2004, that has become much less common over time. The most common barriers to data use cited now are discoverability, data format problems, incomplete data sets, and poor documentation. Most attendees agree that the most useful types of online documentation and user support for scientific data are step-by-step instructions, examples, tutorials, and reference manuals. Satellite imagery and weather data were the most commonly used types of data, and these were often

  6. The Nasa-Isro SAR Mission Science Data Products and Processing Workflows

    Science.gov (United States)

    Rosen, P. A.; Agram, P. S.; Lavalle, M.; Cohen, J.; Buckley, S.; Kumar, R.; Misra-Ray, A.; Ramanujam, V.; Agarwal, K. M.

    2017-12-01

    The NASA-ISRO SAR (NISAR) Mission is currently in the development phase and in the process of specifying its suite of data products and algorithmic workflows, responding to inputs from the NISAR Science and Applications Team. NISAR will provide raw data (Level 0), full-resolution complex imagery (Level 1), and interferometric and polarimetric image products (Level 2) for the entire data set, in both natural radar and geocoded coordinates. NASA and ISRO are coordinating the formats, meta-data layers, and algorithms for these products, for both the NASA-provided L-band radar and the ISRO-provided S-band radar. Higher level products will be also be generated for the purpose of calibration and validation, over large areas of Earth, including tectonic plate boundaries, ice sheets and sea-ice, and areas of ecosystem disturbance and change. This level of comprehensive product generation has been unprecedented for SAR missions in the past, and leads to storage processing challenges for the production system and the archive center. Further, recognizing the potential to support applications that require low latency product generation and delivery, the NISAR team is optimizing the entire end-to-end ground data system for such response, including exploring the advantages of cloud-based processing, algorithmic acceleration using GPUs, and on-demand processing schemes that minimize computational and transport costs, but allow rapid delivery to science and applications users. This paper will review the current products, workflows, and discuss the scientific and operational trade-space of mission capabilities.

  7. EDMS based workflow for Printing Industry

    Directory of Open Access Journals (Sweden)

    Prathap Nayak

    2013-04-01

    Full Text Available Information is indispensable factor of any enterprise. It can be a record or a document generated for every transaction that is made, which is either a paper based or in electronic format for future reference. A Printing Industry is one such industry in which managing information of various formats, with latest workflows and technologies, could be a nightmare and a challenge for any operator or an user when each process from the least bit of information to a printed product are always dependendent on each other. Hence the information has to be harmonized artistically in order to avoid production downtime or employees pointing fingers at each other. This paper analyses how the implementation of Electronic Document Management System (EDMS could contribute to the Printing Industry for immediate access to stored documents within and across departments irrespective of geographical boundaries. The paper outlines initially with a brief history, contemporary EDMS system and some illustrated examples with a study done by choosing Library as a pilot area for evaluating EDMS. The paper ends with an imitative proposal that maps several document management based activities for implementation of EDMS for a Printing Industry.

  8. Summer Student Report - AV Workflow

    CERN Document Server

    Abramson, Jessie

    2014-01-01

    The AV Workflow is web application which allows cern users to publish, update and delete videos from cds. During my summer internship I implemented the backend of the new version of the AV Worklow in python using the django framework.

  9. Parametric Room Acoustic workflows with real-time acoustic simulation

    DEFF Research Database (Denmark)

    Parigi, Dario

    2017-01-01

    The paper investigates and assesses the opportunities that real-time acoustic simulation offer to engage in parametric acoustics workflow and to influence architectural designs from early design stages......The paper investigates and assesses the opportunities that real-time acoustic simulation offer to engage in parametric acoustics workflow and to influence architectural designs from early design stages...

  10. Exploring the impact of an automated prescription-filling device on community pharmacy technician workflow

    Science.gov (United States)

    Walsh, Kristin E.; Chui, Michelle Anne; Kieser, Mara A.; Williams, Staci M.; Sutter, Susan L.; Sutter, John G.

    2012-01-01

    Objective To explore community pharmacy technician workflow change after implementation of an automated robotic prescription-filling device. Methods At an independent community pharmacy in rural Mayville, WI, pharmacy technicians were observed before and 3 months after installation of an automated robotic prescription-filling device. The main outcome measures were sequences and timing of technician workflow steps, workflow interruptions, automation surprises, and workarounds. Results Of the 77 and 80 observations made before and 3 months after robot installation, respectively, 17 different workflow sequences were observed before installation and 38 after installation. Average prescription filling time was reduced by 40 seconds per prescription with use of the robot. Workflow interruptions per observation increased from 1.49 to 1.79 (P = 0.11), and workarounds increased from 10% to 36% after robot use. Conclusion Although automated prescription-filling devices can increase efficiency, workflow interruptions and workarounds may negate that efficiency. Assessing changes in workflow and sequencing of tasks that may result from the use of automation can help uncover opportunities for workflow policy and procedure redesign. PMID:21896459

  11. From Paper Based Clinical Practice Guidelines to Declarative Workflow Management

    DEFF Research Database (Denmark)

    Lyng, Karen Marie; Hildebrandt, Thomas; Mukkamala, Raghava Rao

    2009-01-01

    a sub workflow can be described in a declarative workflow management system: the Resultmaker Online Consultant (ROC). The example demonstrates that declarative primitives allow to naturally extend the paper based flowchart to an executable model without introducing a complex cyclic control flow graph....

  12. Workflow interruptions, cognitive failure and near-accidents in health care.

    Science.gov (United States)

    Elfering, Achim; Grebner, Simone; Ebener, Corinne

    2015-01-01

    Errors are frequent in health care. A specific model was tested that affirms failure in cognitive action regulation to mediate the influence of nurses' workflow interruptions and safety conscientiousness on near-accidents in health care. One hundred and sixty-five nurses from seven Swiss hospitals participated in a questionnaire survey. Structural equation modelling confirmed the hypothesised mediation model. Cognitive failure in action regulation significantly mediated the influence of workflow interruptions on near-accidents (p accidents via cognitive failure in action regulation was also significant (p accidents; moreover, cognitive failure mediated the association between compliance and near-accidents (p < .05). Contrary to expectations, compliance with safety regulations was not related to workflow interruptions. Workflow interruptions caused by colleagues, patients and organisational constraints are likely to trigger errors in nursing. Work redesign is recommended to reduce cognitive failure and improve safety of nurses and patients.

  13. Distributing Workflows over a Ubiquitous P2P Network

    Directory of Open Access Journals (Sweden)

    Eddie Al-Shakarchi

    2007-01-01

    Full Text Available This paper discusses issues in the distribution of bundled workflows across ubiquitous peer-to-peer networks for the application of music information retrieval. The underlying motivation for this work is provided by the DART project, which aims to develop a novel music recommendation system by gathering statistical data using collaborative filtering techniques and the analysis of the audio itsel, in order to create a reliable and comprehensive database of the music that people own and which they listen to. To achieve this, the DART scientists creating the algorithms need the ability to distribute the Triana workflows they create, representing the analysis to be performed, across the network on a regular basis (perhaps even daily in order to update the network as a whole with new workflows to be executed for the analysis. DART uses a similar approach to BOINC but differs in that the workers receive input data in the form of a bundled Triana workflow, which is executed in order to process any MP3 files that they own on their machine. Once analysed, the results are returned to DART's distributed database that collects and aggregates the resulting information. DART employs the use of package repositories to decentralise the distribution of such workflow bundles and this approach is validated in this paper through simulations that show that suitable scalability is maintained through the system as the number of participants increases. The results clearly illustrate the effectiveness of the approach.

  14. Flexible Early Warning Systems with Workflows and Decision Tables

    Science.gov (United States)

    Riedel, F.; Chaves, F.; Zeiner, H.

    2012-04-01

    An essential part of early warning systems and systems for crisis management are decision support systems that facilitate communication and collaboration. Often official policies specify how different organizations collaborate and what information is communicated to whom. For early warning systems it is crucial that information is exchanged dynamically in a timely manner and all participants get exactly the information they need to fulfil their role in the crisis management process. Information technology obviously lends itself to automate parts of the process. We have experienced however that in current operational systems the information logistics processes are hard-coded, even though they are subject to change. In addition, systems are tailored to the policies and requirements of a certain organization and changes can require major software refactoring. We seek to develop a system that can be deployed and adapted to multiple organizations with different dynamic runtime policies. A major requirement for such a system is that changes can be applied locally without affecting larger parts of the system. In addition to the flexibility regarding changes in policies and processes, the system needs to be able to evolve; when new information sources become available, it should be possible to integrate and use these in the decision process. In general, this kind of flexibility comes with a significant increase in complexity. This implies that only IT professionals can maintain a system that can be reconfigured and adapted; end-users are unable to utilise the provided flexibility. In the business world similar problems arise and previous work suggested using business process management systems (BPMS) or workflow management systems (WfMS) to guide and automate early warning processes or crisis management plans. However, the usability and flexibility of current WfMS are limited, because current notations and user interfaces are still not suitable for end-users, and workflows

  15. Automation of Flexible Migration Workflows

    Directory of Open Access Journals (Sweden)

    Dirk von Suchodoletz

    2011-03-01

    Full Text Available Many digital preservation scenarios are based on the migration strategy, which itself is heavily tool-dependent. For popular, well-defined and often open file formats – e.g., digital images, such as PNG, GIF, JPEG – a wide range of tools exist. Migration workflows become more difficult with proprietary formats, as used by the several text processing applications becoming available in the last two decades. If a certain file format can not be rendered with actual software, emulation of the original environment remains a valid option. For instance, with the original Lotus AmiPro or Word Perfect, it is not a problem to save an object of this type in ASCII text or Rich Text Format. In specific environments, it is even possible to send the file to a virtual printer, thereby producing a PDF as a migration output. Such manual migration tasks typically involve human interaction, which may be feasible for a small number of objects, but not for larger batches of files.We propose a novel approach using a software-operated VNC abstraction layer in order to replace humans with machine interaction. Emulators or virtualization tools equipped with a VNC interface are very well suited for this approach. But screen, keyboard and mouse interaction is just part of the setup. Furthermore, digital objects need to be transferred into the original environment in order to be extracted after processing. Nevertheless, the complexity of the new generation of migration services is quickly rising; a preservation workflow is now comprised not only of the migration tool itself, but of a complete software and virtual hardware stack with recorded workflows linked to every supported migration scenario. Thus the requirements of OAIS management must include proper software archiving, emulator selection, system image and recording handling. The concept of view-paths could help either to automatically determine the proper pre-configured virtual environment or to set up system

  16. Considering Time in Orthophotography Production: from a General Workflow to a Shortened Workflow for a Faster Disaster Response

    Science.gov (United States)

    Lucas, G.

    2015-08-01

    This article overall deals with production time with orthophoto imagery with medium size digital frame camera. The workflow examination follows two main parts: data acquisition and post-processing. The objectives of the research are fourfold: 1/ gathering time references for the most important steps of orthophoto production (it turned out that literature is missing on this topic); these figures are used later for total production time estimation; 2/ identifying levers for reducing orthophoto production time; 3/ building a simplified production workflow for emergency response: less exigent with accuracy and faster; and compare it to a classical workflow; 4/ providing methodical elements for the estimation of production time with a custom project. In the data acquisition part a comprehensive review lists and describes all the factors that may affect the acquisition efficiency. Using a simulation with different variables (average line length, time of the turns, flight speed) their effect on acquisition efficiency is quantitatively examined. Regarding post-processing, the time references figures were collected from the processing of a 1000 frames case study with 15 cm GSD covering a rectangular area of 447 km2; the time required to achieve each step during the production is written down. When several technical options are possible, each one is tested and time documented so as all alternatives are available. Based on a technical choice with the workflow and using the compiled time reference of the elementary steps, a total time is calculated for the post-processing of the 1000 frames. Two scenarios are compared as regards to time and accuracy. The first one follows the "normal" practices, comprising triangulation, orthorectification and advanced mosaicking methods (feature detection, seam line editing and seam applicator); the second is simplified and make compromise over positional accuracy (using direct geo-referencing) and seamlines preparation in order to achieve

  17. Planning bioinformatics workflows using an expert system

    Science.gov (United States)

    Chen, Xiaoling; Chang, Jeffrey T.

    2017-01-01

    Abstract Motivation: Bioinformatic analyses are becoming formidably more complex due to the increasing number of steps required to process the data, as well as the proliferation of methods that can be used in each step. To alleviate this difficulty, pipelines are commonly employed. However, pipelines are typically implemented to automate a specific analysis, and thus are difficult to use for exploratory analyses requiring systematic changes to the software or parameters used. Results: To automate the development of pipelines, we have investigated expert systems. We created the Bioinformatics ExperT SYstem (BETSY) that includes a knowledge base where the capabilities of bioinformatics software is explicitly and formally encoded. BETSY is a backwards-chaining rule-based expert system comprised of a data model that can capture the richness of biological data, and an inference engine that reasons on the knowledge base to produce workflows. Currently, the knowledge base is populated with rules to analyze microarray and next generation sequencing data. We evaluated BETSY and found that it could generate workflows that reproduce and go beyond previously published bioinformatics results. Finally, a meta-investigation of the workflows generated from the knowledge base produced a quantitative measure of the technical burden imposed by each step of bioinformatics analyses, revealing the large number of steps devoted to the pre-processing of data. In sum, an expert system approach can facilitate exploratory bioinformatic analysis by automating the development of workflows, a task that requires significant domain expertise. Availability and Implementation: https://github.com/jefftc/changlab Contact: jeffrey.t.chang@uth.tmc.edu PMID:28052928

  18. Developing integrated workflows for the digitisation of herbarium specimens using a modular and scalable approach

    Directory of Open Access Journals (Sweden)

    Elspeth Haston

    2012-07-01

    Full Text Available Digitisation programmes in many institutes frequently involve disparate and irregular funding, diverse selection criteria and scope, with different members of staff managing and operating the processes. These factors have influenced the decision at the Royal Botanic Garden Edinburgh to develop an integrated workflow for the digitisation of herbarium specimens which is modular and scalable to enable a single overall workflow to be used for all digitisation projects. This integrated workflow is comprised of three principal elements: a specimen workflow, a data workflow and an image workflow.The specimen workflow is strongly linked to curatorial processes which will impact on the prioritisation, selection and preparation of the specimens. The importance of including a conservation element within the digitisation workflow is highlighted. The data workflow includes the concept of three main categories of collection data: label data, curatorial data and supplementary data. It is shown that each category of data has its own properties which influence the timing of data capture within the workflow. Development of software has been carried out for the rapid capture of curatorial data, and optical character recognition (OCR software is being used to increase the efficiency of capturing label data and supplementary data. The large number and size of the images has necessitated the inclusion of automated systems within the image workflow.

  19. Data mining workflow templates for intelligent discovery assistance in RapidMiner

    OpenAIRE

    Kietz, J U; Serban, F; Bernstein, A; Fischer, S

    2010-01-01

    Knowledge Discovery in Databases (KDD) has evolved during the last years and reached a mature stage offering plenty of operators to solve complex tasks. User support for building workflows, in contrast, has not increased proportionally. The large number of operators available in current KDD systems make it difficult for users to successfully analyze data. Moreover, workflows easily contain a large number of operators and parts of the workflows are applied several times, thus it is hard for us...

  20. Requirements for Secure Logging of Decentralized Cross-Organizational Workflow Executions

    NARCIS (Netherlands)

    Wombacher, Andreas; Wieringa, Roelf J.; Jonker, Willem; Knezevic, P.; Pokraev, S.; meersman, R; Tari, Z; herrero, p; Méndez, G.; Cavedon, L.; Martin, D.; Hinze, A.; Buchanan, G.

    2005-01-01

    The control of actions performed by parties involved in a decentralized cross-organizational workflow is done by several independent workflow engines. Due to the lack of a centralized coordination control, an auditing is required which supports a reliable and secure detection of malicious actions

  1. CrossFlow: Cross-Organizational Workflow Management in Dynamic Virtual Enterprises

    NARCIS (Netherlands)

    Grefen, P.W.P.J.; Aberer, Karl; Hoffner, Yigal; Ludwig, Heiko

    In this report, we present the approach to cross-organizational workflow management of the CrossFlow project. CrossFlow is a European research project aiming at the support of cross-organizational workflows in dynamic virtual enterprises. The cooperation in these virtual enterprises is based on

  2. CrossFlow : cross-organizational workflow management in dynamic virtual enterprises

    NARCIS (Netherlands)

    Grefen, P.W.P.J.; Aberer, K.; Hoffner, Y.

    2000-01-01

    This paper gives a detailed overview of the approach to cross-organizational workflow management developed in the CrossFlow project. CrossFlow is a European research project aiming at the support of cross-organizational workflows in dynamic virtual enterprises. The cooperation in these virtual

  3. Research and Implementation of Key Technologies in Multi-Agent System to Support Distributed Workflow

    Science.gov (United States)

    Pan, Tianheng

    2018-01-01

    In recent years, the combination of workflow management system and Multi-agent technology is a hot research field. The problem of lack of flexibility in workflow management system can be improved by introducing multi-agent collaborative management. The workflow management system adopts distributed structure. It solves the problem that the traditional centralized workflow structure is fragile. In this paper, the agent of Distributed workflow management system is divided according to its function. The execution process of each type of agent is analyzed. The key technologies such as process execution and resource management are analyzed.

  4. Integration of the radiotherapy irradiation planning in the digital workflow

    International Nuclear Information System (INIS)

    Roehner, F.; Schmucker, M.; Henne, K.; Bruggmoser, G.; Grosu, A.L.; Frommhold, H.; Heinemann, F.E.; Momm, F.

    2013-01-01

    Background and purpose: At the Clinic of Radiotherapy at the University Hospital Freiburg, all relevant workflow is paperless. After implementing the Operating Schedule System (OSS) as a framework, all processes are being implemented into the departmental system MOSAIQ. Designing a digital workflow for radiotherapy irradiation planning is a large challenge, it requires interdisciplinary expertise and therefore the interfaces between the professions also have to be interdisciplinary. For every single step of radiotherapy irradiation planning, distinct responsibilities have to be defined and documented. All aspects of digital storage, backup and long-term availability of data were considered and have already been realized during the OSS project. Method: After an analysis of the complete workflow and the statutory requirements, a detailed project plan was designed. In an interdisciplinary workgroup, problems were discussed and a detailed flowchart was developed. The new functionalities were implemented in a testing environment by the Clinical and Administrative IT Department (CAI). After extensive tests they were integrated into the new modular department system. Results and conclusion: The Clinic of Radiotherapy succeeded in realizing a completely digital workflow for radiotherapy irradiation planning. During the testing phase, our digital workflow was examined and afterwards was approved by the responsible authority. (orig.)

  5. Measuring Semantic and Structural Information for Data Oriented Workflow Retrieval with Cost Constraints

    Directory of Open Access Journals (Sweden)

    Yinglong Ma

    2014-01-01

    Full Text Available The reuse of data oriented workflows (DOWs can reduce the cost of workflow system development and control the risk of project failure and therefore is crucial for accelerating the automation of business processes. Reusing workflows can be achieved by measuring the similarity among candidate workflows and selecting the workflow satisfying requirements of users from them. However, due to DOWs being often developed based on an open, distributed, and heterogeneous environment, different users often can impose diverse cost constraints on data oriented workflows. This makes the reuse of DOWs challenging. There is no clear solution for retrieving DOWs with cost constraints. In this paper, we present a novel graph based model of DOWs with cost constraints, called constrained data oriented workflow (CDW, which can express cost constraints that users are often concerned about. An approach is proposed for retrieving CDWs, which seamlessly combines semantic and structural information of CDWs. A distance measure based on matrix theory is adopted to seamlessly combine semantic and structural similarities of CDWs for selecting and reusing them. Finally, the related experiments are made to show the effectiveness and efficiency of our approach.

  6. A Multilevel Secure Workflow Management System

    National Research Council Canada - National Science Library

    Kang, Myong H; Froscher, Judith N; Sheth, Amit P; Kochut, Krys J; Miller, John A

    1999-01-01

    The Department of Defense (DoD) needs multilevel secure (MLS) workflow management systems to enable globally distributed users and applications to cooperate across classification levels to achieve mission critical goals...

  7. Enabling big geoscience data analytics with a cloud-based, MapReduce-enabled and service-oriented workflow framework.

    Directory of Open Access Journals (Sweden)

    Zhenlong Li

    Full Text Available Geoscience observations and model simulations are generating vast amounts of multi-dimensional data. Effectively analyzing these data are essential for geoscience studies. However, the tasks are challenging for geoscientists because processing the massive amount of data is both computing and data intensive in that data analytics requires complex procedures and multiple tools. To tackle these challenges, a scientific workflow framework is proposed for big geoscience data analytics. In this framework techniques are proposed by leveraging cloud computing, MapReduce, and Service Oriented Architecture (SOA. Specifically, HBase is adopted for storing and managing big geoscience data across distributed computers. MapReduce-based algorithm framework is developed to support parallel processing of geoscience data. And service-oriented workflow architecture is built for supporting on-demand complex data analytics in the cloud environment. A proof-of-concept prototype tests the performance of the framework. Results show that this innovative framework significantly improves the efficiency of big geoscience data analytics by reducing the data processing time as well as simplifying data analytical procedures for geoscientists.

  8. Enabling big geoscience data analytics with a cloud-based, MapReduce-enabled and service-oriented workflow framework.

    Science.gov (United States)

    Li, Zhenlong; Yang, Chaowei; Jin, Baoxuan; Yu, Manzhu; Liu, Kai; Sun, Min; Zhan, Matthew

    2015-01-01

    Geoscience observations and model simulations are generating vast amounts of multi-dimensional data. Effectively analyzing these data are essential for geoscience studies. However, the tasks are challenging for geoscientists because processing the massive amount of data is both computing and data intensive in that data analytics requires complex procedures and multiple tools. To tackle these challenges, a scientific workflow framework is proposed for big geoscience data analytics. In this framework techniques are proposed by leveraging cloud computing, MapReduce, and Service Oriented Architecture (SOA). Specifically, HBase is adopted for storing and managing big geoscience data across distributed computers. MapReduce-based algorithm framework is developed to support parallel processing of geoscience data. And service-oriented workflow architecture is built for supporting on-demand complex data analytics in the cloud environment. A proof-of-concept prototype tests the performance of the framework. Results show that this innovative framework significantly improves the efficiency of big geoscience data analytics by reducing the data processing time as well as simplifying data analytical procedures for geoscientists.

  9. Enabling Big Geoscience Data Analytics with a Cloud-Based, MapReduce-Enabled and Service-Oriented Workflow Framework

    Science.gov (United States)

    Li, Zhenlong; Yang, Chaowei; Jin, Baoxuan; Yu, Manzhu; Liu, Kai; Sun, Min; Zhan, Matthew

    2015-01-01

    Geoscience observations and model simulations are generating vast amounts of multi-dimensional data. Effectively analyzing these data are essential for geoscience studies. However, the tasks are challenging for geoscientists because processing the massive amount of data is both computing and data intensive in that data analytics requires complex procedures and multiple tools. To tackle these challenges, a scientific workflow framework is proposed for big geoscience data analytics. In this framework techniques are proposed by leveraging cloud computing, MapReduce, and Service Oriented Architecture (SOA). Specifically, HBase is adopted for storing and managing big geoscience data across distributed computers. MapReduce-based algorithm framework is developed to support parallel processing of geoscience data. And service-oriented workflow architecture is built for supporting on-demand complex data analytics in the cloud environment. A proof-of-concept prototype tests the performance of the framework. Results show that this innovative framework significantly improves the efficiency of big geoscience data analytics by reducing the data processing time as well as simplifying data analytical procedures for geoscientists. PMID:25742012

  10. Leveraging the Power of High Performance Computing for Next Generation Sequencing Data Analysis: Tricks and Twists from a High Throughput Exome Workflow

    Science.gov (United States)

    Wonczak, Stephan; Thiele, Holger; Nieroda, Lech; Jabbari, Kamel; Borowski, Stefan; Sinha, Vishal; Gunia, Wilfried; Lang, Ulrich; Achter, Viktor; Nürnberg, Peter

    2015-01-01

    Next generation sequencing (NGS) has been a great success and is now a standard method of research in the life sciences. With this technology, dozens of whole genomes or hundreds of exomes can be sequenced in rather short time, producing huge amounts of data. Complex bioinformatics analyses are required to turn these data into scientific findings. In order to run these analyses fast, automated workflows implemented on high performance computers are state of the art. While providing sufficient compute power and storage to meet the NGS data challenge, high performance computing (HPC) systems require special care when utilized for high throughput processing. This is especially true if the HPC system is shared by different users. Here, stability, robustness and maintainability are as important for automated workflows as speed and throughput. To achieve all of these aims, dedicated solutions have to be developed. In this paper, we present the tricks and twists that we utilized in the implementation of our exome data processing workflow. It may serve as a guideline for other high throughput data analysis projects using a similar infrastructure. The code implementing our solutions is provided in the supporting information files. PMID:25942438

  11. Design and implementation of a secure workflow system based on PKI/PMI

    Science.gov (United States)

    Yan, Kai; Jiang, Chao-hui

    2013-03-01

    As the traditional workflow system in privilege management has the following weaknesses: low privilege management efficiency, overburdened for administrator, lack of trust authority etc. A secure workflow model based on PKI/PMI is proposed after studying security requirements of the workflow systems in-depth. This model can achieve static and dynamic authorization after verifying user's ID through PKC and validating user's privilege information by using AC in workflow system. Practice shows that this system can meet the security requirements of WfMS. Moreover, it can not only improve system security, but also ensures integrity, confidentiality, availability and non-repudiation of the data in the system.

  12. Characterizing workflow for pediatric asthma patients in emergency departments using electronic health records.

    Science.gov (United States)

    Ozkaynak, Mustafa; Dziadkowiec, Oliwier; Mistry, Rakesh; Callahan, Tiffany; He, Ze; Deakyne, Sara; Tham, Eric

    2015-10-01

    The purpose of this study was to describe a workflow analysis approach and apply it in emergency departments (EDs) using data extracted from the electronic health record (EHR) system. We used data that were obtained during 2013 from the ED of a children's hospital and its four satellite EDs. Workflow-related data were extracted for all patient visits with either a primary or secondary diagnosis on discharge of asthma (ICD-9 code=493). For each patient visit, eight different a priori time-stamped events were identified. Data were also collected on mode of arrival, patient demographics, triage score (i.e. acuity level), and primary/secondary diagnosis. Comparison groups were by acuity levels 2 and 3 with 2 being more acute than 3, arrival mode (ambulance versus walk-in), and site. Data were analyzed using a visualization method and Markov Chains. To demonstrate the viability and benefit of the approach, patient care workflows were visually and quantitatively compared. The analysis of the EHR data allowed for exploration of workflow patterns and variation across groups. Results suggest that workflow was different for different arrival modes, settings and acuity levels. EHRs can be used to explore workflow with statistical and visual analytics techniques novel to the health care setting. The results generated by the proposed approach could be utilized to help institutions identify workflow issues, plan for varied workflows and ultimately improve efficiency in caring for diverse patient groups. EHR data and novel analytic techniques in health care can expand our understanding of workflow in both large and small ED units. Copyright © 2015 Elsevier Inc. All rights reserved.

  13. Authentication systems for securing clinical documentation workflows. A systematic literature review.

    Science.gov (United States)

    Schwartze, J; Haarbrandt, B; Fortmeier, D; Haux, R; Seidel, C

    2014-01-01

    Integration of electronic signatures embedded in health care processes in Germany challenges health care service and supply facilities. The suitability of the signature level of an eligible authentication procedure is confirmed for a large part of documents in clinical practice. However, the concrete design of such a procedure remains unclear. To create a summary of usable user authentication systems suitable for clinical workflows. A Systematic literature review based on nine online bibliographic databases. Search keywords included authentication, access control, information systems, information security and biometrics with terms user authentication, user identification and login in title or abstract. Searches were run between 7 and 12 September 2011. Relevant conference proceedings were searched manually in February 2013. Backward reference search of selected results was done. Only publications fully describing authentication systems used or usable were included. Algorithms or purely theoretical concepts were excluded. Three authors did selection independently. DATA EXTRACTION AND ASSESSMENT: Semi-structured extraction of system characteristics was done by the main author. Identified procedures were assessed for security and fulfillment of relevant laws and guidelines as well as for applicability. Suitability for clinical workflows was derived from the assessments using a weighted sum proposed by Bonneau. Of 7575 citations retrieved, 55 publications meet our inclusion criteria. They describe 48 different authentication systems; 39 were biometric and nine graphical password systems. Assessment of authentication systems showed high error rates above European CENELEC standards and a lack of applicability of biometric systems. Graphical passwords did not add overall value compared to conventional passwords. Continuous authentication can add an additional layer of safety. Only few systems are suitable partially or entirely for use in clinical processes. Suitability

  14. A novel approach to optimize workflow in grid-based teleradiology applications.

    Science.gov (United States)

    Yılmaz, Ayhan Ozan; Baykal, Nazife

    2016-01-01

    This study proposes an infrastructure with a reporting workflow optimization algorithm (RWOA) in order to interconnect facilities, reporting units and radiologists on a single access interface, to increase the efficiency of the reporting process by decreasing the medical report turnaround time and to increase the quality of medical reports by determining the optimum match between the inspection and radiologist in terms of subspecialty, workload and response time. Workflow centric network architecture with an enhanced caching, querying and retrieving mechanism is implemented by seamlessly integrating Grid Agent and Grid Manager to conventional digital radiology systems. The inspection and radiologist attributes are modelled using a hierarchical ontology structure. Attribute preferences rated by radiologists and technical experts are formed into reciprocal matrixes and weights for entities are calculated utilizing Analytic Hierarchy Process (AHP). The assignment alternatives are processed by relation-based semantic matching (RBSM) and Integer Linear Programming (ILP). The results are evaluated based on both real case applications and simulated process data in terms of subspecialty, response time and workload success rates. Results obtained using simulated data are compared with the outcomes obtained by applying Round Robin, Shortest Queue and Random distribution policies. The proposed algorithm is also applied to a real case teleradiology application process data where medical reporting workflow was performed based on manual assignments by the chief radiologist for 6225 inspections. RBSM gives the highest subspecialty success rate and integrating ILP with RBSM ratings as RWOA provides a better response time and workload distribution success rate. RWOA based image delivery also prevents bandwidth, storage or hardware related stuck and latencies. When compared with a real case teleradiology application where inspection assignments were performed manually, the proposed

  15. A Workflow-Oriented Approach To Propagation Models In Heliophysics

    Directory of Open Access Journals (Sweden)

    Gabriele Pierantoni

    2014-01-01

    Full Text Available The Sun is responsible for the eruption of billions of tons of plasma andthe generation of near light-speed particles that propagate throughout the solarsystem and beyond. If directed towards Earth, these events can be damaging toour tecnological infrastructure. Hence there is an effort to understand the causeof the eruptive events and how they propagate from Sun to Earth. However, thephysics governing their propagation is not well understood, so there is a need todevelop a theoretical description of their propagation, known as a PropagationModel, in order to predict when they may impact Earth. It is often difficultto define a single propagation model that correctly describes the physics ofsolar eruptive events, and even more difficult to implement models capable ofcatering for all these complexities and to validate them using real observational data.In this paper, we envisage that workflows offer both a theoretical andpractical framerwork for a novel approach to propagation models. We definea mathematical framework that aims at encompassing the different modalitieswith which workflows can be used, and provide a set of generic building blockswritten in the TAVERNA workflow language that users can use to build theirown propagation models. Finally we test both the theoretical model and thecomposite building blocks of the workflow with a real Science Use Case that wasdiscussed during the 4th CDAW (Coordinated Data Analysis Workshop eventheld by the HELIO project. We show that generic workflow building blocks canbe used to construct a propagation model that succesfully describes the transitof solar eruptive events toward Earth and predict a correct Earth-impact time

  16. Analog to digital workflow improvement: a quantitative study.

    Science.gov (United States)

    Wideman, Catherine; Gallet, Jacqueline

    2006-01-01

    This study tracked a radiology department's conversion from utilization of a Kodak Amber analog system to a Kodak DirectView DR 5100 digital system. Through the use of ProModel Optimization Suite, a workflow simulation software package, significant quantitative information was derived from workflow process data measured before and after the change to a digital system. Once the digital room was fully operational and the radiology staff comfortable with the new system, average patient examination time was reduced from 9.24 to 5.28 min, indicating that a higher patient throughput could be achieved. Compared to the analog system, chest examination time for modality specific activities was reduced by 43%. The percentage of repeat examinations experienced with the digital system also decreased to 8% vs. the level of 9.5% experienced with the analog system. The study indicated that it is possible to quantitatively study clinical workflow and productivity by using commercially available software.

  17. AnalyzeThis: An Analysis Workflow-Aware Storage System

    Energy Technology Data Exchange (ETDEWEB)

    Sim, Hyogi [ORNL; Kim, Youngjae [ORNL; Vazhkudai, Sudharshan S [ORNL; Tiwari, Devesh [ORNL; Anwar, Ali [Virginia Tech, Blacksburg, VA; Butt, Ali R [Virginia Tech, Blacksburg, VA; Ramakrishnan, Lavanya [Lawrence Berkeley National Laboratory (LBNL)

    2015-01-01

    The need for novel data analysis is urgent in the face of a data deluge from modern applications. Traditional approaches to data analysis incur significant data movement costs, moving data back and forth between the storage system and the processor. Emerging Active Flash devices enable processing on the flash, where the data already resides. An array of such Active Flash devices allows us to revisit how analysis workflows interact with storage systems. By seamlessly blending together the flash storage and data analysis, we create an analysis workflow-aware storage system, AnalyzeThis. Our guiding principle is that analysis-awareness be deeply ingrained in each and every layer of the storage, elevating data analyses as first-class citizens, and transforming AnalyzeThis into a potent analytics-aware appliance. We implement the AnalyzeThis storage system atop an emulation platform of the Active Flash array. Our results indicate that AnalyzeThis is viable, expediting workflow execution and minimizing data movement.

  18. An integrated billing application to streamline clinician workflow.

    Science.gov (United States)

    Vawdrey, David K; Walsh, Colin; Stetson, Peter D

    2014-01-01

    Between 2008 and 2010, our academic medical center transitioned to electronic provider documentation using a commercial electronic health record system. For attending physicians, one of the most frustrating aspects of this experience was the system's failure to support their existing electronic billing workflow. Because of poor system integration, it was difficult to verify the supporting documentation for each bill and impractical to track whether billable notes had corresponding charges. We developed and deployed in 2011 an integrated billing application called "iCharge" that streamlines clinicians' documentation and billing workflow, and simultaneously populates the inpatient problem list using billing diagnosis codes. Each month, over 550 physicians use iCharge to submit approximately 23,000 professional service charges for over 4,200 patients. On average, about 2.5 new problems are added to each patient's problem list. This paper describes the challenges and benefits of workflow integration across disparate applications and presents an example of innovative software development within a commercial EHR framework.

  19. Demystifying Open Access

    International Nuclear Information System (INIS)

    Mele, Salvatore

    2007-01-01

    The tenets of Open Access are to grant anyone, anywhere and anytime free access to the results of scientific research. HEP spearheaded the Open Access dissemination of scientific results with the mass mailing of preprints in the pre-WWW era and with the launch of the arXiv preprint system at the dawn of the '90s. The HEP community is now ready for a further push to Open Access while retaining all the advantages of the peer-review system and, at the same time, bring the spiralling cost of journal subscriptions under control. I will present a possible plan for the conversion to Open Access of HEP peer-reviewed journals, through a consortium of HEP funding agencies, laboratories and libraries: SCOAP3 (Sponsoring Consortium for Open Access Publishing in Particle Physics). SCOAP3 will engage with scientific publishers towards building a sustainable model for Open Access publishing, which is as transparent as possible for HEP authors. The current system in which journals income comes from subscription fees is replaced with a scheme where SCOAP3 compensates publishers for the costs incurred to organise the peer-review service and give Open Access to the final version of articles. SCOAP3 will be funded by all countries active in HEP under a 'fair share' scenario, according to their production of HEP articles. In this talk I will present a short overview of the history of Open Access in HEP, the details of the SCOAP3 model and the outlook for its implementation.

  20. CMS Alignement and Calibration workflows: lesson learned and future plans

    CERN Document Server

    AUTHOR|(CDS)2069172

    2014-01-01

    We review the online and offline workflows designed to align and calibrate the CMS detector. Starting from the gained experience during the first LHC run, we discuss the expected developments for Run II. In particular, we describe the envisioned different stages, from the alignment using cosmic rays data to the detector alignment and calibration using the first proton-proton collisions data ( O(100 pb-1) ) and a larger dataset ( O(1 fb-1) ) to reach the target precision. The automatisation of the workflow and the integration in the online and offline activity (dedicated triggers and datasets, data skims, workflows to compute the calibration and alignment constants) are discussed.

  1. New Interactions with Workflow Systems

    NARCIS (Netherlands)

    Wassink, I.; van der Vet, P.E.; van der Veer, Gerrit C.; Roos, M.; van Dijk, Elisabeth M.A.G.; Norros, L.; Koskinen, H.; Salo, L.; Savioja, P.

    2009-01-01

    This paper describes the evaluation of our early design ideas of an ad-hoc of workflow system. Using the teach-back technique, we have performed a hermeneutic analysis of the mockup implementation named NIWS to get corrective and creative feedback at the functional, dialogue and representation level

  2. Open-Access Publishing

    Directory of Open Access Journals (Sweden)

    Nedjeljko Frančula

    2013-06-01

    Full Text Available Nature, one of the most prominent scientific journals dedicated one of its issues to recent changes in scientific publishing (Vol. 495, Issue 7442, 27 March 2013. Its editors stressed that words technology and revolution are closely related when it comes to scientific publishing. In addition, the transformation of research publishing is not as much a revolution than an attrition war in which all sides are buried. The most important change they refer to is the open-access model in which an author or an institution pays in advance for publishing a paper in a journal, and the paper is then available to users on the Internet free of charge.According to preliminary results of a survey conducted among 23 000 scientists by the publisher of Nature, 45% of them believes all papers should be published in open access, but at the same time 22% of them would not allow the use of papers for commercial purposes. Attitudes toward open access vary according to scientific disciplines, leading the editors to conclude the revolution still does not suit everyone.

  3. Logical provenance in data-oriented workflows?

    KAUST Repository

    Ikeda, R.; Das Sarma, Akash; Widom, J.

    2013-01-01

    for general transformations, introducing the notions of correctness, precision, and minimality. We then determine when properties such as correctness and minimality carry over from the individual transformations' provenance to the workflow provenance. We

  4. Workflow in clinical trial sites & its association with near miss events for data quality: ethnographic, workflow & systems simulation.

    Science.gov (United States)

    de Carvalho, Elias Cesar Araujo; Batilana, Adelia Portero; Claudino, Wederson; Reis, Luiz Fernando Lima; Schmerling, Rafael A; Shah, Jatin; Pietrobon, Ricardo

    2012-01-01

    With the exponential expansion of clinical trials conducted in (Brazil, Russia, India, and China) and VISTA (Vietnam, Indonesia, South Africa, Turkey, and Argentina) countries, corresponding gains in cost and enrolment efficiency quickly outpace the consonant metrics in traditional countries in North America and European Union. However, questions still remain regarding the quality of data being collected in these countries. We used ethnographic, mapping and computer simulation studies to identify/address areas of threat to near miss events for data quality in two cancer trial sites in Brazil. Two sites in Sao Paolo and Rio Janeiro were evaluated using ethnographic observations of workflow during subject enrolment and data collection. Emerging themes related to threats to near miss events for data quality were derived from observations. They were then transformed into workflows using UML-AD and modeled using System Dynamics. 139 tasks were observed and mapped through the ethnographic study. The UML-AD detected four major activities in the workflow evaluation of potential research subjects prior to signature of informed consent, visit to obtain subject́s informed consent, regular data collection sessions following study protocol and closure of study protocol for a given project. Field observations pointed to three major emerging themes: (a) lack of standardized process for data registration at source document, (b) multiplicity of data repositories and (c) scarcity of decision support systems at the point of research intervention. Simulation with policy model demonstrates a reduction of the rework problem. Patterns of threats to data quality at the two sites were similar to the threats reported in the literature for American sites. The clinical trial site managers need to reorganize staff workflow by using information technology more efficiently, establish new standard procedures and manage professionals to reduce near miss events and save time/cost. Clinical trial

  5. Automating bibliometric analyses using Taverna scientific workflows : A tutorial on integrating Web Services

    NARCIS (Netherlands)

    Guler, A.T.; Waaijer, C.J.F.; Mohammed, Y.; Palmblad, M.

    2016-01-01

    Quantitative analysis of the scientific literature is a frequent task in bibliometrics. Several large online resources collect and disseminate bibliographic information, paving the way for broad analyses and statistics. The Europe PubMed Central (PMC) and its Web Services is one of these resources,

  6. Developing a Collection of Composable Data Translation Software Units to Improve Efficiency and Reproducibility in Ecohydrologic Modeling Workflows

    Science.gov (United States)

    Olschanowsky, C.; Flores, A. N.; FitzGerald, K.; Masarik, M. T.; Rudisill, W. J.; Aguayo, M.

    2017-12-01

    Dynamic models of the spatiotemporal evolution of water, energy, and nutrient cycling are important tools to assess impacts of climate and other environmental changes on ecohydrologic systems. These models require spatiotemporally varying environmental forcings like precipitation, temperature, humidity, windspeed, and solar radiation. These input data originate from a variety of sources, including global and regional weather and climate models, global and regional reanalysis products, and geostatistically interpolated surface observations. Data translation measures, often subsetting in space and/or time and transforming and converting variable units, represent a seemingly mundane, but critical step in the application workflows. Translation steps can introduce errors, misrepresentations of data, slow execution time, and interrupt data provenance. We leverage a workflow that subsets a large regional dataset derived from the Weather Research and Forecasting (WRF) model and prepares inputs to the Parflow integrated hydrologic model to demonstrate the impact translation tool software quality on scientific workflow results and performance. We propose that such workflows will benefit from a community approved collection of data transformation components. The components should be self-contained composable units of code. This design pattern enables automated parallelization and software verification, improving performance and reliability. Ensuring that individual translation components are self-contained and target minute tasks increases reliability. The small code size of each component enables effective unit and regression testing. The components can be automatically composed for efficient execution. An efficient data translation framework should be written to minimize data movement. Composing components within a single streaming process reduces data movement. Each component will typically have a low arithmetic intensity, meaning that it requires about the same number of

  7. Extension of specification language for soundness and completeness of service workflow

    Science.gov (United States)

    Viriyasitavat, Wattana; Xu, Li Da; Bi, Zhuming; Sapsomboon, Assadaporn

    2018-05-01

    A Service Workflow is an aggregation of distributed services to fulfill specific functionalities. With ever increasing available services, the methodologies for the selections of the services against the given requirements become main research subjects in multiple disciplines. A few of researchers have contributed to the formal specification languages and the methods for model checking; however, existing methods have the difficulties to tackle with the complexity of workflow compositions. In this paper, we propose to formalize the specification language to reduce the complexity of the workflow composition. To this end, we extend a specification language with the consideration of formal logic, so that some effective theorems can be derived for the verification of syntax, semantics, and inference rules in the workflow composition. The logic-based approach automates compliance checking effectively. The Service Workflow Specification (SWSpec) has been extended and formulated, and the soundness, completeness, and consistency of SWSpec applications have been verified; note that a logic-based SWSpec is mandatory for the development of model checking. The application of the proposed SWSpec has been demonstrated by the examples with the addressed soundness, completeness, and consistency.

  8. The use of workflows in the design and implementation of complex experiments in macromolecular crystallography

    International Nuclear Information System (INIS)

    Brockhauser, Sandor; Svensson, Olof; Bowler, Matthew W.; Nanao, Max; Gordon, Elspeth; Leal, Ricardo M. F.; Popov, Alexander; Gerring, Matthew; McCarthy, Andrew A.; Gotz, Andy

    2012-01-01

    A powerful and easy-to-use workflow environment has been developed at the ESRF for combining experiment control with online data analysis on synchrotron beamlines. This tool provides the possibility of automating complex experiments without the need for expertise in instrumentation control and programming, but rather by accessing defined beamline services. The automation of beam delivery, sample handling and data analysis, together with increasing photon flux, diminishing focal spot size and the appearance of fast-readout detectors on synchrotron beamlines, have changed the way that many macromolecular crystallography experiments are planned and executed. Screening for the best diffracting crystal, or even the best diffracting part of a selected crystal, has been enabled by the development of microfocus beams, precise goniometers and fast-readout detectors that all require rapid feedback from the initial processing of images in order to be effective. All of these advances require the coupling of data feedback to the experimental control system and depend on immediate online data-analysis results during the experiment. To facilitate this, a Data Analysis WorkBench (DAWB) for the flexible creation of complex automated protocols has been developed. Here, example workflows designed and implemented using DAWB are presented for enhanced multi-step crystal characterizations, experiments involving crystal reorientation with kappa goniometers, crystal-burning experiments for empirically determining the radiation sensitivity of a crystal system and the application of mesh scans to find the best location of a crystal to obtain the highest diffraction quality. Beamline users interact with the prepared workflows through a specific brick within the beamline-control GUI MXCuBE

  9. Access to scientific information. A national survey of the Italian Society of Clinical Biochemistry and Laboratory Medicine (SIBioC).

    Science.gov (United States)

    Lippi, Giuseppe; Ciaccio, Marcello; Giavarina, Davide

    2016-09-01

    Digital libraries are typically used for retrieving and accessing articles in academic journals and repositories. Previous studies have been published about the performance of various biomedical research platforms, but no information is available about access preferences. A six-question survey was designed by the Italian Society of Clinical Biochemistry and Laboratory Medicine (SIBioC) using the platform Google Drive, and made available for 1 month to the members of the society. The information about the survey was published on the website of SIBioC and also disseminated by two sequential newsletters. Overall, 165 replies were collected throughout the 1-month survey availability. The largest number of replies were provided by laboratory professionals working in the national healthcare system (44.2%), followed by those working in private facilities (13.9%), university professors (12.7%) and specialization training staff (12.7%). The majority of responders published zero to one articles per year (55.2%), followed by two to five articles per year (37.6%), whereas only 7.3% published more than five articles per year. A total of 34.5% of the responders consulted biomedical research platforms on weekly basis, followed by 33.9% who did so on daily basis. PubMed/Medline was the most accessed scientific database, followed by Scopus, ISI Web of Science and Google Scholar. The impact factor was the leading reason when selecting which journal to publish in. The most consulted journals in the field of laboratory medicine were Clinical Chemistry and Laboratory Medicine and Biochimica Clinica. This survey provides useful indications about the personal inclination towards access to scientific information in our country.

  10. Workflow optimization beyond RIS and PACS

    International Nuclear Information System (INIS)

    Treitl, M.; Wirth, S.; Lucke, A.; Nissen-Meyer, S.; Trumm, C.; Rieger, J.; Pfeifer, K.-J.; Reiser, M.; Villain, S.

    2005-01-01

    Technological progress and the rising cost pressure on the healthcare system have led to a drastic change in the work environment of radiologists today. The pervasive demand for workflow optimization and increased efficiency of its activities raises the question of whether by employment of electronic systems, such as RIS and PACS, the potentials of digital technology are sufficiently used to fulfil this demand. This report describes the tasks and structures in radiology departments, which so far are only insufficiently supported by commercially available electronic systems but are nevertheless substantial. We developed and employed a web-based, integrated workplace system, which simplifies many daily tasks of departmental organization and administration apart from well-established tasks of documentation. Furthermore, we analyzed the effects exerted on departmental workflow by employment of this system for 3 years. (orig.) [de

  11. Embedding Scientific Integrity and Ethics into the Scientific Process and Research Data Lifecycle

    Science.gov (United States)

    Gundersen, L. C.

    2016-12-01

    Predicting climate change, developing resources sustainably, and mitigating natural hazard risk are complex interdisciplinary challenges in the geosciences that require the integration of data and knowledge from disparate disciplines and scales. This kind of interdisciplinary science can only thrive if scientific communities work together and adhere to common standards of scientific integrity, ethics, data management, curation, and sharing. Science and data without integrity and ethics can erode the very fabric of the scientific enterprise and potentially harm society and the planet. Inaccurate risk analyses of natural hazards can lead to poor choices in construction, insurance, and emergency response. Incorrect assessment of mineral resources can bankrupt a company, destroy a local economy, and contaminate an ecosystem. This paper presents key ethics and integrity questions paired with the major components of the research data life cycle. The questions can be used by the researcher during the scientific process to help ensure the integrity and ethics of their research and adherence to sound data management practice. Questions include considerations for open, collaborative science, which is fundamentally changing the responsibility of scientists regarding data sharing and reproducibility. The publication of primary data, methods, models, software, and workflows must become a norm of science. There are also questions that prompt the scientist to think about the benefit of their work to society; ensuring equity, respect, and fairness in working with others; and always striving for honesty, excellence, and transparency.

  12. Access to the scientific literature

    Science.gov (United States)

    Albarède, Francis

    The Public Library of Science Open Letter (http://www.publiclibraryofscience.org) is a very generous initiative, but, as most similar initiatives since the advent of electronic publishing, it misses the critical aspects of electronic publishing.Ten years ago, a Publisher would be in charge of running a system called a “scientific journal.” In such a system, the presence of an Editor and peer Reviewers secures the strength of the science and the rigor of writing; the Publisher guarantees the professional quality of printing, efficient dissemination, and long-term archiving. Publishing used to be in everyone's best interest, or nearly everyone. The Publisher, because he/she is financially motivated, ensures widespread dissemination of the journal amongst libraries and individual subscribers. The interest of the Author is that the system guarantees a broad potential readership. The interest of the Reader is that a line is drawn between professionally edited literature, presumably of better quality, and gray literature or home publishing, so that he/she does not waste time going through ‘low yield’ ungraded information. The Publisher could either be a private company, an academic institution, or a scholarly society. My experience is that, when page charges and subscription rates are compounded, journals published by scholarly societies are not necessarily cheaper. The difference between these cases is not the cost of running an office with rents, wages, printing, postage, advertisement, and archiving, but that a private Publisher pays shareholders. Shareholders have the bad habit of minding their own business and, therefore, they may interfere negatively with scientific publishing. Nevertheless, while the stranglehold imposed by private Publishers on our libraries over the last 10 years by increasing subscription rates may in part be due to shareholders' greed, this is true only in part. The increases are also a consequence of the booming number of pages being

  13. The CMS tracker calibration workflow: Experience with cosmic ray data

    International Nuclear Information System (INIS)

    Frosali, Simone

    2010-01-01

    During the second part of 2008 a CMS commissioning was performed with the acquisition of cosmic events in global runs. Cosmic rays detected in the muon chambers were used to trigger the readout of all CMS subdetectors in the general data acquisition system. A total of about 300M of tracks were collected by the CMS Muon Chambers with a 3.8T magnetic field produced by the CMS superconducting solenoid, 6M of which pointing to the tracker region and reconstructed by the Si-Strip Tracker (SST) detectors. Other 1M of cosmic tracks were collected with the magnetic field off. Using the cosmic data available it was possible to validate the performances of the CMS tracker calibration workflows. In this paper the adopted calibration workflow is described. In particular, the three main calibration workflows requested for the low level reconstruction of the SST, i.e. gain calibration, Lorentz angle calibration and bad components identification, are described. The results obtained using cosmic tracks for these three calibration workflows are also presented.

  14. Lowering the Barrier to Cross-Disciplinary Scientific Data Access via a Brokering Service Built Around a Unified Data Model

    Science.gov (United States)

    Lindholm, D. M.; Wilson, A.

    2012-12-01

    The steps many scientific data users go through to use data (after discovering it) can be rather tedious, even when dealing with datasets within their own discipline. Accessing data across domains often seems intractable. We present here, LaTiS, an Open Source brokering solution that bridges the gap between the source data and the user's code by defining a unified data model plus a plugin framework for "adapters" to read data from their native source, "filters" to perform server side data processing, and "writers" to output any number of desired formats or streaming protocols. A great deal of work is being done in the informatics community to promote multi-disciplinary science with a focus on search and discovery based on metadata - information about the data. The goal of LaTiS is to go that last step to provide a uniform interface to read the dataset into computer programs and other applications once it has been identified. The LaTiS solution for integrating a wide variety of data models is to return to mathematical fundamentals. The LaTiS data model emphasizes functional relationships between variables. For example, a time series of temperature measurements can be thought of as a function that maps a time to a temperature. With just three constructs: "Scalar" for a single variable, "Tuple" for a collection of variables, and "Function" to represent a set of independent and dependent variables, the LaTiS data model can represent most scientific datasets at a low level that enables uniform data access. Higher level abstractions can be built on top of the basic model to add more meaningful semantics for specific user communities. LaTiS defines its data model in terms of the Unified Modeling Language (UML). It also defines a very thin Java Interface that can be implemented by numerous existing data interfaces (e.g. NetCDF-Java) such that client code can access any dataset via the Java API, independent of the underlying data access mechanism. LaTiS also provides a

  15. INA-Rxiv: The Missing Puzzle in Indonesia’s Scientific Publishing Workflow

    Science.gov (United States)

    Rahim, R.; Irawan, D. E.; Zulfikar, A.; Hardi, R.; Arliman S, L.; Gultom, E. R.; Ginting, G.; Wahyuni, S. S.; Mesran, M.; Mahjudin, M.; Saputra, I.; Waruwu, F. T.; Suginam, S.; Buulolo, E.; Abraham, J.

    2018-04-01

    INA-Rxiv is the first Indonesia preprint server marking the new development initiated by the open science community. This study aimed at describing the development of INA-Rxiv and its conversations. It usedanalyzer of Inarxiv.id, WhatsApp Group Analyzer, and Twitter Analytics as the tools for data analysis complemented with observation.The results showed that INA-Rxiv users are growing because of the numerous discussions in social media, e.g.WhatsApp,as well as some other positive response of writers who have been using INA- Rxiv. The perspective of growth mindset and the implication of INA-Rxiv movement for filling up the gap in accelerating scientific dissemination process are presented at the end of this article.

  16. Watchdog - a workflow management system for the distributed analysis of large-scale experimental data.

    Science.gov (United States)

    Kluge, Michael; Friedel, Caroline C

    2018-03-13

    The development of high-throughput experimental technologies, such as next-generation sequencing, have led to new challenges for handling, analyzing and integrating the resulting large and diverse datasets. Bioinformatical analysis of these data commonly requires a number of mutually dependent steps applied to numerous samples for multiple conditions and replicates. To support these analyses, a number of workflow management systems (WMSs) have been developed to allow automated execution of corresponding analysis workflows. Major advantages of WMSs are the easy reproducibility of results as well as the reusability of workflows or their components. In this article, we present Watchdog, a WMS for the automated analysis of large-scale experimental data. Main features include straightforward processing of replicate data, support for distributed computer systems, customizable error detection and manual intervention into workflow execution. Watchdog is implemented in Java and thus platform-independent and allows easy sharing of workflows and corresponding program modules. It provides a graphical user interface (GUI) for workflow construction using pre-defined modules as well as a helper script for creating new module definitions. Execution of workflows is possible using either the GUI or a command-line interface and a web-interface is provided for monitoring the execution status and intervening in case of errors. To illustrate its potentials on a real-life example, a comprehensive workflow and modules for the analysis of RNA-seq experiments were implemented and are provided with the software in addition to simple test examples. Watchdog is a powerful and flexible WMS for the analysis of large-scale high-throughput experiments. We believe it will greatly benefit both users with and without programming skills who want to develop and apply bioinformatical workflows with reasonable overhead. The software, example workflows and a comprehensive documentation are freely

  17. Exformatics Declarative Case Management Workflows as DCR Graphs

    DEFF Research Database (Denmark)

    Slaats, Tijs; Mukkamala, Raghava Rao; Hildebrandt, Thomas

    2013-01-01

    Declarative workflow languages have been a growing research subject over the past ten years, but applications of the declarative approach in industry are still uncommon. Over the past two years Exformatics A/S, a Danish provider of Electronic Case Management systems, has been cooperating...... with researchers at IT University of Copenhagen (ITU) to create tools for the declarative workflow language Dynamic Condition Response Graphs (DCR Graphs) and incorporate them into their products and in teaching at ITU. In this paper we give a status report over the work. We start with an informal introduction...

  18. A framework for service enterprise workflow simulation with multi-agents cooperation

    Science.gov (United States)

    Tan, Wenan; Xu, Wei; Yang, Fujun; Xu, Lida; Jiang, Chuanqun

    2013-11-01

    Process dynamic modelling for service business is the key technique for Service-Oriented information systems and service business management, and the workflow model of business processes is the core part of service systems. Service business workflow simulation is the prevalent approach to be used for analysis of service business process dynamically. Generic method for service business workflow simulation is based on the discrete event queuing theory, which is lack of flexibility and scalability. In this paper, we propose a service workflow-oriented framework for the process simulation of service businesses using multi-agent cooperation to address the above issues. Social rationality of agent is introduced into the proposed framework. Adopting rationality as one social factor for decision-making strategies, a flexible scheduling for activity instances has been implemented. A system prototype has been developed to validate the proposed simulation framework through a business case study.

  19. A-Posteriori Detection of Sensor Infrastructure Errors in Correlated Sensor Data and Business Workflows

    NARCIS (Netherlands)

    Wombacher, Andreas; Rinderle-Ma, Stefanie; Toumani, Farouk; Wolf, Karsten

    Some physical objects are influenced by business workflows and are observed by sensors. Since both sensor infrastructures and business workflows must deal with imprecise information, the correlation of sensor data and business workflow data related to physical objects might be used a-posteriori to

  20. Restructuring of workflows to minimise errors via stochastic model checking: An automated evolutionary approach

    DEFF Research Database (Denmark)

    Herbert, Luke Thomas; Hansen, Zaza Nadja Lee

    2016-01-01

    This article presents a framework for the automated restructuring of stochastic workflows to reduce the impact of faults. The framework allows for the modelling of workflows by means of a formalised subset of the BPMN workflow language. We extend this modelling formalism to describe faults...

  1. Coverage, universal access and equity in health: a characterization of scientific production in nursing.

    Science.gov (United States)

    Mendoza-Parra, Sara

    2016-01-01

    to characterize the scientific contribution nursing has made regarding coverage, universal access and equity in health, and to understand this production in terms of subjects and objects of study. this was cross-sectional, documentary research; the units of analysis were 97 journals and 410 documents, retrieved from the Web of Science in the category, "nursing". Descriptors associated to coverage, access and equity in health, and the Mesh thesaurus, were applied. We used bibliometric laws and indicators, and analyzed the most important articles according to amount of citations and collaboration. the document retrieval allowed for 25 years of observation of production, an institutional and an international collaboration of 31% and 7%, respectively. The mean number of coauthors per article was 3.5, with a transience rate of 93%. The visibility index was 67.7%, and 24.6% of production was concentrated in four core journals. A review from the nursing category with 286 citations, and a Brazilian author who was the most productive, are issues worth highlighting. the nursing collective should strengthen future research on the subject, defining lines and sub-lines of research, increasing internationalization and building it with the joint participation of the academy and nursing community.

  2. Interoperable and scalable metabolomics data analysis with microservices

    OpenAIRE

    Van Vliet, Michael; Ruttkies, Christoph; Steinbeck, Christoph; Burman, Joachim; Bergmann, Sven; Moreno, Pablo; Khoonsari, Payam; Rueedi, Rico; Roger, Pierrick; Rocca-Serra, Philippe; Pireddu, Luca; Salek, Reza; Sansone, Susanna-Assunta; Schober, Daniel; Peters, Kristian

    2017-01-01

    Developing a robust and performant data analysis workflow that integrates all necessary components whilst still being able to scale over multiple compute nodes is a challenging task. We introduce a generic method based on the microservice architecture, where software tools are encapsulated as Docker containers that can be connected into scientific workflows and executed in parallel using the Kubernetes container orchestrator. The access point is a virtual research environment which can be lau...

  3. Implementing an Open Access Policy – Modeling KAUST in the Region

    KAUST Repository

    Baessa, Mohamed A.; Vijayakumar, J.K.

    2014-01-01

    The presentation will discuss different open access approaches, and what can well-fit academic and governmental institutions. As a case study of KAUST, presenters will discuss how it can be initiated in a university set-up, how to get academic stakeholder engaged with support, and how the final stage is reached. Details about the KAUST Open Access Policy for research articles, theses and dissertations and the required tools and workflow to implement the policies will be highlighted.

  4. Implementing an Open Access Policy – Modeling KAUST in the Region

    KAUST Repository

    Baessa, Mohamed A.

    2014-11-12

    The presentation will discuss different open access approaches, and what can well-fit academic and governmental institutions. As a case study of KAUST, presenters will discuss how it can be initiated in a university set-up, how to get academic stakeholder engaged with support, and how the final stage is reached. Details about the KAUST Open Access Policy for research articles, theses and dissertations and the required tools and workflow to implement the policies will be highlighted.

  5. Grid workflow job execution service 'Pilot'

    Science.gov (United States)

    Shamardin, Lev; Kryukov, Alexander; Demichev, Andrey; Ilyin, Vyacheslav

    2011-12-01

    'Pilot' is a grid job execution service for workflow jobs. The main goal for the service is to automate computations with multiple stages since they can be expressed as simple workflows. Each job is a directed acyclic graph of tasks and each task is an execution of something on a grid resource (or 'computing element'). Tasks may be submitted to any WS-GRAM (Globus Toolkit 4) service. The target resources for the tasks execution are selected by the Pilot service from the set of available resources which match the specific requirements from the task and/or job definition. Some simple conditional execution logic is also provided. The 'Pilot' service is built on the REST concepts and provides a simple API through authenticated HTTPS. This service is deployed and used in production in a Russian national grid project GridNNN.

  6. Grid workflow job execution service 'Pilot'

    International Nuclear Information System (INIS)

    Shamardin, Lev; Kryukov, Alexander; Demichev, Andrey; Ilyin, Vyacheslav

    2011-01-01

    'Pilot' is a grid job execution service for workflow jobs. The main goal for the service is to automate computations with multiple stages since they can be expressed as simple workflows. Each job is a directed acyclic graph of tasks and each task is an execution of something on a grid resource (or 'computing element'). Tasks may be submitted to any WS-GRAM (Globus Toolkit 4) service. The target resources for the tasks execution are selected by the Pilot service from the set of available resources which match the specific requirements from the task and/or job definition. Some simple conditional execution logic is also provided. The 'Pilot' service is built on the REST concepts and provides a simple API through authenticated HTTPS. This service is deployed and used in production in a Russian national grid project GridNNN.

  7. Workflows zur Bereitstellung von Zeitschriftenartikeln auf Open-Access-Repositorien - Herausforderungen und Lösungsansätze

    Directory of Open Access Journals (Sweden)

    Paul Vierkant

    2017-04-01

    werden zentrale Herausforderungen beschrieben und Lösungsansätze für die Zugänglichmachung von Open-Access-Zeitschriftenartikeln auf Repositorien zusammengestellt. Open access is provided for a growing number of journal articles from German research institutions. Free availability can be achieved in different ways, based on diverse business and financing models. But how can research organisations ensure that their gold open access publications are also made available in a permanent and standardized way in an open access repository? In order to achieve this, what should a model publication process look like? This paper addresses the main challenges and describes possible solutions for making open access articles available in repositories.

  8. Ergatis: a web interface and scalable software system for bioinformatics workflows

    Science.gov (United States)

    Orvis, Joshua; Crabtree, Jonathan; Galens, Kevin; Gussman, Aaron; Inman, Jason M.; Lee, Eduardo; Nampally, Sreenath; Riley, David; Sundaram, Jaideep P.; Felix, Victor; Whitty, Brett; Mahurkar, Anup; Wortman, Jennifer; White, Owen; Angiuoli, Samuel V.

    2010-01-01

    Motivation: The growth of sequence data has been accompanied by an increasing need to analyze data on distributed computer clusters. The use of these systems for routine analysis requires scalable and robust software for data management of large datasets. Software is also needed to simplify data management and make large-scale bioinformatics analysis accessible and reproducible to a wide class of target users. Results: We have developed a workflow management system named Ergatis that enables users to build, execute and monitor pipelines for computational analysis of genomics data. Ergatis contains preconfigured components and template pipelines for a number of common bioinformatics tasks such as prokaryotic genome annotation and genome comparisons. Outputs from many of these components can be loaded into a Chado relational database. Ergatis was designed to be accessible to a broad class of users and provides a user friendly, web-based interface. Ergatis supports high-throughput batch processing on distributed compute clusters and has been used for data management in a number of genome annotation and comparative genomics projects. Availability: Ergatis is an open-source project and is freely available at http://ergatis.sourceforge.net Contact: jorvis@users.sourceforge.net PMID:20413634

  9. Thermal Remote Sensing with Uav-Based Workflows

    Science.gov (United States)

    Boesch, R.

    2017-08-01

    Climate change will have a significant influence on vegetation health and growth. Predictions of higher mean summer temperatures and prolonged summer draughts may pose a threat to agriculture areas and forest canopies. Rising canopy temperatures can be an indicator of plant stress because of the closure of stomata and a decrease in the transpiration rate. Thermal cameras are available for decades, but still often used for single image analysis, only in oblique view manner or with visual evaluations of video sequences. Therefore remote sensing using a thermal camera can be an important data source to understand transpiration processes. Photogrammetric workflows allow to process thermal images similar to RGB data. But low spatial resolution of thermal cameras, significant optical distortion and typically low contrast require an adapted workflow. Temperature distribution in forest canopies is typically completely unknown and less distinct than for urban or industrial areas, where metal constructions and surfaces yield high contrast and sharp edge information. The aim of this paper is to investigate the influence of interior camera orientation, tie point matching and ground control points on the resulting accuracy of bundle adjustment and dense cloud generation with a typically used photogrammetric workflow for UAVbased thermal imagery in natural environments.

  10. Parallel workflow tools to facilitate human brain MRI post-processing

    Directory of Open Access Journals (Sweden)

    Zaixu eCui

    2015-05-01

    Full Text Available Multi-modal magnetic resonance imaging (MRI techniques are widely applied in human brain studies. To obtain specific brain measures of interest from MRI datasets, a number of complex image post-processing steps are typically required. Parallel workflow tools have recently been developed, concatenating individual processing steps and enabling fully automated processing of raw MRI data to obtain the final results. These workflow tools are also designed to make optimal use of available computational resources and to support the parallel processing of different subjects or of independent processing steps for a single subject. Automated, parallel MRI post-processing tools can greatly facilitate relevant brain investigations and are being increasingly applied. In this review, we briefly summarize these parallel workflow tools and discuss relevant issues.

  11. Barriers to critical thinking: workflow interruptions and task switching among nurses.

    Science.gov (United States)

    Cornell, Paul; Riordan, Monica; Townsend-Gervis, Mary; Mobley, Robin

    2011-10-01

    Nurses are increasingly called upon to engage in critical thinking. However, current workflow inhibits this goal with frequent task switching and unpredictable demands. To assess workflow's cognitive impact, nurses were observed at 2 hospitals with different patient loads and acuity levels. Workflow on a medical/surgical and pediatric oncology unit was observed, recording tasks, tools, collaborators, and locations. Nineteen nurses were observed for a total of 85.2 hours. Tasks were short with a mean duration of 62.4 and 81.6 seconds on the 2 units. More than 50% of the recorded tasks were less than 30 seconds in length. An analysis of task sequence revealed few patterns and little pairwise repetition. Performance on specific tasks differed between the 2 units, but the character of the workflow was highly similar. The nonrepetitive flow and high amount of switching indicate nurses experience a heavy cognitive load with little uninterrupted time. This implies that nurses rarely have the conditions necessary for critical thinking.

  12. Adaptive workflow simulation of emergency response

    NARCIS (Netherlands)

    Bruinsma, Guido Wybe Jan

    2010-01-01

    Recent incidents and major training exercises in and outside the Netherlands have persistently shown that not having or not sharing information during emergency response are major sources of emergency response inefficiency and error, and affect incident mitigation outcomes through workflow planning

  13. Workflow for high-content, individual cell quantification of fluorescent markers from universal microscope data, supported by open source software.

    Science.gov (United States)

    Stockwell, Simon R; Mittnacht, Sibylle

    2014-12-16

    Advances in understanding the control mechanisms governing the behavior of cells in adherent mammalian tissue culture models are becoming increasingly dependent on modes of single-cell analysis. Methods which deliver composite data reflecting the mean values of biomarkers from cell populations risk losing subpopulation dynamics that reflect the heterogeneity of the studied biological system. In keeping with this, traditional approaches are being replaced by, or supported with, more sophisticated forms of cellular assay developed to allow assessment by high-content microscopy. These assays potentially generate large numbers of images of fluorescent biomarkers, which enabled by accompanying proprietary software packages, allows for multi-parametric measurements per cell. However, the relatively high capital costs and overspecialization of many of these devices have prevented their accessibility to many investigators. Described here is a universally applicable workflow for the quantification of multiple fluorescent marker intensities from specific subcellular regions of individual cells suitable for use with images from most fluorescent microscopes. Key to this workflow is the implementation of the freely available Cell Profiler software(1) to distinguish individual cells in these images, segment them into defined subcellular regions and deliver fluorescence marker intensity values specific to these regions. The extraction of individual cell intensity values from image data is the central purpose of this workflow and will be illustrated with the analysis of control data from a siRNA screen for G1 checkpoint regulators in adherent human cells. However, the workflow presented here can be applied to analysis of data from other means of cell perturbation (e.g., compound screens) and other forms of fluorescence based cellular markers and thus should be useful for a wide range of laboratories.

  14. Earth System Model Development and Analysis using FRE-Curator and Live Access Servers: On-demand analysis of climate model output with data provenance.

    Science.gov (United States)

    Radhakrishnan, A.; Balaji, V.; Schweitzer, R.; Nikonov, S.; O'Brien, K.; Vahlenkamp, H.; Burger, E. F.

    2016-12-01

    There are distinct phases in the development cycle of an Earth system model. During the model development phase, scientists make changes to code and parameters and require rapid access to results for evaluation. During the production phase, scientists may make an ensemble of runs with different settings, and produce large quantities of output, that must be further analyzed and quality controlled for scientific papers and submission to international projects such as the Climate Model Intercomparison Project (CMIP). During this phase, provenance is a key concern:being able to track back from outputs to inputs. We will discuss one of the paths taken at GFDL in delivering tools across this lifecycle, offering on-demand analysis of data by integrating the use of GFDL's in-house FRE-Curator, Unidata's THREDDS and NOAA PMEL's Live Access Servers (LAS).Experience over this lifecycle suggests that a major difficulty in developing analysis capabilities is only partially the scientific content, but often devoted to answering the questions "where is the data?" and "how do I get to it?". "FRE-Curator" is the name of a database-centric paradigm used at NOAA GFDL to ingest information about the model runs into an RDBMS (Curator database). The components of FRE-Curator are integrated into Flexible Runtime Environment workflow and can be invoked during climate model simulation. The front end to FRE-Curator, known as the Model Development Database Interface (MDBI) provides an in-house web-based access to GFDL experiments: metadata, analysis output and more. In order to provide on-demand visualization, MDBI uses Live Access Servers which is a highly configurable web server designed to provide flexible access to geo-referenced scientific data, that makes use of OPeNDAP. Model output saved in GFDL's tape archive, the size of the database and experiments, continuous model development initiatives with more dynamic configurations add complexity and challenges in providing an on

  15. submitter Performance studies of CMS workflows using Big Data technologies

    CERN Document Server

    Ambroz, Luca; Grandi, Claudio

    At the Large Hadron Collider (LHC), more than 30 petabytes of data are produced from particle collisions every year of data taking. The data processing requires large volumes of simulated events through Monte Carlo techniques. Furthermore, physics analysis implies daily access to derived data formats by hundreds of users. The Worldwide LHC Computing Grid (WLCG) - an international collaboration involving personnel and computing centers worldwide - is successfully coping with these challenges, enabling the LHC physics program. With the continuation of LHC data taking and the approval of ambitious projects such as the High-Luminosity LHC, such challenges will reach the edge of current computing capacity and performance. One of the keys to success in the next decades - also under severe financial resource constraints - is to optimize the efficiency in exploiting the computing resources. This thesis focuses on performance studies of CMS workflows, namely centrallyscheduled production activities and unpredictable d...

  16. Optimizing perioperative decision making: improved information for clinical workflow planning.

    Science.gov (United States)

    Doebbeling, Bradley N; Burton, Matthew M; Wiebke, Eric A; Miller, Spencer; Baxter, Laurence; Miller, Donald; Alvarez, Jorge; Pekny, Joseph

    2012-01-01

    Perioperative care is complex and involves multiple interconnected subsystems. Delayed starts, prolonged cases and overtime are common. Surgical procedures account for 40-70% of hospital revenues and 30-40% of total costs. Most planning and scheduling in healthcare is done without modern planning tools, which have potential for improving access by assisting in operations planning support. We identified key planning scenarios of interest to perioperative leaders, in order to examine the feasibility of applying combinatorial optimization software solving some of those planning issues in the operative setting. Perioperative leaders desire a broad range of tools for planning and assessing alternate solutions. Our modeled solutions generated feasible solutions that varied as expected, based on resource and policy assumptions and found better utilization of scarce resources. Combinatorial optimization modeling can effectively evaluate alternatives to support key decisions for planning clinical workflow and improving care efficiency and satisfaction.

  17. e-BioFlow: improving practical use of workflow systems in bioinformatics

    NARCIS (Netherlands)

    Wassink, I.; Ooms, M.; Neerincx, P.; Rauwerda, H.; Leunissen, J.A.M.; Breit, T.M.; Nijholt, A.; Vet, van der P.

    2010-01-01

    Workflow management systems (WfMSs) are useful tools for bioinformaticians. As experiences with using WfMSs accumulate, shortcomings of current systems become apparent. In this paper, we focus on practical issues that hinder WfMS users and that arise in the design and execution of workflows, and in

  18. Lessons in scientific data interoperability: XML and the eMinerals project.

    Science.gov (United States)

    White, T O H; Bruin, R P; Chiang, G-T; Dove, M T; Tyer, R P; Walker, A M

    2009-03-13

    A collaborative environmental eScience project produces a broad range of data, notable as much for its diversity, in source and format, as its quantity. We find that extensible markup language (XML) and associated technologies are invaluable in managing this deluge of data. We describe Fo X, a toolkit for allowing Fortran codes to read and write XML, thus allowing existing scientific tools to be easily re-used in an XML-centric workflow.

  19. DNAseq Workflow in a Diagnostic Context and an Example of a User Friendly Implementation

    Directory of Open Access Journals (Sweden)

    Beat Wolf

    2015-01-01

    Full Text Available Over recent years next generation sequencing (NGS technologies evolved from costly tools used by very few, to a much more accessible and economically viable technology. Through this recently gained popularity, its use-cases expanded from research environments into clinical settings. But the technical know-how and infrastructure required to analyze the data remain an obstacle for a wider adoption of this technology, especially in smaller laboratories. We present GensearchNGS, a commercial DNAseq software suite distributed by Phenosystems SA. The focus of GensearchNGS is the optimal usage of already existing infrastructure, while keeping its use simple. This is achieved through the integration of existing tools in a comprehensive software environment, as well as custom algorithms developed with the restrictions of limited infrastructures in mind. This includes the possibility to connect multiple computers to speed up computing intensive parts of the analysis such as sequence alignments. We present a typical DNAseq workflow for NGS data analysis and the approach GensearchNGS takes to implement it. The presented workflow goes from raw data quality control to the final variant report. This includes features such as gene panels and the integration of online databases, like Ensembl for annotations or Cafe Variome for variant sharing.

  20. Integration Of Externalized Decision Models In The Definition Of Workflows For Digital Pathology

    Directory of Open Access Journals (Sweden)

    J. van Leeuwen

    2016-06-01

    We proposed a workflow solution enabling the representation of decision models as externalized executable tasks in the process definition. Our approach separates the task implementations from the workflow model, ensuring scalability and allowing for the inclusion of complex decision logic in the workflow execution. In we depict a simplified model of a pathology diagnosis workflow (starting with the digitization of the slides, represented according to the BPMN modeling conventions. The example shows a workflow sequence that automatically orders a HER2 FISH when IHC is borderline according to defined customizable thresholds. The process model integrates an image analysis algorithm that scores images. Based on the score and the thresholds the decision model evaluates the condition and recommends the pre-ordering of an additional test when the score falls between the two thresholds. This leads to faster diagnosis and allows balancing the costs of an additional test versus the overhead of the pathologist by choosing the values of the thresholds.  

  1. Translating Unstructured Workflow Processes to Readable BPEL: Theory and Implementation

    DEFF Research Database (Denmark)

    van der Aalst, Willibrordus Martinus Pancratius; Lassen, Kristian Bisgaard

    2008-01-01

    and not easy to use by end-users. Therefore, we provide a mapping from Workflow Nets (WF-nets) to BPEL. This mapping builds on the rich theory of Petri nets and can also be used to map other languages (e.g., UML, EPC, BPMN, etc.) onto BPEL. In addition to this we have implemented the algorithm in a tool called...... WorkflowNet2BPEL4WS....

  2. Workflow Based Software Development Environment, Phase I

    Data.gov (United States)

    National Aeronautics and Space Administration — The goal of this proposed research is to investigate and develop a workflow based tool, the Software Developers Assistant, to facilitate the collaboration between...

  3. Workflow Based Software Development Environment, Phase II

    Data.gov (United States)

    National Aeronautics and Space Administration — The goal of this proposed research is to investigate and develop a workflow based tool, the Software Developers Assistant, to facilitate the collaboration between...

  4. Integration of EGA secure data access into Galaxy.

    Science.gov (United States)

    Hoogstrate, Youri; Zhang, Chao; Senf, Alexander; Bijlard, Jochem; Hiltemann, Saskia; van Enckevort, David; Repo, Susanna; Heringa, Jaap; Jenster, Guido; J A Fijneman, Remond; Boiten, Jan-Willem; A Meijer, Gerrit; Stubbs, Andrew; Rambla, Jordi; Spalding, Dylan; Abeln, Sanne

    2016-01-01

    High-throughput molecular profiling techniques are routinely generating vast amounts of data for translational medicine studies. Secure access controlled systems are needed to manage, store, transfer and distribute these data due to its personally identifiable nature. The European Genome-phenome Archive (EGA) was created to facilitate access and management to long-term archival of bio-molecular data. Each data provider is responsible for ensuring a Data Access Committee is in place to grant access to data stored in the EGA. Moreover, the transfer of data during upload and download is encrypted. ELIXIR, a European research infrastructure for life-science data, initiated a project (2016 Human Data Implementation Study) to understand and document the ELIXIR requirements for secure management of controlled-access data. As part of this project, a full ecosystem was designed to connect archived raw experimental molecular profiling data with interpreted data and the computational workflows, using the CTMM Translational Research IT (CTMM-TraIT) infrastructure http://www.ctmm-trait.nl as an example. Here we present the first outcomes of this project, a framework to enable the download of EGA data to a Galaxy server in a secure way. Galaxy provides an intuitive user interface for molecular biologists and bioinformaticians to run and design data analysis workflows. More specifically, we developed a tool -- ega_download_streamer - that can download data securely from EGA into a Galaxy server, which can subsequently be further processed. This tool will allow a user within the browser to run an entire analysis containing sensitive data from EGA, and to make this analysis available for other researchers in a reproducible manner, as shown with a proof of concept study.  The tool ega_download_streamer is available in the Galaxy tool shed: https://toolshed.g2.bx.psu.edu/view/yhoogstrate/ega_download_streamer.

  5. BIFI: a Taverna plugin for a simplified and user-friendly workflow platform.

    Science.gov (United States)

    Yildiz, Ahmet; Dilaveroglu, Erkan; Visne, Ilhami; Günay, Bilal; Sefer, Emrah; Weinhausel, Andreas; Rattay, Frank; Goble, Carole A; Pandey, Ram Vinay; Kriegner, Albert

    2014-10-20

    Heterogeneity in the features, input-output behaviour and user interface for available bioinformatics tools and services is still a bottleneck for both expert and non-expert users. Advancement in providing common interfaces over such tools and services are gaining interest among researchers. However, the lack of (meta-) information about input-output data and parameter prevents to provide automated and standardized solutions, which can assist users in setting the appropriate parameters. These limitations must be resolved especially in the workflow-based solution in order to ease the integration of software. We report a Taverna Workbench plugin: the XworX BIFI (Beautiful Interfaces for Inputs) implemented as a solution for the aforementioned issues. BIFI provides a Graphical User Interface (GUI) definition language used to layout the user interface and to define parameter options for Taverna workflows. BIFI is also able to submit GUI Definition Files (GDF) directly or discover appropriate instances from a configured repository. In the absence of a GDF, BIFI generates a default interface. The Taverna Workbench is an open source software providing the ability to combine various services within a workflow. Nevertheless, users can supply input data to the workflow via a simple user interface providing only a text area to enter the input in text form. The workflow may contain meta-information in human readable form such as description text for the port and an example value. However, not all workflow ports are documented so well or have all the required information.BIFI uses custom user interface components for ports which give users feedback on the parameter data type or structure to be used for service execution and enables client-side data validations. Moreover, BIFI offers user interfaces that allow users to interactively construct workflow views and share them with the community, thus significantly increasing usability of heterogeneous, distributed service

  6. Provenance-Based Debugging and Drill-Down in Data-Oriented Workflows

    KAUST Repository

    Ikeda, Robert

    2012-04-01

    Panda (for Provenance and Data) is a system that supports the creation and execution of data-oriented workflows, with automatic provenance generation and built-in provenance tracing operations. Workflows in Panda are arbitrary a cyclic graphs containing both relational (SQL) processing nodes and opaque processing nodes programmed in Python. For both types of nodes, Panda generates logical provenance - provenance information stored at the processing-node level - and uses the generated provenance to support record-level backward tracing and forward tracing operations. In our demonstration we use Panda to integrate, process, and analyze actual education data from multiple sources. We specifically demonstrate how Panda\\'s provenance generation and tracing capabilities can be very useful for workflow debugging, and for drilling down on specific results of interest. © 2012 IEEE.

  7. Nexus: A modular workflow management system for quantum simulation codes

    Science.gov (United States)

    Krogel, Jaron T.

    2016-01-01

    The management of simulation workflows represents a significant task for the individual computational researcher. Automation of the required tasks involved in simulation work can decrease the overall time to solution and reduce sources of human error. A new simulation workflow management system, Nexus, is presented to address these issues. Nexus is capable of automated job management on workstations and resources at several major supercomputing centers. Its modular design allows many quantum simulation codes to be supported within the same framework. Current support includes quantum Monte Carlo calculations with QMCPACK, density functional theory calculations with Quantum Espresso or VASP, and quantum chemical calculations with GAMESS. Users can compose workflows through a transparent, text-based interface, resembling the input file of a typical simulation code. A usage example is provided to illustrate the process.

  8. Text mining for the biocuration workflow

    Science.gov (United States)

    Hirschman, Lynette; Burns, Gully A. P. C; Krallinger, Martin; Arighi, Cecilia; Cohen, K. Bretonnel; Valencia, Alfonso; Wu, Cathy H.; Chatr-Aryamontri, Andrew; Dowell, Karen G.; Huala, Eva; Lourenço, Anália; Nash, Robert; Veuthey, Anne-Lise; Wiegers, Thomas; Winter, Andrew G.

    2012-01-01

    Molecular biology has become heavily dependent on biological knowledge encoded in expert curated biological databases. As the volume of biological literature increases, biocurators need help in keeping up with the literature; (semi-) automated aids for biocuration would seem to be an ideal application for natural language processing and text mining. However, to date, there have been few documented successes for improving biocuration throughput using text mining. Our initial investigations took place for the workshop on ‘Text Mining for the BioCuration Workflow’ at the third International Biocuration Conference (Berlin, 2009). We interviewed biocurators to obtain workflows from eight biological databases. This initial study revealed high-level commonalities, including (i) selection of documents for curation; (ii) indexing of documents with biologically relevant entities (e.g. genes); and (iii) detailed curation of specific relations (e.g. interactions); however, the detailed workflows also showed many variabilities. Following the workshop, we conducted a survey of biocurators. The survey identified biocurator priorities, including the handling of full text indexed with biological entities and support for the identification and prioritization of documents for curation. It also indicated that two-thirds of the biocuration teams had experimented with text mining and almost half were using text mining at that time. Analysis of our interviews and survey provide a set of requirements for the integration of text mining into the biocuration workflow. These can guide the identification of common needs across curated databases and encourage joint experimentation involving biocurators, text mining developers and the larger biomedical research community. PMID:22513129

  9. Validation of the Applied Biosystems RapidFinder Shiga Toxin-Producing E. coli (STEC) Detection Workflow.

    Science.gov (United States)

    Cloke, Jonathan; Matheny, Sharon; Swimley, Michelle; Tebbs, Robert; Burrell, Angelia; Flannery, Jonathan; Bastin, Benjamin; Bird, Patrick; Benzinger, M Joseph; Crowley, Erin; Agin, James; Goins, David; Salfinger, Yvonne; Brodsky, Michael; Fernandez, Maria Cristina

    2016-11-01

    The Applied Biosystems™ RapidFinder™ STEC Detection Workflow (Thermo Fisher Scientific) is a complete protocol for the rapid qualitative detection of Escherichia coli (E. coli) O157:H7 and the "Big 6" non-O157 Shiga-like toxin-producing E. coli (STEC) serotypes (defined as serogroups: O26, O45, O103, O111, O121, and O145). The RapidFinder STEC Detection Workflow makes use of either the automated preparation of PCR-ready DNA using the Applied Biosystems PrepSEQ™ Nucleic Acid Extraction Kit in conjunction with the Applied Biosystems MagMAX™ Express 96-well magnetic particle processor or the Applied Biosystems PrepSEQ Rapid Spin kit for manual preparation of PCR-ready DNA. Two separate assays comprise the RapidFinder STEC Detection Workflow, the Applied Biosystems RapidFinder STEC Screening Assay and the Applied Biosystems RapidFinder STEC Confirmation Assay. The RapidFinder STEC Screening Assay includes primers and probes to detect the presence of stx1 (Shiga toxin 1), stx2 (Shiga toxin 2), eae (intimin), and E. coli O157 gene targets. The RapidFinder STEC Confirmation Assay includes primers and probes for the "Big 6" non-O157 STEC and E. coli O157:H7. The use of these two assays in tandem allows a user to detect accurately the presence of the "Big 6" STECs and E. coli O157:H7. The performance of the RapidFinder STEC Detection Workflow was evaluated in a method comparison study, in inclusivity and exclusivity studies, and in a robustness evaluation. The assays were compared to the U.S. Department of Agriculture (USDA), Food Safety and Inspection Service (FSIS) Microbiology Laboratory Guidebook (MLG) 5.09: Detection, Isolation and Identification of Escherichia coli O157:H7 from Meat Products and Carcass and Environmental Sponges for raw ground beef (73% lean) and USDA/FSIS-MLG 5B.05: Detection, Isolation and Identification of Escherichia coli non-O157:H7 from Meat Products and Carcass and Environmental Sponges for raw beef trim. No statistically significant

  10. The universal access handbook

    CERN Document Server

    Stephanidis, Constantine

    2009-01-01

    In recent years, the field of Universal Access has made significant progress in consolidating theoretical approaches, scientific methods and technologies, as well as in exploring new application domains. Increasingly, professionals in this rapidly maturing area require a comprehensive and multidisciplinary resource that addresses current principles, methods, and tools. Written by leading international authorities from academic, research, and industrial organizations and nonmarket institutions, The Universal Access Handbook covers the unfolding scientific, methodological, technological, and pol

  11. Workflow Automation: A Collective Case Study

    Science.gov (United States)

    Harlan, Jennifer

    2013-01-01

    Knowledge management has proven to be a sustainable competitive advantage for many organizations. Knowledge management systems are abundant, with multiple functionalities. The literature reinforces the use of workflow automation with knowledge management systems to benefit organizations; however, it was not known if process automation yielded…

  12. Create, run, share, publish, and reference your LC-MS, FIA-MS, GC-MS, and NMR data analysis workflows with the Workflow4Metabolomics 3.0 Galaxy online infrastructure for metabolomics.

    Science.gov (United States)

    Guitton, Yann; Tremblay-Franco, Marie; Le Corguillé, Gildas; Martin, Jean-François; Pétéra, Mélanie; Roger-Mele, Pierrick; Delabrière, Alexis; Goulitquer, Sophie; Monsoor, Misharl; Duperier, Christophe; Canlet, Cécile; Servien, Rémi; Tardivel, Patrick; Caron, Christophe; Giacomoni, Franck; Thévenot, Etienne A

    2017-12-01

    Metabolomics is a key approach in modern functional genomics and systems biology. Due to the complexity of metabolomics data, the variety of experimental designs, and the multiplicity of bioinformatics tools, providing experimenters with a simple and efficient resource to conduct comprehensive and rigorous analysis of their data is of utmost importance. In 2014, we launched the Workflow4Metabolomics (W4M; http://workflow4metabolomics.org) online infrastructure for metabolomics built on the Galaxy environment, which offers user-friendly features to build and run data analysis workflows including preprocessing, statistical analysis, and annotation steps. Here we present the new W4M 3.0 release, which contains twice as many tools as the first version, and provides two features which are, to our knowledge, unique among online resources. First, data from the four major metabolomics technologies (i.e., LC-MS, FIA-MS, GC-MS, and NMR) can be analyzed on a single platform. By using three studies in human physiology, alga evolution, and animal toxicology, we demonstrate how the 40 available tools can be easily combined to address biological issues. Second, the full analysis (including the workflow, the parameter values, the input data and output results) can be referenced with a permanent digital object identifier (DOI). Publication of data analyses is of major importance for robust and reproducible science. Furthermore, the publicly shared workflows are of high-value for e-learning and training. The Workflow4Metabolomics 3.0 e-infrastructure thus not only offers a unique online environment for analysis of data from the main metabolomics technologies, but it is also the first reference repository for metabolomics workflows. Copyright © 2017 Elsevier Ltd. All rights reserved.

  13. Promoting access to and use of seismic data in a large scientific community

    Directory of Open Access Journals (Sweden)

    Michel Eric

    2017-01-01

    Full Text Available The growing amount of seismic data available from space missions (SOHO, CoRoT, Kepler, SDO,… but also from ground-based facilities (GONG, BiSON, ground-based large programmes…, stellar modelling and numerical simulations, creates new scientific perspectives such as characterizing stellar populations in our Galaxy or planetary systems by providing model-independent global properties of stars such as mass, radius, and surface gravity within several percent accuracy, as well as constraints on the age. These applications address a broad scientific community beyond the solar and stellar one and require combining indices elaborated with data from different databases (e.g. seismic archives and ground-based spectroscopic surveys. It is thus a basic requirement to develop a simple and effcient access to these various data resources and dedicated tools. In the framework of the European project SpaceInn (FP7, several data sources have been developed or upgraded. The Seismic Plus Portal has been developed, where synthetic descriptions of the most relevant existing data sources can be found, as well as tools allowing to localize existing data for given objects or period and helping the data query. This project has been developed within the Virtual Observatory (VO framework. In this paper, we give a review of the various facilities and tools developed within this programme. The SpaceInn project (Exploitation of Space Data for Innovative Helio- and Asteroseismology has been initiated by the European Helio- and Asteroseismology Network (HELAS.

  14. Formalizing an integrative, multidisciplinary cancer therapy discovery workflow

    Science.gov (United States)

    McGuire, Mary F.; Enderling, Heiko; Wallace, Dorothy I.; Batra, Jaspreet; Jordan, Marie; Kumar, Sushil; Panetta, John C.; Pasquier, Eddy

    2014-01-01

    Although many clinicians and researchers work to understand cancer, there has been limited success to effectively combine forces and collaborate over time, distance, data and budget constraints. Here we present a workflow template for multidisciplinary cancer therapy that was developed during the 2nd Annual Workshop on Cancer Systems Biology sponsored by Tufts University, Boston, MA in July 2012. The template was applied to the development of a metronomic therapy backbone for neuroblastoma. Three primary groups were identified: clinicians, biologists, and scientists (mathematicians, computer scientists, physicists and engineers). The workflow described their integrative interactions; parallel or sequential processes; data sources and computational tools at different stages as well as the iterative nature of therapeutic development from clinical observations to in vitro, in vivo, and clinical trials. We found that theoreticians in dialog with experimentalists could develop calibrated and parameterized predictive models that inform and formalize sets of testable hypotheses, thus speeding up discovery and validation while reducing laboratory resources and costs. The developed template outlines an interdisciplinary collaboration workflow designed to systematically investigate the mechanistic underpinnings of a new therapy and validate that therapy to advance development and clinical acceptance. PMID:23955390

  15. Improved compliance by BPM-driven workflow automation.

    Science.gov (United States)

    Holzmüller-Laue, Silke; Göde, Bernd; Fleischer, Heidi; Thurow, Kerstin

    2014-12-01

    Using methods and technologies of business process management (BPM) for the laboratory automation has important benefits (i.e., the agility of high-level automation processes, rapid interdisciplinary prototyping and implementation of laboratory tasks and procedures, and efficient real-time process documentation). A principal goal of the model-driven development is the improved transparency of processes and the alignment of process diagrams and technical code. First experiences of using the business process model and notation (BPMN) show that easy-to-read graphical process models can achieve and provide standardization of laboratory workflows. The model-based development allows one to change processes quickly and an easy adaption to changing requirements. The process models are able to host work procedures and their scheduling in compliance with predefined guidelines and policies. Finally, the process-controlled documentation of complex workflow results addresses modern laboratory needs of quality assurance. BPMN 2.0 as an automation language to control every kind of activity or subprocess is directed to complete workflows in end-to-end relationships. BPMN is applicable as a system-independent and cross-disciplinary graphical language to document all methods in laboratories (i.e., screening procedures or analytical processes). That means, with the BPM standard, a communication method of sharing process knowledge of laboratories is also available. © 2014 Society for Laboratory Automation and Screening.

  16. MysiRNA-designer: a workflow for efficient siRNA design.

    Directory of Open Access Journals (Sweden)

    Mohamed Mysara

    Full Text Available The design of small interfering RNA (siRNA is a multi factorial problem that has gained the attention of many researchers in the area of therapeutic and functional genomics. MysiRNA score was previously introduced that improves the correlation of siRNA activity prediction considering state of the art algorithms. In this paper, a new program, MysiRNA-Designer, is described which integrates several factors in an automated work-flow considering mRNA transcripts variations, siRNA and mRNA target accessibility, and both near-perfect and partial off-target matches. It also features the MysiRNA score, a highly ranked correlated siRNA efficacy prediction score for ranking the designed siRNAs, in addition to top scoring models Biopredsi, DISR, Thermocomposition21 and i-Score, and integrates them in a unique siRNA score-filtration technique. This multi-score filtration layer filters siRNA that passes the 90% thresholds calculated from experimental dataset features. MysiRNA-Designer takes an accession, finds conserved regions among its transcript space, finds accessible regions within the mRNA, designs all possible siRNAs for these regions, filters them based on multi-scores thresholds, and then performs SNP and off-target filtration. These strict selection criteria were tested against human genes in which at least one active siRNA was designed from 95.7% of total genes. In addition, when tested against an experimental dataset, MysiRNA-Designer was found capable of rejecting 98% of the false positive siRNAs, showing superiority over three state of the art siRNA design programs. MysiRNA is a freely accessible (Microsoft Windows based desktop application that can be used to design siRNA with a high accuracy and specificity. We believe that MysiRNA-Designer has the potential to play an important role in this area.

  17. Successful Completion of FY18/Q1 ASC L2 Milestone 6355: Electrical Analysis Calibration Workflow Capability Demonstration.

    Energy Technology Data Exchange (ETDEWEB)

    Copps, Kevin D. [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States)

    2017-12-01

    The Sandia Analysis Workbench (SAW) project has developed and deployed a production capability for SIERRA computational mechanics analysis workflows. However, the electrical analysis workflow capability requirements have only been demonstrated in early prototype states, with no real capability deployed for analysts’ use. This milestone aims to improve the electrical analysis workflow capability (via SAW and related tools) and deploy it for ongoing use. We propose to focus on a QASPR electrical analysis calibration workflow use case. We will include a number of new capabilities (versus today’s SAW), such as: 1) support for the XYCE code workflow component, 2) data management coupled to electrical workflow, 3) human-in-theloop workflow capability, and 4) electrical analysis workflow capability deployed on the restricted (and possibly classified) network at Sandia. While far from the complete set of capabilities required for electrical analysis workflow over the long term, this is a substantial first step toward full production support for the electrical analysts.

  18. Inter-observer reliability assessments in time motion studies: the foundation for meaningful clinical workflow analysis.

    Science.gov (United States)

    Lopetegui, Marcelo A; Bai, Shasha; Yen, Po-Yin; Lai, Albert; Embi, Peter; Payne, Philip R O

    2013-01-01

    Understanding clinical workflow is critical for researchers and healthcare decision makers. Current workflow studies tend to oversimplify and underrepresent the complexity of clinical workflow. Continuous observation time motion studies (TMS) could enhance clinical workflow studies by providing rich quantitative data required for in-depth workflow analyses. However, methodological inconsistencies have been reported in continuous observation TMS, potentially reducing the validity of TMS' data and limiting their contribution to the general state of knowledge. We believe that a cornerstone in standardizing TMS is to ensure the reliability of the human observers. In this manuscript we review the approaches for inter-observer reliability assessment (IORA) in a representative sample of TMS focusing on clinical workflow. We found that IORA is an uncommon practice, inconsistently reported, and often uses methods that provide partial and overestimated measures of agreement. Since a comprehensive approach to IORA is yet to be proposed and validated, we provide initial recommendations for IORA reporting in continuous observation TMS.

  19. BIM Workflow for Mechanical Ventilation Design : Object-Based Modeling with Autodesk Revit®

    OpenAIRE

    Bonduel, Mathias

    2016-01-01

    This study is conducted for the Belgian engineering firm CENERGIE, whose main business activities are within the fields of building systems and sustainable buildings. The company wants to change their current design workflows to adapt the use of Building Information Modeling (BIM) with Autodesk Revit The research focused on the development of a BIM workflow where no models are exchanged between building partners. The aim of this study was to develop such a Revit BIM workflow for the desig...

  20. Soundness of Timed-Arc Workflow Nets

    DEFF Research Database (Denmark)

    Mateo, Jose Antonio; Srba, Jiri; Sørensen, Mathias Grund

    2014-01-01

    , we demonstrate the usability of our theory on the case studies of a Brake System Control Unit used in aircraft certification, the MPEG2 encoding algorithm, and a blood transfusion workflow. The implementation of the algorithms is freely available as a part of the model checker TAPAAL....