WorldWideScience

Sample records for science data-management challenge

  1. The Office of Science Data-Management Challenge

    Energy Technology Data Exchange (ETDEWEB)

    Mount, Richard P.; /SLAC

    2005-10-10

    Science--like business, national security, and even everyday life--is becoming more and more data intensive. In some sciences the data-management challenge already exceeds the compute-power challenge in its needed resources. Leadership in applying computing to science will necessarily require both world-class computing and world-class data management. The Office of Science program needs a leadership-class capability in scientific data management. Currently two-thirds of Office of Science research and development in data management is left to the individual scientific programs. About $18M/year is spent by the programs on data-management research and development targeted at their most urgent needs. This is to be compared with the $9M/year spent on data management by DOE computer science. This highly mission-directed approach has been effective, but only in meeting just the highest-priority needs of individual programs. A coherent, leadership-class, program of data management is clearly warranted by the scale and nature of the Office of Science programs. More directly, much of the Office of Science portfolio is in desperate need of such a program; without it, data management could easily become the primary bottleneck to scientific progress within the next five years. When grouped into simulation-intensive science, experiment/observation-intensive science, and information-intensive science, the Office of Science programs show striking commonalities in their data-management needs. Not just research and development but also packaging and hardening as well as maintenance and support are required. Meeting these needs is a medium- to long-term effort requiring a well-planned program of evolving investment. We propose an Office of Science Data-Management Program at an initial scale of $32M/year of new funding. The program should be managed by a Director charged with creating and maintaining a forward-looking approach to multiscience data-management challenges. The program

  2. Data management challenges in analysis and synthesis in the ecosystem sciences.

    Science.gov (United States)

    Specht, A; Guru, S; Houghton, L; Keniger, L; Driver, P; Ritchie, E G; Lai, K; Treloar, A

    2015-11-15

    Open-data has created an unprecedented opportunity with new challenges for ecosystem scientists. Skills in data management are essential to acquire, manage, publish, access and re-use data. These skills span many disciplines and require trans-disciplinary collaboration. Science synthesis centres support analysis and synthesis through collaborative 'Working Groups' where domain specialists work together to synthesise existing information to provide insight into critical problems. The Australian Centre for Ecological Analysis and Synthesis (ACEAS) served a wide range of stakeholders, from scientists to policy-makers to managers. This paper investigates the level of sophistication in data management in the ecosystem science community through the lens of the ACEAS experience, and identifies the important factors required to enable us to benefit from this new data-world and produce innovative science. ACEAS promoted the analysis and synthesis of data to solve transdisciplinary questions, and promoted the publication of the synthesised data. To do so, it provided support in many of the key skillsets required. Analysis and synthesis in multi-disciplinary and multi-organisational teams, and publishing data were new for most. Data were difficult to discover and access, and to make ready for analysis, largely due to lack of metadata. Data use and publication were hampered by concerns about data ownership and a desire for data citation. A web portal was created to visualise geospatial datasets to maximise data interpretation. By the end of the experience there was a significant increase in appreciation of the importance of a Data Management Plan. It is extremely doubtful that the work would have occurred or data delivered without the support of the Synthesis centre, as few of the participants had the necessary networks or skills. It is argued that participation in the Centre provided an important learning opportunity, and has resulted in improved knowledge and understanding

  3. Challenges and Successes Managing Airborne Science Data for CARVE

    Science.gov (United States)

    Hardman, S. H.; Dinardo, S. J.; Lee, E. C.

    2014-12-01

    The Carbon in Arctic Reservoirs Vulnerability Experiment (CARVE) mission collects detailed measurements of important greenhouse gases on local to regional scales in the Alaskan Arctic and demonstrates new remote sensing and improved modeling capabilities to quantify Arctic carbon fluxes and carbon cycle-climate processes. Airborne missions offer a number of challenges when it comes to collecting and processing the science data and CARVE is no different. The biggest challenge relates to the flexibility of the instrument payload. Within the life of the mission, instruments may be removed from or added to the payload, or even reconfigured on a yearly, monthly or daily basis. Although modification of the instrument payload provides a distinct advantage for airborne missions compared to spaceborne missions, it does tend to wreak havoc on the underlying data system when introducing changes to existing data inputs or new data inputs that require modifications to the pipeline for processing the data. In addition to payload flexibility, it is not uncommon to find unsupported files in the field data submission. In the case of CARVE, these include video files, photographs taken during the flight and screen shots from terminal displays. These need to captured, saved and somehow integrated into the data system. The CARVE data system was built on a multi-mission data system infrastructure for airborne instruments called the Airborne Cloud Computing Environment (ACCE). ACCE encompasses the end-to-end lifecycle covering planning, provisioning of data system capabilities, and support for scientific analysis in order to improve the quality, cost effectiveness, and capabilities to enable new scientific discovery and research in earth observation. This well-tested and proven infrastructure allows the CARVE data system to be easily adapted in order to handle the challenges posed by the CARVE mission and to successfully process, manage and distribute the mission's science data. This

  4. Challenges in data science: a complex systems perspective

    International Nuclear Information System (INIS)

    Carbone, Anna; Jensen, Meiko; Sato, Aki-Hiro

    2016-01-01

    The ability to process and manage large data volumes has been proven to be not enough to tackle the current challenges presented by “Big Data”. Deep insight is required for understanding interactions among connected systems, space- and time- dependent heterogeneous data structures. Emergence of global properties from locally interacting data entities and clustering phenomena demand suitable approaches and methodologies recently developed in the foundational area of Data Science by taking a Complex Systems standpoint. Here, we deal with challenges that can be summarized by the question: “What can Complex Systems Science contribute to Big Data? ”. Such question can be reversed and brought to a superior level of abstraction by asking “What Knowledge can be drawn from Big Data?” These aspects constitute the main motivation behind this article to introduce a volume containing a collection of papers presenting interdisciplinary advances in the Big Data area by methodologies and approaches typical of the Complex Systems Science, Nonlinear Systems Science and Statistical Physics.

  5. Challenges for Data Archival Centers in Evolving Environmental Sciences

    Science.gov (United States)

    Wei, Y.; Cook, R. B.; Gu, L.; Santhana Vannan, S. K.; Beaty, T.

    2015-12-01

    Environmental science has entered into a big data era as enormous data about the Earth environment are continuously collected through field and airborne missions, remote sensing observations, model simulations, sensor networks, etc. An open-access and open-management data infrastructure for data-intensive science is a major grand challenge in global environmental research (BERAC, 2010). Such an infrastructure, as exemplified in EOSDIS, GEOSS, and NSF EarthCube, will provide a complete lifecycle of environmental data and ensures that data will smoothly flow among different phases of collection, preservation, integration, and analysis. Data archival centers, as the data integration units closest to data providers, serve as the source power to compile and integrate heterogeneous environmental data into this global infrastructure. This presentation discusses the interoperability challenges and practices of geosciences from the aspect of data archival centers, based on the operational experiences of the NASA-sponsored Oak Ridge National Laboratory Distributed Active Archive Center (ORNL DAAC) and related environmental data management activities. Specifically, we will discuss the challenges to 1) encourage and help scientists to more actively share data with the broader scientific community, so that valuable environmental data, especially those dark data collected by individual scientists in small independent projects, can be shared and integrated into the infrastructure to tackle big science questions; 2) curate heterogeneous multi-disciplinary data, focusing on the key aspects of identification, format, metadata, data quality, and semantics to make them ready to be plugged into a global data infrastructure. We will highlight data curation practices at the ORNL DAAC for global campaigns such as BOREAS, LBA, SAFARI 2000; and 3) enhance the capabilities to more effectively and efficiently expose and deliver "big" environmental data to broad range of users and systems

  6. Challenges in data science

    DEFF Research Database (Denmark)

    Carbone, Anna; Jensen, M.; Sato, Aki-Hiro

    2016-01-01

    of global properties from locally interacting data entities and clustering phenomena demand suitable approaches and methodologies recently developed in the foundational area of Data Science by taking a Complex Systems standpoint. Here, we deal with challenges that can be summarized by the question: "What...... can Complex Systems Science contribute to Big Data? ". Such question can be reversed and brought to a superior level of abstraction by asking "What Knowledge can be drawn from Big Data?" These aspects constitute the main motivation behind this article to introduce a volume containing a collection...... of papers presenting interdisciplinary advances in the Big Data area by methodologies and approaches typical of the Complex Systems Science, Nonlinear Systems Science and Statistical Physics. (C) 2016 Elsevier Ltd. All rights reserved....

  7. New challenges for Life Sciences flight project management

    Science.gov (United States)

    Huntoon, C. L.

    1999-01-01

    Scientists have conducted studies involving human spaceflight crews for over three decades. These studies have progressed from simple observations before and after each flight to sophisticated experiments during flights of several weeks up to several months. The findings from these experiments are available in the scientific literature. Management of these flight experiments has grown into a system fashioned from the Apollo Program style, focusing on budgeting, scheduling and allocation of human and material resources. While these areas remain important to the future, the International Space Station (ISS) requires that the Life Sciences spaceflight experiments expand the existing project management methodology. The use of telescience with state-the-art information technology and the multi-national crews and investigators challenges the former management processes. Actually conducting experiments on board the ISS will be an enormous undertaking and International Agreements and Working Groups will be essential in giving guidance to the flight project management Teams forged in this matrix environment must be competent to make decisions and qualified to work with the array of engineers, scientists, and the spaceflight crews. In order to undertake this complex task, data systems not previously used for these purposes must be adapted so that the investigators and the project management personnel can all share in important information as soon as it is available. The utilization of telescience and distributed experiment operations will allow the investigator to remain involved in their experiment as well as to understand the numerous issues faced by other elements of the program The complexity in formation and management of project teams will be a new kind of challenge for international science programs. Meeting that challenge is essential to assure success of the International Space Station as a laboratory in space.

  8. Data Management in Metagenomics: A Risk Management Approach

    Directory of Open Access Journals (Sweden)

    Filipe Ferreira

    2014-07-01

    Full Text Available In eScience, where vast data collections are processed in scientific workflows, new risks and challenges are emerging. Those challenges are changing the eScience paradigm, mainly regarding digital preservation and scientific workflows. To address specific concerns with data management in these scenarios, the concept of the Data Management Plan was established, serving as a tool for enabling digital preservation in eScience research projects. We claim risk management can be jointly used with a Data Management Plan, so new risks and challenges can be easily tackled. Therefore, we propose an analysis process for eScience projects using a Data Management Plan and ISO 31000 in order to create a Risk Management Plan that can complement the Data Management Plan. The motivation, requirements and validation of this proposal are explored in the MetaGen-FRAME project, focused in Metagenomics.

  9. Challenges of archiving science data from long duration missions: the Rosetta case

    Science.gov (United States)

    Heather, David

    2016-07-01

    Rosetta is the first mission designed to orbit and land on a comet. It consists of an orbiter, carrying 11 science experiments, and a lander, called 'Philae', carrying 10 additional instruments. Rosetta was launched on 2 March 2004, and arrived at the comet 67P/Churyumov-Gerasimenko on 6 August 2014. During its long journey, Rosetta has completed flybys of the Earth and Mars, and made two excursions to the main asteroid belt to observe (2867) Steins and (21) Lutetia. On 12 November 2014, the Philae probe soft landed on comet 67P/Churyumov-Gerasimenko, the first time in history that such an extraordinary feat has been achieved. After the landing, the Rosetta orbiter followed the comet through its perihelion in August 2015, and will continue to accompany 67P/Churyumov-Gerasimenko as it recedes from the Sun until the end of the mission. There are significant challenges in managing the science archive of a mission such as Rosetta. The first data were returned from Rosetta more than 10 years ago, and there have been flybys of several planetary bodies, including two asteroids from which significant science data were returned by many of the instruments. The scientific applications for these flyby data can be very different to those taken during the main science phase at the comet, but there are severe limitations on the changes that can be applied to the data pipelines managed by the various science teams as resources are scarce. The priority is clearly on maximising the potential science from the comet phase, so data formats and pipelines have been designed with that in mind, and changes limited to managing issues found during official archiving authority and independent science reviews. In addition, in the time that Rosetta has been operating, the archiving standards themselves have evolved. All Rosetta data are archived following version 3 of NASA's Planetary Data System (PDS) Standards. Currently, new and upcoming planetary science missions are delivering data

  10. Opportunities and challenges of big data for the social sciences: The case of genomic data.

    Science.gov (United States)

    Liu, Hexuan; Guo, Guang

    2016-09-01

    In this paper, we draw attention to one unique and valuable source of big data, genomic data, by demonstrating the opportunities they provide to social scientists. We discuss different types of large-scale genomic data and recent advances in statistical methods and computational infrastructure used to address challenges in managing and analyzing such data. We highlight how these data and methods can be used to benefit social science research. Copyright © 2016 Elsevier Inc. All rights reserved.

  11. Challenges in Managing Trustworthy Large-scale Digital Science

    Science.gov (United States)

    Evans, B. J. K.

    2017-12-01

    The increased use of large-scale international digital science has opened a number of challenges for managing, handling, using and preserving scientific information. The large volumes of information are driven by three main categories - model outputs including coupled models and ensembles, data products that have been processing to a level of usability, and increasingly heuristically driven data analysis. These data products are increasingly the ones that are usable by the broad communities, and far in excess of the raw instruments data outputs. The data, software and workflows are then shared and replicated to allow broad use at an international scale, which places further demands of infrastructure to support how the information is managed reliably across distributed resources. Users necessarily rely on these underlying "black boxes" so that they are productive to produce new scientific outcomes. The software for these systems depend on computational infrastructure, software interconnected systems, and information capture systems. This ranges from the fundamentals of the reliability of the compute hardware, system software stacks and libraries, and the model software. Due to these complexities and capacity of the infrastructure, there is an increased emphasis of transparency of the approach and robustness of the methods over the full reproducibility. Furthermore, with large volume data management, it is increasingly difficult to store the historical versions of all model and derived data. Instead, the emphasis is on the ability to access the updated products and the reliability by which both previous outcomes are still relevant and can be updated for the new information. We will discuss these challenges and some of the approaches underway that are being used to address these issues.

  12. Meeting global health challenges through operational research and management science.

    Science.gov (United States)

    Royston, Geoff

    2011-09-01

    This paper considers how operational research and management science can improve the design of health systems and the delivery of health care, particularly in low-resource settings. It identifies some gaps in the way operational research is typically used in global health and proposes steps to bridge them. It then outlines some analytical tools of operational research and management science and illustrates how their use can inform some typical design and delivery challenges in global health. The paper concludes by considering factors that will increase and improve the contribution of operational research and management science to global health.

  13. Exploring Best Practices for Research Data Management in Earth Science through Collaborating with University Libraries

    Science.gov (United States)

    Wang, T.; Branch, B. D.

    2013-12-01

    Earth Science research data, its data management, informatics processing and its data curation are valuable in allowing earth scientists to make new discoveries. But how to actively manage these research assets to ensure them safe and secure, accessible and reusable for long term is a big challenge. Nowadays, the data deluge makes this challenge become even more difficult. To address the growing demand for managing earth science data, the Council on Library and Information Resources (CLIR) partners with the Library and Technology Services (LTS) of Lehigh University and Purdue University Libraries (PUL) on hosting postdoctoral fellows in data curation activity. This inter-disciplinary fellowship program funded by the SLOAN Foundation innovatively connects university libraries and earth science departments and provides earth science Ph.D.'s opportunities to use their research experiences in earth science and data curation trainings received during their fellowship to explore best practices for research data management in earth science. In the process of exploring best practices for data curation in earth science, the CLIR Data Curation Fellows have accumulated rich experiences and insights on the data management behaviors and needs of earth scientists. Specifically, Ting Wang, the postdoctoral fellow at Lehigh University has worked together with the LTS support team for the College of Arts and Sciences, Web Specialists and the High Performance Computing Team, to assess and meet the data management needs of researchers at the Department of Earth and Environmental Sciences (EES). By interviewing the faculty members and graduate students at EES, the fellow has identified a variety of data-related challenges at different research fields of earth science, such as climate, ecology, geochemistry, geomorphology, etc. The investigation findings of the fellow also support the LTS for developing campus infrastructure for long-term data management in the sciences. Likewise

  14. Information management challenges of the EOS Data and Information System

    Science.gov (United States)

    Mcdonald, Kenneth R.; Blake, Deborah J.

    1991-01-01

    An overview of the current information management concepts that are embodied in the plans for the Earth Observing System Data and Information System (EOSDIS) is presented, and some of the technology development and application areas that are envisioned to be particularly challenging are introduced. The Information Management System (IMS) is the EOSDIS element that provides the primary interface between the science users and the data products and services of EOSDIS. The goals of IMS are to define a clear and complete set of functional requirements and to apply innovative methods and technologies to satisfy them. The information management functions are described in detail, and some applicable technolgies are discussed. Some of the general issues affecting the successful development and operation of the information management element are addressed.

  15. Social Water Science Data: Dimensions, Data Management, and Visualization

    Science.gov (United States)

    Jones, A. S.; Horsburgh, J. S.; Flint, C.; Jackson-Smith, D.

    2016-12-01

    Water systems are increasingly conceptualized as coupled human-natural systems, with growing emphasis on representing the human element in hydrology. However, social science data and associated considerations may be unfamiliar and intimidating to many hydrologic researchers. Monitoring social aspects of water systems involves expanding the range of data types typically used in hydrology and appreciating nuances in datasets that are well known to social scientists, but less understood by hydrologists. We define social water science data as any information representing the human aspects of a water system. We present a scheme for classifying these data, highlight an array of data types, and illustrate data management considerations and challenges unique to social science data. This classification scheme was applied to datasets generated as part of iUTAH (innovative Urban Transitions and Arid region Hydro-sustainability), an interdisciplinary water research project based in Utah, USA that seeks to integrate and share social and biophysical water science data. As the project deployed cyberinfrastructure for baseline biophysical data, cyberinfrastructure for analogous social science data was necessary. As a particular case of social water science data, we focus in this presentation on social science survey data. These data are often interpreted through the lens of the original researcher and are typically presented to interested parties in static figures or reports. To provide more exploratory and dynamic communication of these data beyond the individual or team who collected the data, we developed a web-based, interactive viewer to visualize social science survey responses. This interface is applicable for examining survey results that show human motivations and actions related to environmental systems and as a useful tool for participatory decision-making. It also serves as an example of how new data sharing and visualization tools can be developed once the

  16. Sedimentary Geology Context and Challenges for Cyberinfrastructure Data Management

    Science.gov (United States)

    Chan, M. A.; Budd, D. A.

    2014-12-01

    A cyberinfrastructure data management system for sedimentary geology is crucial to multiple facets of interdisciplinary Earth science research, as sedimentary systems form the deep-time framework for many geoscience communities. The breadth and depth of the sedimentary field spans research on the processes that form, shape and affect the Earth's sedimentary crust and distribute resources such as hydrocarbons, coal, and water. The sedimentary record is used by Earth scientists to explore questions such as the continental crust evolution, dynamics of Earth's past climates and oceans, evolution of the biosphere, and the human interface with Earth surface processes. Major challenges to a data management system for sedimentary geology are the volume and diversity of field, analytical, and experimental data, along with many types of physical objects. Objects include rock samples, biological specimens, cores, and photographs. Field data runs the gamut from discrete location and spatial orientation to vertical records of bed thickness, textures, color, sedimentary structures, and grain types. Ex situ information can include geochemistry, mineralogy, petrophysics, chronologic, and paleobiologic data. All data types cover multiple order-of-magnitude scales, often requiring correlation of the multiple scales with varying degrees of resolution. The stratigraphic framework needs dimensional context with locality, time, space, and depth relationships. A significant challenge is that physical objects represent discrete values at specific points, but measured stratigraphic sections are continuous. In many cases, field data is not easily quantified, and determining uncertainty can be difficult. Despite many possible hurdles, the sedimentary community is anxious to embrace geoinformatic resources that can provide better tools to integrate the many data types, create better search capabilities, and equip our communities to conduct high-impact science at unprecedented levels.

  17. Data Provenance and Data Management in eScience

    CERN Document Server

    Bai, Quan; Giugni, Stephen; Williamson, Darrell; Taylor, John

    2013-01-01

    eScience allows scientific research to be carried out in highly distributed environments. The complex nature of the interactions in an eScience infrastructure, which often involves a range of instruments, data, models, applications, people and computational facilities, suggests there is a need for data provenance and data management (DPDM). The W3C Provenance Working Group defines the provenance of a resource as a “record that describes entities and processes involved in producing and delivering or otherwise influencing that resource”. It has been widely recognised that provenance is a critical issue to enable sharing, trust, authentication and reproducibility of eScience process.   Data Provenance and Data Management in eScience identifies the gaps between DPDM foundations and their practice within eScience domains including clinical trials, bioinformatics and radio astronomy. The book covers important aspects of fundamental research in DPDM including provenance representation and querying. It also expl...

  18. Management as a science-based profession: a grand societal challenge

    NARCIS (Netherlands)

    Romme, A.G.L.

    2017-01-01

    The purpose of this paper is to explore how the quest for management as a science-based profession, conceived as a grand societal challenge, can be revitalized. A reflective approach is adopted by questioning some of the key assumptions made by management scholars, especially those that undermine

  19. Addressing big data challenges for scientific data infrastructure

    NARCIS (Netherlands)

    Demchenko, Y.; Zhao, Z.; Grosso, P.; Wibisono, A.; de Laat, C.

    2012-01-01

    This paper discusses the challenges that are imposed by Big Data Science on the modern and future Scientific Data Infrastructure (SDI). The paper refers to different scientific communities to define requirements on data management, access control and security. The paper introduces the Scientific

  20. A science data gateway for environmental management: A SCIENCE DATA GATEWAY FOR ENVIRONMENTAL MANAGEMENT

    Energy Technology Data Exchange (ETDEWEB)

    Agarwal, Deborah A. [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States); Faybishenko, Boris [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States); Freedman, Vicky L. [Pacific Northwest National Lab. (PNNL), Richland, WA (United States); Krishnan, Harinarayan [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States); Kushner, Gary [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States); Lansing, Carina [Pacific Northwest National Lab. (PNNL), Richland, WA (United States); Porter, Ellen [Pacific Northwest National Lab. (PNNL), Richland, WA (United States); Romosan, Alexandru [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States); Shoshani, Arie [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States); Wainwright, Haruko [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States); Weidmer, Arthur [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States); Wu, Kesheng [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States)

    2015-10-12

    Science data gateways are effective in providing complex science data collections to the world-wide user communities. In this paper we describe a gateway for the Advanced Simulation Capability for Environmental Management (ASCEM) framework. Built on top of established web service technologies, the ASCEM data gateway is specifically designed for environmental modeling applications. Its key distinguishing features include: (1) handling of complex spatiotemporal data, (2) offering a variety of selective data access mechanisms, (3) providing state of the art plotting and visualization of spatiotemporal data records, and (4) integrating seamlessly with a distributed workflow system using a RESTful interface. ASCEM project scientists have been using this data gateway since 2011.

  1. Big Data and Data Science: Opportunities and Challenges of iSchools

    Directory of Open Access Journals (Sweden)

    Il-Yeol Song

    2017-08-01

    Full Text Available Due to the recent explosion of big data, our society has been rapidly going through digital transformation and entering a new world with numerous eye-opening developments. These new trends impact the society and future jobs, and thus student careers. At the heart of this digital transformation is data science, the discipline that makes sense of big data. With many rapidly emerging digital challenges ahead of us, this article discusses perspectives on iSchools’ opportunities and suggestions in data science education. We argue that iSchools should empower their students with “information computing” disciplines, which we define as the ability to solve problems and create values, information, and knowledge using tools in application domains. As specific approaches to enforcing information computing disciplines in data science education, we suggest the three foci of user-based, tool-based, and application-based. These three foci will serve to differentiate the data science education of iSchools from that of computer science or business schools. We present a layered Data Science Education Framework (DSEF with building blocks that include the three pillars of data science (people, technology, and data, computational thinking, data-driven paradigms, and data science lifecycles. Data science courses built on the top of this framework should thus be executed with user-based, tool-based, and application-based approaches. This framework will help our students think about data science problems from the big picture perspective and foster appropriate problem-solving skills in conjunction with broad perspectives of data science lifecycles. We hope the DSEF discussed in this article will help fellow iSchools in their design of new data science curricula.

  2. Research challenges for energy data management (panel)

    DEFF Research Database (Denmark)

    Pedersen, Torben Bach; Lehner, Wolfgang

    2013-01-01

    This panel paper aims at initiating discussion at the Second International Workshop on Energy Data Management (EnDM 2013) about the important research challenges within Energy Data Management. The authors are the panel organizers, extra panelists will be recruited before the workshop...

  3. Data, Data Management, and the Ethos of Science

    Science.gov (United States)

    Duerr, R.; Barry, R.; Parsons, M. A.

    2006-12-01

    Since the beginnings of the scientific era, data - the record of the observations made to elucidate the inner workings of the universe - have been a fundamental component of the scientific method, a cornerstone of the edifice that is science. Historically it has been a norm for scientists to publish these data so that others may verify the claims made or to extend the field further, for example by using the data as input to models. Entire journals owe their very existence to the need for mechanisms for making data available, for recording the observations of science for posterity. As such, data and the publication of data, are fundamental to the integrity of science, to a scientists ability to trust in the work of other scientists, as well as to uphold the trust the public and policy maker place in science as an enterprise worthy of support. In the past, the data-related mechanisms for maintaining this trust were well understood. A scientist need simply record the observations they made as part of a journal article. With the advent of the digital era and the ever-increasing volumes of data, these old methods have become insufficient to the task. The focus of this talk is on the complex and changing ways that digital data and digital data management are impacting science and the way the external world perceives science. We will discuss many aspects of the issue - from the responsibilities of scientists in regards to making data available, to the elements of sound data management, to the need to explain events visible in the data (e.g., sea ice minima) to the public.

  4. Enabling a new Paradigm to Address Big Data and Open Science Challenges

    Science.gov (United States)

    Ramamurthy, Mohan; Fisher, Ward

    2017-04-01

    Data are not only the lifeblood of the geosciences but they have become the currency of the modern world in science and society. Rapid advances in computing, communi¬cations, and observational technologies — along with concomitant advances in high-resolution modeling, ensemble and coupled-systems predictions of the Earth system — are revolutionizing nearly every aspect of our field. Modern data volumes from high-resolution ensemble prediction/projection/simulation systems and next-generation remote-sensing systems like hyper-spectral satellite sensors and phased-array radars are staggering. For example, CMIP efforts alone will generate many petabytes of climate projection data for use in assessments of climate change. And NOAA's National Climatic Data Center projects that it will archive over 350 petabytes by 2030. For researchers and educators, this deluge and the increasing complexity of data brings challenges along with the opportunities for discovery and scientific breakthroughs. The potential for big data to transform the geosciences is enormous, but realizing the next frontier depends on effectively managing, analyzing, and exploiting these heterogeneous data sources, extracting knowledge and useful information from heterogeneous data sources in ways that were previously impossible, to enable discoveries and gain new insights. At the same time, there is a growing focus on the area of "Reproducibility or Replicability in Science" that has implications for Open Science. The advent of cloud computing has opened new avenues for not only addressing both big data and Open Science challenges to accelerate scientific discoveries. However, to successfully leverage the enormous potential of cloud technologies, it will require the data providers and the scientific communities to develop new paradigms to enable next-generation workflows and transform the conduct of science. Making data readily available is a necessary but not a sufficient condition. Data providers

  5. The European HST Science Data Archive. [and Data Management Facility (DMF)

    Science.gov (United States)

    Pasian, F.; Pirenne, B.; Albrecht, R.; Russo, G.

    1993-01-01

    The paper describes the European HST Science Data Archive. Particular attention is given to the flow from the HST spacecraft to the Science Data Archive at the Space Telescope European Coordinating Facility (ST-ECF); the archiving system at the ST-ECF, including the hardware and software system structure; the operations at the ST-ECF and differences with the Data Management Facility; and the current developments. A diagram of the logical structure and data flow of the system managing the European HST Science Data Archive is included.

  6. Data Management and Preservation Planning for Big Science

    Directory of Open Access Journals (Sweden)

    Juan Bicarregui

    2013-06-01

    Full Text Available ‘Big Science’ - that is, science which involves large collaborations with dedicated facilities, and involving large data volumes and multinational investments – is often seen as different when it comes to data management and preservation planning. Big Science handles its data differently from other disciplines and has data management problems that are qualitatively different from other disciplines. In part, these differences arise from the quantities of data involved, but possibly more importantly from the cultural, organisational and technical distinctiveness of these academic cultures. Consequently, the data management systems are typically and rationally bespoke, but this means that the planning for data management and preservation (DMP must also be bespoke.These differences are such that ‘just read and implement the OAIS specification’ is reasonable Data Management and Preservation (DMP advice, but this bald prescription can and should be usefully supported by a methodological ‘toolkit’, including overviews, case-studies and costing models to provide guidance on developing best practice in DMP policy and infrastructure for these projects, as well as considering OAIS validation, audit and cost modelling.In this paper, we build on previous work with the LIGO collaboration to consider the role of DMP planning within these big science scenarios, and discuss how to apply current best practice. We discuss the result of the MaRDI-Gross project (Managing Research Data Infrastructures – Big Science, which has been developing a toolkit to provide guidelines on the application of best practice in DMP planning within big science projects. This is targeted primarily at projects’ engineering managers, but intending also to help funders collaborate on DMP plans which satisfy the requirements imposed on them.

  7. Accelerating Science Impact through Big Data Workflow Management and Supercomputing

    Directory of Open Access Journals (Sweden)

    De K.

    2016-01-01

    Full Text Available The Large Hadron Collider (LHC, operating at the international CERN Laboratory in Geneva, Switzerland, is leading Big Data driven scientific explorations. ATLAS, one of the largest collaborations ever assembled in the the history of science, is at the forefront of research at the LHC. To address an unprecedented multi-petabyte data processing challenge, the ATLAS experiment is relying on a heterogeneous distributed computational infrastructure. To manage the workflow for all data processing on hundreds of data centers the PanDA (Production and Distributed AnalysisWorkload Management System is used. An ambitious program to expand PanDA to all available computing resources, including opportunistic use of commercial and academic clouds and Leadership Computing Facilities (LCF, is realizing within BigPanDA and megaPanDA projects. These projects are now exploring how PanDA might be used for managing computing jobs that run on supercomputers including OLCF’s Titan and NRC-KI HPC2. The main idea is to reuse, as much as possible, existing components of the PanDA system that are already deployed on the LHC Grid for analysis of physics data. The next generation of PanDA will allow many data-intensive sciences employing a variety of computing platforms to benefit from ATLAS experience and proven tools in highly scalable processing.

  8. Increasing the Use of Earth Science Data and Models in Air Quality Management.

    Science.gov (United States)

    Milford, Jana B; Knight, Daniel

    2017-04-01

    In 2010, the U.S. National Aeronautics and Space Administration (NASA) initiated the Air Quality Applied Science Team (AQAST) as a 5-year, $17.5-million award with 19 principal investigators. AQAST aims to increase the use of Earth science products in air quality-related research and to help meet air quality managers' information needs. We conducted a Web-based survey and a limited number of follow-up interviews to investigate federal, state, tribal, and local air quality managers' perspectives on usefulness of Earth science data and models, and on the impact AQAST has had. The air quality managers we surveyed identified meeting the National Ambient Air Quality Standards for ozone and particulate matter, emissions from mobile sources, and interstate air pollution transport as top challenges in need of improved information. Most survey respondents viewed inadequate coverage or frequency of satellite observations, data uncertainty, and lack of staff time or resources as barriers to increased use of satellite data by their organizations. Managers who have been involved with AQAST indicated that the program has helped build awareness of NASA Earth science products, and assisted their organizations with retrieval and interpretation of satellite data and with application of global chemistry and climate models. AQAST has also helped build a network between researchers and air quality managers with potential for further collaborations. NASA's Air Quality Applied Science Team (AQAST) aims to increase the use of satellite data and global chemistry and climate models for air quality management purposes, by supporting research and tool development projects of interest to both groups. Our survey and interviews of air quality managers indicate they found value in many AQAST projects and particularly appreciated the connections to the research community that the program facilitated. Managers expressed interest in receiving continued support for their organizations' use of

  9. Evolving NASA's Earth Science Data Systems

    Science.gov (United States)

    Walter, J.; Behnke, J.; Murphy, K. J.; Lowe, D. R.

    2013-12-01

    NASA's Earth Science Data and Information System Project (ESDIS) is charged with managing, maintaining, and evolving NASA's Earth Observing System Data and Information System (EOSDIS) and is responsible for processing, archiving, and distributing NASA Earth science data. The system supports a multitude of missions and serves diverse science research and other user communities. Keeping up with ever-changing information technology and figuring out how to leverage those changes across such a large system in order to continuously improve and meet the needs of a diverse user community is a significant challenge. Maintaining and evolving the system architecture and infrastructure is a continuous and multi-layered effort. It requires a balance between a "top down" management paradigm that provides a coherent system view and maintaining the managerial, technological, and functional independence of the individual system elements. This presentation will describe some of the key elements of the current system architecture, some of the strategies and processes we employ to meet these challenges, current and future challenges, and some ideas for meeting those challenges.

  10. Sustainable Materials Management (SMM) Electronics Challenge Data

    Science.gov (United States)

    On September 22, 2012, EPA launched the SMM Electronics Challenge. The Challenge encourages electronics manufacturers, brand owners and retailers to strive to send 100 percent of the used electronics they collect from the public, businesses and within their own organizations to third-party certified electronics refurbishers and recyclers. The Challenge??s goals are to: 1). Ensure responsible recycling through the use of third-party certified recyclers, 2). Increase transparency and accountability through public posting of electronics collection and recycling data, and 3). Encourage outstanding performance through awards and recognition. By striving to send 100 percent of used electronics collected to certified recyclers and refurbishers, Challenge participants are ensuring that the used electronics they collect will be responsibly managed by recyclers that maximize reuse and recycling, minimize exposure to human health and the environment, ensure the safe management of materials by downstream handlers, and require destruction of all data on used electronics. Electronics Challenge participants are publicly recognized on EPA's website as a registrant, new participant, or active participant. Awards are offered in two categories - tier and champion. Tier awards are given in recognition of achieving all the requirements under a gold, silver or bronze tier. Champion awards are given in two categories - product and non-product. For champion awards, a product is an it

  11. 'Big data' in pharmaceutical science: challenges and opportunities.

    Science.gov (United States)

    Dossetter, Al G; Ecker, Gerhard; Laverty, Hugh; Overington, John

    2014-05-01

    Future Medicinal Chemistry invited a selection of experts to express their views on the current impact of big data in drug discovery and design, as well as speculate on future developments in the field. The topics discussed include the challenges of implementing big data technologies, maintaining the quality and privacy of data sets, and how the industry will need to adapt to welcome the big data era. Their enlightening responses provide a snapshot of the many and varied contributions being made by big data to the advancement of pharmaceutical science.

  12. Data Grid tools: enabling science on big distributed data

    Energy Technology Data Exchange (ETDEWEB)

    Allcock, Bill [Mathematics and Computer Science, Argonne National Laboratory, Argonne, IL 60439 (United States); Chervenak, Ann [Information Sciences Institute, University of Southern California, Marina del Rey, CA 90291 (United States); Foster, Ian [Mathematics and Computer Science, Argonne National Laboratory, Argonne, IL 60439 (United States); Department of Computer Science, University of Chicago, Chicago, IL 60615 (United States); Kesselman, Carl [Information Sciences Institute, University of Southern California, Marina del Rey, CA 90291 (United States); Livny, Miron [Department of Computer Science, University of Wisconsin, Madison, WI 53705 (United States)

    2005-01-01

    A particularly demanding and important challenge that we face as we attempt to construct the distributed computing machinery required to support SciDAC goals is the efficient, high-performance, reliable, secure, and policy-aware management of large-scale data movement. This problem is fundamental to diverse application domains including experimental physics (high energy physics, nuclear physics, light sources), simulation science (climate, computational chemistry, fusion, astrophysics), and large-scale collaboration. In each case, highly distributed user communities require high-speed access to valuable data, whether for visualization or analysis. The quantities of data involved (terabytes to petabytes), the scale of the demand (hundreds or thousands of users, data-intensive analyses, real-time constraints), and the complexity of the infrastructure that must be managed (networks, tertiary storage systems, network caches, computers, visualization systems) make the problem extremely challenging. Data management tools developed under the auspices of the SciDAC Data Grid Middleware project have become the de facto standard for data management in projects worldwide. Day in and day out, these tools provide the 'plumbing' that allows scientists to do more science on an unprecedented scale in production environments.

  13. Data Grid tools: enabling science on big distributed data

    International Nuclear Information System (INIS)

    Allcock, Bill; Chervenak, Ann; Foster, Ian; Kesselman, Carl; Livny, Miron

    2005-01-01

    A particularly demanding and important challenge that we face as we attempt to construct the distributed computing machinery required to support SciDAC goals is the efficient, high-performance, reliable, secure, and policy-aware management of large-scale data movement. This problem is fundamental to diverse application domains including experimental physics (high energy physics, nuclear physics, light sources), simulation science (climate, computational chemistry, fusion, astrophysics), and large-scale collaboration. In each case, highly distributed user communities require high-speed access to valuable data, whether for visualization or analysis. The quantities of data involved (terabytes to petabytes), the scale of the demand (hundreds or thousands of users, data-intensive analyses, real-time constraints), and the complexity of the infrastructure that must be managed (networks, tertiary storage systems, network caches, computers, visualization systems) make the problem extremely challenging. Data management tools developed under the auspices of the SciDAC Data Grid Middleware project have become the de facto standard for data management in projects worldwide. Day in and day out, these tools provide the 'plumbing' that allows scientists to do more science on an unprecedented scale in production environments

  14. Scientific data management challenges, technology and deployment

    CERN Document Server

    Rotem, Doron

    2010-01-01

    Dealing with the volume, complexity, and diversity of data currently being generated by scientific experiments and simulations often causes scientists to waste productive time. Scientific Data Management: Challenges, Technology, and Deployment describes cutting-edge technologies and solutions for managing and analyzing vast amounts of data, helping scientists focus on their scientific goals. The book begins with coverage of efficient storage systems, discussing how to write and read large volumes of data without slowing the simulation, analysis, or visualization processes. It then focuses on the efficient data movement and management of storage spaces and explores emerging database systems for scientific data. The book also addresses how to best organize data for analysis purposes, how to effectively conduct searches over large datasets, how to successfully automate multistep scientific process workflows, and how to automatically collect metadata and lineage information. This book provides a comprehensive u...

  15. App-lifying USGS Earth Science Data: Engaging the public through Challenge.gov

    Science.gov (United States)

    Frame, M. T.

    2013-12-01

    With the goal of promoting innovative use and applications of USGS data, USGS Core Science Analytics and Synthesis (CSAS) launched the first USGS Challenge: App-lifying USGS Earth Science Data. While initiated before the recent Office of Science and Technology Policy's memorandum 'Increasing Access to the Results of Federally Funded Scientific Research', our challenge focused on one of the core tenets of the memorandum- expanding discoverability, accessibility and usability of CSAS data. From January 9 to April 1, 2013, we invited developers, information scientists, biologists/ecologists, and scientific data visualization specialists to create applications for selected USGS datasets. Identifying new, innovative ways to represent, apply, and make these data available is a high priority for our leadership. To help boost innovation, our only constraint on the challengers stated they must incorporate at least one of the identified datasets in their application. Winners were selected based on the relevance to the USGS and CSAS missions, innovation in design, and overall ease of use of the application. The winner for Best Overall App was TaxaViewer by the rOpenSci group. TaxaViewer is a Web interface to a mashup of data from the USGS-sponsored interagency Integrated Taxonomic Information System (ITIS) and other data from the Phylotastic taxonomic Name service, the Global Invasive Species Database, Phylomatic, and the Global Biodiversity Information Facility. The Popular Choice App award, selected through a public vote on the submissions, went to the Species Comparison Tool by Kimberly Sparks of Raleigh, N.C., which allows users to explore the USGS Gap Analysis Program habitat distribution and/or range of two species concurrently. The application also incorporates ITIS data and provides external links to NatureServe species information. Our results indicated that running a challenge was an effective method for promoting our data products and therefore improving

  16. Data Science Careers: A Sampling of Successful Strategies, Pitfalls, and Persistent Challenges

    Science.gov (United States)

    Stocks, K. I.; Duerr, R.; Wyborn, L. A.; Yarmey, L.

    2015-12-01

    Data Scientists do not have a single career trajectory or preparatory pathway. Successful data scientists have come from domain sciences, computer science, library science, and other diverse fields. They have worked up from entry-level staff positions, have started as academics with doctoral degrees, and have established themselves as management professionals. They have positions in government, industry, academia, and NGO's, and their responsibilities range from highly specialized, to generalists, to high-level leadership. This presents a potentially confusing landscape for students interested in the field: how to decide among the varied options to have the best chance at fulfilling employment? What are the mistakes to avoid? Many established data scientist, both old-timers and early career professionals, expressed interest in presenting in this session but were unable to justify using their one AGU abstract for something other than their funded projects. As the session chairs we interviewed them, plus our extended network of colleagues, to ask for their best advice on what was most critical to their success in their current position, what pitfalls to avoid, what ongoing challenges they see, and what advice they would give themselves, if they could do it all over again starting now. Here we consolidate those interviews with our own perspectives to present some of the common themes and standout advice.

  17. The 1995 Science Information Management and Data Compression Workshop

    Science.gov (United States)

    Tilton, James C. (Editor)

    1995-01-01

    This document is the proceedings from the 'Science Information Management and Data Compression Workshop,' which was held on October 26-27, 1995, at the NASA Goddard Space Flight Center, Greenbelt, Maryland. The Workshop explored promising computational approaches for handling the collection, ingestion, archival, and retrieval of large quantities of data in future Earth and space science missions. It consisted of fourteen presentations covering a range of information management and data compression approaches that are being or have been integrated into actual or prototypical Earth or space science data information systems, or that hold promise for such an application. The Workshop was organized by James C. Tilton and Robert F. Cromp of the NASA Goddard Space Flight Center.

  18. Managing the Fukushima Challenge

    Science.gov (United States)

    Suzuki, Atsuyuki

    2014-01-01

    The Fukushima Daiichi accident raises a fundamental question: Can science and technology prevent the inevitability of serious accidents, especially those with low probabilities and high consequences? This question reminds us of a longstanding challenge with the trans-sciences, originally addressed by Alvin Weinberg well before the Three Mile Island and Chernobyl accidents. This article, revisiting Weinberg's issue, aims at gaining insights from the accident with a special emphasis on the sociotechnical or human behavioral aspects lying behind the accident's causes. In particular, an innovative method for managing the challenge is explored referring to behavioral science approaches to a decision-making process on risk management; such as managing human behavioral risks with information asymmetry, seeking a rational consensus with communicative action, and pursuing procedural rationality through interactions with the outer environment. In short, this article describes the emerging need for Japan to transform its national safety management institutions so that these might be based on interactive communication with parties inside and outside Japan. PMID:24954604

  19. Data science and symbolic AI: Synergies, challenges and opportunities

    KAUST Repository

    Hoehndorf, Robert

    2017-06-02

    Symbolic approaches to artificial intelligence represent things within a domain of knowledge through physical symbols, combine symbols into symbol expressions, and manipulate symbols and symbol expressions through inference processes. While a large part of Data Science relies on statistics and applies statistical approaches to artificial intelligence, there is an increasing potential for successfully applying symbolic approaches as well. Symbolic representations and symbolic inference are close to human cognitive representations and therefore comprehensible and interpretable; they are widely used to represent data and metadata, and their specific semantic content must be taken into account for analysis of such information; and human communication largely relies on symbols, making symbolic representations a crucial part in the analysis of natural language. Here we discuss the role symbolic representations and inference can play in Data Science, highlight the research challenges from the perspective of the data scientist, and argue that symbolic methods should become a crucial component of the data scientists’ toolbox.

  20. Data science and symbolic AI: Synergies, challenges and opportunities

    KAUST Repository

    Hoehndorf, Robert; Queralt-Rosinach, Nú ria

    2017-01-01

    Symbolic approaches to artificial intelligence represent things within a domain of knowledge through physical symbols, combine symbols into symbol expressions, and manipulate symbols and symbol expressions through inference processes. While a large part of Data Science relies on statistics and applies statistical approaches to artificial intelligence, there is an increasing potential for successfully applying symbolic approaches as well. Symbolic representations and symbolic inference are close to human cognitive representations and therefore comprehensible and interpretable; they are widely used to represent data and metadata, and their specific semantic content must be taken into account for analysis of such information; and human communication largely relies on symbols, making symbolic representations a crucial part in the analysis of natural language. Here we discuss the role symbolic representations and inference can play in Data Science, highlight the research challenges from the perspective of the data scientist, and argue that symbolic methods should become a crucial component of the data scientists’ toolbox.

  1. NASA's EOSDIS Cumulus: Ingesting, Archiving, Managing, and Distributing Earth Science Data from the Commercial Cloud

    Science.gov (United States)

    Baynes, Katie; Ramachandran, Rahul; Pilone, Dan; Quinn, Patrick; Gilman, Jason; Schuler, Ian; Jazayeri, Alireza

    2017-01-01

    NASA's Earth Observing System Data and Information System (EOSDIS) has been working towards a vision of a cloud-based, highly-flexible, ingest, archive, management, and distribution system for its ever-growing and evolving data holdings. This system, Cumulus, is emerging from its prototyping stages and is poised to make a huge impact on how NASA manages and disseminates its Earth science data. This talk will outline the motivation for this work, present the achievements and hurdles of the past 18 months and will chart a course for the future expansion of the Cumulus expansion. We will explore on not just the technical, but also the socio-technical challenges that we face in evolving a system of this magnitude into the cloud and how we are rising to meet those challenges through open collaboration and intentional stakeholder engagement.

  2. An Overview of the Challenges with and Proposed Solutions for the Ingest and Distribution Processes For Airborne Data Management

    Science.gov (United States)

    Northup, E. A.; Beach, A. L., III; Early, A. B.; Kusterer, J.; Quam, B.; Wang, D.; Chen, G.

    2015-12-01

    The current data management practices for NASA airborne field projects have successfully served science team data needs over the past 30 years to achieve project science objectives, however, users have discovered a number of issues in terms of data reporting and format. The ICARTT format, a NASA standard since 2010, is currently the most popular among the airborne measurement community. Although easy for humans to use, the format standard is not sufficiently rigorous to be machine-readable, and there lacks a standard variable naming convention among the many airborne measurement variables. This makes data use and management tedious and resource intensive, and also create problems in Distributed Active Archive Center (DAAC) data ingest procedures and distribution. Further, most DAACs use metadata models that concentrate on satellite data observations, making them less prepared to deal with airborne data. There also exists a substantial amount of airborne data distributed by websites designed for science team use that are less friendly to users unfamiliar with operations of airborne field studies. A number of efforts are underway to help overcome the issues with airborne data discovery and distribution. The ICARTT Refresh Earth Science Data Systems Working Group (ESDSWG) was established to enable a platform for atmospheric science data providers, users, and data managers to collaborate on developing new criteria for the file format in an effort to enhance airborne data usability. In addition, the NASA Langley Research Center Atmospheric Science Data Center (ASDC) has developed the Toolsets for Airborne Data (TAD) to provide web-based tools and centralized access to airborne in situ measurements of atmospheric composition. This presentation will discuss the aforementioned challenges and attempted solutions in an effort to demonstrate how airborne data management can be improved to streamline data ingest and discoverability to a broader user community.

  3. Ocean Science Video Challenge Aims to Improve Science Communication

    Science.gov (United States)

    Showstack, Randy

    2013-10-01

    Given today's enormous management and protection challenges related to the world's oceans, a new competition calls on ocean scientists to effectively communicate their research in videos that last up to 3 minutes. The Ocean 180 Video Challenge, named for the number of seconds in 3 minutes, aims to improve ocean science communication while providing high school and middle school teachers and students with new and interesting educational materials about current science topics.

  4. Persistent Identifiers in Earth science data management environments

    Science.gov (United States)

    Weigel, Tobias; Stockhause, Martina; Lautenschlager, Michael

    2014-05-01

    Globally resolvable Persistent Identifiers (PIDs) that carry additional context information (which can be any form of metadata) are increasingly used by data management infrastructures for fundamental tasks. The notion of a Persistent Identifier is originally an abstract concept that aims to provide identifiers that are quality-controlled and maintained beyond the life time of the original issuer, for example through the use of redirection mechanisms. Popular implementations of the PID concept are for example the Handle System and the DOI System based on it. These systems also move beyond the simple identification concept by providing facilities that can hold additional context information. Not only in the Earth sciences, data managers are increasingly attracted to PIDs because of the opportunities these facilities provide; however, long-term viable principles and mechanisms for efficient organization of PIDs and context information are not yet available or well established. In this respect, promising techniques are to type the information that is associated with PIDs and to construct actionable collections of PIDs. There are two main drivers for extended PID usage: Earth science data management middleware use cases and applications geared towards scientific end-users. Motivating scenarios from data management include hierarchical data and metadata management, consistent data tracking and improvements in the accountability of processes. If PIDs are consistently assigned to data objects, context information can be carried over to subsequent data life cycle stages much easier. This can also ease data migration from one major curation domain to another, e.g. from early dissemination within research communities to formal publication and long-term archival stages, and it can help to document processes across technical and organizational boundaries. For scientific end users, application scenarios include for example more personalized data citation and improvements in the

  5. Project management of life-science research projects: project characteristics, challenges and training needs.

    Science.gov (United States)

    Beukers, Margot W

    2011-02-01

    Thirty-four project managers of life-science research projects were interviewed to investigate the characteristics of their projects, the challenges they faced and their training requirements. A set of ten discriminating parameters were identified based on four project categories: contract research, development, discovery and call-based projects--projects set up to address research questions defined in a call for proposals. The major challenges these project managers are faced with relate to project members, leadership without authority and a lack of commitment from the respective organization. Two-thirds of the project managers indicated that they would be interested in receiving additional training, mostly on people-oriented, soft skills. The training programs that are currently on offer, however, do not meet their needs. Copyright © 2010 Elsevier Ltd. All rights reserved.

  6. The Value of Metrics for Science Data Center Management

    Science.gov (United States)

    Moses, J.; Behnke, J.; Watts, T. H.; Lu, Y.

    2005-12-01

    The Earth Observing System Data and Information System (EOSDIS) has been collecting and analyzing records of science data archive, processing and product distribution for more than 10 years. The types of information collected and the analysis performed has matured and progressed to become an integral and necessary part of the system management and planning functions. Science data center managers are realizing the importance that metrics can play in influencing and validating their business model. New efforts focus on better understanding of users and their methods. Examples include tracking user web site interactions and conducting user surveys such as the government authorized American Customer Satisfaction Index survey. This paper discusses the metrics methodology, processes and applications that are growing in EOSDIS, the driving requirements and compelling events, and the future envisioned for metrics as an integral part of earth science data systems.

  7. Spatially explicit data: stewardship and ethical challenges in science.

    Science.gov (United States)

    Hartter, Joel; Ryan, Sadie J; Mackenzie, Catrina A; Parker, John N; Strasser, Carly A

    2013-09-01

    Scholarly communication is at an unprecedented turning point created in part by the increasing saliency of data stewardship and data sharing. Formal data management plans represent a new emphasis in research, enabling access to data at higher volumes and more quickly, and the potential for replication and augmentation of existing research. Data sharing has recently transformed the practice, scope, content, and applicability of research in several disciplines, in particular in relation to spatially specific data. This lends exciting potentiality, but the most effective ways in which to implement such changes, particularly for disciplines involving human subjects and other sensitive information, demand consideration. Data management plans, stewardship, and sharing, impart distinctive technical, sociological, and ethical challenges that remain to be adequately identified and remedied. Here, we consider these and propose potential solutions for their amelioration.

  8. Biomarkers as Common Data Elements for Symptom and Self-Management Science.

    Science.gov (United States)

    Page, Gayle G; Corwin, Elizabeth J; Dorsey, Susan G; Redeker, Nancy S; McCloskey, Donna Jo; Austin, Joan K; Guthrie, Barbara J; Moore, Shirley M; Barton, Debra; Kim, Miyong T; Docherty, Sharron L; Waldrop-Valverde, Drenna; Bailey, Donald E; Schiffman, Rachel F; Starkweather, Angela; Ward, Teresa M; Bakken, Suzanne; Hickey, Kathleen T; Renn, Cynthia L; Grady, Patricia

    2018-05-01

    Biomarkers as common data elements (CDEs) are important for the characterization of biobehavioral symptoms given that once a biologic moderator or mediator is identified, biologically based strategies can be investigated for treatment efforts. Just as a symptom inventory reflects a symptom experience, a biomarker is an indicator of the symptom, though not the symptom per se. The purposes of this position paper are to (a) identify a "minimum set" of biomarkers for consideration as CDEs in symptom and self-management science, specifically biochemical biomarkers; (b) evaluate the benefits and limitations of such a limited array of biomarkers with implications for symptom science; (c) propose a strategy for the collection of the endorsed minimum set of biologic samples to be employed as CDEs for symptom science; and (d) conceptualize this minimum set of biomarkers consistent with National Institute of Nursing Research (NINR) symptoms of fatigue, depression, cognition, pain, and sleep disturbance. From May 2016 through January 2017, a working group consisting of a subset of the Directors of the NINR Centers of Excellence funded by P20 or P30 mechanisms and NINR staff met bimonthly via telephone to develop this position paper suggesting the addition of biomarkers as CDEs. The full group of Directors reviewed drafts, provided critiques and suggestions, recommended the minimum set of biomarkers, and approved the completed document. Best practices for selecting, identifying, and using biological CDEs as well as challenges to the use of biological CDEs for symptom and self-management science are described. Current platforms for sample outcome sharing are presented. Finally, biological CDEs for symptom and self-management science are proposed along with implications for future research and use of CDEs in these areas. The recommended minimum set of biomarker CDEs include pro- and anti-inflammatory cytokines, a hypothalamic-pituitary-adrenal axis marker, cortisol, the

  9. Toward a Big Data Science: A challenge of "Science Cloud"

    Science.gov (United States)

    Murata, Ken T.; Watanabe, Hidenobu

    2013-04-01

    During these 50 years, along with appearance and development of high-performance computers (and super-computers), numerical simulation is considered to be a third methodology for science, following theoretical (first) and experimental and/or observational (second) approaches. The variety of data yielded by the second approaches has been getting more and more. It is due to the progress of technologies of experiments and observations. The amount of the data generated by the third methodologies has been getting larger and larger. It is because of tremendous development and programming techniques of super computers. Most of the data files created by both experiments/observations and numerical simulations are saved in digital formats and analyzed on computers. The researchers (domain experts) are interested in not only how to make experiments and/or observations or perform numerical simulations, but what information (new findings) to extract from the data. However, data does not usually tell anything about the science; sciences are implicitly hidden in the data. Researchers have to extract information to find new sciences from the data files. This is a basic concept of data intensive (data oriented) science for Big Data. As the scales of experiments and/or observations and numerical simulations get larger, new techniques and facilities are required to extract information from a large amount of data files. The technique is called as informatics as a fourth methodology for new sciences. Any methodologies must work on their facilities: for example, space environment are observed via spacecraft and numerical simulations are performed on super-computers, respectively in space science. The facility of the informatics, which deals with large-scale data, is a computational cloud system for science. This paper is to propose a cloud system for informatics, which has been developed at NICT (National Institute of Information and Communications Technology), Japan. The NICT science

  10. Transboundary fisheries science: Meeting the challenges of inland fisheries management in the 21st century

    Science.gov (United States)

    Midway, Stephen R.; Wagner, Tyler; Zydlewski, Joseph D.; Irwin, Brian J.; Paukert, Craig P.

    2016-01-01

    Managing inland fisheries in the 21st century presents several obstacles, including the need to view fisheries from multiple spatial and temporal scales, which usually involves populations and resources spanning sociopolitical boundaries. Though collaboration is not new to fisheries science, inland aquatic systems have historically been managed at local scales and present different challenges than in marine or large freshwater systems like the Laurentian Great Lakes. Therefore, we outline a flexible strategy that highlights organization, cooperation, analytics, and implementation as building blocks toward effectively addressing transboundary fisheries issues. Additionally, we discuss the use of Bayesian hierarchical models (within the analytical stage), due to their flexibility in dealing with the variability present in data from multiple scales. With growing recognition of both ecological drivers that span spatial and temporal scales and the subsequent need for collaboration to effectively manage heterogeneous resources, we expect implementation of transboundary approaches to become increasingly critical for effective inland fisheries management.

  11. Big data : opportunities and challenges in asset management : final report.

    Science.gov (United States)

    2016-08-01

    State Departments of Transportation and other transportation agencies collect vast quantities of data but managing, accessing and sharing data has been problematic and well documented. This project reviewed the similar challenges faced by other indus...

  12. Modeling and Analysis in Marine Big Data: Advances and Challenges

    Directory of Open Access Journals (Sweden)

    Dongmei Huang

    2015-01-01

    Full Text Available It is aware that big data has gathered tremendous attentions from academic research institutes, governments, and enterprises in all aspects of information sciences. With the development of diversity of marine data acquisition techniques, marine data grow exponentially in last decade, which forms marine big data. As an innovation, marine big data is a double-edged sword. On the one hand, there are many potential and highly useful values hidden in the huge volume of marine data, which is widely used in marine-related fields, such as tsunami and red-tide warning, prevention, and forecasting, disaster inversion, and visualization modeling after disasters. There is no doubt that the future competitions in marine sciences and technologies will surely converge into the marine data explorations. On the other hand, marine big data also brings about many new challenges in data management, such as the difficulties in data capture, storage, analysis, and applications, as well as data quality control and data security. To highlight theoretical methodologies and practical applications of marine big data, this paper illustrates a broad view about marine big data and its management, makes a survey on key methods and models, introduces an engineering instance that demonstrates the management architecture, and discusses the existing challenges.

  13. Modeling and Management of Big Data: Challenges and opportunities

    OpenAIRE

    Gil, David; Song, Il-Yeol

    2016-01-01

    The term Big Data denotes huge-volume, complex, rapid growing datasets with numerous, autonomous and independent sources. In these new circumstances Big Data bring many attractive opportunities; however, good opportunities are always followed by challenges, such as modelling, new paradigms, novel architectures that require original approaches to address data complexities. The purpose of this special issue on Modeling and Management of Big Data is to discuss research and experience in modellin...

  14. Citizen Science: Data Sharing For, By, and With the Public

    Science.gov (United States)

    Wiggins, A.

    2017-12-01

    Data sharing in citizen science is just as challenging as it is for any other type of science, except that there are more parties involved, with more diverse needs and interests. This talk provides an overview of the challenges and current efforts to advance data sharing in citizen science, and suggests refocusing data management activities on supporting the needs of multiple audiences. Early work on data sharing in citizen science advocated applying the standards and practices of academia, which can only address the needs of one of several audiences for citizen science data, and academics are not always the primary audience. Practitioners still need guidance on how to better share data other key parties, such as participants and policymakers, and which data management practices to prioritize for addressing the needs of multiple audiences. The benefits to the project of investing scarce resources into data products and dissemination strategies for each target audience still remain variable, unclear, or unpredictable. And as projects mature and change, the importance of data sharing activities and audiences are likely to change as well. This combination of multiple diverse audiences, shifting priorities, limited resources, and unclear benefits creates a perfect storm of conditions to suppress data sharing. Nonetheless, many citizen science projects make the effort, with exemplars showing substantial returns on data stewardship investments, and international initiatives are underway to bolster the data sharing capacity of the field. To improve the state of data sharing in citizen science, strategic use of limited resources suggests prioritizing data management activities that support the needs of multiple audiences. These may include better transparency about data access and usage, and standardized reporting of broader impacts from secondary data users, to both reward projects and incentivize further data sharing.

  15. Scientific data management in the environmental molecular sciences laboratory

    Energy Technology Data Exchange (ETDEWEB)

    Bernard, P.R.; Keller, T.L.

    1995-09-01

    The Environmental Molecular Sciences Laboratory (EMSL) is currently under construction at Pacific Northwest Laboratory (PNL) for the U.S. Department of Energy (DOE). This laboratory will be used for molecular and environmental sciences research to identify comprehensive solutions to DOE`s environmental problems. Major facilities within the EMSL include the Molecular Sciences Computing Facility (MSCF), a laser-surface dynamics laboratory, a high-field nuclear magnetic resonance (NMR) laboratory, and a mass spectrometry laboratory. The EMSL is scheduled to open early in 1997 and will house about 260 resident and visiting scientists. It is anticipated that at least six (6) terabytes of data will be archived in the first year of operation. An object-oriented database management system (OODBMS) and a mass storage system will be integrated to provide an intelligent, automated mechanism to manage data. The resulting system, called the DataBase Computer System (DBCS), will provide total scientific data management capabilities to EMSL users. A prototype mass storage system based on the National Storage Laboratory`s (NSL) UniTree has been procured and is in limited use. This system consists of two independent hierarchies of storage devices. One hierarchy of lower capacity, slower speed devices provides support for smaller files transferred over the Fiber Distributed Data Interface (FDDI) network. Also part of the system is a second hierarchy of higher capacity, higher speed devices that will be used to support high performance clients (e.g., a large scale parallel processor). The ObjectStore OODBMS will be used to manage metadata for archived datasets, maintain relationships between archived datasets, and -hold small, duplicate subsets of archived datasets (i.e., derivative data). The interim system is called DBCS, Phase 0 (DBCS-0). The production system for the EMSL, DBCS Phase 1 (DBCS-1), will be procured and installed in the summer of 1996.

  16. Managing science developing your research, leadership and management skills

    CERN Document Server

    Peach, Ken

    2017-01-01

    Managing science, which includes managing scientific research and, implicitly, managing scientists, has much in common with managing any enterprise, and most of these issues (e.g. annual budget planning and reporting) form the background. Equally, much scientific research is carried in universities ancient and modern, which have their own mores, ranging from professorial autocracy to democratic plurality, as well as national and international with their missions and styles. But science has issues that require a somewhat different approach if it is to prosper and succeed. Society now expects science, whether publicly or privately funded, to deliver benefits, yet the definition of science presumes no such benefit. Managing the expectations of the scientist with those of society is the challenge of the manager of science. The book addresses some issues around science and the organizations that do science. It then deals with leadership, management and communication, team building, recruitment, motivation, managin...

  17. The AGU Data Management Maturity Model Initiative

    Science.gov (United States)

    Bates, J. J.

    2015-12-01

    In September 2014, the AGU Board of Directors approved two initiatives to help the Earth and space sciences community address the growing challenges accompanying the increasing size and complexity of data. These initiatives are: 1) Data Science Credentialing: development of a continuing education and professional certification program to help scientists in their careers and to meet growing responsibilities and requirements around data science; and 2) Data Management Maturity (DMM) Model: development and implementation of a data management maturity model to assess process maturity against best practices, and to identify opportunities in organizational data management processes. Each of these has been organized within AGU as an Editorial Board and both Boards have held kick off meetings. The DMM model Editorial Board will recommend strategies for adapting and deploying a DMM model to the Earth and space sciences create guidance documents to assist in its implementation, and provide input on a pilot appraisal process. This presentation will provide an overview of progress to date in the DMM model Editorial Board and plans for work to be done over the upcoming year.

  18. Integration Of PanDA Workload Management System With Supercomputers for ATLAS and Data Intensive Science

    Science.gov (United States)

    Klimentov, A.; De, K.; Jha, S.; Maeno, T.; Nilsson, P.; Oleynik, D.; Panitkin, S.; Wells, J.; Wenaus, T.

    2016-10-01

    The.LHC, operating at CERN, is leading Big Data driven scientific explorations. Experiments at the LHC explore the fundamental nature of matter and the basic forces that shape our universe. ATLAS, one of the largest collaborations ever assembled in the sciences, is at the forefront of research at the LHC. To address an unprecedented multi-petabyte data processing challenge, the ATLAS experiment is relying on a heterogeneous distributed computational infrastructure. The ATLAS experiment uses PanDA (Production and Data Analysis) Workload Management System for managing the workflow for all data processing on over 150 data centers. Through PanDA, ATLAS physicists see a single computing facility that enables rapid scientific breakthroughs for the experiment, even though the data centers are physically scattered all over the world. While PanDA currently uses more than 250,000 cores with a peak performance of 0.3 petaFLOPS, LHC data taking runs require more resources than grid can possibly provide. To alleviate these challenges, LHC experiments are engaged in an ambitious program to expand the current computing model to include additional resources such as the opportunistic use of supercomputers. We will describe a project aimed at integration of PanDA WMS with supercomputers in United States, in particular with Titan supercomputer at Oak Ridge Leadership Computing Facility. Current approach utilizes modified PanDA pilot framework for job submission to the supercomputers batch queues and local data management, with light-weight MPI wrappers to run single threaded workloads in parallel on LCFs multi-core worker nodes. This implementation was tested with a variety of Monte-Carlo workloads on several supercomputing platforms for ALICE and ATLAS experiments and it is in full pro duction for the ATLAS since September 2015. We will present our current accomplishments with running PanDA at supercomputers and demonstrate our ability to use PanDA as a portal independent of the

  19. Integration Of PanDA Workload Management System With Supercomputers for ATLAS and Data Intensive Science

    International Nuclear Information System (INIS)

    Klimentov, A; Maeno, T; Nilsson, P; Panitkin, S; Wenaus, T; De, K; Oleynik, D; Jha, S; Wells, J

    2016-01-01

    The.LHC, operating at CERN, is leading Big Data driven scientific explorations. Experiments at the LHC explore the fundamental nature of matter and the basic forces that shape our universe. ATLAS, one of the largest collaborations ever assembled in the sciences, is at the forefront of research at the LHC. To address an unprecedented multi-petabyte data processing challenge, the ATLAS experiment is relying on a heterogeneous distributed computational infrastructure. The ATLAS experiment uses PanDA (Production and Data Analysis) Workload Management System for managing the workflow for all data processing on over 150 data centers. Through PanDA, ATLAS physicists see a single computing facility that enables rapid scientific breakthroughs for the experiment, even though the data centers are physically scattered all over the world. While PanDA currently uses more than 250,000 cores with a peak performance of 0.3 petaFLOPS, LHC data taking runs require more resources than grid can possibly provide. To alleviate these challenges, LHC experiments are engaged in an ambitious program to expand the current computing model to include additional resources such as the opportunistic use of supercomputers. We will describe a project aimed at integration of PanDA WMS with supercomputers in United States, in particular with Titan supercomputer at Oak Ridge Leadership Computing Facility. Current approach utilizes modified PanDA pilot framework for job submission to the supercomputers batch queues and local data management, with light-weight MPI wrappers to run single threaded workloads in parallel on LCFs multi-core worker nodes. This implementation was tested with a variety of Monte-Carlo workloads on several supercomputing platforms for ALICE and ATLAS experiments and it is in full pro duction for the ATLAS since September 2015. We will present our current accomplishments with running PanDA at supercomputers and demonstrate our ability to use PanDA as a portal independent of the

  20. Nurse managers' challenges in project management.

    Science.gov (United States)

    Suhonen, Marjo; Paasivaara, Leena

    2011-11-01

    To analyse the challenges that nurse managers meet in project management. Project management done by nurse managers has a significant role in the success of projects conducted in work units. The data were collected by open interviews (n = 14). The participants were nurse managers, nurses and public health nurses. Data analysis was carried out using qualitative content analysis. The three main challenges nurse managers faced in project management in health-care work units were: (1) apathetic organization and management, (2) paralysed work community and (3) cooperation between individuals being discouraged. Nurse managers' challenges in project management can be viewed from the perspective of the following paradoxes: (1) keeping up projects-ensuring patient care, (2) enthusiastic management-effective management of daily work and (3) supporting the work of a multiprofessional team-leadership of individual employees. It is important for nurse managers to learn to relate these paradoxes to one another in a positive way. Further research is needed, focusing on nurse managers' ability to promote workplace spirituality, nurse managers' emotional intelligence and their enthusiasm in small projects. © 2011 Blackwell Publishing Ltd.

  1. Archive & Data Management Activities for ISRO Science Archives

    Science.gov (United States)

    Thakkar, Navita; Moorthi, Manthira; Gopala Krishna, Barla; Prashar, Ajay; Srinivasan, T. P.

    2012-07-01

    ISRO has kept a step ahead by extending remote sensing missions to planetary and astronomical exploration. It has started with Chandrayaan-1 and successfully completed the moon imaging during its life time in the orbit. Now, in future ISRO is planning to launch Chandrayaan-2 (next moon mission), Mars Mission and Astronomical mission ASTROSAT. All these missions are characterized by the need to receive process, archive and disseminate the acquired science data to the user community for analysis and scientific use. All these science missions will last for a few months to a few years but the data received are required to be archived, interoperable and requires a seamless access to the user community for the future. ISRO has laid out definite plans to archive these data sets in specified standards and develop relevant access tools to be able to serve the user community. To achieve this goal, a Data Center is set up at Bangalore called Indian Space Science Data Center (ISSDC). This is the custodian of all the data sets of the current and future science missions of ISRO . Chandrayaan-1 is the first among the planetary missions launched/to be launched by ISRO and we had taken the challenge and developed a system for data archival and dissemination of the payload data received. For Chandrayaan-1 the data collected from all the instruments are processed and is archived in the archive layer in the Planetary Data System (PDS 3.0) standards, through the automated pipeline. But the dataset once stored is of no use unless it is made public, which requires a Web-based dissemination system that can be accessible to all the planetary scientists/data users working in this field. Towards this, a Web- based Browse and Dissemination system has been developed, wherein users can register and search for their area of Interest and view the data archived for TMC & HYSI with relevant Browse chips and Metadata of the data. Users can also order the data and get it on their desktop in the PDS

  2. Strategic management cultures: historical connections with science

    OpenAIRE

    Abreu Pederzini, G.

    2016-01-01

    Purpose: The implicit and indirect influence of classical science on strategic management has been of utmost importance in the development of the discipline. Classical science has underpinned the main and even contrasting strategic management cultures. Classical science has undoubtedly allowed strategic management to thrive. Nevertheless, important limitations, roadblocks and challenges have also been produced. This paper aims to explore the influence of classical science on the main positivi...

  3. USGS Science Data Catalog - Open Data Advances or Declines

    Science.gov (United States)

    Frame, M. T.; Hutchison, V.; Zolly, L.; Wheeler, B.; Latysh, N.; Devarakonda, R.; Palanisamy, G.; Shrestha, B.

    2014-12-01

    The recent Office of Science and Technology Policy (OSTP) White House Open Data Policies (2013) have required Federal agencies to establish formal catalogues of their science data holdings and make these data easily available on Web sites, portals, and applications. As an organization, the USGS has historically excelled at making its data holdings freely available on its various Web sites (i.e., National, Scientific Programs, or local Science Center). In response to these requirements, the USGS Core Science Analytics, Synthesis, and Libraries program, in collaboration with DOE's Oak Ridge National Laboratory (ORNL) Mercury Consortium (funded by NASA, USGS, and DOE), and a number of other USGS organizations, established the Science Data Catalog (http://data.usgs.gov) cyberinfrastructure, content management processes/tools, and supporting policies. The USGS Science Data Catalog led the charge at USGS to improve the robustness of existing/future metadata collections; streamline and develop sustainable publishing to external aggregators (i.e., data.gov); and provide leadership to the U.S. Department of Interior in emerging Open Data policies, techniques, and systems. The session will discuss the current successes, challenges, and movement toward meeting these Open Data policies for USGS scientific data holdings. A retrospective look at the last year of implementation of these efforts within USGS will occur to determine whether these Open Data Policies are improving data access or limiting data availability. To learn more about the USGS Science Data Catalog, visit us at http://data.usgs.gov/info/about.html

  4. Democratizing data science through data science training.

    Science.gov (United States)

    Van Horn, John Darrell; Fierro, Lily; Kamdar, Jeana; Gordon, Jonathan; Stewart, Crystal; Bhattrai, Avnish; Abe, Sumiko; Lei, Xiaoxiao; O'Driscoll, Caroline; Sinha, Aakanchha; Jain, Priyambada; Burns, Gully; Lerman, Kristina; Ambite, José Luis

    2018-01-01

    The biomedical sciences have experienced an explosion of data which promises to overwhelm many current practitioners. Without easy access to data science training resources, biomedical researchers may find themselves unable to wrangle their own datasets. In 2014, to address the challenges posed such a data onslaught, the National Institutes of Health (NIH) launched the Big Data to Knowledge (BD2K) initiative. To this end, the BD2K Training Coordinating Center (TCC; bigdatau.org) was funded to facilitate both in-person and online learning, and open up the concepts of data science to the widest possible audience. Here, we describe the activities of the BD2K TCC and its focus on the construction of the Educational Resource Discovery Index (ERuDIte), which identifies, collects, describes, and organizes online data science materials from BD2K awardees, open online courses, and videos from scientific lectures and tutorials. ERuDIte now indexes over 9,500 resources. Given the richness of online training materials and the constant evolution of biomedical data science, computational methods applying information retrieval, natural language processing, and machine learning techniques are required - in effect, using data science to inform training in data science. In so doing, the TCC seeks to democratize novel insights and discoveries brought forth via large-scale data science training.

  5. Towards Data Science

    OpenAIRE

    Zhu, Yangyong; Xiong, Yun

    2015-01-01

    Currently, a huge amount of data is being rapidly generated in cyberspace. Datanature (all data in cyberspace) is forming due to a data explosion. Exploring the patterns and rules in datanature is necessary but difficult. A new discipline called Data Science is coming. It provides a type of novel research method (a data-intensive method) for natural and social sciences and goes beyond computer science in researching data. This paper presents the challenges presented by data and discusses what...

  6. Provenance Challenges for Earth Science Dataset Publication

    Science.gov (United States)

    Tilmes, Curt

    2011-01-01

    Modern science is increasingly dependent on computational analysis of very large data sets. Organizing, referencing, publishing those data has become a complex problem. Published research that depends on such data often fails to cite the data in sufficient detail to allow an independent scientist to reproduce the original experiments and analyses. This paper explores some of the challenges related to data identification, equivalence and reproducibility in the domain of data intensive scientific processing. It will use the example of Earth Science satellite data, but the challenges also apply to other domains.

  7. Dealing with Data: Science Librarians' Participation in Data Management at Association of Research Libraries Institutions

    Science.gov (United States)

    Antell, Karen; Foote, Jody Bales; Turner, Jaymie; Shults, Brian

    2014-01-01

    As long as empirical research has existed, researchers have been doing "data management" in one form or another. However, funding agency mandates for doing formal data management are relatively recent, and academic libraries' involvement has been concentrated mainly in the last few years. The National Science Foundation implemented a new…

  8. Data Access, Interoperability and Sustainability: Key Challenges for the Evolution of Science Capabilities

    Science.gov (United States)

    Walton, A. L.

    2015-12-01

    In 2016, the National Science Foundation (NSF) will support a portfolio of activities and investments focused upon challenges in data access, interoperability, and sustainability. These topics are fundamental to science questions of increasing complexity that require multidisciplinary approaches and expertise. Progress has become tractable because of (and sometimes complicated by) unprecedented growth in data (both simulations and observations) and rapid advances in technology (such as instrumentation in all aspects of the discovery process, together with ubiquitous cyberinfrastructure to connect, compute, visualize, store, and discover). The goal is an evolution of capabilities for the research community based on these investments, scientific priorities, technology advances, and policies. Examples from multiple NSF directorates, including investments by the Advanced Cyberinfrastructure Division, are aimed at these challenges and can provide the geosciences research community with models and opportunities for participation. Implications for the future are highlighted, along with the importance of continued community engagement on key issues.

  9. Web-scale data management for the cloud

    CERN Document Server

    Lehner, Wolfgang

    2013-01-01

    The efficient management of a consistent and integrated database is a central task in modern IT and highly relevant for science and industry. Hardly any critical enterprise solution comes without any functionality for managing data in its different forms. Web-Scale Data Management for the Cloud addresses fundamental challenges posed by the need and desire to provide database functionality in the context of the Database as a Service (DBaaS) paradigm for database outsourcing. This book also discusses the motivation of the new paradigm of cloud computing, and its impact to data outsourcing and se

  10. Opportunities and Challenges for the Life Sciences Community

    Science.gov (United States)

    Stewart, Elizabeth; Ozdemir, Vural

    2012-01-01

    Abstract Twenty-first century life sciences have transformed into data-enabled (also called data-intensive, data-driven, or big data) sciences. They principally depend on data-, computation-, and instrumentation-intensive approaches to seek comprehensive understanding of complex biological processes and systems (e.g., ecosystems, complex diseases, environmental, and health challenges). Federal agencies including the National Science Foundation (NSF) have played and continue to play an exceptional leadership role by innovatively addressing the challenges of data-enabled life sciences. Yet even more is required not only to keep up with the current developments, but also to pro-actively enable future research needs. Straightforward access to data, computing, and analysis resources will enable true democratization of research competitions; thus investigators will compete based on the merits and broader impact of their ideas and approaches rather than on the scale of their institutional resources. This is the Final Report for Data-Intensive Science Workshops DISW1 and DISW2. The first NSF-funded Data Intensive Science Workshop (DISW1, Seattle, WA, September 19–20, 2010) overviewed the status of the data-enabled life sciences and identified their challenges and opportunities. This served as a baseline for the second NSF-funded DIS workshop (DISW2, Washington, DC, May 16–17, 2011). Based on the findings of DISW2 the following overarching recommendation to the NSF was proposed: establish a community alliance to be the voice and framework of the data-enabled life sciences. After this Final Report was finished, Data-Enabled Life Sciences Alliance (DELSA, www.delsall.org) was formed to become a Digital Commons for the life sciences community. PMID:22401659

  11. Big data management challenges in health research-a literature review.

    Science.gov (United States)

    Wang, Xiaoming; Williams, Carolyn; Liu, Zhen Hua; Croghan, Joe

    2017-08-07

    Big data management for information centralization (i.e. making data of interest findable) and integration (i.e. making related data connectable) in health research is a defining challenge in biomedical informatics. While essential to create a foundation for knowledge discovery, optimized solutions to deliver high-quality and easy-to-use information resources are not thoroughly explored. In this review, we identify the gaps between current data management approaches and the need for new capacity to manage big data generated in advanced health research. Focusing on these unmet needs and well-recognized problems, we introduce state-of-the-art concepts, approaches and technologies for data management from computing academia and industry to explore improvement solutions. We explain the potential and significance of these advances for biomedical informatics. In addition, we discuss specific issues that have a great impact on technical solutions for developing the next generation of digital products (tools and data) to facilitate the raw-data-to-knowledge process in health research. Published by Oxford University Press 2017. This work is written by US Government employees and is in the public domain in the US.

  12. Making USGS Science Data more Open, Accessible, and Usable: Leveraging ScienceBase for Success

    Science.gov (United States)

    Chang, M.; Ignizio, D.; Langseth, M. L.; Norkin, T.

    2016-12-01

    In 2013, the White House released initiatives requiring federally funded research to be made publicly available and machine readable. In response, the U.S. Geological Survey (USGS) has been developing a unified approach to make USGS data available and open. This effort has involved the establishment of internal policies and the release of a Public Access Plan, which outlines a strategy for the USGS to move forward into the modern era in scientific data management. Originally designed as a catalog and collaborative data management platform, ScienceBase (www.sciencebase.gov) is being leveraged to serve as a robust data hosting solution for USGS researchers to make scientific data accessible. With the goal of maintaining persistent access to formal data products and developing a management approach to facilitate stable data citation, the ScienceBase Data Release Team was established to ensure the quality, consistency, and meaningful organization of USGS data through standardized workflows and best practices. These practices include the creation and maintenance of persistent identifiers for data, improving the use of open data formats, establishing permissions for read/write access, validating the quality of standards compliant metadata, verifying that data have been reviewed and approved prior to release, and connecting to external search catalogs such as the USGS Science Data Catalog (data.usgs.gov) and data.gov. The ScienceBase team is actively building features to support this effort by automating steps to streamline the process, building metrics to track site visits and downloads, and connecting published digital resources in line with USGS and Federal policy. By utilizing ScienceBase to achieve stewardship quality and employing a dedicated team to help USGS scientists improve the quality of their data, the USGS is helping to meet today's data quality management challenges and ensure that reliable USGS data are available to and reusable for the public.

  13. Data management and its role in delivering science at DOE BES user facilities - Past, Present, and Future

    Science.gov (United States)

    Miller, Stephen D.; Herwig, Kenneth W.; Ren, Shelly; Vazhkudai, Sudharshan S.; Jemian, Pete R.; Luitz, Steffen; Salnikov, Andrei A.; Gaponenko, Igor; Proffen, Thomas; Lewis, Paul; Green, Mark L.

    2009-07-01

    The primary mission of user facilities operated by Basic Energy Sciences under the Department of Energy is to produce data for users in support of open science and basic research [1]. We trace back almost 30 years of history across selected user facilities illustrating the evolution of facility data management practices and how these practices have related to performing scientific research. The facilities cover multiple techniques such as X-ray and neutron scattering, imaging and tomography sciences. Over time, detector and data acquisition technologies have dramatically increased the ability to produce prolific volumes of data challenging the traditional paradigm of users taking data home upon completion of their experiments to process and publish their results. During this time, computing capacity has also increased dramatically, though the size of the data has grown significantly faster than the capacity of one's laptop to manage and process this new facility produced data. Trends indicate that this will continue to be the case for yet some time. Thus users face a quandary for how to manage today's data complexity and size as these may exceed the computing resources users have available to themselves. This same quandary can also stifle collaboration and sharing. Realizing this, some facilities are already providing web portal access to data and computing thereby providing users access to resources they need [2]. Portal based computing is now driving researchers to think about how to use the data collected at multiple facilities in an integrated way to perform their research, and also how to collaborate and share data. In the future, inter-facility data management systems will enable next tier cross-instrument-cross facility scientific research fuelled by smart applications residing upon user computer resources. We can learn from the medical imaging community that has been working since the early 1990's to integrate data from across multiple modalities to achieve

  14. Data management and its role in delivering science at DOE BES user facilities - Past, Present, and Future

    International Nuclear Information System (INIS)

    Miller, Stephen D; Herwig, Kenneth W; Ren, Shelly; Vazhkudai, Sudharshan S; Jemian, Pete R; Luitz, Steffen; Salnikov, Andrei A; Gaponenko, Igor; Proffen, Thomas; Lewis, Paul; Green, Mark L

    2009-01-01

    The primary mission of user facilities operated by Basic Energy Sciences under the Department of Energy is to produce data for users in support of open science and basic research. We trace back almost 30 years of history across selected user facilities illustrating the evolution of facility data management practices and how these practices have related to performing scientific research. The facilities cover multiple techniques such as X-ray and neutron scattering, imaging and tomography sciences. Over time, detector and data acquisition technologies have dramatically increased the ability to produce prolific volumes of data challenging the traditional paradigm of users taking data home upon completion of their experiments to process and publish their results. During this time, computing capacity has also increased dramatically, though the size of the data has grown significantly faster than the capacity of one's laptop to manage and process this new facility produced data. Trends indicate that this will continue to be the case for yet some time. Thus users face a quandary for how to manage today's data complexity and size as these may exceed the computing resources users have available to themselves. This same quandary can also stifle collaboration and sharing. Realizing this, some facilities are already providing web portal access to data and computing thereby providing users access to resources they need. Portal based computing is now driving researchers to think about how to use the data collected at multiple facilities in an integrated way to perform their research, and also how to collaborate and share data. In the future, inter-facility data management systems will enable next tier cross-instrument-cross facility scientific research fuelled by smart applications residing upon user computer resources. We can learn from the medical imaging community that has been working since the early 1990's to integrate data from across multiple modalities to achieve better

  15. Data Management and its Role in Delivering Science at DOE BES User Facilities - Past, Present, and Future

    International Nuclear Information System (INIS)

    Miller, Stephen D.; Herwig, Kenneth W.; Ren, Shelly; Vazhkudai, Sudharshan S.; Jemian, Pete R.; Luitz, Steffen; Salnikov, Andrei; Gaponenko, Igor; Proffen, Thomas; Lewis, Paul; Hagen, Mark E.

    2009-01-01

    The primary mission of user facilities operated by Basic Energy Sciences under the Department of Energy is to produce data for users in support of open science and basic research. We trace back almost 30 years of history across selected user facilities illustrating the evolution of facility data management practices and how these practices have related to performing scientific research. The facilities cover multiple techniques such as X-ray and neutron scattering, imaging and tomography sciences. Over time, detector and data acquisition technologies have dramatically increased the ability to produce prolific volumes of data challenging the traditional paradigm of users taking data home upon completion of their experiments to process and publish their results. During this time, computing capacity has also increased dramatically, though the size of the data has grown significantly faster than the capacity of one's laptop to manage and process this new facility produced data. Trends indicate that this will continue to be the case for yet some time. Thus users face a quandary for how to manage today's data complexity and size as these may exceed the computing resources users have available to themselves. This same quandary can also stifle collaboration and sharing. Realizing this, some facilities are already providing web portal access to data and computing thereby providing users access to resources they need. Portal based computing is now driving researchers to think about how to use the data collected at multiple facilities in an integrated way to perform their research, and also how to collaborate and share data. In the future, inter-facility data management systems will enable next tier cross-instrument-cross facility scientific research fuelled by smart applications residing upon user computer resources. We can learn from the medical imaging community that has been working since the early 1990's to integrate data from across multiple modalities to achieve better

  16. Data Management and Its Role in Delivering Science at DOE BES User Facilities Past, Present, and Future

    International Nuclear Information System (INIS)

    Miller, Stephen D.; Herwig, Kenneth W.; Ren, Shelly; Vazhkudai, Sudharshan S.

    2009-01-01

    The primary mission of user facilities operated by Basic Energy Sciences under the Department of Energy is to produce data for users in support of open science and basic research (1). We trace back almost 30 years of history across selected user facilities illustrating the evolution of facility data management practices and how these practices have related to performing scientific research. The facilities cover multiple techniques such as X-ray and neutron scattering, imaging and tomography sciences. Over time, detector and data acquisition technologies have dramatically increased the ability to produce prolific volumes of data challenging the traditional paradigm of users taking data home upon completion of their experiments to process and publish their results. During this time, computing capacity has also increased dramatically, though the size of the data has grown significantly faster than the capacity of one's laptop to manage and process this new facility produced data. Trends indicate that this will continue to be the case for yet some time. Thus users face a quandary for how to manage today's data complexity and size as these may exceed the computing resources users have available to themselves. This same quandary can also stifle collaboration and sharing. Realizing this, some facilities are already providing web portal access to data and computing thereby providing users access to resources they need (2). Portal based computing is now driving researchers to think about how to use the data collected at multiple facilities in an integrated way to perform their research, and also how to collaborate and share data. In the future, inter-facility data management systems will enable next tier cross-instrument-cross facility scientific research fuelled by smart applications residing upon user computer resources. We can learn from the medical imaging community that has been working since the early 1990's to integrate data from across multiple modalities to achieve

  17. Citizen science can improve conservation science, natural resource management, and environmental protection

    Science.gov (United States)

    McKinley, Duncan C.; Miller-Rushing, Abe J.; Ballard, Heidi L.; Bonney, Rick; Brown, Hutch; Cook-Patton, Susan; Evans, Daniel M.; French, Rebecca A.; Parrish, Julia; Phillips, Tina B.; Ryan, Sean F.; Shanley, Lea A.; Shirk, Jennifer L.; Stepenuck, Kristine F.; Weltzin, Jake F.; Wiggins, Andrea; Boyle, Owen D.; Briggs, Russell D.; Chapin, Stuart F.; Hewitt, David A.; Preuss, Peter W.; Soukup, Michael A.

    2017-01-01

    Citizen science has advanced science for hundreds of years, contributed to many peer-reviewed articles, and informed land management decisions and policies across the United States. Over the last 10 years, citizen science has grown immensely in the United States and many other countries. Here, we show how citizen science is a powerful tool for tackling many of the challenges faced in the field of conservation biology. We describe the two interwoven paths by which citizen science can improve conservation efforts, natural resource management, and environmental protection. The first path includes building scientific knowledge, while the other path involves informing policy and encouraging public action. We explore how citizen science is currently used and describe the investments needed to create a citizen science program. We find that:Citizen science already contributes substantially to many domains of science, including conservation, natural resource, and environmental science. Citizen science informs natural resource management, environmental protection, and policymaking and fosters public input and engagement.Many types of projects can benefit from citizen science, but one must be careful to match the needs for science and public involvement with the right type of citizen science project and the right method of public participation.Citizen science is a rigorous process of scientific discovery, indistinguishable from conventional science apart from the participation of volunteers. When properly designed, carried out, and evaluated, citizen science can provide sound science, efficiently generate high-quality data, and help solve problems.

  18. Big Data and Data Science in Critical Care.

    Science.gov (United States)

    Sanchez-Pinto, L Nelson; Luo, Yuan; Churpek, Matthew M

    2018-05-09

    The digitalization of the healthcare system has resulted in a deluge of clinical Big Data and has prompted the rapid growth of data science in medicine. Data science, which is the field of study dedicated to the principled extraction of knowledge from complex data, is particularly relevant in the critical care setting. The availability of large amounts of data in the intensive care unit, the need for better evidence-based care, and the complexity of critical illness makes the use of data science techniques and data-driven research particularly appealing to intensivists. Despite the increasing number of studies and publications in the field, so far there have been few examples of data science projects that have resulted in successful implementations of data-driven systems in the intensive care unit. However, given the expected growth in the field, intensivists should be familiar with the opportunities and challenges of Big Data and data science. In this paper, we review the definitions, types of algorithms, applications, challenges, and future of Big Data and data science in critical care. Copyright © 2018. Published by Elsevier Inc.

  19. Bringing Data Science, Xinformatics and Semantic eScience into the Graduate Curriculum

    Science.gov (United States)

    Fox, P.

    2012-04-01

    Recent advances in acquisition techniques quickly provide massive amount of complex data characterized by source heterogeneity, multiple modalities, high volume, high dimensionality, and multiple scales (temporal, spatial, and function). In turn, science and engineering disciplines are rapidly becoming more and more data driven with goals of higher sample throughput, better understanding/modeling of complex systems and their dynamics, and ultimately engineering products for practical applications. However, analyzing libraries of complex data requires managing its complexity and integrating the information and knowledge across multiple scales over different disciplines. Attention to Data Science is now ubiquitous - The Fourth Paradigm publication, Nature and Science special issues on Data, and explicit emphasis on Data in national and international agency programs, foundations (Keck, Moore) and corporations (IBM, GE, Microsoft, etc.). Surrounding this attention is a proliferation of studies, reports, conferences and workshops on Data, Data Science and workforce. Examples include: "Train a new generation of data scientists, and broaden public understanding" from an EU Expert Group, "…the nation faces a critical need for a competent and creative workforce in science, technology, engineering and mathematics (STEM)...", "We note two possible approaches to addressing the challenge of this transformation: revolutionary (paradigmatic shifts and systemic structural reform) and evolutionary (such as adding data mining courses to computational science education or simply transferring textbook organized content into digital textbooks).", and "The training programs that NSF establishes around such a data infrastructure initiative will create a new generation of data scientists, data curators, and data archivists that is equipped to meet the challenges and jobs of the future." Further, interim report of the International Council for Science's (ICSU) Strategic Coordinating

  20. Investigating the challenges of biodiversity management of Sefidkuh ...

    African Journals Online (AJOL)

    Investigating the challenges of biodiversity management of Sefidkuh Khoramabad protected area ... Journal of Fundamental and Applied Sciences ... The basis of managing protected areas in Iran is based on protection, research, training and ...

  1. Methodological challenges and analytic opportunities for modeling and interpreting Big Healthcare Data.

    Science.gov (United States)

    Dinov, Ivo D

    2016-01-01

    Managing, processing and understanding big healthcare data is challenging, costly and demanding. Without a robust fundamental theory for representation, analysis and inference, a roadmap for uniform handling and analyzing of such complex data remains elusive. In this article, we outline various big data challenges, opportunities, modeling methods and software techniques for blending complex healthcare data, advanced analytic tools, and distributed scientific computing. Using imaging, genetic and healthcare data we provide examples of processing heterogeneous datasets using distributed cloud services, automated and semi-automated classification techniques, and open-science protocols. Despite substantial advances, new innovative technologies need to be developed that enhance, scale and optimize the management and processing of large, complex and heterogeneous data. Stakeholder investments in data acquisition, research and development, computational infrastructure and education will be critical to realize the huge potential of big data, to reap the expected information benefits and to build lasting knowledge assets. Multi-faceted proprietary, open-source, and community developments will be essential to enable broad, reliable, sustainable and efficient data-driven discovery and analytics. Big data will affect every sector of the economy and their hallmark will be 'team science'.

  2. Science and technology in business management: Challenges for the training of professionals

    Directory of Open Access Journals (Sweden)

    Carlos Fernando Giler-Zúñiga

    2016-09-01

    Full Text Available The advances that are operated in science and technology at preset has been of accelerated changes, that lead to the analysis of the need of professionals{ training to face since the enterprise practice the challenges that contemporary innovation impose about knowledge, it is obliging education to assume different challenges. The professional training of professional correctly prepared, with the aim of contributing to the development of the country, it link it with economical politic and the a wider social politic ,thus, as with the systems of production and management, it pretends to give a new approach which is: to train to make capable of ,to specialize and update students and professionals to prepare a professional and leaders with critical thought and social  intellectual and of social goods and service and consciousness ,that link him with the principle of belonging being this the responsibility with preparation and training ,at the service of the an aesthetic politics of the society.

  3. Forging New Service Paths: Institutional Approaches to Providing Research Data Management Services

    Directory of Open Access Journals (Sweden)

    Regina Raboin

    2012-01-01

    Full Text Available Objective: This paper describes three different institutional experiences in developing research data management programs and services, challenges/opportunities and lessons learned.Overview: This paper is based on the Librarian Panel Discussion during the 4th Annual University of Massachusetts and New England Region e-Science Symposium. Librarians representing large public and private research universities presented an overview of service models developed at their respective organizations to bring support for data management and eScience to their communities. The approaches described include two library-based, integrated service models and one collaboratively-staffed, center-based service model.Results: Three institutions describe their experiences in creating the organizational capacity for research data management support services. Although each institutional approach is unique, common challenges include garnering administrative support, managing the integration of services with new or existing staff structures, and continuing to meet researchers needs as they evolve.Conclusions: There is no one way to provide research data management services, but any staff position, committee, or formalized center reflects an overarching organizational commitment to data management support.

  4. CitSci.org: A New Model for Managing, Documenting, and Sharing Citizen Science Data.

    Directory of Open Access Journals (Sweden)

    Yiwei Wang

    2015-10-01

    Full Text Available Citizen science projects have the potential to advance science by increasing the volume and variety of data, as well as innovation. Yet this potential has not been fully realized, in part because citizen science data are typically not widely shared and reused. To address this and related challenges, we built CitSci.org (see www.citsci.org, a customizable platform that allows users to collect and generate diverse datasets. We hope that CitSci.org will ultimately increase discoverability and confidence in citizen science observations, encouraging scientists to use such data in their own scientific research.

  5. CitSci.org: A New Model for Managing, Documenting, and Sharing Citizen Science Data.

    Science.gov (United States)

    Wang, Yiwei; Kaplan, Nicole; Newman, Greg; Scarpino, Russell

    2015-10-01

    Citizen science projects have the potential to advance science by increasing the volume and variety of data, as well as innovation. Yet this potential has not been fully realized, in part because citizen science data are typically not widely shared and reused. To address this and related challenges, we built CitSci.org (see www.citsci.org), a customizable platform that allows users to collect and generate diverse datasets. We hope that CitSci.org will ultimately increase discoverability and confidence in citizen science observations, encouraging scientists to use such data in their own scientific research.

  6. Ecoinformatics: supporting ecology as a data-intensive science.

    Science.gov (United States)

    Michener, William K; Jones, Matthew B

    2012-02-01

    Ecology is evolving rapidly and increasingly changing into a more open, accountable, interdisciplinary, collaborative and data-intensive science. Discovering, integrating and analyzing massive amounts of heterogeneous data are central to ecology as researchers address complex questions at scales from the gene to the biosphere. Ecoinformatics offers tools and approaches for managing ecological data and transforming the data into information and knowledge. Here, we review the state-of-the-art and recent advances in ecoinformatics that can benefit ecologists and environmental scientists as they tackle increasingly challenging questions that require voluminous amounts of data across disciplines and scales of space and time. We also highlight the challenges and opportunities that remain. Copyright © 2011 Elsevier Ltd. All rights reserved.

  7. Forget the hype or reality. Big data presents new opportunities in Earth Science.

    Science.gov (United States)

    Lee, T. J.

    2015-12-01

    Earth science is arguably one of the most mature science discipline which constantly acquires, curates, and utilizes a large volume of data with diverse variety. We deal with big data before there is big data. For example, while developing the EOS program in the 1980s, the EOS data and information system (EOSDIS) was developed to manage the vast amount of data acquired by the EOS fleet of satellites. EOSDIS continues to be a shining example of modern science data systems in the past two decades. With the explosion of internet, the usage of social media, and the provision of sensors everywhere, the big data era has bring new challenges. First, Goggle developed the search algorithm and a distributed data management system. The open source communities quickly followed up and developed Hadoop file system to facility the map reduce workloads. The internet continues to generate tens of petabytes of data every day. There is a significant shortage of algorithms and knowledgeable manpower to mine the data. In response, the federal government developed the big data programs that fund research and development projects and training programs to tackle these new challenges. Meanwhile, comparatively to the internet data explosion, Earth science big data problem has become quite small. Nevertheless, the big data era presents an opportunity for Earth science to evolve. We learned about the MapReduce algorithms, in memory data mining, machine learning, graph analysis, and semantic web technologies. How do we apply these new technologies to our discipline and bring the hype to Earth? In this talk, I will discuss how we might want to apply some of the big data technologies to our discipline and solve many of our challenging problems. More importantly, I will propose new Earth science data system architecture to enable new type of scientific inquires.

  8. Integration Of PanDA Workload Management System With Supercomputers for ATLAS and Data Intensive Science

    Energy Technology Data Exchange (ETDEWEB)

    De, K [University of Texas at Arlington; Jha, S [Rutgers University; Klimentov, A [Brookhaven National Laboratory (BNL); Maeno, T [Brookhaven National Laboratory (BNL); Nilsson, P [Brookhaven National Laboratory (BNL); Oleynik, D [University of Texas at Arlington; Panitkin, S [Brookhaven National Laboratory (BNL); Wells, Jack C [ORNL; Wenaus, T [Brookhaven National Laboratory (BNL)

    2016-01-01

    The Large Hadron Collider (LHC), operating at the international CERN Laboratory in Geneva, Switzerland, is leading Big Data driven scientific explorations. Experiments at the LHC explore the fundamental nature of matter and the basic forces that shape our universe, and were recently credited for the discovery of a Higgs boson. ATLAS, one of the largest collaborations ever assembled in the sciences, is at the forefront of research at the LHC. To address an unprecedented multi-petabyte data processing challenge, the ATLAS experiment is relying on a heterogeneous distributed computational infrastructure. The ATLAS experiment uses PanDA (Production and Data Analysis) Workload Management System for managing the workflow for all data processing on over 150 data centers. Through PanDA, ATLAS physicists see a single computing facility that enables rapid scientific breakthroughs for the experiment, even though the data centers are physically scattered all over the world. While PanDA currently uses more than 250,000 cores with a peak performance of 0.3 petaFLOPS, LHC data taking runs require more resources than Grid computing can possibly provide. To alleviate these challenges, LHC experiments are engaged in an ambitious program to expand the current computing model to include additional resources such as the opportunistic use of supercomputers. We will describe a project aimed at integration of PanDA WMS with supercomputers in United States, Europe and Russia (in particular with Titan supercomputer at Oak Ridge Leadership Computing Facility (OLCF), MIRA supercomputer at Argonne Leadership Computing Facilities (ALCF), Supercomputer at the National Research Center Kurchatov Institute , IT4 in Ostrava and others). Current approach utilizes modified PanDA pilot framework for job submission to the supercomputers batch queues and local data management, with light-weight MPI wrappers to run single threaded workloads in parallel on LCFs multi-core worker nodes. This implementation

  9. The Challenges Faced by New Science Teachers in Saudi Arabia

    Science.gov (United States)

    Alsharari, Salman

    Growing demand for science teachers in the Kingdom of Saudi Arabia, fed by increasing numbers of public school students, is forcing the Saudi government to attract, recruit and retain well-qualified science teachers. Beginning science teachers enter the educational profession with a massive fullfilment and satisfaction in their roles and positions as teachers to educating children in a science classroom. Nevertheless, teachers, over their early years of practice, encounter numerous challenges to provide the most effective science instruction. Therefore, the current study was aimed to identify academic and behavioral classroom challenges faced by science teachers in their first three years of teaching in the Kingdom of Saudi Arabia. In addition, new science teacher gender, school level and years of teaching experience differences in perceptions of the challenges that they encountered at work were analyzed. The present study also investigated various types of support that new science teachers may need to overcome academic and behavioral classroom challenges. In order to gain insights about ways to adequately support novice science teachers, it was important to examine new science teachers' beliefs, ideas and perceptions about effective science teaching. Three survey questionnaires were developed and distributed to teachers of both sexes who have been teaching science subjects, for less than three years, to elementary, middle and high school students in Al Jouf public schools. A total of 49 novice science teachers responded to the survey and 9 of them agreed to participate voluntarily in a face-to-face interview. Different statistical procedures and multiple qualitative methodologies were used to analyze the collected data. Findings suggested that the top three academic challenges faced by new science teachers were: poor quality of teacher preparation programs, absence of appropriate school equipment and facilities and lack of classroom materials and instructional

  10. A Research Agenda and Vision for Data Science

    Science.gov (United States)

    Mattmann, C. A.

    2014-12-01

    Big Data has emerged as a first-class citizen in the research community spanning disciplines in the domain sciences - Astronomy is pushing velocity with new ground-based instruments such as the Square Kilometre Array (SKA) and its unprecedented data rates (700 TB/sec!); Earth-science is pushing the boundaries of volume with increasing experiments in the international Intergovernmental Panel on Climate Change (IPCC) and climate modeling and remote sensing communities increasing the size of the total archives into the Exabytes scale; airborne missions from NASA such as the JPL Airborne Snow Observatory (ASO) is increasing both its velocity and decreasing the overall turnaround time required to receive products and to make them available to water managers and decision makers. Proteomics and the computational biology community are sequencing genomes and providing near real time answers to clinicians, researchers, and ultimately to patients, helping to process and understand and create diagnoses. Data complexity is on the rise, and the norm is no longer 100s of metadata attributes, but thousands to hundreds of thousands, including complex interrelationships between data and metadata and knowledge. I published a vision for data science in Nature 2013 that encapsulates four thrust areas and foci that I believe the computer science, Big Data, and data science communities need to attack over the next decade to make fundamental progress in the data volume, velocity and complexity challenges arising from the domain sciences such as those described above. These areas include: (1) rapid and unobtrusive algorithm integration; (2) intelligent and automatic data movement; (3) automated and rapid extraction text, metadata and language from heterogeneous file formats; and (4) participation and people power via open source communities. In this talk I will revisit these four areas and describe current progress; future work and challenges ahead as we move forward in this exciting age

  11. Sustainable Materials Management (SMM) Federal Green Challenge (FGC) Data

    Data.gov (United States)

    U.S. Environmental Protection Agency — The Federal Green Challenge (FGC) is a national effort under EPA's Sustainable Materials Management (SMM) Program, challenging EPA and other federal agencies...

  12. Meeting the Data Management Compliance Challenge: Funder Expectations and Institutional Reality

    Directory of Open Access Journals (Sweden)

    Catherine Pink

    2013-11-01

    Full Text Available In common with many global research funding agencies, in 2011 the UK Engineering and Physical Sciences Research Council (EPSRC published its Policy Framework on Research Data along with a mandate that institutions be fully compliant with the policy by May 2015. The University of Bath has a strong applied science and engineering research focus and, as such, the EPSRC is a major funder of the university’s research. In this paper, the Jisc-funded Research360 project shares its experience in developing the infrastructure required to enable a research-intensive institution to achieve full compliance with a particular funder’s policy, in such a way as to support the varied data management needs of both the University of Bath and its external stakeholders. A key feature of the Research360 project was to ensure that after the project’s completion in summer 2013 the newly developed data management infrastructure would be maintained up to and beyond the EPSRC’s 2015 deadline. Central to these plans was the ‘University of Bath Roadmap for EPSRC’, which was identified as an exemplar response by the EPSRC. This paper explores how a roadmap designed to meet a single funder’s requirements can be compatible with the strategic goals of an institution. Also discussed is how the project worked with Charles Beagrie Ltd to develop a supporting business case, thus ensuring implementation of these long-term objectives. This paper describes how two new data management roles, the Institutional Data Scientist and Technical Data Coordinator, have contributed to delivery of the Research360 project and the importance of these new types of cross-institutional roles for embedding a new data management infrastructure within an institution. Finally, the experience of developing a new institutional data policy is shared. This policy represents a particular example of the need to reconcile a funder’s expectations with the needs of individual researchers and their

  13. Models meet data: Challenges and opportunities in implementing land management in Earth system models.

    Science.gov (United States)

    Pongratz, Julia; Dolman, Han; Don, Axel; Erb, Karl-Heinz; Fuchs, Richard; Herold, Martin; Jones, Chris; Kuemmerle, Tobias; Luyssaert, Sebastiaan; Meyfroidt, Patrick; Naudts, Kim

    2018-04-01

    As the applications of Earth system models (ESMs) move from general climate projections toward questions of mitigation and adaptation, the inclusion of land management practices in these models becomes crucial. We carried out a survey among modeling groups to show an evolution from models able only to deal with land-cover change to more sophisticated approaches that allow also for the partial integration of land management changes. For the longer term a comprehensive land management representation can be anticipated for all major models. To guide the prioritization of implementation, we evaluate ten land management practices-forestry harvest, tree species selection, grazing and mowing harvest, crop harvest, crop species selection, irrigation, wetland drainage, fertilization, tillage, and fire-for (1) their importance on the Earth system, (2) the possibility of implementing them in state-of-the-art ESMs, and (3) availability of required input data. Matching these criteria, we identify "low-hanging fruits" for the inclusion in ESMs, such as basic implementations of crop and forestry harvest and fertilization. We also identify research requirements for specific communities to address the remaining land management practices. Data availability severely hampers modeling the most extensive land management practice, grazing and mowing harvest, and is a limiting factor for a comprehensive implementation of most other practices. Inadequate process understanding hampers even a basic assessment of crop species selection and tillage effects. The need for multiple advanced model structures will be the challenge for a comprehensive implementation of most practices but considerable synergy can be gained using the same structures for different practices. A continuous and closer collaboration of the modeling, Earth observation, and land system science communities is thus required to achieve the inclusion of land management in ESMs. © 2017 John Wiley & Sons Ltd.

  14. Progress and Challenges in Assessing NOAA Data Management

    Science.gov (United States)

    de la Beaujardiere, J.

    2016-12-01

    The US National Oceanic and Atmospheric Administration (NOAA) produces large volumes of environmental data from a great variety of observing systems including satellites, radars, aircraft, ships, buoys, and other platforms. These data are irreplaceable assets that must be properly managed to ensure they are discoverable, accessible, usable, and preserved. A policy framework has been established which informs data producers of their responsibilities and which supports White House-level mandates such as the Executive Order on Open Data and the OSTP Memorandum on Increasing Access to the Results of Federally Funded Scientific Research. However, assessing the current state and progress toward completion for the many NOAA datasets is a challenge. This presentation will discuss work toward establishing assessment methodologies and dashboard-style displays. Ideally, metrics would be gathered though software and be automatically updated whenever an individual improvement was made. In practice, however, some level of manual information collection is required. Differing approaches to dataset granularity in different branches of NOAA yield additional complexity.

  15. University of Washington's eScience Institute Promotes New Training and Career Pathways in Data Science

    Science.gov (United States)

    Stone, S.; Parker, M. S.; Howe, B.; Lazowska, E.

    2015-12-01

    Rapid advances in technology are transforming nearly every field from "data-poor" to "data-rich." The ability to extract knowledge from this abundance of data is the cornerstone of 21st century discovery. At the University of Washington eScience Institute, our mission is to engage researchers across disciplines in developing and applying advanced computational methods and tools to real world problems in data-intensive discovery. Our research team consists of individuals with diverse backgrounds in domain sciences such as astronomy, oceanography and geology, with complementary expertise in advanced statistical and computational techniques such as data management, visualization, and machine learning. Two key elements are necessary to foster careers in data science: individuals with cross-disciplinary training in both method and domain sciences, and career paths emphasizing alternative metrics for advancement. We see persistent and deep-rooted challenges for the career paths of people whose skills, activities and work patterns don't fit neatly into the traditional roles and success metrics of academia. To address these challenges the eScience Institute has developed training programs and established new career opportunities for data-intensive research in academia. Our graduate students and post-docs have mentors in both a methodology and an application field. They also participate in coursework and tutorials to advance technical skill and foster community. Professional Data Scientist positions were created to support research independence while encouraging the development and adoption of domain-specific tools and techniques. The eScience Institute also supports the appointment of faculty who are innovators in developing and applying data science methodologies to advance their field of discovery. Our ultimate goal is to create a supportive environment for data science in academia and to establish global recognition for data-intensive discovery across all fields.

  16. The NIOSH Radiation Dose Reconstruction Project: managing technical challenges.

    Science.gov (United States)

    Moeller, Matthew P; Townsend, Ronald D; Dooley, David A

    2008-07-01

    Approximately two years after promulgation of the Energy Employees Occupational Illness Compensation Program Act, the National Institute for Occupational Safety and Health Office of Compensation and Analysis Support selected a contractor team to perform many aspects of the radiation dose reconstruction process. The project scope and schedule necessitated the development of an organization involving a comparatively large number of health physicists. From the initial stages, there were many technical and managerial challenges that required continuous planning, integration, and conflict resolution. This paper identifies those challenges and describes the resolutions and lessons learned. These insights are hopefully useful to managers of similar scientific projects, especially those requiring significant data, technical methods, and calculations. The most complex challenge has been to complete defensible, individualized dose reconstructions that support timely compensation decisions at an acceptable production level. Adherence to applying claimant-favorable and transparent science consistent with the requirements of the Act has been the key to establishing credibility, which is essential to this large and complex project involving tens of thousands of individual stakeholders. The initial challenges included garnering sufficient and capable scientific staff, developing an effective infrastructure, establishing necessary methods and procedures, and integrating activities to ensure consistent, quality products. The continuing challenges include maintaining the project focus on recommending a compensation determination (rather than generating an accurate dose reconstruction), managing the associated very large data and information management challenges, and ensuring quality control and assurance in the presence of an evolving infrastructure. The lessons learned concern project credibility, claimant favorability, project priorities, quality and consistency, and critical

  17. Challenges of the science data processing, analysis and archiving approach in BepiColombo

    Science.gov (United States)

    Martinez, Santa

    BepiColombo is a joint mission of the European Space Agency (ESA) and the Japan Aerospace Exploration Agency (JAXA) to the planet Mercury. It comprises two separate orbiters: the Mercury Planetary Orbiter (MPO) and the Mercury Magnetospheric Orbiter (MMO). After approximately 7.5 years of cruise, BepiColombo will arrive at Mercury in 2024 and will gather data during a 1-year nominal mission, with a possible 1-year extension. The approach selected for BepiColombo for the processing, analysis and archiving of the science data represents a significant change with respect to previous ESA planetary missions. Traditionally Instrument Teams are responsible for processing, analysing and preparing their science data for the long-term archive, however in BepiColombo, the Science Ground Segment (SGS), located in Madrid, Spain, will play a key role in these activities. Fundamental aspects of this approach include: the involvement of the SGS in the definition, development and operation of the instrument processing pipelines; the production of ready-to-archive science products compatible with NASA’s Planetary Data System (PDS) standards in all the processing steps; the joint development of a quick-look analysis system to monitor deviations between planned and executed observations to feed back the results into the different planning cycles when possible; and a mission archive providing access to the scientific products and to the operational data throughout the different phases of the mission (from the early development phase to the legacy phase). In order to achieve these goals, the SGS will need to overcome a number of challenges. The proposed approach requires a flexible infrastructure able to cope with a distributed data processing system, residing in different locations but designed as a single entity. For this, all aspects related to the integration of software developed by different Instrument Teams and the alignment of their development schedules will need to be

  18. Data management in astrobiology: challenges and opportunities for an interdisciplinary community.

    Science.gov (United States)

    Aydinoglu, Arsev Umur; Suomela, Todd; Malone, Jim

    2014-06-01

    Data management and sharing are growing concerns for scientists and funding organizations throughout the world. Funding organizations are implementing requirements for data management plans, while scientists are establishing new infrastructures for data sharing. One of the difficulties is sharing data among a diverse set of research disciplines. Astrobiology is a unique community of researchers, containing over 110 different disciplines. The current study reports the results of a survey of data management practices among scientists involved in the astrobiology community and the NASA Astrobiology Institute (NAI) in particular. The survey was administered over a 2-month period in the first half of 2013. Fifteen percent of the NAI community responded (n=114), and additional (n=80) responses were collected from members of an astrobiology Listserv. The results of the survey show that the astrobiology community shares many of the same concerns for data sharing as other groups. The benefits of data sharing are acknowledged by many respondents, but barriers to data sharing remain, including lack of acknowledgement, citation, time, and institutional rewards. Overcoming technical, institutional, and social barriers to data sharing will be a challenge into the future.

  19. Prospects and challenges for social media data in conservation science

    Directory of Open Access Journals (Sweden)

    Enrico eDi Minin

    2015-09-01

    Full Text Available Social media data have been extensively used in numerous fields of science, but examples of their use in conservation science are still very limited. In this paper, we propose a framework on how social media data could be useful for conservation science and practice. We present the commonly used social media platforms and discuss how their content could be providing new data and information for conservation science. Based on this, we discuss how future work in conservation science and practice would benefit from social media data.

  20. The challenges associated with developing science-based landscape scale management plans.

    Science.gov (United States)

    Robert C. Szaro; Douglas A. Jr. Boyce; Thomas. Puchlerz

    2005-01-01

    Planning activities over large landscapes poses a complex of challenges when trying to balance the implementation of a conservation strategy while still allowing for a variety of consumptive and nonconsumptive uses. We examine a case in southeast Alaska to illustrate the breadth of these challenges and an approach to developing a science-based resource plan. Not only...

  1. Interconnection Structures, Management and Routing Challenges in Cloud-Service Data Center Networks: A Survey

    Directory of Open Access Journals (Sweden)

    Ahmad Nahar Quttoum

    2018-01-01

    Full Text Available Today’s data center networks employ expensive networking equipments in associated structures that were not designed to meet the increasing requirements of the current large-scale data center services. Limitations that vary between reliability, resource utilization, and high costs are challenging. The era of cloud computing represents a promise to enable large-scale data centers. Computing platforms of such cloud service data centers consist of large number of commodity low-price servers that, with a theme of virtualization on top, can meet the performance of the expensive high-level servers at only a fraction of the price. Recently, the research in data center networks started to evolve rapidly. This opened the path for addressing many of its design and management challenges, these like scalability, reliability, bandwidth capacities, virtual machines’ migration, and cost. Bandwidth resource fragmentation limits the network agility, and leads to low utilization rates, not only for the bandwidth resources, but also for the servers that run the applications. With Traffic Engineering methods, managers of such networks can adapt for rapid changes in the network traffic among their servers, this can help to provide better resource utilization and lower costs. The market is going through exciting changes, and the need to run demanding-scale services drives the work toward cloud networks. These networks that are enabled by the notation of autonomic management, and the availability of commodity low-price network equipments. This work provides the readers with a survey that presents the management challenges, design and operational constraints of the cloud-service data center networks

  2. Big Data, Computational Science, Economics, Finance, Marketing, Management, and Psychology: Connections

    NARCIS (Netherlands)

    C-L. Chang (Chia-Lin); M.J. McAleer (Michael); W.-K. Wong (Wing-Keung)

    2018-01-01

    textabstractThe paper provides a review of the literature that connects Big Data, Computational Science, Economics, Finance, Marketing, Management, and Psychology, and discusses some research that is related to the seven disciplines. Academics could develop theoretical models and subsequent

  3. PanDA Beyond ATLAS: Workload Management for Data Intensive Science

    CERN Document Server

    Schovancova, J; The ATLAS collaboration; Klimentov, A; Maeno, T; Nilsson, P; Oleynik, D; Panitkin, S; Petrosyan, A; Vaniachine, A; Wenaus, T; Yu, D

    2013-01-01

    The PanDA Production ANd Distributed Analysis system has been developed by ATLAS to meet the experiment's requirements for a data-driven workload management system for production and distributed analysis processing capable of operating at LHC data processing scale. After 7 years of impressively successful PanDA operation in ATLAS there are also other experiments which can benefit from PanDA in the Big Data challenge, with several at various stages of evaluation and adoption. The new project "Next Generation Workload Management and Analysis System for Big Data" is extending PanDA to meet the needs of other data intensive scientific applications in HEP, astro-particle and astrophysics communities, bio-informatics and other fields as a general solution to large scale workload management. PanDA can utilize dedicated or opportunistic computing resources such as grids, clouds, and High Performance Computing facilities, and is being extended to leverage next generation intelligent networks in automated workflow mana...

  4. Foundations of data-intensive science: Technology and practice for high throughput, widely distributed, data management and analysis systems

    Science.gov (United States)

    Johnston, William; Ernst, M.; Dart, E.; Tierney, B.

    2014-04-01

    Today's large-scale science projects involve world-wide collaborations depend on moving massive amounts of data from an instrument to potentially thousands of computing and storage systems at hundreds of collaborating institutions to accomplish their science. This is true for ATLAS and CMS at the LHC, and it is true for the climate sciences, Belle-II at the KEK collider, genome sciences, the SKA radio telescope, and ITER, the international fusion energy experiment. DOE's Office of Science has been collecting science discipline and instrument requirements for network based data management and analysis for more than a decade. As a result of this certain key issues are seen across essentially all science disciplines that rely on the network for significant data transfer, even if the data quantities are modest compared to projects like the LHC experiments. These issues are what this talk will address; to wit: 1. Optical signal transport advances enabling 100 Gb/s circuits that span the globe on optical fiber with each carrying 100 such channels; 2. Network router and switch requirements to support high-speed international data transfer; 3. Data transport (TCP is still the norm) requirements to support high-speed international data transfer (e.g. error-free transmission); 4. Network monitoring and testing techniques and infrastructure to maintain the required error-free operation of the many R&E networks involved in international collaborations; 5. Operating system evolution to support very high-speed network I/O; 6. New network architectures and services in the LAN (campus) and WAN networks to support data-intensive science; 7. Data movement and management techniques and software that can maximize the throughput on the network connections between distributed data handling systems, and; 8. New approaches to widely distributed workflow systems that can support the data movement and analysis required by the science. All of these areas must be addressed to enable large

  5. Data Science as an Innovation Challenge: From Big Data to Value Proposition

    Directory of Open Access Journals (Sweden)

    Victoria Kayser

    2018-03-01

    Full Text Available Analyzing “big data” holds huge potential for generating business value. The ongoing advancement of tools and technology over recent years has created a new ecosystem full of opportunities for data-driven innovation. However, as the amount of available data rises to new heights, so too does complexity. Organizations are challenged to create the right contexts, by shaping interfaces and processes, and by asking the right questions to guide the data analysis. Lifting the innovation potential requires teaming and focus to efficiently assign available resources to the most promising initiatives. With reference to the innovation process, this article will concentrate on establishing a process for analytics projects from first ideas to realization (in most cases: a running application. The question we tackle is: what can the practical discourse on big data and analytics learn from innovation management? The insights presented in this article are built on our practical experiences in working with various clients. We will classify analytics projects as well as discuss common innovation barriers along this process.

  6. Six Challenges for Ethical Conduct in Science.

    Science.gov (United States)

    Niemi, Petteri

    2016-08-01

    The realities of human agency and decision making pose serious challenges for research ethics. This article explores six major challenges that require more attention in the ethics education of students and scientists and in the research on ethical conduct in science. The first of them is the routinization of action, which makes the detection of ethical issues difficult. The social governance of action creates ethical problems related to power. The heuristic nature of human decision making implies the risk of ethical bias. The moral disengagement mechanisms represent a human tendency to evade personal responsibility. The greatest challenge of all might be the situational variation in people's ethical behaviour. Even minor situational factors have a surprisingly strong influence on our actions. Furthermore, finally, the nature of ethics itself also causes problems: instead of clear answers, we receive a multitude of theories and intuitions that may sometimes be contradictory. All these features of action and ethics represent significant risks for ethical conduct in science. I claim that they have to be managed within the everyday practices of science and addressed explicitly in research ethics education. I analyse them and suggest some ways in which their risks can be alleviated.

  7. INDIGO-DataCloud solutions for Earth Sciences

    Science.gov (United States)

    Aguilar Gómez, Fernando; de Lucas, Jesús Marco; Fiore, Sandro; Monna, Stephen; Chen, Yin

    2017-04-01

    INDIGO-DataCloud (https://www.indigo-datacloud.eu/) is a European Commission funded project aiming to develop a data and computing platform targeting scientific communities, deployable on multiple hardware and provisioned over hybrid (private or public) e-infrastructures. The development of INDIGO solutions covers the different layers in cloud computing (IaaS, PaaS, SaaS), and provides tools to exploit resources like HPC or GPGPUs. INDIGO is oriented to support European Scientific research communities, that are well represented in the project. Twelve different Case Studies have been analyzed in detail from different fields: Biological & Medical sciences, Social sciences & Humanities, Environmental and Earth sciences and Physics & Astrophysics. INDIGO-DataCloud provides solutions to emerging challenges in Earth Science like: -Enabling an easy deployment of community services at different cloud sites. Many Earth Science research infrastructures often involve distributed observation stations across countries, and also have distributed data centers to support the corresponding data acquisition and curation. There is a need to easily deploy new data center services while the research infrastructure continuous spans. As an example: LifeWatch (ESFRI, Ecosystems and Biodiversity) uses INDIGO solutions to manage the deployment of services to perform complex hydrodynamics and water quality modelling over a Cloud Computing environment, predicting algae blooms, using the Docker technology: TOSCA requirement description, Docker repository, Orchestrator for deployment, AAI (AuthN, AuthZ) and OneData (Distributed Storage System). -Supporting Big Data Analysis. Nowadays, many Earth Science research communities produce large amounts of data and and are challenged by the difficulties of processing and analysing it. A climate models intercomparison data analysis case study for the European Network for Earth System Modelling (ENES) community has been setup, based on the Ophidia big

  8. Data science for dummies

    CERN Document Server

    Pierson, Lillian

    2015-01-01

    Discover how data science can help you gain in-depth insight into your business - the easy way! Jobs in data science abound, but few people have the data science skills needed to fill these increasingly important roles in organizations. Data Science For Dummies is the perfect starting point for IT professionals and students interested in making sense of their organization's massive data sets and applying their findings to real-world business scenarios. From uncovering rich data sources to managing large amounts of data within hardware and software limitations, ensuring consistency in report

  9. The Big Challenge in Big Earth Science Data: Maturing to Transdisciplinary Data Platforms that are Relevant to Government, Research and Industry

    Science.gov (United States)

    Wyborn, Lesley; Evans, Ben

    2016-04-01

    Collecting data for the Earth Sciences has a particularly long history going back centuries. Initially scientific data came only from simple human observations recorded by pen on paper. Scientific instruments soon supplemented data capture, and as these instruments became more capable (e.g, automation, more information captured, generation of digitally-born outputs), Earth Scientists entered the 'Big Data' era where progressively data became too big to store and process locally in the old style vaults. To date, most funding initiatives for collection and storage of large volume data sets in the Earth Sciences have been specialised within a single discipline (e.g., climate, geophysics, and Earth Observation) or specific to an individual institution. To undertake interdisciplinary research, it is hard for users to integrate data from these individual repositories mainly due to limitations on physical access to/movement of the data, and/or data being organised without enough information to make sense of it without discipline specialised knowledge. Smaller repositories have also gradually been seen as inefficient in terms of the cost to manage and access (including scarce skills) and effective implementation of new technology and techniques. Within the last decade, the trend is towards fewer and larger data repositories that increasingly are collocated with HPC/cloud resources. There has also been a growing recognition that digital data can be a valuable resource that can be reused and repurposed - publicly funded data from either the academic of government sector is seen as a shared resource, and that efficiencies can be gained by co-location. These new, highly capable, 'transdisciplinary' data repositories are emerging as a fundamental 'infrastructure' both for research and other innovation. The sharing of academic and government data resources on the same infrastructures is enabling new research programmes that will enable integration beyond the traditional physical

  10. 7th International Conference on Management Science and Engineering Management

    CERN Document Server

    Fry, John; Lev, Benjamin; Hajiyev, Asaf; Vol.I Focused on Electrical and Information Technology; Vol.II Focused on Electrical and Information Technology

    2014-01-01

    This book presents the proceedings of the Seventh International Conference on Management Science and Engineering Management (ICMSEM2013) held from November 7 to 9, 2013 at Drexel University, Philadelphia, Pennsylvania, USA and organized by the International Society of Management Science and Engineering Management, Sichuan University (Chengdu, China) and Drexel University (Philadelphia, Pennsylvania, USA).   The goals of the Conference are to foster international research collaborations in Management Science and Engineering Management as well as to provide a forum to present current research findings. The selected papers cover various areas in management science and engineering management, such as Decision Support Systems, Multi-Objective Decisions, Uncertain Decisions, Computational Mathematics, Information Systems, Logistics and Supply Chain Management, Relationship Management, Scheduling and Control, Data Warehousing and Data Mining, Electronic Commerce, Neural Networks, Stochastic Models and Simulation, F...

  11. Funding research data management and related infrastructures : Knowledge Exchange and Science Europe briefing paper

    NARCIS (Netherlands)

    Bijsterbosch, Magchiel; Duca, Daniela; Katerbow, Matthias; Kupiainen, Irina; Dillo, Ingrid; Doorn, P.K.; Enke, Harry; de Lucas, Jesus Eugenio Marco

    2016-01-01

    Research Funding Organisations (RFO) and Research Performing Organisations (RPO) throughout Europe are well aware that science and scholarship increasingly depend on infrastructures supporting sustainable Research Data Management (RDM). In two complementary surveys, the Science Europe Working Group

  12. Challenges of citizen science contributions to modelling hydrodynamics of floods

    Science.gov (United States)

    Assumpção, Thaine Herman; Popescu, Ioana; Jonoski, Andreja; Solomatine, Dimitri P.

    2017-04-01

    Citizen science is an established mechanism in many fields of science, including ecology, biology and astronomy. Citizen participation ranges from collecting and interpreting data towards designing experiments with scientists and cooperating with water management authorities. In the environmental sciences, its potential has begun to be explored in the past decades and many studies on the applicability to water resources have emerged. Citizen Observatories are at the core of several EU-funded projects such as WeSenseIt, GroundTruth, GroundTruth 2.0 and SCENT (Smart Toolbox for Engaging Citizens into a People-Centric Observation Web) that already resulted in valuable contributions to the field. Buytaert et al. (2014) has already reviewed the role of citizen science in hydrology. The work presented here aims to complement it, reporting and discussing the use of citizen science for modelling the hydrodynamics of floods in a variety of studies. Additionally, it highlights the challenges that lie ahead to utilize more fully the citizen science potential contribution. In this work, focus is given to each component of hydrodynamic models: water level, velocity, flood extent, roughness and topography. It is addressed how citizens have been contributing to each aspect, mainly considering citizens as sensors and citizens as data interpreters. We consider to which kind of model (1D or 2D) the discussed approaches contribute and what their limitations and potential uses are. We found that although certain mechanisms are well established (e.g. the use of Volunteer Geographic Information for soft validation of land-cover and land-use maps), the applications in a modelling context are rather modest. Also, most studies involving models are limited to replacing traditional data with citizen data. We recommend that citizen science continue to be explored in modelling frameworks, in different case studies, taking advantage of the discussed mechanisms and of new sensor technologies

  13. Enabling the Usability of Earth Science Data Products and Services by Evaluating, Describing, and Improving Data Quality throughout the Data Lifecycle

    Science.gov (United States)

    Downs, R. R.; Peng, G.; Wei, Y.; Ramapriyan, H.; Moroni, D. F.

    2015-12-01

    Earth science data products and services are being used by representatives of various science and social science disciplines, by planning and decision-making professionals, by educators and learners ranging from primary through graduate and informal education, and by the general public. The diversity of users and uses of Earth science data is gratifying and offers new challenges for enabling the usability of these data by audiences with various purposes and levels of expertise. Users and other stakeholders need capabilities to efficiently find, explore, select, and determine the applicability and suitability of data products and services to meet their objectives and information needs. Similarly, they need to be able to understand the limitations of Earth science data, which can be complex, especially when considering combined or simultaneous use of multiple data products and services. Quality control efforts of stakeholders, throughout the data lifecycle, can contribute to the usability of Earth science data to meet the needs of diverse users. Such stakeholders include study design teams, data producers, data managers and curators, archives, systems professionals, data distributors, end-users, intermediaries, sponsoring organizations, hosting institutions, and others. Opportunities for engaging stakeholders to review, describe, and improve the quality of Earth science data products and services throughout the data lifecycle are identified and discussed. Insight is shared from the development of guidelines for implementing the Group on Earth Observations (GEO) Data Management Principles, the recommendations from the Earth Science Data System Working Group (ESDSWG) on Data Quality, and the efforts of the Information Quality Cluster of the Federation of Earth Science Information Partners (ESIP). Examples and outcomes from quality control efforts of data facilities, such as scientific data centers, that contribute to the usability of Earth science data also are offered.

  14. Plant genetic resources management in Ghana: Some challenges in ...

    African Journals Online (AJOL)

    Plant genetic resources management in Ghana: Some challenges in legumes. ... Ghana Journal of Agricultural Science ... The Plant Genetic Resources Research Institute, serving as the national gene bank of Ghana, together with other stakeholders, had made strenuous efforts in managing the legume genetic resources in ...

  15. The grand challenge of managing the petascale facility.

    Energy Technology Data Exchange (ETDEWEB)

    Aiken, R. J.; Mathematics and Computer Science

    2007-02-28

    This report is the result of a study of networks and how they may need to evolve to support petascale leadership computing and science. As Dr. Ray Orbach, director of the Department of Energy's Office of Science, says in the spring 2006 issue of SciDAC Review, 'One remarkable example of growth in unexpected directions has been in high-end computation'. In the same article Dr. Michael Strayer states, 'Moore's law suggests that before the end of the next cycle of SciDAC, we shall see petaflop computers'. Given the Office of Science's strong leadership and support for petascale computing and facilities, we should expect to see petaflop computers in operation in support of science before the end of the decade, and DOE/SC Advanced Scientific Computing Research programs are focused on making this a reality. This study took its lead from this strong focus on petascale computing and the networks required to support such facilities, but it grew to include almost all aspects of the DOE/SC petascale computational and experimental science facilities, all of which will face daunting challenges in managing and analyzing the voluminous amounts of data expected. In addition, trends indicate the increased coupling of unique experimental facilities with computational facilities, along with the integration of multidisciplinary datasets and high-end computing with data-intensive computing; and we can expect these trends to continue at the petascale level and beyond. Coupled with recent technology trends, they clearly indicate the need for including capability petascale storage, networks, and experiments, as well as collaboration tools and programming environments, as integral components of the Office of Science's petascale capability metafacility. The objective of this report is to recommend a new cross-cutting program to support the management of petascale science and infrastructure. The appendices of the report document current and projected

  16. PanDA Beyond ATLAS : A Scalable Workload Management System For Data Intensive Science

    CERN Document Server

    Borodin, M; The ATLAS collaboration; Jha, S; Golubkov, D; Klimentov, A; Maeno, T; Nilsson, P; Oleynik, D; Panitkin, S; Petrosyan, A; Schovancova, J; Vaniachine, A; Wenaus, T

    2014-01-01

    The LHC experiments are today at the leading edge of large scale distributed data-intensive computational science. The LHC's ATLAS experiment processes data volumes which are particularly extreme, over 140 PB to date, distributed worldwide at over of 120 sites. An important element in the success of the exciting physics results from ATLAS is the highly scalable integrated workflow and dataflow management afforded by the PanDA workload management system, used for all the distributed computing needs of the experiment. The PanDA design is not experiment specific and PanDA is now being extended to support other data intensive scientific applications. PanDA was cited as an example of "a high performance, fault tolerant software for fast, scalable access to data repositories of many kinds" during the "Big Data Research and Development Initiative" announcement, a 200 million USD U.S. government investment in tools to handle huge volumes of digital data needed to spur science and engineering discoveries. In this talk...

  17. Locating ethics in data science: responsibility and accountability in global and distributed knowledge production systems.

    Science.gov (United States)

    Leonelli, Sabina

    2016-12-28

    The distributed and global nature of data science creates challenges for evaluating the quality, import and potential impact of the data and knowledge claims being produced. This has significant consequences for the management and oversight of responsibilities and accountabilities in data science. In particular, it makes it difficult to determine who is responsible for what output, and how such responsibilities relate to each other; what 'participation' means and which accountabilities it involves, with regard to data ownership, donation and sharing as well as data analysis, re-use and authorship; and whether the trust placed on automated tools for data mining and interpretation is warranted (especially as data processing strategies and tools are often developed separately from the situations of data use where ethical concerns typically emerge). To address these challenges, this paper advocates a participative, reflexive management of data practices. Regulatory structures should encourage data scientists to examine the historical lineages and ethical implications of their work at regular intervals. They should also foster awareness of the multitude of skills and perspectives involved in data science, highlighting how each perspective is partial and in need of confrontation with others. This approach has the potential to improve not only the ethical oversight for data science initiatives, but also the quality and reliability of research outputs.This article is part of the themed issue 'The ethical impact of data science'. © 2015 The Authors.

  18. Meeting report: Ocean 'omics science, technology and cyberinfrastructure: current challenges and future requirements (August 20-23, 2013).

    Science.gov (United States)

    Gilbert, Jack A; Dick, Gregory J; Jenkins, Bethany; Heidelberg, John; Allen, Eric; Mackey, Katherine R M; DeLong, Edward F

    2014-06-15

    The National Science Foundation's EarthCube End User Workshop was held at USC Wrigley Marine Science Center on Catalina Island, California in August 2013. The workshop was designed to explore and characterize the needs and tools available to the community that is focusing on microbial and physical oceanography research with a particular emphasis on 'omic research. The assembled researchers outlined the existing concerns regarding the vast data resources that are being generated, and how we will deal with these resources as their volume and diversity increases. Particular attention was focused on the tools for handling and analyzing the existing data, on the need for the construction and curation of diverse federated databases, as well as development of shared, interoperable, "big-data capable" analytical tools. The key outputs from this workshop include (i) critical scientific challenges and cyber infrastructure constraints, (ii) the current and future ocean 'omics science grand challenges and questions, and (iii) data management, analytical and associated and cyber-infrastructure capabilities required to meet critical current and future scientific challenges. The main thrust of the meeting and the outcome of this report is a definition of the 'omics tools, technologies and infrastructures that facilitate continued advance in ocean science biology, marine biogeochemistry, and biological oceanography.

  19. Emerging Challenges and Opportunities for Education and Research in Weed Science

    Directory of Open Access Journals (Sweden)

    Bhagirath S. Chauhan

    2017-09-01

    Full Text Available In modern agriculture, with more emphasis on high input systems, weed problems are likely to increase and become more complex. With heightened awareness of adverse effects of herbicide residues on human health and environment and the evolution of herbicide-resistant weed biotypes, a significant focus within weed science has now shifted to the development of eco-friendly technologies with reduced reliance on herbicides. Further, with the large-scale adoption of herbicide-resistant crops, and uncertain climatic optima under climate change, the problems for weed science have become multi-faceted. To handle these complex weed problems, a holistic line of action with multi-disciplinary approaches is required, including adjustments to technology, management practices, and legislation. Improved knowledge of weed ecology, biology, genetics, and molecular biology is essential for developing sustainable weed control practices. Additionally, judicious use of advanced technologies, such as site-specific weed management systems and decision support modeling, will play a significant role in reducing costs associated with weed control. Further, effective linkages between farmers and weed researchers will be necessary to facilitate the adoption of technological developments. To meet these challenges, priorities in research need to be determined and the education system for weed science needs to be reoriented. In respect of the latter imperative, closer collaboration between weed scientists and other disciplines can help in defining and solving the complex weed management challenges of the 21st century. This consensus will provide more versatile and diverse approaches to innovative teaching and training practices, which will be needed to prepare future weed science graduates who are capable of handling the anticipated challenges of weed science facing in contemporary agriculture. To build this capacity, mobilizing additional funding for both weed research and

  20. A Position Statement on Population Data Science:

    Directory of Open Access Journals (Sweden)

    Kim McGrail

    2018-02-01

    Full Text Available Information is increasingly digital, creating opportunities to respond to pressing issues about human populations in near real time using linked datasets that are large, complex, and diverse. The potential social and individual benefits that can come from data-intensive science are large, but raise challenges of balancing individual privacy and the public good, building appropriate socio-technical systems to support data-intensive science, and determining whether defining a new field of inquiry might help move those collective interests and activities forward. A combination of expert engagement, literature review, and iterative conversations led to our conclusion that defining the field of Population Data Science (challenge 3 will help address the other two challenges as well. We define Population Data Science succinctly as the science of data about people and note that it is related to but distinct from the fields of data science and informatics. A broader definition names four characteristics of: data use for positive impact on citizens and society; bringing together and analyzing data from multiple sources; finding population-level insights; and developing safe, privacy-sensitive and ethical infrastructure to support research. One implication of these characteristics is that few people possess all of the requisite knowledge and skills of Population Data Science, so this is by nature a multi-disciplinary field. Other implications include the need to advance various aspects of science, such as data linkage technology, various forms of analytics, and methods of public engagement. These implications are the beginnings of a research agenda for Population Data Science, which if approached as a collective field, can catalyze significant advances in our understanding of trends in society, health, and human behavior.

  1. Summary report and strategy recommendations for EU citizen science gateway for biodiversity data

    Directory of Open Access Journals (Sweden)

    Veljo Runnel

    2016-12-01

    Full Text Available Citizen science is an approach of public participation in scientific research which has gained significant momentum in recent years. This is particularly evident in biology and environmental sciences where input from citizen scientists has greatly increased the number of publicly available observation data. However, there are still challenges in effective networking, data sharing and securing data quality. EU BON project has analyzed the citizen science landscape in Europe with regards to biodiversity research and proposes several policy recommendations. One of the recommendations is a Pan-European citizen science gateway for biodiversity data with dedicated tools for data collection and management. The prototypes of the gateway components are part of the EU BON biodiversity portal and described in current report.

  2. Data Intensive Science

    Directory of Open Access Journals (Sweden)

    Nicolas Schmelling

    2017-02-01

    Full Text Available A proposal to create a full-semester zero-entry level course about the responsible handling of research data and the associated analyses, storage, and sharing. The syllabus will comprise open science workflows, the creation of data management plans, as well as the addressing issues about reproducibility and data sharing in science. The course and all its materials will be licensed under CC-BY or if possible under CC-0.

  3. Building Theory for Management Science and Practice

    DEFF Research Database (Denmark)

    Sanchez, Ron; Heene, Aimé

    2017-01-01

    In this paper we examine some fundamental epistemological issues in building theory for applied management science, by which we mean theory that can be usefully applied in a scientific approach to management research and practice. We first define and distinguish “grand theory” from “mid......-range theory” in the social and management sciences. We then elaborate and contrast epistemologies for (i) building “grand theory” intended to be applicable to all cases and contexts, and (ii) building “mid-range theory” intended to apply to specific kinds of contexts. We illustrate the epistemological...... challenges in building grand theory in management science by considering important differences in the abilities of two “grand theories” in strategic management – industry structure theory and firm resources theory – to support development of conceptually consistent models and propositions for empirical...

  4. Data science and digital society

    Directory of Open Access Journals (Sweden)

    Chen Cathy Yi-Hsuan

    2017-07-01

    Full Text Available Data Science looks at raw numbers and informational objects created by different disciplines. The Digital Society creates information and numbers from many scientific disciplines. The amassment of data though makes is hard to find structures and requires a skill full analysis of this massive raw material. The thoughts presented here on DS2 - Data Science & Digital Society analyze these challenges and offers ways to handle the questions arising in this evolving context. We propose three levels of analysis and lay out how one can react to the challenges that come about. Concrete examples concern Credit default swaps, Dynamic Topic modeling, Crypto currencies and above all the quantitative analysis of real data in a DS2 context.

  5. MERRA Analytic Services: Meeting the Big Data Challenges of Climate Science through Cloud-Enabled Climate Analytics-as-a-Service

    Science.gov (United States)

    Schnase, J. L.; Duffy, D.; Tamkin, G. S.; Nadeau, D.; Thompson, J. H.; Grieg, C. M.; McInerney, M.; Webster, W. P.

    2013-12-01

    Climate science is a Big Data domain that is experiencing unprecedented growth. In our efforts to address the Big Data challenges of climate science, we are moving toward a notion of Climate Analytics-as-a-Service (CAaaS). We focus on analytics, because it is the knowledge gained from our interactions with Big Data that ultimately produce societal benefits. We focus on CAaaS because we believe it provides a useful way of thinking about the problem: a specialization of the concept of business process-as-a-service, which is an evolving extension of IaaS, PaaS, and SaaS enabled by Cloud Computing. Within this framework, Cloud Computing plays an important role; however, we see it as only one element in a constellation of capabilities that are essential to delivering climate analytics as a service. These elements are essential because in the aggregate they lead to generativity, a capacity for self-assembly that we feel is the key to solving many of the Big Data challenges in this domain. MERRA Analytic Services (MERRA/AS) is an example of cloud-enabled CAaaS built on this principle. MERRA/AS enables MapReduce analytics over NASA's Modern-Era Retrospective Analysis for Research and Applications (MERRA) data collection. The MERRA reanalysis integrates observational data with numerical models to produce a global temporally and spatially consistent synthesis of 26 key climate variables. It represents a type of data product that is of growing importance to scientists doing climate change research and a wide range of decision support applications. MERRA/AS brings together the following generative elements in a full, end-to-end demonstration of CAaaS capabilities: (1) high-performance, data proximal analytics, (2) scalable data management, (3) software appliance virtualization, (4) adaptive analytics, and (5) a domain-harmonized API. The effectiveness of MERRA/AS has been demonstrated in several applications. In our experience, Cloud Computing lowers the barriers and risk to

  6. MERRA Analytic Services: Meeting the Big Data Challenges of Climate Science Through Cloud-enabled Climate Analytics-as-a-service

    Science.gov (United States)

    Schnase, John L.; Duffy, Daniel Quinn; Tamkin, Glenn S.; Nadeau, Denis; Thompson, John H.; Grieg, Christina M.; McInerney, Mark A.; Webster, William P.

    2014-01-01

    Climate science is a Big Data domain that is experiencing unprecedented growth. In our efforts to address the Big Data challenges of climate science, we are moving toward a notion of Climate Analytics-as-a-Service (CAaaS). We focus on analytics, because it is the knowledge gained from our interactions with Big Data that ultimately produce societal benefits. We focus on CAaaS because we believe it provides a useful way of thinking about the problem: a specialization of the concept of business process-as-a-service, which is an evolving extension of IaaS, PaaS, and SaaS enabled by Cloud Computing. Within this framework, Cloud Computing plays an important role; however, we it see it as only one element in a constellation of capabilities that are essential to delivering climate analytics as a service. These elements are essential because in the aggregate they lead to generativity, a capacity for self-assembly that we feel is the key to solving many of the Big Data challenges in this domain. MERRA Analytic Services (MERRAAS) is an example of cloud-enabled CAaaS built on this principle. MERRAAS enables MapReduce analytics over NASAs Modern-Era Retrospective Analysis for Research and Applications (MERRA) data collection. The MERRA reanalysis integrates observational data with numerical models to produce a global temporally and spatially consistent synthesis of 26 key climate variables. It represents a type of data product that is of growing importance to scientists doing climate change research and a wide range of decision support applications. MERRAAS brings together the following generative elements in a full, end-to-end demonstration of CAaaS capabilities: (1) high-performance, data proximal analytics, (2) scalable data management, (3) software appliance virtualization, (4) adaptive analytics, and (5) a domain-harmonized API. The effectiveness of MERRAAS has been demonstrated in several applications. In our experience, Cloud Computing lowers the barriers and risk to

  7. Challenges in the Management of Bronchial Asthma Among Adults ...

    African Journals Online (AJOL)

    Challenges in the Management of Bronchial Asthma Among Adults in Nigeria: A Systematic ... Annals of Medical and Health Sciences Research ... Nigerian Thoracic Society, pharmaceutical industries, and the health‑care workers in general.

  8. Evolution of Information Management at the GSFC Earth Sciences (GES) Data and Information Services Center (DISC): 2006-2007

    Science.gov (United States)

    Kempler, Steven; Lynnes, Christopher; Vollmer, Bruce; Alcott, Gary; Berrick, Stephen

    2009-01-01

    Increasingly sophisticated National Aeronautics and Space Administration (NASA) Earth science missions have driven their associated data and data management systems from providing simple point-to-point archiving and retrieval to performing user-responsive distributed multisensor information extraction. To fully maximize the use of remote-sensor-generated Earth science data, NASA recognized the need for data systems that provide data access and manipulation capabilities responsive to research brought forth by advancing scientific analysis and the need to maximize the use and usability of the data. The decision by NASA to purposely evolve the Earth Observing System Data and Information System (EOSDIS) at the Goddard Space Flight Center (GSFC) Earth Sciences (GES) Data and Information Services Center (DISC) and other information management facilities was timely and appropriate. The GES DISC evolution was focused on replacing the EOSDIS Core System (ECS) by reusing the In-house developed disk-based Simple, Scalable, Script-based Science Product Archive (S4PA) data management system and migrating data to the disk archives. Transition was completed in December 2007

  9. Environmental Management Science Program Workshop. Proceedings

    Energy Technology Data Exchange (ETDEWEB)

    None

    1998-07-01

    The Department of Energy Office of Environmental Management (EM), in partnership with the Office of Energy Research (ER), designed, developed, and implemented the Environmental Management Science Program as a basic research effort to fund the scientific and engineering understanding required to solve the most challenging technical problems facing the government's largest, most complex environmental cleanup program. The intent of the Environmental Management Science Program is to: (1) Provide scientific knowledge that will revolutionize technologies and cleanup approaches to significantly reduce future costs, schedules, and risks. (2) Bridge the gap between broad fundamental research that has wide-ranging applications such as that performed in the Department's Office of Energy Research and needs-driven applied technology development that is conducted in Environmental Management's Office of Science and Technology. (3) Focus the nation's science infrastructure on critical Department of Energy environmental problems. In an effort to share information regarding basic research efforts being funded by the Environmental Management Science Program and the Environmental Management/Energy Research Pilot Collaborative Research Program (Wolf-Broido Program), this CD includes summaries for each project. These project summaries, available in portable document format (PDF), were prepared in the spring of 1998 by the principal investigators and provide information about their most recent project activities and accomplishments.

  10. Mining the Quantified Self: Personal Knowledge Discovery as a Challenge for Data Science.

    Science.gov (United States)

    Fawcett, Tom

    2015-12-01

    The last several years have seen an explosion of interest in wearable computing, personal tracking devices, and the so-called quantified self (QS) movement. Quantified self involves ordinary people recording and analyzing numerous aspects of their lives to understand and improve themselves. This is now a mainstream phenomenon, attracting a great deal of attention, participation, and funding. As more people are attracted to the movement, companies are offering various new platforms (hardware and software) that allow ever more aspects of daily life to be tracked. Nearly every aspect of the QS ecosystem is advancing rapidly, except for analytic capabilities, which remain surprisingly primitive. With increasing numbers of qualified self participants collecting ever greater amounts and types of data, many people literally have more data than they know what to do with. This article reviews the opportunities and challenges posed by the QS movement. Data science provides well-tested techniques for knowledge discovery. But making these useful for the QS domain poses unique challenges that derive from the characteristics of the data collected as well as the specific types of actionable insights that people want from the data. Using a small sample of QS time series data containing information about personal health we provide a formulation of the QS problem that connects data to the decisions of interest to the user.

  11. Integrating Social Science and Ecosystem Management: A National Challenge

    Science.gov (United States)

    Cordell; H. Ken; Linda Caldwell

    1995-01-01

    These proceedings contain the contributed papers and panel presentations, as well as a paper presented at the National Workshop, of the Conference on Integrating Social Sciences and Ecosystem Management, which was held at Unicoi Lodge and Conference Center, Helen, GA, December 12-14, 1995. The overall purpose of this Conference was to improve understanding, integration...

  12. Cross-scale phenological data integration to benefit resource management and monitoring

    Science.gov (United States)

    Richardson, Andrew D.; Weltzin, Jake F.; Morisette, Jeffrey T.

    2017-01-01

    Climate change is presenting new challenges for natural resource managers charged with maintaining sustainable ecosystems and landscapes. Phenology, a branch of science dealing with seasonal natural phenomena (bird migration or plant flowering in response to weather changes, for example), bridges the gap between the biosphere and the climate system. Phenological processes operate across scales that span orders of magnitude—from leaf to globe and from days to seasons—making phenology ideally suited to multiscale, multiplatform data integration and delivery of information at spatial and temporal scales suitable to inform resource management decisions.A workshop report: Workshop held June 2016 to investigate opportunities and challenges facing multi-scale, multi-platform integration of phenological data to support natural resource management decision-making.

  13. Integrating Science into Management of Ecosystems in the Greater Blue Mountains

    Science.gov (United States)

    Chapple, Rosalie S.; Ramp, Daniel; Bradstock, Ross A.; Kingsford, Richard T.; Merson, John A.; Auld, Tony D.; Fleming, Peter J. S.; Mulley, Robert C.

    2011-10-01

    Effective management of large protected conservation areas is challenged by political, institutional and environmental complexity and inconsistency. Knowledge generation and its uptake into management are crucial to address these challenges. We reflect on practice at the interface between science and management of the Greater Blue Mountains World Heritage Area (GBMWHA), which covers approximately 1 million hectares west of Sydney, Australia. Multiple government agencies and other stakeholders are involved in its management, and decision-making is confounded by numerous plans of management and competing values and goals, reflecting the different objectives and responsibilities of stakeholders. To highlight the complexities of the decision-making process for this large area, we draw on the outcomes of a recent collaborative research project and focus on fire regimes and wild-dog control as examples of how existing knowledge is integrated into management. The collaborative research project achieved the objectives of collating and synthesizing biological data for the region; however, transfer of the project's outcomes to management has proved problematic. Reasons attributed to this include lack of clearly defined management objectives to guide research directions and uptake, and scientific information not being made more understandable and accessible. A key role of a local bridging organisation (e.g., the Blue Mountains World Heritage Institute) in linking science and management is ensuring that research results with management significance can be effectively transmitted to agencies and that outcomes are explained for nonspecialists as well as more widely distributed. We conclude that improved links between science, policy, and management within an adaptive learning-by-doing framework for the GBMWHA would assist the usefulness and uptake of future research.

  14. Meeting report: Ocean ‘omics science, technology and cyberinfrastructure: current challenges and future requirements (August 20-23, 2013)

    Science.gov (United States)

    Gilbert, Jack A; Dick, Gregory J.; Jenkins, Bethany; Heidelberg, John; Allen, Eric; Mackey, Katherine R. M.

    2014-01-01

    The National Science Foundation’s EarthCube End User Workshop was held at USC Wrigley Marine Science Center on Catalina Island, California in August 2013. The workshop was designed to explore and characterize the needs and tools available to the community that is focusing on microbial and physical oceanography research with a particular emphasis on ‘omic research. The assembled researchers outlined the existing concerns regarding the vast data resources that are being generated, and how we will deal with these resources as their volume and diversity increases. Particular attention was focused on the tools for handling and analyzing the existing data, on the need for the construction and curation of diverse federated databases, as well as development of shared, interoperable, “big-data capable” analytical tools. The key outputs from this workshop include (i) critical scientific challenges and cyber infrastructure constraints, (ii) the current and future ocean ‘omics science grand challenges and questions, and (iii) data management, analytical and associated and cyber-infrastructure capabilities required to meet critical current and future scientific challenges. The main thrust of the meeting and the outcome of this report is a definition of the ‘omics tools, technologies and infrastructures that facilitate continued advance in ocean science biology, marine biogeochemistry, and biological oceanography. PMID:25197495

  15. Physics Guided Data Science in the Earth Sciences

    Science.gov (United States)

    Ganguly, A. R.

    2017-12-01

    Even as the geosciences are becoming relatively data-rich owing to remote sensing and archived model simulations, established physical understanding and process knowledge cannot be ignored. The ability to leverage both physics and data-intensive sciences may lead to new discoveries and predictive insights. A principled approach to physics guided data science, where physics informs feature selection, output constraints, and even the architecture of the learning models, is motivated. The possibility of hybrid physics and data science models at the level of component processes is discussed. The challenges and opportunities, as well as the relations to other approaches such as data assimilation - which also bring physics and data together - are discussed. Case studies are presented in climate, hydrology and meteorology.

  16. Recommendations of Common Data Elements to Advance the Science of Self-Management of Chronic Conditions.

    Science.gov (United States)

    Moore, Shirley M; Schiffman, Rachel; Waldrop-Valverde, Drenna; Redeker, Nancy S; McCloskey, Donna Jo; Kim, Miyong T; Heitkemper, Margaret M; Guthrie, Barbara J; Dorsey, Susan G; Docherty, Sharron L; Barton, Debra; Bailey, Donald E; Austin, Joan K; Grady, Patricia

    2016-09-01

    Common data elements (CDEs) are increasingly being used by researchers to promote data sharing across studies. The purposes of this article are to (a) describe the theoretical, conceptual, and definition issues in the development of a set of CDEs for research addressing self-management of chronic conditions; (b) propose an initial set of CDEs and their measures to advance the science of self-management; and (c) recommend implications for future research and dissemination. Between July 2014 and December 2015 the directors of the National Institute of Nursing Research (NINR)-funded P20 and P30 centers of excellence and NINR staff met in a series of telephone calls and a face-to-face NINR-sponsored meeting to select a set of recommended CDEs to be used in self-management research. A list of potential CDEs was developed from examination of common constructs in current self-management frameworks, as well as identification of variables frequently used in studies conducted in the centers of excellence. The recommended CDEs include measures of three self-management processes: activation, self-regulation, and self-efficacy for managing chronic conditions, and one measure of a self-management outcome, global health. The self-management of chronic conditions, which encompasses a considerable number of processes, behaviors, and outcomes across a broad range of chronic conditions, presents several challenges in the identification of a parsimonious set of CDEs. This initial list of recommended CDEs for use in self-management research is provisional in that it is expected that over time it will be refined. Comment and recommended revisions are sought from the research and practice communities. The use of CDEs can facilitate generalizability of research findings across diverse population and interventions. © 2016 Sigma Theta Tau International.

  17. Enabling and Encouraging Transparency in Earth Science Data for Decision Making

    Science.gov (United States)

    Abbott, S. B.

    2010-12-01

    Our ability to understand, respond, and make decisions about our changing planet hinges on timely scientific information and situational awareness. Information and understanding will continue to be the foundations of decision support in the face of uncertainty. Over the last 40 years, investments in Earth observations have brought remarkable achievements in weather prediction, disaster prediction and response, land management, and our broad base of Earth science knowledge. The only way to know what is happening to our planet and to manage our resources wisely is to measure it, This means tracking changes decade after decade and reanalyzing the record in light of new insights, technologies, and methodologies. In order to understand and respond to climate change and other global challenges, there is a need for a high degree of transparency in the publication, management, traceability, and citability of science data, and particularly for Earth science data. In addition, it is becoming increasingly important that free, open, and authoritative sources of quality data are available for peer review. One important focus is on applications and opportunities for enhancing data exchange standards for use with Earth science data. By increasing the transparency of scientific work and providing incentives for researchers and institutions to openly share data, we will more effectively leverage the scientific capacity of our Nation to address climate change and to meet future challenges. It is an enormous challenge to collect, organize, and communicate the vast stores of data maintained across the government. The Administration is committed to moving past these barriers in providing the American public with unprecedented access to useful government data, including an open architecture and making data available in multiple formats. The goal is to enable better decision-making, drive transparency, and to help power innovation for a stronger America. Whether for a research project

  18. CyVerse Data Commons: lessons learned in cyberinfrastructure management and data hosting from the Life Sciences

    Science.gov (United States)

    Swetnam, T. L.; Walls, R.; Merchant, N.

    2017-12-01

    CyVerse, is a US National Science Foundation funded initiative "to design, deploy, and expand a national cyberinfrastructure for life sciences research, and to train scientists in its use," supporting and enabling cross disciplinary collaborations across institutions. CyVerse' free, open-source, cyberinfrastructure is being adopted into biogeoscience and space sciences research. CyVerse data-science agnostic platforms provide shared data storage, high performance computing, and cloud computing that allow analysis of very large data sets (including incomplete or work-in-progress data sets). Part of CyVerse success has been in addressing the handling of data through its entire lifecycle, from creation to final publication in a digital data repository to reuse in new analyses. CyVerse developers and user communities have learned many lessons that are germane to Earth and Environmental Science. We present an overview of the tools and services available through CyVerse including: interactive computing with the Discovery Environment (https://de.cyverse.org/), an interactive data science workbench featuring data storage and transfer via the Data Store; cloud computing with Atmosphere (https://atmo.cyverse.org); and access to HPC via Agave API (https://agaveapi.co/). Each CyVerse service emphasizes access to long term data storage, including our own Data Commons (http://datacommons.cyverse.org), as well as external repositories. The Data Commons service manages, organizes, preserves, publishes, allows for discovery and reuse of data. All data published to CyVerse's Curated Data receive a permanent identifier (PID) in the form of a DOI (Digital Object Identifier) or ARK (Archival Resource Key). Data that is more fluid can also be published in the Data commons through Community Collaborated data. The Data Commons provides landing pages, permanent DOIs or ARKs, and supports data reuse and citation through features such as open data licenses and downloadable citations. The

  19. Data science ethics in government.

    Science.gov (United States)

    Drew, Cat

    2016-12-28

    Data science can offer huge opportunities for government. With the ability to process larger and more complex datasets than ever before, it can provide better insights for policymakers and make services more tailored and efficient. As with all new technologies, there is a risk that we do not take up its opportunities and miss out on its enormous potential. We want people to feel confident to innovate with data. So, over the past 18 months, the Government Data Science Partnership has taken an open, evidence-based and user-centred approach to creating an ethical framework. It is a practical document that brings all the legal guidance together in one place, and is written in the context of new data science capabilities. As part of its development, we ran a public dialogue on data science ethics, including deliberative workshops, an experimental conjoint survey and an online engagement tool. The research supported the principles set out in the framework as well as provided useful insight into how we need to communicate about data science. It found that people had a low awareness of the term 'data science', but that showing data science examples can increase broad support for government exploring innovative uses of data. But people's support is highly context driven. People consider acceptability on a case-by-case basis, first thinking about the overall policy goals and likely intended outcome, and then weighing up privacy and unintended consequences. The ethical framework is a crucial start, but it does not solve all the challenges it highlights, particularly as technology is creating new challenges and opportunities every day. Continued research is needed into data minimization and anonymization, robust data models, algorithmic accountability, and transparency and data security. It also has revealed the need to set out a renewed deal between the citizen and state on data, to maintain and solidify trust in how we use people's data for social good.This article is part

  20. An ecoinformatics application for forest dynamics plot data management and sharing

    Science.gov (United States)

    Chau-Chin Lin; Abd Rahman Kassim; Kristin Vanderbilt; Donald Henshaw; Eda C. Melendez-Colom; John H. Porter; Kaoru Niiyama; Tsutomu Yagihashi; Sek Aun Tan; Sheng-Shan Lu; Chi-Wen Hsiao; Li-Wan Chang; Meei-Ru. Jeng

    2011-01-01

    Several forest dynamics plot research projects in the East-Asia Pacific region of the International Long-Term Ecological Research network actively collect long-term data, and some of these large plots are members of the Center for Tropical Forest Science network. The wealth of forest plot data presents challenges in information management to researchers. In order to...

  1. Framework for Processing Citizens Science Data for Applications to NASA Earth Science Missions

    Science.gov (United States)

    Teng, William; Albayrak, Arif

    2017-01-01

    Citizen science (or crowdsourcing) has drawn much high-level recent and ongoing interest and support. It is poised to be applied, beyond the by-now fairly familiar use of, e.g., Twitter for natural hazards monitoring, to science research, such as augmenting the validation of NASA earth science mission data. This interest and support is seen in the 2014 National Plan for Civil Earth Observations, the 2015 White House forum on citizen science and crowdsourcing, the ongoing Senate Bill 2013 (Crowdsourcing and Citizen Science Act of 2015), the recent (August 2016) Open Geospatial Consortium (OGC) call for public participation in its newly-established Citizen Science Domain Working Group, and NASA's initiation of a new Citizen Science for Earth Systems Program (along with its first citizen science-focused solicitation for proposals). Over the past several years, we have been exploring the feasibility of extracting from the Twitter data stream useful information for application to NASA precipitation research, with both "passive" and "active" participation by the twitterers. The Twitter database, which recently passed its tenth anniversary, is potentially a rich source of real-time and historical global information for science applications. The time-varying set of "precipitation" tweets can be thought of as an organic network of rain gauges, potentially providing a widespread view of precipitation occurrence. The validation of satellite precipitation estimates is challenging, because many regions lack data or access to data, especially outside of the U.S. and in remote and developing areas. Mining the Twitter stream could augment these validation programs and, potentially, help tune existing algorithms. Our ongoing work, though exploratory, has resulted in key components for processing and managing tweets, including the capabilities to filter the Twitter stream in real time, to extract location information, to filter for exact phrases, and to plot tweet distributions. The

  2. IBM Watson: How Cognitive Computing Can Be Applied to Big Data Challenges in Life Sciences Research.

    Science.gov (United States)

    Chen, Ying; Elenee Argentinis, J D; Weber, Griff

    2016-04-01

    Life sciences researchers are under pressure to innovate faster than ever. Big data offer the promise of unlocking novel insights and accelerating breakthroughs. Ironically, although more data are available than ever, only a fraction is being integrated, understood, and analyzed. The challenge lies in harnessing volumes of data, integrating the data from hundreds of sources, and understanding their various formats. New technologies such as cognitive computing offer promise for addressing this challenge because cognitive solutions are specifically designed to integrate and analyze big datasets. Cognitive solutions can understand different types of data such as lab values in a structured database or the text of a scientific publication. Cognitive solutions are trained to understand technical, industry-specific content and use advanced reasoning, predictive modeling, and machine learning techniques to advance research faster. Watson, a cognitive computing technology, has been configured to support life sciences research. This version of Watson includes medical literature, patents, genomics, and chemical and pharmacological data that researchers would typically use in their work. Watson has also been developed with specific comprehension of scientific terminology so it can make novel connections in millions of pages of text. Watson has been applied to a few pilot studies in the areas of drug target identification and drug repurposing. The pilot results suggest that Watson can accelerate identification of novel drug candidates and novel drug targets by harnessing the potential of big data. Copyright © 2016 The Authors. Published by Elsevier Inc. All rights reserved.

  3. Sustainable Materials Management (SMM) Food Recovery Challenge (FRC) Data

    Data.gov (United States)

    U.S. Environmental Protection Agency — As part of EPA's Food Recovery Challenge (FRC), organizations pledge to improve their sustainable food management practices and report their results. The FRC is part...

  4. Advanced Technologies and Data Management Practices in Environmental Science: Lessons from Academia

    Science.gov (United States)

    Hernandez, Rebecca R.; Mayernik, Matthew S.; Murphy-Mariscal, Michelle L.; Allen, Michael F.

    2012-01-01

    Environmental scientists are increasing their capitalization on advancements in technology, computation, and data management. However, the extent of that capitalization is unknown. We analyzed the survey responses of 434 graduate students to evaluate the understanding and use of such advances in the environmental sciences. Two-thirds of the…

  5. Processes meet big data : connecting data science with process science

    NARCIS (Netherlands)

    van der Aalst, W.; Damiani, E.

    2015-01-01

    As more and more companies are embracing Big data, it has become apparent that the ultimate challenge is to relate massive amounts of event data to processes that are highly dynamic. To unleash the value of event data, events need to be tightly connected to the control and management of operational

  6. Operational research as implementation science: definitions, challenges and research priorities.

    Science.gov (United States)

    Monks, Thomas

    2016-06-06

    Operational research (OR) is the discipline of using models, either quantitative or qualitative, to aid decision-making in complex implementation problems. The methods of OR have been used in healthcare since the 1950s in diverse areas such as emergency medicine and the interface between acute and community care; hospital performance; scheduling and management of patient home visits; scheduling of patient appointments; and many other complex implementation problems of an operational or logistical nature. To date, there has been limited debate about the role that operational research should take within implementation science. I detail three such roles for OR all grounded in upfront system thinking: structuring implementation problems, prospective evaluation of improvement interventions, and strategic reconfiguration. Case studies from mental health, emergency medicine, and stroke care are used to illustrate each role. I then describe the challenges for applied OR within implementation science at the organisational, interventional, and disciplinary levels. Two key challenges include the difficulty faced in achieving a position of mutual understanding between implementation scientists and research users and a stark lack of evaluation of OR interventions. To address these challenges, I propose a research agenda to evaluate applied OR through the lens of implementation science, the liberation of OR from the specialist research and consultancy environment, and co-design of models with service users. Operational research is a mature discipline that has developed a significant volume of methodology to improve health services. OR offers implementation scientists the opportunity to do more upfront system thinking before committing resources or taking risks. OR has three roles within implementation science: structuring an implementation problem, prospective evaluation of implementation problems, and a tool for strategic reconfiguration of health services. Challenges facing OR

  7. Nursing Management Minimum Data Set: Cost-Effective Tool To Demonstrate the Value of Nurse Staffing in the Big Data Science Era.

    Science.gov (United States)

    Pruinelli, Lisiane; Delaney, Connie W; Garciannie, Amy; Caspers, Barbara; Westra, Bonnie L

    2016-01-01

    There is a growing body of evidence of the relationship of nurse staffing to patient, nurse, and financial outcomes. With the advent of big data science and developing big data analytics in nursing, data science with the reuse of big data is emerging as a timely and cost-effective approach to demonstrate nursing value. The Nursing Management Minimum Date Set (NMMDS) provides standard administrative data elements, definitions, and codes to measure the context where care is delivered and, consequently, the value of nursing. The integration of the NMMDS elements in the current health system provides evidence for nursing leaders to measure and manage decisions, leading to better patient, staffing, and financial outcomes. It also enables the reuse of data for clinical scholarship and research.

  8. High Performance Numerical Computing for High Energy Physics: A New Challenge for Big Data Science

    International Nuclear Information System (INIS)

    Pop, Florin

    2014-01-01

    Modern physics is based on both theoretical analysis and experimental validation. Complex scenarios like subatomic dimensions, high energy, and lower absolute temperature are frontiers for many theoretical models. Simulation with stable numerical methods represents an excellent instrument for high accuracy analysis, experimental validation, and visualization. High performance computing support offers possibility to make simulations at large scale, in parallel, but the volume of data generated by these experiments creates a new challenge for Big Data Science. This paper presents existing computational methods for high energy physics (HEP) analyzed from two perspectives: numerical methods and high performance computing. The computational methods presented are Monte Carlo methods and simulations of HEP processes, Markovian Monte Carlo, unfolding methods in particle physics, kernel estimation in HEP, and Random Matrix Theory used in analysis of particles spectrum. All of these methods produce data-intensive applications, which introduce new challenges and requirements for ICT systems architecture, programming paradigms, and storage capabilities.

  9. dataMares - An online platform for the fast, effective dissemination of science

    Science.gov (United States)

    Johnson, A. F.; Aburto-Oropeza, O.; Moreno-Báez, M.; Giron-Nava, A.; Lopez-Sagástegui, R.; Lopez-Sagástegui, C.

    2016-02-01

    One of the current challenges in public policy development, especially related to natural resource management and conservation, is that there are very few tools that help easily identify and incorporate relevant scientific findings and data into public policy. This can also lead to a repetition of research efforts and the collect of information that in some cases might already exist. The key to addressing this challenge is to develop collaborative research tools, which can be used by different sectors of society including key stakeholder groups, managers, policy makers and the public. Here we present an "open science" platform capable of handling large data and disseminating results to a wide audience quickly. dataMares uses business intelligence software to allow the dynamic presentation of data quickly to a range of users online. dataMares provides Robust and up-to-date scientific information for decision-makers, resource managers, conservation practitioners, fishers, community members, and regional and national level decision-makers in a nutshell. It can also be used in the training of young scientists and allows quick and open connections with the journalism industry.

  10. Citizen science in hydrology and water resources: opportunities for knowledge generation, ecosystem service management, and sustainable development

    Directory of Open Access Journals (Sweden)

    Wouter eBuytaert

    2014-10-01

    Full Text Available The participation of the general public in the research design, data collection and interpretation process together with scientists is often referred to as citizen science. While citizen science itself has existed since the start of scientific practice, developments in sensing technology, data processing and visualisation, and communication of ideas and results, are creating a wide range of new opportunities for public participation in scientific research. This paper reviews the state of citizen science in a hydrological context and explores the potential of citizen science to complement more traditional ways of scientific data collection and knowledge generation for hydrological sciences and water resources management. Although hydrological data collection often involves advanced technology, the advent of robust, cheap and low-maintenance sensing equipment provides unprecedented opportunities for data collection in a citizen science context. These data have a significant potential to create new hydrological knowledge, especially in relation to the characterisation of process heterogeneity, remote regions, and human impacts on the water cycle. However, the nature and quality of data collected in citizen science experiments is potentially very different from those of traditional monitoring networks. This poses challenges in terms of their processing, interpretation, and use, especially with regard to assimilation of traditional knowledge, the quantification of uncertainties, and their role in decision support. It also requires care in designing citizen science projects such that the generated data complement optimally other available knowledge. Lastly, we reflect on the challenges and opportunities in the integration of hydrologically-oriented citizen science in water resources management, the role of scientific knowledge in the decision-making process, and the potential contestation to established community institutions posed by co-generation of

  11. Research data management support for large-scale, long-term, interdisciplinary collaborative research centers with a focus on environmental sciences

    Science.gov (United States)

    Curdt, C.; Hoffmeister, D.; Bareth, G.; Lang, U.

    2017-12-01

    Science conducted in collaborative, cross-institutional research projects, requires active sharing of research ideas, data, documents and further information in a well-managed, controlled and structured manner. Thus, it is important to establish corresponding infrastructures and services for the scientists. Regular project meetings and joint field campaigns support the exchange of research ideas. Technical infrastructures facilitate storage, documentation, exchange and re-use of data as results of scientific output. Additionally, also publications, conference contributions, reports, pictures etc. should be managed. Both, knowledge and data sharing is essential to create synergies. Within the coordinated programme `Collaborative Research Center' (CRC), the German Research Foundation offers funding to establish research data management (RDM) infrastructures and services. CRCs are large-scale, interdisciplinary, multi-institutional, long-term (up to 12 years), university-based research institutions (up to 25 sub-projects). These CRCs address complex and scientifically challenging research questions. This poster presents the RDM services and infrastructures that have been established for two CRCs, both focusing on environmental sciences. Since 2007, a RDM support infrastructure and associated services have been set up for the CRC/Transregio 32 (CRC/TR32) `Patterns in Soil-Vegetation-Atmosphere-Systems: Monitoring, Modelling and Data Assimilation' (www.tr32.de). The experiences gained have been used to arrange RDM services for the CRC1211 `Earth - Evolution at the Dry Limit' (www.crc1211.de), funded since 2016. In both projects scientists from various disciplines collect heterogeneous data at field campaigns or by modelling approaches. To manage the scientific output, the TR32DB data repository (www.tr32db.de) has been designed and implemented for the CRC/TR32. This system was transferred and adapted to the CRC1211 needs (www.crc1211db.uni-koeln.de) in 2016. Both

  12. Open Data for Global Science

    Directory of Open Access Journals (Sweden)

    Paul F Uhlir

    2007-06-01

    Full Text Available he digital revolution has transformed the accumulation of properly curated public research data into an essential upstream resource whose value increases with use. The potential contributions of such data to the creation of new knowledge and downstream economic and social goods can in many cases be multiplied exponentially when the data are made openly available on digital networks. Most developed countries spend large amounts of public resources on research and related scientific facilities and instruments that generate massive amounts of data. Yet precious little of that investment is devoted to promoting the value of the resulting data by preserving and making them broadly available. The largely ad hoc approach to managing such data, however, is now beginning to be understood as inadequate to meet the exigencies of the national and international research enterprise. The time has thus come for the research community to establish explicit responsibilities for these digital resources. This article reviews the opportunities and challenges to the global science system associated with establishing an open data policy.

  13. Paradigms and problems: The practice of social science in natural resource management

    Science.gov (United States)

    Michael E. Patterson; Daniel R. Williams

    1998-01-01

    Increasingly, natural resource management is seeing calls for new paradigms. These calls pose challenges that have implications not only for planning and management, but also for the practice of science. As a consequence, the profession needs to deepen its understanding of the nature of science by exploring recent advances in the philosophy of science....

  14. Data Sharing: Convert Challenges into Opportunities.

    Science.gov (United States)

    Figueiredo, Ana Sofia

    2017-01-01

    Initiatives for sharing research data are opportunities to increase the pace of knowledge discovery and scientific progress. The reuse of research data has the potential to avoid the duplication of data sets and to bring new views from multiple analysis of the same data set. For example, the study of genomic variations associated with cancer profits from the universal collection of such data and helps in selecting the most appropriate therapy for a specific patient. However, data sharing poses challenges to the scientific community. These challenges are of ethical, cultural, legal, financial, or technical nature. This article reviews the impact that data sharing has in science and society and presents guidelines to improve the efficient sharing of research data.

  15. Toward A Science of Sustainable Water Management

    Science.gov (United States)

    Brown, C.

    2016-12-01

    Societal need for improved water management and concerns for the long-term sustainability of water resources systems are prominent around the world. The continued susceptibility of society to the harmful effects of hydrologic variability, pervasive concerns related to climate change and the emergent awareness of devastating effects of current practice on aquatic ecosystems all illustrate our limited understanding of how water ought to be managed in a dynamic world. The related challenges of resolving the competition for freshwater among competing uses (so called "nexus" issues) and adapting water resources systems to climate change are prominent examples of the of sustainable water management challenges. In addition, largely untested concepts such as "integrated water resources management" have surfaced as Sustainable Development Goals. In this presentation, we argue that for research to improve water management, and for practice to inspire better research, a new focus is required, one that bridges disciplinary barriers between the water resources research focus on infrastructure planning and management, and the role of human actors, and geophysical sciences community focus on physical processes in the absence of dynamical human response. Examples drawn from climate change adaptation for water resource systems and groundwater management policy provide evidence of initial progress towards a science of sustainable water management that links improved physical understanding of the hydrological cycle with the socioeconomic and ecological understanding of water and societal interactions.

  16. Data science for mental health: a UK perspective on a global challenge.

    Science.gov (United States)

    McIntosh, Andrew M; Stewart, Robert; John, Ann; Smith, Daniel J; Davis, Katrina; Sudlow, Cathie; Corvin, Aiden; Nicodemus, Kristin K; Kingdon, David; Hassan, Lamiece; Hotopf, Matthew; Lawrie, Stephen M; Russ, Tom C; Geddes, John R; Wolpert, Miranda; Wölbert, Eva; Porteous, David J

    2016-10-01

    Data science uses computer science and statistics to extract new knowledge from high-dimensional datasets (ie, those with many different variables and data types). Mental health research, diagnosis, and treatment could benefit from data science that uses cohort studies, genomics, and routine health-care and administrative data. The UK is well placed to trial these approaches through robust NHS-linked data science projects, such as the UK Biobank, Generation Scotland, and the Clinical Record Interactive Search (CRIS) programme. Data science has great potential as a low-cost, high-return catalyst for improved mental health recognition, understanding, support, and outcomes. Lessons learnt from such studies could have global implications. Copyright © 2016 Elsevier Ltd. All rights reserved.

  17. Stewardship and management challenges within a cloud-based open data ecosystem (Invited Paper 211863)

    Science.gov (United States)

    Kearns, E. J.

    2017-12-01

    NOAA's Big Data Project is conducting an experiment in the collaborative distribution of open government data to non-governmental cloud-based systems. Through Cooperative Research and Development Agreements signed in 2015 between NOAA and Amazon Web Services, Google Cloud Platform, IBM, Microsoft Azure, and the Open Commons Consortium, NOAA is distributing open government data to a wide community of potential users. There are a number of significant advantages related to the use of open data on commercial cloud platforms, but through this experiment NOAA is also discovering significant challenges for those stewarding and maintaining NOAA's data resources in support of users in the wider open data ecosystem. Among the challenges that will be discussed are: the need to provide effective interpretation of the data content to enable their use by data scientists from other expert communities; effective maintenance of Collaborators' open data stores through coordinated publication of new data and new versions of older data; the provenance and verification of open data as authentic NOAA-sourced data across multiple management boundaries and analytical tools; and keeping pace with the accelerating expectations of users with regard to improved quality control, data latency, availability, and discoverability. Suggested strategies to address these challenges will also be described.

  18. WCS Challenges for NASA's Earth Science Data

    Science.gov (United States)

    Cantrell, S.; Swentek, L.; Khan, A.

    2017-12-01

    In an effort to ensure that data in NASA's Earth Observing System Data and Information System (EOSDIS) is available to a wide variety of users through the tools of their choice, NASA continues to focus on exposing data and services using standards based protocols. Specifically, this work has focused recently on the Web Coverage Service (WCS). Experience has been gained in data delivery via GetCoverage requests, starting out with WCS v1.1.1. The pros and cons of both the version itself and different implementation approaches will be shared during this session. Additionally, due to limitations with WCS v1.1.1's ability to work with NASA's Earth science data, this session will also discuss the benefit of migrating to WCS 2.0.1 with EO-x to enrich this capability to meet a wide range of anticipated user needs This will enable subsetting and various types of data transformations to be performed on a variety of EOS data sets.

  19. The Data Science Landscape

    Science.gov (United States)

    Mentzel, C.

    2017-12-01

    Modern scientific data continue to increase in volume, variety, and velocity, and though the hype of big data has subsided, its usefulness for scientific discovery has only just begun. Harnessing these data for new insights, more efficient decision making, and other mission critical uses requires a combination of skills and expertise, often labeled data science. Data science can be thought of as a combination of statistics, computation and the domain from which the data relate, and so is a true interdisciplinary pursuit. Though it has reaped large benefits in companies able to afford the high cost of the severely limited talent pool, it suffers from lack of support in mission driven organizations. Not purely in any one historical field, data science is proving difficult to find a home in traditional university academic departments and other research organizations. The landscape of data science efforts, from academia, industry and government, can be characterized as nascent, enthusiastic, uneven, and highly competitive. Part of the challenge in documenting these trends is the lack of agreement about what data science is, and who is a data scientist. Defining these terms too closely and too early runs the risk of cutting off a tremendous amount of productive creativity, but waiting too long leaves many people without a sustainable career, and many organizations without the necessary skills to gain value from their data. This talk will explore the landscape of data science efforts in the US, including how organizations are building and sustaining data science teams.

  20. Observation management challenges of the Square Kilometre Array

    Science.gov (United States)

    Bridger, Alan; Williams, Stewart J.; Nicol, Mark; Klaassen, Pamela; Thompson, Roger S.; Knapic, Cristina; Jerse, Giovanna; Orlati, Andrea; Messina, Marco; Valame, Snehal

    2016-07-01

    The Square Kilometre Array (SKA) will be the world's most advanced radio telescope, designed to explore some of the biggest questions in astronomy today, such as the epoch of re-ionization, the nature of gravity and the origins of cosmic magnetism. SKA1, the first phase of SKA construction, is currently being designed by a large team of experts world-wide. SKA1 comprises two telescopes: a 200-element dish interferometer in South Africa and a 130000-element dipole antenna aperture array in Australia. To enable the ground-breaking science of the SKA an advanced Observation Management system is required to support both the needs of the astronomical community users and the SKA Observatory staff. This system will ensure that the SKA realises its scientiffc aims and achieves optimal scientific throughput. This paper provides an overview of the design of the system that will accept proposals from SKA users, and result in the execution of the scripts that will obtain science data, taking in the stages of detailed preparation, planning and scheduling of the observations and onwards tracking. It describes the unique challenges of the differing requirements of two telescopes, one of which is very much a software telescope, including the need to schedule the data processing as well as the acquisition, and to react to both internally and externally discovered transient events. The scheduling of multiple parallel sub-array use is covered, along with the need to handle commensal observing - using the same data stream to satisfy the science goals of more than one project simultaneously. An international team from academia and industry, drawing on expertise and experience from previous telescope projects, the virtual observatory and comparable problems in industry, has been assembled to design the solution to this challenging but exciting problem.

  1. A concept for performance management for Federal science programs

    Science.gov (United States)

    Whalen, Kevin G.

    2017-11-06

    The demonstration of clear linkages between planning, funding, outcomes, and performance management has created unique challenges for U.S. Federal science programs. An approach is presented here that characterizes science program strategic objectives by one of five “activity types”: (1) knowledge discovery, (2) knowledge development and delivery, (3) science support, (4) inventory and monitoring, and (5) knowledge synthesis and assessment. The activity types relate to performance measurement tools for tracking outcomes of research funded under the objective. The result is a multi-time scale, integrated performance measure that tracks individual performance metrics synthetically while also measuring progress toward long-term outcomes. Tracking performance on individual metrics provides explicit linkages to root causes of potentially suboptimal performance and captures both internal and external program drivers, such as customer relations and science support for managers. Functionally connecting strategic planning objectives with performance measurement tools is a practical approach for publicly funded science agencies that links planning, outcomes, and performance management—an enterprise that has created unique challenges for public-sector research and development programs.

  2. Big data, computational science, economics, finance, marketing, management, and psychology: connections

    OpenAIRE

    Chang, Chia-Lin; McAleer, Michael; Wong, Wing-Keung

    2018-01-01

    textabstractThe paper provides a review of the literature that connects Big Data, Computational Science, Economics, Finance, Marketing, Management, and Psychology, and discusses some research that is related to the seven disciplines. Academics could develop theoretical models and subsequent econometric and statistical models to estimate the parameters in the associated models, as well as conduct simulation to examine whether the estimators in their theories on estimation and hypothesis testin...

  3. Challenges to implementing "best available science"

    Science.gov (United States)

    Vita Wright

    2010-01-01

    Interagency wildland fire policy directs manager to apply "best available science" to management plans and activities. But what does "best available science" mean? With a vague definition of this concept and few guidelines for delivering or integrating science into management, it can be difficult for scientists to effectively provide managers with...

  4. Data Science for Imbalanced Data: Methods and Applications

    Science.gov (United States)

    Johnson, Reid A.

    2016-01-01

    Data science is a broad, interdisciplinary field concerned with the extraction of knowledge or insights from data, with the classification of data as a core, fundamental task. One of the most persistent challenges faced when performing classification is the class imbalance problem. Class imbalance refers to when the frequency with which each class…

  5. Data Management challenges in Astronomy and Astroparticle Physics

    Science.gov (United States)

    Lamanna, Giovanni

    2015-12-01

    Astronomy and Astroparticle Physics domains are experiencing a deluge of data with the next generation of facilities prioritised in the European Strategy Forum on Research Infrastructures (ESFRI), such as SKA, CTA, KM3Net and with other world-class projects, namely LSST, EUCLID, EGO, etc. The new ASTERICS-H2020 project brings together the concerned scientific communities in Europe to work together to find common solutions to their Big Data challenges, their interoperability, and their data access. The presentation will highlight these new challenges and the work being undertaken also in cooperation with e-infrastructures in Europe.

  6. A Big Data Guide to Understanding Climate Change: The Case for Theory-Guided Data Science.

    Science.gov (United States)

    Faghmous, James H; Kumar, Vipin

    2014-09-01

    Global climate change and its impact on human life has become one of our era's greatest challenges. Despite the urgency, data science has had little impact on furthering our understanding of our planet in spite of the abundance of climate data. This is a stark contrast from other fields such as advertising or electronic commerce where big data has been a great success story. This discrepancy stems from the complex nature of climate data as well as the scientific questions climate science brings forth. This article introduces a data science audience to the challenges and opportunities to mine large climate datasets, with an emphasis on the nuanced difference between mining climate data and traditional big data approaches. We focus on data, methods, and application challenges that must be addressed in order for big data to fulfill their promise with regard to climate science applications. More importantly, we highlight research showing that solely relying on traditional big data techniques results in dubious findings, and we instead propose a theory-guided data science paradigm that uses scientific theory to constrain both the big data techniques as well as the results-interpretation process to extract accurate insight from large climate data .

  7. Transforming Science Data for GIS: How to Find and Use NASA Earth Observation Data Without Being a Rocket Scientist

    Science.gov (United States)

    Bagwell, Ross; Peters, Byron; Berrick, Stephen

    2017-01-01

    NASAs Earth Observing System Data Information System (EOSDIS) manages Earth Observation satellites and the Distributed Active Archive Centers (DAACs), where the data is stored and processed. The challenge is that Earth Observation data is complicated. There is plenty of data available, however, the science teams have had a top-down approach: define what it is you are trying to study -select a set of satellite(s) and sensor(s), and drill down for the data.Our alternative is to take a bottom-up approach using eight environmental fields of interest as defined by the Group on Earth Observations (GEO) called Societal Benefit Areas (SBAs): Disaster Resilience (DR) Public Health Surveillance (PHS) Energy and Mineral Resource Management (EMRM) Water Resources Management (WRM) Infrastructure and Transport Management (ITM) Sustainable Urban Development (SUD) Food Security and Sustainable Agriculture (FSSA) Biodiversity and Ecosystems Sustainability (BES).

  8. Discovery informatics in biological and biomedical sciences: research challenges and opportunities.

    Science.gov (United States)

    Honavar, Vasant

    2015-01-01

    New discoveries in biological, biomedical and health sciences are increasingly being driven by our ability to acquire, share, integrate and analyze, and construct and simulate predictive models of biological systems. While much attention has focused on automating routine aspects of management and analysis of "big data", realizing the full potential of "big data" to accelerate discovery calls for automating many other aspects of the scientific process that have so far largely resisted automation: identifying gaps in the current state of knowledge; generating and prioritizing questions; designing studies; designing, prioritizing, planning, and executing experiments; interpreting results; forming hypotheses; drawing conclusions; replicating studies; validating claims; documenting studies; communicating results; reviewing results; and integrating results into the larger body of knowledge in a discipline. Against this background, the PSB workshop on Discovery Informatics in Biological and Biomedical Sciences explores the opportunities and challenges of automating discovery or assisting humans in discovery through advances (i) Understanding, formalization, and information processing accounts of, the entire scientific process; (ii) Design, development, and evaluation of the computational artifacts (representations, processes) that embody such understanding; and (iii) Application of the resulting artifacts and systems to advance science (by augmenting individual or collective human efforts, or by fully automating science).

  9. Gait biomechanics in the era of data science.

    Science.gov (United States)

    Ferber, Reed; Osis, Sean T; Hicks, Jennifer L; Delp, Scott L

    2016-12-08

    Data science has transformed fields such as computer vision and economics. The ability of modern data science methods to extract insights from large, complex, heterogeneous, and noisy datasets is beginning to provide a powerful complement to the traditional approaches of experimental motion capture and biomechanical modeling. The purpose of this article is to provide a perspective on how data science methods can be incorporated into our field to advance our understanding of gait biomechanics and improve treatment planning procedures. We provide examples of how data science approaches have been applied to biomechanical data. We then discuss the challenges that remain for effectively using data science approaches in clinical gait analysis and gait biomechanics research, including the need for new tools, better infrastructure and incentives for sharing data, and education across the disciplines of biomechanics and data science. By addressing these challenges, we can revolutionize treatment planning and biomechanics research by capitalizing on the wealth of knowledge gained by gait researchers over the past decades and the vast, but often siloed, data that are collected in clinical and research laboratories around the world. Copyright © 2016 Elsevier Ltd. All rights reserved.

  10. Data and Workflow Management Challenges in Global Adjoint Tomography

    Science.gov (United States)

    Lei, W.; Ruan, Y.; Smith, J. A.; Modrak, R. T.; Orsvuran, R.; Krischer, L.; Chen, Y.; Balasubramanian, V.; Hill, J.; Turilli, M.; Bozdag, E.; Lefebvre, M. P.; Jha, S.; Tromp, J.

    2017-12-01

    It is crucial to take the complete physics of wave propagation into account in seismic tomography to further improve the resolution of tomographic images. The adjoint method is an efficient way of incorporating 3D wave simulations in seismic tomography. However, global adjoint tomography is computationally expensive, requiring thousands of wavefield simulations and massive data processing. Through our collaboration with the Oak Ridge National Laboratory (ORNL) computing group and an allocation on Titan, ORNL's GPU-accelerated supercomputer, we are now performing our global inversions by assimilating waveform data from over 1,000 earthquakes. The first challenge we encountered is dealing with the sheer amount of seismic data. Data processing based on conventional data formats and processing tools (such as SAC), which are not designed for parallel systems, becomes our major bottleneck. To facilitate the data processing procedures, we designed the Adaptive Seismic Data Format (ASDF) and developed a set of Python-based processing tools to replace legacy FORTRAN-based software. These tools greatly enhance reproducibility and accountability while taking full advantage of highly parallel system and showing superior scaling on modern computational platforms. The second challenge is that the data processing workflow contains more than 10 sub-procedures, making it delicate to handle and prone to human mistakes. To reduce human intervention as much as possible, we are developing a framework specifically designed for seismic inversion based on the state-of-the art workflow management research, specifically the Ensemble Toolkit (EnTK), in collaboration with the RADICAL team from Rutgers University. Using the initial developments of the EnTK, we are able to utilize the full computing power of the data processing cluster RHEA at ORNL while keeping human interaction to a minimum and greatly reducing the data processing time. Thanks to all the improvements, we are now able to

  11. A Big Data Task Force Review of Advances in Data Access and Discovery Within the Science Disciplines of the NASA Science Mission Directorate (SMD)

    Science.gov (United States)

    Walker, R. J.; Beebe, R. F.

    2017-12-01

    One of the basic problems the NASA Science Mission Directorate (SMD) faces when dealing with preservation of scientific data is the variety of the data. This stems from the fact that NASA's involvement in the sciences spans a broad range of disciplines across the Science Mission Directorate: Astrophysics, Earth Sciences, Heliophysics and Planetary Science. As the ability of some missions to produce large data volumes has accelerated, the range of problems associated with providing adequate access to the data has demanded diverse approaches for data access. Although mission types, complexity and duration vary across the disciplines, the data can be characterized by four characteristics: velocity, veracity, volume, and variety. The rate of arrival of the data (velocity) must be addressed at the individual mission level, validation and documentation of the data (veracity), data volume and the wide variety of data products present huge challenges as the science disciplines strive to provide transparent access to their available data. Astrophysics, supports an integrated system of data archives based on frequencies covered (UV, visible, IR, etc.) or subject areas (extrasolar planets, extra galactic, etc.) and is accessed through the Astrophysics Data Center (https://science.nasa.gov/astrophysics/astrophysics-data-centers/). Earth Science supports the Earth Observing System (https://earthdata.nasa.gov/) that manages the earth science satellite data. The discipline supports 12 Distributed Active Archive Centers. Heliophysics provides the Space Physics Data Facility (https://spdf.gsfc.nasa.gov/) that supports the heliophysics community and Solar Data Analysis Center (https://umbra.nascom.nasa.gov/index.html) that allows access to the solar data. The Planetary Data System (https://pds.nasa.gov) is the main archive for planetary science data. It consists of science discipline nodes (Atmospheres, Geosciences, Cartography and Imaging Sciences, Planetary Plasma Interactions

  12. Data Sharing: Convert Challenges into Opportunities

    Directory of Open Access Journals (Sweden)

    Ana Sofia Figueiredo

    2017-12-01

    Full Text Available Initiatives for sharing research data are opportunities to increase the pace of knowledge discovery and scientific progress. The reuse of research data has the potential to avoid the duplication of data sets and to bring new views from multiple analysis of the same data set. For example, the study of genomic variations associated with cancer profits from the universal collection of such data and helps in selecting the most appropriate therapy for a specific patient. However, data sharing poses challenges to the scientific community. These challenges are of ethical, cultural, legal, financial, or technical nature. This article reviews the impact that data sharing has in science and society and presents guidelines to improve the efficient sharing of research data.

  13. The DKIST Data Center: Meeting the Data Challenges for Next-Generation, Ground-Based Solar Physics

    Science.gov (United States)

    Davey, A. R.; Reardon, K.; Berukoff, S. J.; Hays, T.; Spiess, D.; Watson, F. T.; Wiant, S.

    2016-12-01

    The Daniel K. Inouye Solar Telescope (DKIST) is under construction on the summit of Haleakalā in Maui, and scheduled to start science operations in 2020. The DKIST design includes a four-meter primary mirror coupled to an adaptive optics system, and a flexible instrumentation suite capable of delivering high-resolution optical and infrared observations of the solar chromosphere, photosphere, and corona. Through investigator-driven science proposals, the facility will generate an average of 8 TB of data daily, comprised of millions of images and hundreds of millions of metadata elements. The DKIST Data Center is responsible for the long-term curation and calibration of data received from the DKIST, and for distributing it to the user community for scientific use. Two key elements necessary to meet the inherent big data challenge are the development of flexible public/private cloud computing and coupled relational and non-relational data storage mechanisms. We discuss how this infrastructure is being designed to meet the significant expectation of automatic and manual calibration of ground-based solar physics data, and the maximization the data's utility through efficient, long-term data management practices implemented with prudent process definition and technology exploitation.

  14. Challenges of Big Data in Educational Assessment

    Science.gov (United States)

    Gibson, David C.; Webb, Mary; Ifenthaler, Dirk

    2015-01-01

    This paper briefly discusses four measurement challenges of data science or "big data" in educational assessments that are enabled by technology: 1. Dealing with change over time via time-based data. 2. How a digital performance space's relationships interact with learner actions, communications and products. 3. How layers of…

  15. Global challenges in integrated coastal zone management

    DEFF Research Database (Denmark)

    integration of data and information in policy and management, combining expertise from nature and social science, to reach a balanced and sustainable development of the coastal zone. This important book comprises the proceedings of The International Symposium on Integrated Coastal Zone Management, which took....../mitigation to change in coastal systems Coastal governance Linking science and management Comprising a huge wealth of information, this timely and well-edited volume is essential reading for all those involved in coastal zone management around the globe. All libraries in research establishments and universities where...

  16. Stakeholder-led science: engaging resource managers to identify science needs for long-term management of floodplain conservation lands

    Science.gov (United States)

    Bouska, Kristin L.; Lindner, Garth; Paukert, Craig P.; Jacobson, Robert B.

    2016-01-01

    Floodplains pose challenges to managers of conservation lands because of constantly changing interactions with their rivers. Although scientific knowledge and understanding of the dynamics and drivers of river-floodplain systems can provide guidance to floodplain managers, the scientific process often occurs in isolation from management. Further, communication barriers between scientists and managers can be obstacles to appropriate application of scientific knowledge. With the coproduction of science in mind, our objectives were the following: (1) to document management priorities of floodplain conservation lands, and (2) identify science needs required to better manage the identified management priorities under nonstationary conditions, i.e., climate change, through stakeholder queries and interactions. We conducted an online survey with 80 resource managers of floodplain conservation lands along the Upper and Middle Mississippi River and Lower Missouri River, USA, to evaluate management priority, management intensity, and available scientific information for management objectives and conservation targets. Management objectives with the least information available relative to priority included controlling invasive species, maintaining respectful relationships with neighbors, and managing native, nongame species. Conservation targets with the least information available to manage relative to management priority included pollinators, marsh birds, reptiles, and shore birds. A follow-up workshop and survey focused on clarifying science needs to achieve management objectives under nonstationary conditions. Managers agreed that metrics of inundation, including depth and extent of inundation, and frequency, duration, and timing of inundation would be the most useful metrics for management of floodplain conservation lands with multiple objectives. This assessment provides guidance for developing relevant and accessible science products to inform management of highly

  17. The Magnetospheric Multiscale (MMS) Mission Science Data Center: Technologies, Methods, and Experiences in Making Available Large Volumes of In-Situ Particle and Field Data

    Science.gov (United States)

    Pankratz, Christopher; Kokkonen, Kim; Larsen, Kristopher; Panneton, Russell; Putnam, Brian; Schafer, Corey; Baker, Daniel; Burch, James

    2016-04-01

    On September 1, 2015 the Magnetospheric MultiScale (MMS) constellation of four satellites completed their six-month commissioning period and began routine science data collection. Science operations for the mission is conducted at the Science Operations Center (SOC) at the Laboratory for Atmospheric and Space Physics, University of Colorado in Boulder, Colorado, USA. The MMS Science Data Center (SDC) is a component of the SOC responsible for the data production, management, dissemination, archiving, and visualization of the data from the extensive suite of 100 instruments onboard the four spacecraft. As of March 2016, MMS science data are openly available to the entire science community via the SDC. This includes hundreds of science parameters, and 50 gigabytes of data per day distributed across thousands of data files. Products are produced using integrated software systems developed and maintained by teams at other institutions using their own institutional software management procedures and made available via a centralized public web site and web services. To accomplish the data management, data processing, and system integration challenges present on this space mission, the MMS SDC incorporates a number of evolutionary techniques and technologies. This presentation will provide an informatics-oriented view of the MMS SDC, summarizing its technical aspects, novel technologies and data management practices that are employed, experiences with its design and development, and lessons learned. Also presented is the MMS "Scientist-in-the-Loop" (SITL) system, which is used to leverage human insight and expertise to optimize the data selected for transmission to the ground. This smoothly operating system entails the seamless interoperability of multiple mission facilities and data systems that ultimately translate scientist insight into uplink commands that triggers optimal data downlink to the ground.

  18. Building a Data Science capability for USGS water research and communication

    Science.gov (United States)

    Appling, A.; Read, E. K.

    2015-12-01

    Interpreting and communicating water issues in an era of exponentially increasing information requires a blend of domain expertise, computational proficiency, and communication skills. The USGS Office of Water Information has established a Data Science team to meet these needs, providing challenging careers for diverse domain scientists and innovators in the fields of information technology and data visualization. Here, we detail the experience of building a Data Science capability as a bridging element between traditional water resources analyses and modern computing tools and data management techniques. This approach includes four major components: 1) building reusable research tools, 2) documenting data-intensive research approaches in peer reviewed journals, 3) communicating complex water resources issues with interactive web visualizations, and 4) offering training programs for our peers in scientific computing. These components collectively improve the efficiency, transparency, and reproducibility of USGS data analyses and scientific workflows.

  19. Open science, e-science and the new technologies: Challenges and old problems in qualitative research in the social sciences

    Directory of Open Access Journals (Sweden)

    Ercilia García-Álvarez

    2012-12-01

    Full Text Available Purpose: As well as introducing the articles in the special issue titled "Qualitative Research in the Social Sciences", this article reviews the challenges, problems and main advances made by the qualitative paradigm in the context of the new European science policy based on open science and e-Science and analysis alternative technologies freely available in the 2.0 environment and their application to fieldwork and data analysis. Design/methodology: Theoretical review. Practical implications: The article identifies open access technologies with applications in qualitative research such as applications for smartphones and tablets, web platforms and specific qualitative data analysis software, all developed in both the e-Science context and the 2.0 environment. Social implications: The article discusses the possible role to be played by qualitative research in the open science and e-Science context and considers the impact of this new context on the size and structure of research groups, the development of truly collaborative research, the emergence of new ethical problems and quality assessment in review processes in an open environment. Originality/value: The article describes the characteristics that define the new scientific environment and the challenges posed for qualitative research, reviews the latest open access technologies available to researchers in terms of their main features and proposes specific applications suitable for fieldwork and data analysis.

  20. An Overview of the Challenges With and Proposed Solutions for the Ingest and Distribution Processes for Airborne Data Management

    Science.gov (United States)

    Beach, Aubrey; Northup, Emily; Early, Amanda; Wang, Dali; Kusterer, John; Quam, Brandi; Chen, Gao

    2015-01-01

    The current data management practices for NASA airborne field projects have successfully served science team data needs over the past 30 years to achieve project science objectives, however, users have discovered a number of issues in terms of data reporting and format. The ICARTT format, a NASA standard since 2010, is currently the most popular among the airborne measurement community. Although easy for humans to use, the format standard is not sufficiently rigorous to be machine-readable. This makes data use and management tedious and resource intensive, and also create problems in Distributed Active Archive Center (DAAC) data ingest procedures and distribution. Further, most DAACs use metadata models that concentrate on satellite data observations, making them less prepared to deal with airborne data.

  1. Benefits, Challenges and Tools of Big Data Management

    Directory of Open Access Journals (Sweden)

    Fernando L. F. Almeida

    2017-10-01

    Full Text Available Big Data is one of the most predominant field of knowledge and research that has generated high repercussion in the process of digital transformation of organizations in recent years. The Big Data's main goal is to improve work processes through analysis and interpretation of large amounts of data. Knowing how Big Data works, its benefits, challenges and tools, are essential elements for business success. Our study performs a systematic review on Big Data field adopting a mind map approach, which allows us to easily and visually identify its main elements and dependencies. The findings identified and mapped a total of 12 main branches of benefits, challenges and tools, and also a total of 52 sub branches in each of the main areas of the model.

  2. Expanding Role of Data Science and Bioinformatics in Drug Discovery and Development.

    Science.gov (United States)

    Fingert, Howard J

    2018-01-01

    Numerous barriers have been identified which detract from successful applications of clinical trial data and platforms. Despite the challenges, opportunities are growing to advance compliance, quality, and practical applications through top-down establishment of guiding principles, coupled with bottom-up approaches to promote data science competencies among data producers. Recent examples of successful applications include modern treatments for hematologic malignancies, developed with support from public-private partnerships, guiding principles for data-sharing, standards for protocol designs and data management, digital technologies, and quality analytics. © 2017 American Society for Clinical Pharmacology and Therapeutics.

  3. Challenges of Data-driven Healthcare Management

    DEFF Research Database (Denmark)

    Bossen, Claus; Danholt, Peter; Ubbesen, Morten Bonde

    This paper describes the new kind of data-work involved in developing data-driven healthcare based on two cases from Denmark: The first case concerns a governance infrastructure based on Diagnose-Related Groups (DRG), which was introduced in Denmark in the 1990s. The DRG-system links healthcare...... activity and financing and relies of extensive data entry, reporting and calculations. This has required the development of new skills, work and work roles. The second case concerns a New Governance project aimed at developing new performance indicators for healthcare delivery as an alternative to DRG....... Here, a core challenge is select indicators and actually being able to acquire data upon them. The two cases point out that data-driven healthcare requires more and new kinds of work for which new skills, functions and work roles have to be developed....

  4. Advancing Symptom Science Through Use of Common Data Elements.

    Science.gov (United States)

    Redeker, Nancy S; Anderson, Ruth; Bakken, Suzanne; Corwin, Elizabeth; Docherty, Sharron; Dorsey, Susan G; Heitkemper, Margaret; McCloskey, Donna Jo; Moore, Shirley; Pullen, Carol; Rapkin, Bruce; Schiffman, Rachel; Waldrop-Valverde, Drenna; Grady, Patricia

    2015-09-01

    Use of common data elements (CDEs), conceptually defined as variables that are operationalized and measured in identical ways across studies, enables comparison of data across studies in ways that would otherwise be impossible. Although healthcare researchers are increasingly using CDEs, there has been little systematic use of CDEs for symptom science. CDEs are especially important in symptom science because people experience common symptoms across a broad range of health and developmental states, and symptom management interventions may have common outcomes across populations. The purposes of this article are to (a) recommend best practices for the use of CDEs for symptom science within and across centers; (b) evaluate the benefits and challenges associated with the use of CDEs for symptom science; (c) propose CDEs to be used in symptom science to serve as the basis for this emerging science; and (d) suggest implications and recommendations for future research and dissemination of CDEs for symptom science. The National Institute of Nursing Research (NINR)-supported P20 and P30 Center directors applied published best practices, expert advice, and the literature to identify CDEs to be used across the centers to measure pain, sleep, fatigue, and affective and cognitive symptoms. We generated a minimum set of CDEs to measure symptoms. The CDEs identified through this process will be used across the NINR Centers and will facilitate comparison of symptoms across studies. We expect that additional symptom CDEs will be added and the list will be refined in future work. Symptoms are an important focus of nursing care. Use of CDEs will facilitate research that will lead to better ways to assist people to manage their symptoms. © 2015 Sigma Theta Tau International.

  5. Life Sciences Data Archive (LSDA)

    Science.gov (United States)

    Fitts, M.; Johnson-Throop, Kathy; Thomas, D.; Shackelford, K.

    2008-01-01

    In the early days of spaceflight, space life sciences data were been collected and stored in numerous databases, formats, media-types and geographical locations. While serving the needs of individual research teams, these data were largely unknown/unavailable to the scientific community at large. As a result, the Space Act of 1958 and the Science Data Management Policy mandated that research data collected by the National Aeronautics and Space Administration be made available to the science community at large. The Biomedical Informatics and Health Care Systems Branch of the Space Life Sciences Directorate at JSC and the Data Archive Project at ARC, with funding from the Human Research Program through the Exploration Medical Capability Element, are fulfilling these requirements through the systematic population of the Life Sciences Data Archive. This program constitutes a formal system for the acquisition, archival and distribution of data for Life Sciences-sponsored experiments and investigations. The general goal of the archive is to acquire, preserve, and distribute these data using a variety of media which are accessible and responsive to inquiries from the science communities.

  6. SCIDIP-ES - A science data e-infrastructure for preservation of earth science data

    Science.gov (United States)

    Riddick, Andrew; Glaves, Helen; Marelli, Fulvio; Albani, Mirko; Tona, Calogera; Marketakis, Yannis; Tzitzikas, Yannis; Guarino, Raffaele; Giaretta, David; Di Giammatteo, Ugo

    2013-04-01

    The capability for long term preservation of earth science data is a key requirement to support on-going research and collaboration within and between many earth science disciplines. A number of critically important current research directions (e.g. understanding climate change, and ensuring sustainability of natural resources) rely on the preservation of data often collected over several decades in a form in which it can be accessed and used easily. In many branches of the earth sciences the capture of key observational data may be difficult or impossible to repeat. For example, a specific geological exposure or subsurface borehole may be only temporarily available, and deriving earth observation data from a particular satellite mission is clearly often a unique opportunity. At the same time such unrepeatable observations may be a critical input to environmental, economic and political decision making. Another key driver for strategic long term data preservation is that key research challenges (such as those described above) frequently require cross disciplinary research utilising raw and interpreted data from a number of earth science disciplines. Effective data preservation strategies can support this requirement for interoperability, and thereby stimulate scientific innovation. The SCIDIP-ES project (EC FP7 grant agreement no. 283401) seeks to address these and other data preservation challenges by developing a Europe wide e-infrastructure for long term data preservation comprising appropriate software tools and infrastructure services to enable and promote long term preservation of earth science data. Because we define preservation in terms of continued usability of the digitally encoded information, the generic infrastructure services will allow a wide variety of data to be made usable by researchers from many different domains. This approach will enable the cost for long-term usability across disciplines to be shared supporting the creation of strong

  7. The medical science DMZ: a network design pattern for data-intensive medical science

    Energy Technology Data Exchange (ETDEWEB)

    Peisert, Sean [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States); Univ. of California, Davis, CA (United States). Dept. of computer Science; Corporation for Education Network Initiatives in California (CENIC), Berkeley, CA (United States); Dart, Eli [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States). ESnet; Barnett, William [Indiana Univ., Indianapolis, IN (United States). Indiana Clinical and Translational Sciences Inst., Regenstrief Inst.; Balas, Edward [Indiana Univ., Bloomington, IN (United States). Global Research Network Operations Center; Cuff, James [Harvard Univ., Cambridge, MA (United States). Research Computing; Grossman, Robert L. [Univ. of Chicago, IL (United States). Center for Data Intensive Science; Berman, Ari [BioTeam, Middleton, MA (United States); Shankar, Anurag [Indiana Univ., Bloomington, IN (United States). Pervasive Technology Inst.; Tierney, Brian [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States). ESnet

    2017-10-06

    We describe a detailed solution for maintaining high-capacity, data-intensive network flows (eg, 10, 40, 100 Gbps+) in a scientific, medical context while still adhering to security and privacy laws and regulations.High-end networking, packet-filter firewalls, network intrusion-detection systems.We describe a "Medical Science DMZ" concept as an option for secure, high-volume transport of large, sensitive datasets between research institutions over national research networks, and give 3 detailed descriptions of implemented Medical Science DMZs.The exponentially increasing amounts of "omics" data, high-quality imaging, and other rapidly growing clinical datasets have resulted in the rise of biomedical research "Big Data." The storage, analysis, and network resources required to process these data and integrate them into patient diagnoses and treatments have grown to scales that strain the capabilities of academic health centers. Some data are not generated locally and cannot be sustained locally, and shared data repositories such as those provided by the National Library of Medicine, the National Cancer Institute, and international partners such as the European Bioinformatics Institute are rapidly growing. The ability to store and compute using these data must therefore be addressed by a combination of local, national, and industry resources that exchange large datasets. Maintaining data-intensive flows that comply with the Health Insurance Portability and Accountability Act (HIPAA) and other regulations presents a new challenge for biomedical research. We describe a strategy that marries performance and security by borrowing from and redefining the concept of a Science DMZ, a framework that is used in physical sciences and engineering research to manage high-capacity data flows.By implementing a Medical Science DMZ architecture, biomedical researchers can leverage the scale provided by high-performance computer and cloud storage facilities and national high

  8. Introduction to the special section on peer-to-peer computing and web data management

    Institute of Scientific and Technical Information of China (English)

    Aoying ZHOU

    2008-01-01

    @@ Peer-to-peer (P2P) computing has been attracting attention from quite a few researchers and practitioners from different fields of computer science, such as networking, distributed computing, and database. Over P2P environment, the data management becomes a challenging issue.

  9. Computational intelligence as a platform for data collection methodology in management science

    DEFF Research Database (Denmark)

    Jespersen, Kristina Risom

    2006-01-01

    With the increased focus in management science on how to collect data close to the real-world of managers, then agent-based simulations have interesting prospects that are usable for the design of business applications aimed at the collection of data. As a new generation of data collection...... methodologies this chapter discusses and presents a behavioral simulation founded in the agent-based simulation life cycle and supported by Web technology. With agent-based modeling the complexity of the method is increased without limiting the research due to the technological support, because this makes...... it possible to exploit the advantages of a questionnaire, an experimental design, a role-play and a scenario as such gaining the synergy effect of these methodologies. At the end of the chapter an example of a simulation is presented for researchers and practitioners to study....

  10. Gulf of Mexico Integrated Science - Tampa Bay Study - Data Information Management System (DIMS)

    Science.gov (United States)

    Johnston, James

    2004-01-01

    The Tampa Bay Integrated Science Study is an effort by the U.S. Geological Survey (USGS) that combines the expertise of federal, state and local partners to address some of the most pressing ecological problems of the Tampa Bay estuary. This project serves as a template for the application of integrated research projects in other estuaries in the Gulf of Mexico. Efficient information and data distribution for the Tampa Bay Study has required the development of a Data Information Management System (DIMS). This information system is being used as an outreach management tool, providing information to scientists, decision makers and the public on the coastal resources of the Gulf of Mexico.

  11. Big Data Science: Opportunities and Challenges to Address Minority Health and Health Disparities in the 21st Century

    Science.gov (United States)

    Zhang, Xinzhi; Pérez-Stable, Eliseo J.; Bourne, Philip E.; Peprah, Emmanuel; Duru, O. Kenrik; Breen, Nancy; Berrigan, David; Wood, Fred; Jackson, James S.; Wong, David W.S.; Denny, Joshua

    2017-01-01

    Addressing minority health and health disparities has been a missing piece of the puzzle in Big Data science. This article focuses on three priority opportunities that Big Data science may offer to the reduction of health and health care disparities. One opportunity is to incorporate standardized information on demographic and social determinants in electronic health records in order to target ways to improve quality of care for the most disadvantaged populations over time. A second opportunity is to enhance public health surveillance by linking geographical variables and social determinants of health for geographically defined populations to clinical data and health outcomes. Third and most importantly, Big Data science may lead to a better understanding of the etiology of health disparities and understanding of minority health in order to guide intervention development. However, the promise of Big Data needs to be considered in light of significant challenges that threaten to widen health disparities. Care must be taken to incorporate diverse populations to realize the potential benefits. Specific recommendations include investing in data collection on small sample populations, building a diverse workforce pipeline for data science, actively seeking to reduce digital divides, developing novel ways to assure digital data privacy for small populations, and promoting widespread data sharing to benefit under-resourced minority-serving institutions and minority researchers. With deliberate efforts, Big Data presents a dramatic opportunity for reducing health disparities but without active engagement, it risks further widening them. PMID:28439179

  12. Big Data Science: Opportunities and Challenges to Address Minority Health and Health Disparities in the 21st Century.

    Science.gov (United States)

    Zhang, Xinzhi; Pérez-Stable, Eliseo J; Bourne, Philip E; Peprah, Emmanuel; Duru, O Kenrik; Breen, Nancy; Berrigan, David; Wood, Fred; Jackson, James S; Wong, David W S; Denny, Joshua

    2017-01-01

    Addressing minority health and health disparities has been a missing piece of the puzzle in Big Data science. This article focuses on three priority opportunities that Big Data science may offer to the reduction of health and health care disparities. One opportunity is to incorporate standardized information on demographic and social determinants in electronic health records in order to target ways to improve quality of care for the most disadvantaged populations over time. A second opportunity is to enhance public health surveillance by linking geographical variables and social determinants of health for geographically defined populations to clinical data and health outcomes. Third and most importantly, Big Data science may lead to a better understanding of the etiology of health disparities and understanding of minority health in order to guide intervention development. However, the promise of Big Data needs to be considered in light of significant challenges that threaten to widen health disparities. Care must be taken to incorporate diverse populations to realize the potential benefits. Specific recommendations include investing in data collection on small sample populations, building a diverse workforce pipeline for data science, actively seeking to reduce digital divides, developing novel ways to assure digital data privacy for small populations, and promoting widespread data sharing to benefit under-resourced minority-serving institutions and minority researchers. With deliberate efforts, Big Data presents a dramatic opportunity for reducing health disparities but without active engagement, it risks further widening them.

  13. Transdisciplinary synthesis for ecosystem science, policy and management: The Australian experience.

    Science.gov (United States)

    Lynch, A J J; Thackway, R; Specht, A; Beggs, P J; Brisbane, S; Burns, E L; Byrne, M; Capon, S J; Casanova, M T; Clarke, P A; Davies, J M; Dovers, S; Dwyer, R G; Ens, E; Fisher, D O; Flanigan, M; Garnier, E; Guru, S M; Kilminster, K; Locke, J; Mac Nally, R; McMahon, K M; Mitchell, P J; Pierson, J C; Rodgers, E M; Russell-Smith, J; Udy, J; Waycott, M

    2015-11-15

    Mitigating the environmental effects of global population growth, climatic change and increasing socio-ecological complexity is a daunting challenge. To tackle this requires synthesis: the integration of disparate information to generate novel insights from heterogeneous, complex situations where there are diverse perspectives. Since 1995, a structured approach to inter-, multi- and trans-disciplinary(1) collaboration around big science questions has been supported through synthesis centres around the world. These centres are finding an expanding role due to ever-accumulating data and the need for more and better opportunities to develop transdisciplinary and holistic approaches to solve real-world problems. The Australian Centre for Ecological Analysis and Synthesis (ACEAS ) has been the pioneering ecosystem science synthesis centre in the Southern Hemisphere. Such centres provide analysis and synthesis opportunities for time-pressed scientists, policy-makers and managers. They provide the scientific and organisational environs for virtual and face-to-face engagement, impetus for integration, data and methodological support, and innovative ways to deliver synthesis products. We detail the contribution, role and value of synthesis using ACEAS to exemplify the capacity for synthesis centres to facilitate trans-organisational, transdisciplinary synthesis. We compare ACEAS to other international synthesis centres, and describe how it facilitated project teams and its objective of linking natural resource science to policy to management. Scientists and managers were brought together to actively collaborate in multi-institutional, cross-sectoral and transdisciplinary research on contemporary ecological problems. The teams analysed, integrated and synthesised existing data to co-develop solution-oriented publications and management recommendations that might otherwise not have been produced. We identify key outcomes of some ACEAS working groups which used synthesis to

  14. Data Management Challenges in a National Scientific Program of 55 Diverse Research Projects

    Science.gov (United States)

    De Bruin, T.

    2016-12-01

    In 2007-2015, the Dutch funding agency NWO funded the National Ocean and Coastal Research Program (in Dutch: ZKO). This program focused on `the scientific analysis of five societal challenges related to a sustainable use of the sea and coastal zones'. These five challenges were safety, economic yield, nature, spatial planning & development and water quality. The ZKO program was `set up to strengthen the cohesion and collaboration within Dutch marine research'. From the start of the program, data management was addressed, to allow data to be shared amongst the, diverse, research projects. The ZKO program was divided in 4 different themes (or regions). The `Carrying Capacity' theme was subdivided into 3 `research lines': Carrying capacity (Wadden Sea) - Policy-relevant Research - Monitoring - Hypothesis-driven Research Oceans North Sea Transnational Wadden Sea Research 56 Projects were funded, ranging from studies on the governance of the Wadden Sea to expeditions studying trace elements in the Atlantic Ocean. One of the first projects to be funded was the data management project. Its objectives were to allow data exchange between projects, to archive all relevant data from all ZKO projects and to make the data and publications publicly available, following the ZKO Data Policy. This project was carried out by the NIOZ Data Management Group. It turned out that the research projects had hardly any interest in sharing data between projects and had good (?) arguments not to share data at all until the end of the projects. A data portal was built, to host and make available all ZKO data and publications. When it came to submitting the data to this portal, most projects obliged willingly, though found it occasionally difficult to find time to do so. However, some projects refused to submit data to an open data portal, despite the rules set up by the funding agency and agreed by all. The take-home message of this presentation is that data sharing is a cultural and

  15. Data management of web archive research data

    DEFF Research Database (Denmark)

    Zierau, Eld; Jurik, Bolette

    This paper will provide recommendations to overcome various challenges for data management of web materials. The recommendations are based on results from two independent Danish research projects with different requirements to data management: The first project focuses on high precision on a par...

  16. Advances and challenges for nutrient management in china in the 21st century.

    Science.gov (United States)

    Sims, J T; Ma, L; Oenema, O; Dou, Z; Zhang, F S

    2013-07-01

    Managing agricultural nutrients to provide a safe and secure food supply while protecting the environment remains one of the great challenges for the 21st century. The fourth International Nutrient Management Symposium (INMS), held in 2011 at the University of Delaware, addressed these issues via presentations, panel sessions, and field tours focused on latest technologies and policies available to increase nutrient use efficiency. Participants from the United States, Europe, Canada, and China discussed global trends and challenges, balancing food security and the environment in countries with struggling and emerging economics, nutrient management and transport at the catchment scale, new technologies for managing fertilizer and manure nutrients, and adaptive nutrient management practices for farm to watershed scales. A particular area of interest at the fourth INMS was nutrient management progress and challenges in China over the past 40 years. China's food security challenges and rapidly growing economy have led to major advances in agricultural production systems but also created severe nutrient pollution problems. This special collection of papers from the fourth INMS gives an overview of the remarkable progress China has made in nutrient management and highlights major challenges and changes in agri-environmental policies and practices needed today. Lessons learned in China are of value to both developing and developed countries facing the common task of providing adequate food for an expanding world population, while protecting air and water quality and restoring damaged ecosystems. Copyright © by the American Society of Agronomy, Crop Science Society of America, and Soil Science Society of America, Inc.

  17. Multidimensional Space-Time Methodology for Development of Planetary and Space Sciences, S-T Data Management and S-T Computational Tomography

    Science.gov (United States)

    Andonov, Zdravko

    This R&D represent innovative multidimensional 6D-N(6n)D Space-Time (S-T) Methodology, 6D-6nD Coordinate Systems, 6D Equations, new 6D strategy and technology for development of Planetary Space Sciences, S-T Data Management and S-T Computational To-mography. . . The Methodology is actual for brain new RS Microwaves' Satellites and Compu-tational Tomography Systems development, aimed to defense sustainable Earth, Moon, & Sun System evolution. Especially, extremely important are innovations for monitoring and protec-tion of strategic threelateral system H-OH-H2O Hydrogen, Hydroxyl and Water), correspond-ing to RS VHRS (Very High Resolution Systems) of 1.420-1.657-22.089GHz microwaves. . . One of the Greatest Paradox and Challenge of World Science is the "transformation" of J. L. Lagrange 4D Space-Time (S-T) System to H. Minkovski 4D S-T System (O-X,Y,Z,icT) for Einstein's "Theory of Relativity". As a global result: -In contemporary Advanced Space Sciences there is not real adequate 4D-6D Space-Time Coordinate System and 6D Advanced Cosmos Strategy & Methodology for Multidimensional and Multitemporal Space-Time Data Management and Tomography. . . That's one of the top actual S-T Problems. Simple and optimal nD S-T Methodology discovery is extremely important for all Universities' Space Sci-ences' Education Programs, for advances in space research and especially -for all young Space Scientists R&D!... The top ten 21-Century Challenges ahead of Planetary and Space Sciences, Space Data Management and Computational Space Tomography, important for successfully de-velopment of Young Scientist Generations, are following: 1. R&D of W. R. Hamilton General Idea for transformation all Space Sciences to Time Sciences, beginning with 6D Eukonal for 6D anisotropic mediums & velocities. Development of IERS Earth & Space Systems (VLBI; LLR; GPS; SLR; DORIS Etc.) for Planetary-Space Data Management & Computational Planetary & Space Tomography. 2. R&D of S. W. Hawking Paradigm for 2D

  18. Essential Partnerships in the Data Management Life Cycle

    Science.gov (United States)

    Kinkade, D.; Allison, M. D.; Chandler, C. L.; Copley, N. J.; Gegg, S. R.; Groman, R. C.; Rauch, S.

    2015-12-01

    An obvious product of the scientific research process is data. Today's geoscience research efforts can rapidly produce an unprecedented volume of multidisciplinary data that can pose management challenges for the facility charged with curating that information. How do these facilities achieve efficient data management in a high volume, heterogeneous data world? Partnerships are critical, especially for small to mid-sized data management offices, such as those dedicated to academic research communities. The idea of partnerships can encompass a wide range of collaborative relationships aimed at helping these facilities meet the evolving needs of their communities. However, one basic and often overlooked partnership in the data management process is that of the information manager and the Principal Investigator (PI) or data originator. Such relationships are critical in discerning the best possible management strategy, and in obtaining the most robust metadata necessary for reuse of multidisciplinary datasets. Partnerships established early in the data life cycle enable efficient management and dissemination of data in high volumes and heterogeneous formats. The Biological and Chemical Oceanography Data Management Office (BCO-DMO) was created to fulfill the data management needs of PIs funded by the NSF Ocean Sciences Biological and Chemical Sections, and Division of Polar Programs. Since its inception, the Office has relied upon the close relationships it cultivates between its data managers and PIs in order to provide effective data management for a wide variety of ecological and biogeochemical oceanographic data. This presentation will highlight some of the successful partnerships BCO-DMO has made with individual and collaborative investigators, as well as those with other data managers representing specific research communities.

  19. High Resolution Nature Runs and the Big Data Challenge

    Science.gov (United States)

    Webster, W. Phillip; Duffy, Daniel Q.

    2015-01-01

    NASA's Global Modeling and Assimilation Office at Goddard Space Flight Center is undertaking a series of very computationally intensive Nature Runs and a downscaled reanalysis. The nature runs use the GEOS-5 as an Atmospheric General Circulation Model (AGCM) while the reanalysis uses the GEOS-5 in Data Assimilation mode. This paper will present computational challenges from three runs, two of which are AGCM and one is downscaled reanalysis using the full DAS. The nature runs will be completed at two surface grid resolutions, 7 and 3 kilometers and 72 vertical levels. The 7 km run spanned 2 years (2005-2006) and produced 4 PB of data while the 3 km run will span one year and generate 4 BP of data. The downscaled reanalysis (MERRA-II Modern-Era Reanalysis for Research and Applications) will cover 15 years and generate 1 PB of data. Our efforts to address the big data challenges of climate science, we are moving toward a notion of Climate Analytics-as-a-Service (CAaaS), a specialization of the concept of business process-as-a-service that is an evolving extension of IaaS, PaaS, and SaaS enabled by cloud computing. In this presentation, we will describe two projects that demonstrate this shift. MERRA Analytic Services (MERRA/AS) is an example of cloud-enabled CAaaS. MERRA/AS enables MapReduce analytics over MERRA reanalysis data collection by bringing together the high-performance computing, scalable data management, and a domain-specific climate data services API. NASA's High-Performance Science Cloud (HPSC) is an example of the type of compute-storage fabric required to support CAaaS. The HPSC comprises a high speed Infinib and network, high performance file systems and object storage, and a virtual system environments specific for data intensive, science applications. These technologies are providing a new tier in the data and analytic services stack that helps connect earthbound, enterprise-level data and computational resources to new customers and new mobility

  20. Using GIS in an Earth Sciences Field Course for Quantitative Exploration, Data Management and Digital Mapping

    Science.gov (United States)

    Marra, Wouter A.; van de Grint, Liesbeth; Alberti, Koko; Karssenberg, Derek

    2017-01-01

    Field courses are essential for subjects like Earth Sciences, Geography and Ecology. In these topics, GIS is used to manage and analyse spatial data, and offers quantitative methods that are beneficial for fieldwork. This paper presents changes made to a first-year Earth Sciences field course in the French Alps, where new GIS methods were…

  1. Innovation in Extraterrestrial Service Systems - A Challenge for Service Science

    Science.gov (United States)

    Bergner, David

    2010-01-01

    This presentation was prepared at the invitation of Professor Yukio Ohsawa, Department of Systems Innovation, School of Engineering, The University of Tokyo, for delivery at the International Workshop on Innovating Service Systems, sponsored by the Japanese Society of Artificial Intelligence (JSAI) as part of the JSAI Internation Symposium on AI, 2010. It offers several challenges for Service Science and Service Innovation. the goal of the presentation is to stimulate thinking about how service systems viII evolve in the future, as human society advances from its terrestrial base toward a permanent presence in space. First we will consider the complexity of the International Space Station (ISS) as it is today, with particular emphasis of its research facilities, and focus on a current challenge - to maximize the utilization of ISS research facilities for the benefit of society. After briefly reviewing the basic principles of Service Science, we will discuss the potential application of Service Innovation methodology to this challenge. Then we viII consider how game-changing technologies - in particular Synthetic Biology - could accelerate the pace of sociocultural evolution and consequently, the progression of human society into space. We will use this provocative vision to advance thinking about how the emerging field of Service Science, Management, and Engineering (SSME) might help us anticipate and better handle the challenges of this inevitable evolutionary process.

  2. NASA's Earth Science Data Systems

    Science.gov (United States)

    Ramapriyan, H. K.

    2015-01-01

    NASA's Earth Science Data Systems (ESDS) Program has evolved over the last two decades, and currently has several core and community components. Core components provide the basic operational capabilities to process, archive, manage and distribute data from NASA missions. Community components provide a path for peer-reviewed research in Earth Science Informatics to feed into the evolution of the core components. The Earth Observing System Data and Information System (EOSDIS) is a core component consisting of twelve Distributed Active Archive Centers (DAACs) and eight Science Investigator-led Processing Systems spread across the U.S. The presentation covers how the ESDS Program continues to evolve and benefits from as well as contributes to advances in Earth Science Informatics.

  3. Challenges and opportunities of open data in ecology.

    Science.gov (United States)

    Reichman, O J; Jones, Matthew B; Schildhauer, Mark P

    2011-02-11

    Ecology is a synthetic discipline benefiting from open access to data from the earth, life, and social sciences. Technological challenges exist, however, due to the dispersed and heterogeneous nature of these data. Standardization of methods and development of robust metadata can increase data access but are not sufficient. Reproducibility of analyses is also important, and executable workflows are addressing this issue by capturing data provenance. Sociological challenges, including inadequate rewards for sharing data, must also be resolved. The establishment of well-curated, federated data repositories will provide a means to preserve data while promoting attribution and acknowledgement of its use.

  4. Advanced statistical methods in data science

    CERN Document Server

    Chen, Jiahua; Lu, Xuewen; Yi, Grace; Yu, Hao

    2016-01-01

    This book gathers invited presentations from the 2nd Symposium of the ICSA- CANADA Chapter held at the University of Calgary from August 4-6, 2015. The aim of this Symposium was to promote advanced statistical methods in big-data sciences and to allow researchers to exchange ideas on statistics and data science and to embraces the challenges and opportunities of statistics and data science in the modern world. It addresses diverse themes in advanced statistical analysis in big-data sciences, including methods for administrative data analysis, survival data analysis, missing data analysis, high-dimensional and genetic data analysis, longitudinal and functional data analysis, the design and analysis of studies with response-dependent and multi-phase designs, time series and robust statistics, statistical inference based on likelihood, empirical likelihood and estimating functions. The editorial group selected 14 high-quality presentations from this successful symposium and invited the presenters to prepare a fu...

  5. Data quality management and data cleaning

    OpenAIRE

    Podobnikar , Uroš

    2016-01-01

    Today´s enterprises are often challenged by managing a large amount of data used in their business operation. Assurance and maintenance of adequate data quality level are important aspects of data quality management due to many reasons. On the one hand, the adequate data quality level represents a competitive advantage, and on the other hand, low data quality level leads to many unpleasant consequences. In the past, frameworks, methodologies, and tools to help ensuring adequate level of da...

  6. Big Data: New science, new challenges, new dialogical opportunities

    OpenAIRE

    Fuller, Michael

    2015-01-01

    The advent of extremely large datasets, known as “big data”, has been heralded as the instantiation of a new science, requiring a new kind of practitioner: the “data scientist”. This paper explores the concept of big data, drawing attention to a number of new issues – not least ethical concerns, and questions surrounding interpretation – which big data sets present. It is observed that the skills required for data scientists are in some respects closer to those traditionally associated with t...

  7. United States-Mexican Borderlands: Facing tomorrow's challenges through USGS science

    Science.gov (United States)

    Updike, Randall G.; Ellis, Eugene G.; Page, William R.; Parker, Melanie J.; Hestbeck, Jay B.; Horak, William F.

    2013-01-01

    Along the nearly 3,200 kilometers (almost 2,000 miles) of the United States–Mexican border, in an area known as the Borderlands, we are witnessing the expression of the challenges of the 21st century. This circular identifies several challenge themes and issues associated with life and the environment in the Borderlands, listed below. The challenges are not one-sided; they do not originate in one country only to become problems for the other. The issues and concerns of each challenge theme flow in both directions across the border, and both nations feel their effects throughout the Borderlands and beyond. The clear message is that our two nations, the United States and Mexico, face the issues in these challenge themes together, and the U.S. Geological Survey (USGS) understands it must work with its counterparts, partners, and customers in both countries.Though the mission of the USGS is not to serve as land manager, law enforcer, or code regulator, its innovation and creativity and the scientific and technical depth of its capabilities can be directly applied to monitoring the conditions of the landscape. The ability of USGS scientists to critically analyze the monitored data in search of signals and trends, whether they lead to negative or positive results, allows us to reach significant conclusions—from providing factual conclusions to decisionmakers, to estimating how much of a natural resource exists in a particular locale, to predicting how a natural hazard phenomenon will unfold, to forecasting on a scale from hours to millennia how ecosystems will behave.None of these challenge themes can be addressed strictly by one or two science disciplines; all require well-integrated, cross-discipline thinking, data collection, and analyses. The multidisciplinary science themes that have become the focus of the USGS mission parallel the major challenges in the border region between Mexico and the United States. Because of this multidisciplinary approach, the USGS

  8. Next Generation Cloud-based Science Data Systems and Their Implications on Data and Software Stewardship, Preservation, and Provenance

    Science.gov (United States)

    Hua, H.; Manipon, G.; Starch, M.

    2017-12-01

    NASA's upcoming missions are expected to be generating data volumes at least an order of magnitude larger than current missions. A significant increase in data processing, data rates, data volumes, and long-term data archive capabilities are needed. Consequently, new challenges are emerging that impact traditional data and software management approaches. At large-scales, next generation science data systems are exploring the move onto cloud computing paradigms to support these increased needs. New implications such as costs, data movement, collocation of data systems & archives, and moving processing closer to the data, may result in changes to the stewardship, preservation, and provenance of science data and software. With more science data systems being on-boarding onto cloud computing facilities, we can expect more Earth science data records to be both generated and kept in the cloud. But at large scales, the cost of processing and storing global data may impact architectural and system designs. Data systems will trade the cost of keeping data in the cloud with the data life-cycle approaches of moving "colder" data back to traditional on-premise facilities. How will this impact data citation and processing software stewardship? What are the impacts of cloud-based on-demand processing and its affect on reproducibility and provenance. Similarly, with more science processing software being moved onto cloud, virtual machines, and container based approaches, more opportunities arise for improved stewardship and preservation. But will the science community trust data reprocessed years or decades later? We will also explore emerging questions of the stewardship of the science data system software that is generating the science data records both during and after the life of mission.

  9. Data management for interdisciplinary field experiments: OTTER project support

    Science.gov (United States)

    Angelici, Gary; Popovici, Lidia; Skiles, J. W.

    1993-01-01

    The ability of investigators of an interdisciplinary science project to properly manage the data that are collected during the experiment is critical to the effective conduct of science. When the project becomes large, possibly including several scenes of large-format remotely sensed imagery shared by many investigators requiring several services, the data management effort can involve extensive staff and computerized data inventories. The OTTER (Oregon Transect Ecosystem Research) project was supported by the PLDS (Pilot Land Data System) with several data management services, such as data inventory, certification, and publication. After a brief description of these services, experiences in providing them are compared with earlier data management efforts and some conclusions regarding data management in support of interdisciplinary science are discussed. In addition to providing these services, a major goal of this data management capability was to adopt characteristics of a pro-active attitude, such as flexibility and responsiveness, believed to be crucial for the effective conduct of active, interdisciplinary science. These are also itemized and compared with previous data management support activities. Identifying and improving these services and characteristics can lead to the design and implementation of optimal data management support capabilities, which can result in higher quality science and data products from future interdisciplinary field experiments.

  10. Challenges in the Management and Stewardship of Airborne Observational Data at the National Center for Atmospheric Research (NCAR) Earth Observing Laboratory (EOL)

    Science.gov (United States)

    Aquino, J.; Daniels, M. D.

    2015-12-01

    The National Science Foundation (NSF) provides the National Center for Atmospheric Research (NCAR) Earth Observing Laboratory (EOL) funding for the operation, maintenance and upgrade of two research aircraft: the NSF/NCAR High-performance Instrumented Airborne Platform for Environmental Research (HIAPER) Gulfstream V and the NSF/NCAR Hercules C-130. A suite of in-situ and remote sensing airborne instruments housed at the EOL Research Aviation Facility (RAF) provide a basic set of measurements that are typically deployed on most airborne field campaigns. In addition, instruments to address more specific research requirements are provided by collaborating participants from universities, industry, NASA, NOAA or other agencies (referred to as Principal Investigator, or PI, instruments). At the 2014 AGU Fall Meeting, a poster (IN13B-3639) was presented outlining the components of Airborne Data Management included field phase data collection, formats, data archival and documentation, version control, storage practices, stewardship and obsolete data formats, and public data access. This talk will cover lessons learned, challenges associated with the above components, and current developments to address these challenges, including: tracking data workflows for aircraft instrumentation to facilitate identification, and correction, of gaps in these workflows; implementation of dataset versioning guidelines; and assignment of Digital Object Identifiers (DOIs) to data and instrumentation to facilitate tracking data and facility use in publications.

  11. Data Management Practices and Perspectives of Atmospheric Scientists and Engineering Faculty

    Directory of Open Access Journals (Sweden)

    Christie Wiley

    2016-12-01

    Full Text Available This article analyzes 21 in-depth interviews of engineering and atmospheric science faculty at the University of Illinois Urbana-Champaign (UIUC to determine faculty data management practices and needs within the context of their research activities. A detailed literature review of previous large-scale and institutional surveys and interviews revealed that researchers have a broad awareness of data-sharing mandates of federal agencies and journal publishers and a growing acceptance, with some concerns, of the value of data-sharing. However, the disciplinary differences in data management needs are significant and represent a set of challenges for libraries in setting up consistent and successful services. In addition, faculty have not yet significantly changed their data management practices to conform with the mandates. The interviews focused on current research projects and funding sources, data types and format, the use of disciplinary and institutional repositories, data-sharing, their awareness of university library data management and preservation services, funding agency review panel experiences, and struggles or challenges with managing research data. In general, the interviews corroborated the trends identified in the literature. One clear observation from the interviews was that scientists and engineers take a holistic view of the research lifecycle and treat data as one of many elements in the scholarly communication workflow. Data generation, usage, storage, and sharing are an integrated aspect of a larger scholarly workflow, and are not necessarily treated as a separate entity. Acknowledging this will allow libraries to develop programs that better integrate data management support into scholarly communication instruction and training.

  12. Characterizing Big Data Management

    OpenAIRE

    Rogério Rossi; Kechi Hirama

    2015-01-01

    Big data management is a reality for an increasing number of organizations in many areas and represents a set of challenges involving big data modeling, storage and retrieval, analysis and visualization. However, technological resources, people and processes are crucial to facilitate the management of big data in any kind of organization, allowing information and knowledge from a large volume of data to support decision-making. Big data management can be supported by these three dimensions: t...

  13. Translational Science Project Team Managers: Qualitative Insights and Implications from Current and Previous Postdoctoral Experiences.

    Science.gov (United States)

    Wooten, Kevin C; Dann, Sara M; Finnerty, Celeste C; Kotarba, Joseph A

    2014-07-01

    The development of leadership and project management skills is increasingly important to the evolution of translational science and team-based endeavors. Team science is dependent upon individuals at various stages in their careers, inclusive of postdocs. Data from case histories, as well as from interviews with current and former postdocs, and those supervising postdocs, indicate six essential tasks required of project managers in multidisciplinary translational teams, along with eight skill-related themes critical to their success. To optimize the opportunities available and to ensure sequential development of team project management skills, a life cycle model for the development of translational team skills is proposed, ranging from graduate trainees, postdocs, assistant professors, and finally to mature scientists. Specific goals, challenges and project management roles and tasks are recommended for each stage for the life cycle.

  14. Historical legacies, information and contemporary water science and management

    Science.gov (United States)

    Bain, Daniel J.; Arrigo, Jennifer A.S.; Green, Mark B.; Pellerin, Brian A.; Vörösmarty, Charles J.

    2011-01-01

    Hydrologic science has largely built its understanding of the hydrologic cycle using contemporary data sources (i.e., last 100 years). However, as we try to meet water demand over the next 100 years at scales from local to global, we need to expand our scope and embrace other data that address human activities and the alteration of hydrologic systems. For example, the accumulation of human impacts on water systems requires exploration of incompletely documented eras. When examining these historical periods, basic questions relevant to modern systems arise: (1) How is better information incorporated into water management strategies? (2) Does any point in the past (e.g., colonial/pre-European conditions in North America) provide a suitable restoration target? and (3) How can understanding legacies improve our ability to plan for future conditions? Beginning to answer these questions indicates the vital need to incorporate disparate data and less accepted methods to meet looming water management challenges.

  15. Globus Identity, Access, and Data Management: Platform Services for Collaborative Science

    Science.gov (United States)

    Ananthakrishnan, R.; Foster, I.; Wagner, R.

    2016-12-01

    Globus is software-as-a-service for research data management, developed at, and operated by, the University of Chicago. Globus, accessible at www.globus.org, provides high speed, secure file transfer; file sharing directly from existing storage systems; and data publication to institutional repositories. 40,000 registered users have used Globus to transfer tens of billions of files totaling hundreds of petabytes between more than 10,000 storage systems within campuses and national laboratories in the US and internationally. Web, command line, and REST interfaces support both interactive use and integration into applications and infrastructures. An important component of the Globus system is its foundational identity and access management (IAM) platform service, Globus Auth. Both Globus research data management and other applications use Globus Auth for brokering authentication and authorization interactions between end-users, identity providers, resource servers (services), and a range of clients, including web, mobile, and desktop applications, and other services. Compliant with important standards such as OAuth, OpenID, and SAML, Globus Auth provides mechanisms required for an extensible, integrated ecosystem of services and clients for the research and education community. It underpins projects such as the US National Science Foundation's XSEDE system, NCAR's Research Data Archive, and the DOE Systems Biology Knowledge Base. Current work is extending Globus services to be compliant with FEDRAMP standards for security assessment, authorization, and monitoring for cloud services. We will present Globus IAM solutions and give examples of Globus use in various projects for federated access to resources. We will also describe how Globus Auth and Globus research data management capabilities enable rapid development and low-cost operations of secure data sharing platforms that leverage Globus services and integrate them with local policy and security.

  16. Science Support: The Building Blocks of Active Data Curation

    Science.gov (United States)

    Guillory, A.

    2013-12-01

    While the scientific method is built on reproducibility and transparency, and results are published in peer reviewed literature, we have come to the digital age of very large datasets (now of the order of petabytes and soon exabytes) which cannot be published in the traditional way. To preserve reproducibility and transparency, active curation is necessary to keep and protect the information in the long term, and 'science support' activities provide the building blocks for active data curation. With the explosive growth of data in all fields in recent years, there is a pressing urge for data centres to now provide adequate services to ensure long-term preservation and digital curation of project data outputs, however complex those may be. Science support provides advice and support to science projects on data and information management, from file formats through to general data management awareness. Another purpose of science support is to raise awareness in the science community of data and metadata standards and best practice, engendering a culture where data outputs are seen as valued assets. At the heart of Science support is the Data Management Plan (DMP) which sets out a coherent approach to data issues pertaining to the data generating project. It provides an agreed record of the data management needs and issues within the project. The DMP is agreed upon with project investigators to ensure that a high quality documented data archive is created. It includes conditions of use and deposit to clearly express the ownership, responsibilities and rights associated with the data. Project specific needs are also identified for data processing, visualization tools and data sharing services. As part of the National Centre for Atmospheric Science (NCAS) and National Centre for Earth Observation (NCEO), the Centre for Environmental Data Archival (CEDA) fulfills this science support role of facilitating atmospheric and Earth observation data generating projects to ensure

  17. NASA's Earth Science Enterprise: Future Science Missions, Objectives and Challenges

    Science.gov (United States)

    Habib, Shahid

    1998-01-01

    NASA has been actively involved in studying the planet Earth and its changing environment for well over thirty years. Within the last decade, NASA's Earth Science Enterprise has become a major observational and scientific element of the U.S. Global Change Research Program. NASA's Earth Science Enterprise management has developed a comprehensive observation-based research program addressing all the critical science questions that will take us into the next century. Furthermore, the entire program is being mapped to answer five Science Themes (1) land-cover and land-use change research (2) seasonal-to-interannual climate variability and prediction (3) natural hazards research and applications (4) long-term climate-natural variability and change research and (5) atmospheric ozone research. Now the emergence of newer technologies on the horizon and at the same time continuously declining budget environment has lead to an effort to refocus the Earth Science Enterprise activities. The intent is not to compromise the overall scientific goals, but rather strengthen them by enabling challenging detection, computational and space flight technologies those have not been practically feasible to date. NASA is planning faster, cost effective and relatively smaller missions to continue the science observations from space for the next decade. At the same time, there is a growing interest in the world in the remote sensing area which will allow NASA to take advantage of this by building strong coalitions with a number of international partners. The focus of this presentation is to provide a comprehensive look at the NASA's Earth Science Enterprise in terms of its brief history, scientific objectives, organization, activities and future direction.

  18. Using integrated research and interdisciplinary science: Potential benefits and challenges to managers of parks and protected areas

    Science.gov (United States)

    van Riper, Charles; Powell, Robert B.; Machlis, Gary; van Wagtendonk, Jan W.; van Riper, Carena J.; von Ruschkowski, Eick; Schwarzbach, Steven E.; Galipeau, Russell E.

    2012-01-01

    Our purpose in this paper is to build a case for utilizing interdisciplinary science to enhance the management of parks and protected areas. We suggest that interdisciplinary science is necessary for dealing with the complex issues of contemporary resource management, and that using the best available integrated scientific information be embraced and supported at all levels of agencies that manage parks and protected areas. It will take the commitment of park managers, scientists, and agency leaders to achieve the goal of implementing the results of interdisciplinary science into park management. Although such calls go back at least several decades, today interdisciplinary science is sporadically being promoted as necessary for supporting effective protected area management(e.g., Machlis et al. 1981; Kelleher and Kenchington 1991). Despite this history, rarely has "interdisciplinary science" been defined, its importance explained, or guidance provided on how to translate and then implement the associated research results into management actions (Tress et al. 2006; Margles et al. 2010). With the extremely complex issues that now confront protected areas (e.g., climate change influences, extinctions and loss of biodiversity, human and wildlife demographic changes, and unprecedented human population growth) information from more than one scientific discipline will need to be brought to bear in order to achieve sustained management solutions that resonate with stakeholders (Ostrom 2009). Although interdisciplinary science is not the solution to all problems, we argue that interdisciplinary research is an evolving and widely supported best practice. In the case of park and protected area management, interdisciplinary science is being driven by the increasing recognition of the complexity and interconnectedness of human and natural systems, and the notion that addressing many problems can be more rapidly advanced through interdisciplinary study and analysis.

  19. PanMetaDocs - A tool for collecting and managing the long tail of "small science data"

    Science.gov (United States)

    Klump, J.; Ulbricht, D.

    2011-12-01

    In the early days of thinking about cyberinfrastructure the focus was on "big science data". Today, the challenge is not anymore to store several terabytes of data, but to manage data objects in a way that facilitates their re-use. Key to re-use by a user as a data consumer is proper documentation of the data. Also, data consumers need discovery metadata to find the data they need and they need descriptive metadata to be able to use the data they retrieved. Thus, data documentation faces the challenge to extensively and completely describe these objects, hold the items easily accessible at a sustainable cost level. However, data curation and documentation do not rank high in the everyday work of a scientist as a data producer. Data producers are often frustrated by being asked to provide metadata on their data over and over again, information that seemed very obvious from the context of their work. A challenge to data archives is the wide variety of metadata schemata in use, which creates a number of maintenance and design challenges of its own. PanMetaDocs addresses these issues by allowing an uploaded files to be described by more than one metadata object. PanMetaDocs, which was developed from PanMetaWorks, is a PHP based web application that allow to describe data with any xml-based metadata schema. Its user interface is browser based and was developed to collect metadata and data in collaborative scientific projects situated at one or more institutions. The metadata fields can be filled with static or dynamic content to reduce the number of fields that require manual entries to a minimum and make use of contextual information in a project setting. In the development of PanMetaDocs the business logic of panMetaWorks is reused, except for the authentication and data management functions of PanMetaWorks, which are delegated to the eSciDoc framework. The eSciDoc repository framework is designed as a service oriented architecture that can be controlled through a

  20. Using Smartphones to Collect Behavioral Data in Psychological Science: Opportunities, Practical Considerations, and Challenges.

    Science.gov (United States)

    Harari, Gabriella M; Lane, Nicholas D; Wang, Rui; Crosier, Benjamin S; Campbell, Andrew T; Gosling, Samuel D

    2016-11-01

    Smartphones now offer the promise of collecting behavioral data unobtrusively, in situ, as it unfolds in the course of daily life. Data can be collected from the onboard sensors and other phone logs embedded in today's off-the-shelf smartphone devices. These data permit fine-grained, continuous collection of people's social interactions (e.g., speaking rates in conversation, size of social groups, calls, and text messages), daily activities (e.g., physical activity and sleep), and mobility patterns (e.g., frequency and duration of time spent at various locations). In this article, we have drawn on the lessons from the first wave of smartphone-sensing research to highlight areas of opportunity for psychological research, present practical considerations for designing smartphone studies, and discuss the ongoing methodological and ethical challenges associated with research in this domain. It is our hope that these practical guidelines will facilitate the use of smartphones as a behavioral observation tool in psychological science. © The Author(s) 2016.

  1. Data science in ALICE

    CERN Multimedia

    CERN. Geneva

    2015-01-01

    ALICE is the LHC experiment dedicated to the study of Heavy Ion collisions. In particular, the detector features low momentum tracking and vertexing, and comprehensive particle identification capabilities. In a single central heavy ion collision at the LHC, thousands of particles per unit rapidity are produced, making the data volume, track reconstruction and search of rare signals particularly challenging. Data science and machine learning techniques could help to tackle some of the challenges outlined above. In this talk, we will discuss some early attempts to use these techniques for the processing of detector signals and for the physics analysis. We will also highlight the most promising areas for the application of these methods.

  2. Data science, learning, and applications to biomedical and health sciences.

    Science.gov (United States)

    Adam, Nabil R; Wieder, Robert; Ghosh, Debopriya

    2017-01-01

    The last decade has seen an unprecedented increase in the volume and variety of electronic data related to research and development, health records, and patient self-tracking, collectively referred to as Big Data. Properly harnessed, Big Data can provide insights and drive discovery that will accelerate biomedical advances, improve patient outcomes, and reduce costs. However, the considerable potential of Big Data remains unrealized owing to obstacles including a limited ability to standardize and consolidate data and challenges in sharing data, among a variety of sources, providers, and facilities. Here, we discuss some of these challenges and potential solutions, as well as initiatives that are already underway to take advantage of Big Data. © 2017 New York Academy of Sciences.

  3. U.S. National forests adapt to climate change through science-management partnerships

    Science.gov (United States)

    Jeremy S. Littell; David L. Peterson; Constance I. Millar; Kathy A. O' Halloran

    2011-01-01

    Developing appropriate management options for adapting to climate change is a new challenge for land managers, and integration of climate change concepts into operational management and planning on United States national forests is just starting. We established science-management partnerships on the Olympic National Forest (Washington) and Tahoe National Forest (...

  4. Science for the Poor: How One Woman Challenged Researchers, Ranchers, and Loggers in Amazonia

    Directory of Open Access Journals (Sweden)

    Patricia Shanley

    2006-12-01

    Full Text Available In the lower Tocantins region of Brazil, one Amazonian woman questioned why scientists publish principally for elite audiences. Her experience suggests that the impact may be enhanced by also sharing data with people who depend upon forest goods. Having defended her family homestead near the city of Cameta against loggers in the late 1980s, Glória Gaia became interested in strengthening the information base of other villagers so that they would not lose their forests for meager sums. She challenged scientists to defy norms such as extracting data without giving back to rural villagers and publishing primarily for the privileged. Working with researchers, she helped them to publish an illustrated manual of the ecology, economics, management, and cultural importance of key Amazonian forest species. With and without funds or a formal project, she traveled by foot and boat to remote villages to disseminate the book. Using data, stories, and song, she brought cautionary messages to villages about the impacts of logging on livelihoods. She also brought locally useful processing techniques regarding medicinal plants, fruit, and tree oils. Her holistic teachings challenged traditional forestry to include the management of fruits, fibers, and medicines. A new version of the book, requested by the government of Brazil, contains the contributions of 90 leading Brazilian and international scientists and local people. Glória Gaia's story raises the questions: Who is science for and how can science reach disenfranchised populations? Lessons for scientists and practitioners from Glória's story include: broadening the range of products from research to reach local people, complementing local ecological knowledge with scientific data, sharing precautionary data demonstrating trends, and involving women and marginalized people in the research and outreach process.

  5. Intermediate Trends in Math and Science Partnership-Related Changes in Student Achievement with Management Information System Data

    Science.gov (United States)

    Dimitrov, Dimiter M.

    2009-01-01

    This substudy in the evaluation design of the Math and Science Partnership (MSP) Program Evaluation examines student proficiency in mathematics and science for the MSPs' schools in terms of changes across three years (2003/04, 2004/05, and 2005/06) and relationships with MSP-related variables using Management Information System data with the…

  6. NASA Langley Atmospheric Science Data Centers Near Real-Time Data Products

    Science.gov (United States)

    Davenport, T.; Parker, L.; Rinsland, P. L.

    2014-12-01

    Over the past decade the Atmospheric Science Data Center (ASDC) at NASA Langley Research Center has archived and distributed a variety of satellite mission data sets. NASA's goal in Earth science is to observe, understand, and model the Earth system to discover how it is changing, to better predict change, and to understand the consequences for life on Earth. The ASDC has collaborated with Science Teams to accommodate emerging science users in the climate and modeling communities. The ASDC has expanded its original role to support operational usage by related Earth Science satellites, support land and ocean assimilations, support of field campaigns, outreach programs, and application projects for agriculture and energy industries to bridge the gap between Earth science research results and the adoption of data and prediction capabilities for reliable and sustained use in Decision Support Systems (DSS). For example; these products are being used by the community performing data assimilations to regulate aerosol mass in global transport models to improve model response and forecast accuracy, to assess the performance of components of a global coupled atmospheric-ocean climate model, improve atmospheric motion vector (winds) impact on numerical weather prediction models, and to provide internet-based access to parameters specifically tailored to assist in the design of solar and wind powered renewable energy systems. These more focused applications often require Near Real-Time (NRT) products. Generating NRT products pose their own unique set challenges for the ASDC and the Science Teams. Examples of ASDC NRT products and challenges will be discussed.

  7. A Disciplined Architectural Approach to Scaling Data Analysis for Massive, Scientific Data

    Science.gov (United States)

    Crichton, D. J.; Braverman, A. J.; Cinquini, L.; Turmon, M.; Lee, H.; Law, E.

    2014-12-01

    Data collections across remote sensing and ground-based instruments in astronomy, Earth science, and planetary science are outpacing scientists' ability to analyze them. Furthermore, the distribution, structure, and heterogeneity of the measurements themselves pose challenges that limit the scalability of data analysis using traditional approaches. Methods for developing science data processing pipelines, distribution of scientific datasets, and performing analysis will require innovative approaches that integrate cyber-infrastructure, algorithms, and data into more systematic approaches that can more efficiently compute and reduce data, particularly distributed data. This requires the integration of computer science, machine learning, statistics and domain expertise to identify scalable architectures for data analysis. The size of data returned from Earth Science observing satellites and the magnitude of data from climate model output, is predicted to grow into the tens of petabytes challenging current data analysis paradigms. This same kind of growth is present in astronomy and planetary science data. One of the major challenges in data science and related disciplines defining new approaches to scaling systems and analysis in order to increase scientific productivity and yield. Specific needs include: 1) identification of optimized system architectures for analyzing massive, distributed data sets; 2) algorithms for systematic analysis of massive data sets in distributed environments; and 3) the development of software infrastructures that are capable of performing massive, distributed data analysis across a comprehensive data science framework. NASA/JPL has begun an initiative in data science to address these challenges. Our goal is to evaluate how scientific productivity can be improved through optimized architectural topologies that identify how to deploy and manage the access, distribution, computation, and reduction of massive, distributed data, while

  8. The medical science DMZ: a network design pattern for data-intensive medical science.

    Science.gov (United States)

    Peisert, Sean; Dart, Eli; Barnett, William; Balas, Edward; Cuff, James; Grossman, Robert L; Berman, Ari; Shankar, Anurag; Tierney, Brian

    2017-10-06

    We describe a detailed solution for maintaining high-capacity, data-intensive network flows (eg, 10, 40, 100 Gbps+) in a scientific, medical context while still adhering to security and privacy laws and regulations. High-end networking, packet-filter firewalls, network intrusion-detection systems. We describe a "Medical Science DMZ" concept as an option for secure, high-volume transport of large, sensitive datasets between research institutions over national research networks, and give 3 detailed descriptions of implemented Medical Science DMZs. The exponentially increasing amounts of "omics" data, high-quality imaging, and other rapidly growing clinical datasets have resulted in the rise of biomedical research "Big Data." The storage, analysis, and network resources required to process these data and integrate them into patient diagnoses and treatments have grown to scales that strain the capabilities of academic health centers. Some data are not generated locally and cannot be sustained locally, and shared data repositories such as those provided by the National Library of Medicine, the National Cancer Institute, and international partners such as the European Bioinformatics Institute are rapidly growing. The ability to store and compute using these data must therefore be addressed by a combination of local, national, and industry resources that exchange large datasets. Maintaining data-intensive flows that comply with the Health Insurance Portability and Accountability Act (HIPAA) and other regulations presents a new challenge for biomedical research. We describe a strategy that marries performance and security by borrowing from and redefining the concept of a Science DMZ, a framework that is used in physical sciences and engineering research to manage high-capacity data flows. By implementing a Medical Science DMZ architecture, biomedical researchers can leverage the scale provided by high-performance computer and cloud storage facilities and national high

  9. Clinical data management: Current status, challenges, and future directions from industry perspectives

    Directory of Open Access Journals (Sweden)

    Zhengwu Lu

    2010-06-01

    Full Text Available Zhengwu Lu1, Jing Su21Smith Hanley Consulting, Houston, Texas; 2Department of Chemical Engineering, University of Massachusetts, Amherst, MA, USAAbstract: To maintain a competitive position, the biopharmaceutical industry has been facing the challenge of increasing productivity both internally and externally. As the product of the clinical development process, clinical data are recognized to be the key corporate asset and provide critical evidence of a medicine’s efficacy and safety and of its potential economic value to the market. It is also well recognized that using effective technology-enabled methods to manage clinical data can enhance the speed with which the drug is developed and commercialized, hence enhancing the competitive advantage. The effective use of data-capture tools may ensure that high-quality data are available for early review and rapid decision-making. A well-designed, protocol-driven, standardized, site workflow-oriented and documented database, populated via efficient data feed mechanisms, will ensure regulatory and commercial questions receive rapid responses. When information from a sponsor’s clinical database or data warehouse develops into corporate knowledge, the value of the medicine can be realized. Moreover, regulators, payer groups, patients, activist groups, patient advocacy groups, and employers are becoming more educated consumers of medicine, requiring monetary value and quality, and seeking out up-todate medical information supplied by biopharmaceutical companies. All these developments in the current biopharmaceutical arena demand that clinical data management (CDM is at the forefront, leading change, influencing direction, and providing objective evidence. Sustaining an integrated database or data repository for initial product registration and subsequent postmarketing uses is a long-term process to maximize return on investment for organizations. CDM should be the owner of driving clinical data

  10. The global nutrient challenge. From science to public engagement

    Energy Technology Data Exchange (ETDEWEB)

    Sutton, M.A.; Howard, C.M. [NERC Centre for Ecology and Hydrology, Edinburgh (United Kingdom); Bleeker, A. [Energy research Centre of the Netherlands, Petten (Netherlands); Datta, A. [United Nations Environment Programme, Nairobi (Kenya)

    2013-04-15

    Among the many environment and development challenges facing humanity, it is fair to say that nutrients do not currently feature so regularly in the newspapers, radio and television. The media tends to prefer easy single issues which affect our daily lives in a clear-cut way. The role of carbon in climate change is a good example. We all depend on climate. Burning fossil fuels makes more carbon dioxide, tending to change temperature and rainfall patterns, to which we can easily relate. The science is complex, but it is a simple message for the public to understand. It does not take long to think of several other easily grasped threats, like urban air pollution, poor drinking water, or even the occurrence of horsemeat in food chains. It is perhaps for these reasons that the role of nutrients in environmental change has received much less public attention. After all, nutrients - including nitrogen, phosphorus and many micronutrients - play multiple roles in our world; they affect many biogeochemical processes and they lead to a plethora of interacting threats. If we are not careful, we can quickly get buried in the complexity of the different ways in which our lives are affected by these elements. The outcome is that it can become hard to convey the science of global nutrient cycles in a way that the public can understand. These are points about which we have given substantial thought as we contributed to a recently launched report Our Nutrient World: The challenge to produce more food and energy with less pollution (Sutton et al., 2013). The report was commissioned by the United Nations Environment Programme (UNEP) and conducted by the Global Partnership on Nutrient Management in cooperation with the International Nitrogen Initiative. The commission was not to provide a full scientific assessment, but rather to develop a global overview of the challenges associated with nutrient management. Drawing on existing knowledge, the aim was to distill the nature of the

  11. Smart Data Infrastructure: The Sixth Generation of Mediation for Data Science

    Science.gov (United States)

    Fox, P. A.

    2014-12-01

    In the emergent "fourth paradigm" (data-driven) science, the scientific method is enhanced by the integration of significant data sources into the practice of scientific research. To address Big Science, there are challenges in understanding the role of data in enabling researchers to attack not just disciplinary issues, but also the system-level, large-scale, and transdisciplinary global scientific challenges facing society.Recognizing that the volume of data is only one of many dimensions to be considered, there is a clear need for improved data infrastructures to mediate data and information exchange, which we contend will need to be powered by semantic technologies. One clear need is to provide computational approaches for researchers to discover appropriate data resources, rapidly integrate data collections from heterogeneously resources or multiple data sets, and inter-compare results to allow generation and validation of hypotheses. Another trend is toward automated tools that allow researchers to better find and reuse data that they currently don't know they need, let alone know how to find. Again semantic technologies will be required. Finally, to turn data analytics from "art to science", technical solutions are needed for cross-dataset validation, reproducibility studies on data-driven results, and the concomitant citation of data products allowing recognition for those who curate and share important data resources.

  12. Bulk Data Movement for Climate Dataset: Efficient Data Transfer Management with Dynamic Transfer Adjustment

    International Nuclear Information System (INIS)

    Sim, Alexander; Balman, Mehmet; Williams, Dean; Shoshani, Arie; Natarajan, Vijaya

    2010-01-01

    Many scientific applications and experiments, such as high energy and nuclear physics, astrophysics, climate observation and modeling, combustion, nano-scale material sciences, and computational biology, generate extreme volumes of data with a large number of files. These data sources are distributed among national and international data repositories, and are shared by large numbers of geographically distributed scientists. A large portion of data is frequently accessed, and a large volume of data is moved from one place to another for analysis and storage. One challenging issue in such efforts is the limited network capacity for moving large datasets to explore and manage. The Bulk Data Mover (BDM), a data transfer management tool in the Earth System Grid (ESG) community, has been managing the massive dataset transfers efficiently with the pre-configured transfer properties in the environment where the network bandwidth is limited. Dynamic transfer adjustment was studied to enhance the BDM to handle significant end-to-end performance changes in the dynamic network environment as well as to control the data transfers for the desired transfer performance. We describe the results from the BDM transfer management for the climate datasets. We also describe the transfer estimation model and results from the dynamic transfer adjustment.

  13. 1998 Environmental Management Science Program Annual Report

    International Nuclear Information System (INIS)

    1999-01-01

    The Environmental Management Science Program (EMSP) is a collaborative partnership between the DOE Office of Environmental Management (EM), Office of Science (DOE-SC), and the Idaho Operations Office (DOE-ID) to sponsor basic environmental and waste management related research. Results are expected to lead to reduction of the costs, schedule, and risks associated with cleaning up the nation's nuclear complex. The EMSP research portfolio addresses the most challenging technical problems of the EM program related to high level waste, spent nuclear fuel, mixed waste, nuclear materials, remedial action, decontamination and decommissioning, and health, ecology, or risk. The EMSP was established in response to a mandate from Congress in the fiscal year 1996 Energy and Water Development Appropriations Act. Congress directed the Department to ''provide sufficient attention and resources to longer-term basic science research which needs to be done to ultimately reduce cleanup costs, develop a program that takes advantage of laboratory and university expertise, and seek new and innovative cleanup methods to replace current conventional approaches which are often costly and ineffective''. This mandate followed similar recommendations from the Galvin Commission to the Secretary of Energy Advisory Board. The EMSP also responds to needs identified by National Academy of Sciences experts, regulators, citizen advisory groups, and other stakeholders

  14. The role of metadata in managing large environmental science datasets. Proceedings

    Energy Technology Data Exchange (ETDEWEB)

    Melton, R.B.; DeVaney, D.M. [eds.] [Pacific Northwest Lab., Richland, WA (United States); French, J. C. [Univ. of Virginia, (United States)

    1995-06-01

    The purpose of this workshop was to bring together computer science researchers and environmental sciences data management practitioners to consider the role of metadata in managing large environmental sciences datasets. The objectives included: establishing a common definition of metadata; identifying categories of metadata; defining problems in managing metadata; and defining problems related to linking metadata with primary data.

  15. The Science DMZ: A Network Design Pattern for Data-Intensive Science

    Energy Technology Data Exchange (ETDEWEB)

    Dart, Eli; Rotman, Lauren; Tierney, Brian; Hester, Mary; Zurawski, Jason

    2013-08-13

    The ever-increasing scale of scientific data has become a significant challenge for researchers that rely on networks to interact with remote computing systems and transfer results to collaborators worldwide. Despite the availability of high-capacity connections, scientists struggle with inadequate cyberinfrastructure that cripples data transfer performance, and impedes scientific progress. The Science DMZ paradigm comprises a proven set of network design patterns that collectively address these problems for scientists. We explain the Science DMZ model, including network architecture, system configuration, cybersecurity, and performance tools, that creates an optimized network environment for science. We describe use cases from universities, supercomputing centers and research laboratories, highlighting the effectiveness of the Science DMZ model in diverse operational settings. In all, the Science DMZ model is a solid platform that supports any science workflow, and flexibly accommodates emerging network technologies. As a result, the Science DMZ vastly improves collaboration, accelerating scientific discovery.

  16. WFIRST: Microlensing Analysis Data Challenge

    Science.gov (United States)

    Street, Rachel; WFIRST Microlensing Science Investigation Team

    2018-01-01

    WFIRST will produce thousands of high cadence, high photometric precision lightcurves of microlensing events, from which a wealth of planetary and stellar systems will be discovered. However, the analysis of such lightcurves has historically been very time consuming and expensive in both labor and computing facilities. This poses a potential bottleneck to deriving the full science potential of the WFIRST mission. To address this problem, the WFIRST Microlensing Science Investigation Team designing a series of data challenges to stimulate research to address outstanding problems of microlensing analysis. These range from the classification and modeling of triple lens events to methods to efficiently yet thoroughly search a high-dimensional parameter space for the best fitting models.

  17. Data audit as a management tool.

    Science.gov (United States)

    Baldwin, J K; Hoover, B K

    1989-09-01

    Management faces the three basic challenges of achieving cost efficiency, successfully applying new technology, and minimizing risk. This paper examines the ways in which data audits can help managers come to terms with each of these challenges. It emphasizes the role of data audits in providing reliable information on which to base decisions and stresses the importance of auditing data in process rather than retrospectively. The paper examines frequently encountered sources of error and discusses the ways audits benefit management by assuring the integrity of experimental design, by preventing misassignment of observations, and by preventing the loss of data.

  18. Chemistry Students' Challenges in Using MBL's in Science Laboratories.

    Science.gov (United States)

    Atar, Hakan Yavuz

    Understanding students' challenges about using microcomputer based laboratories (MBLs) would provide important data in understanding the appropriateness of using MBLs in high school chemistry laboratories. Identifying students' concerns about this technology will in part help educators identify the obstacles to science learning when using this…

  19. Big Data, Computational Science, Economics, Finance, Marketing, Management, and Psychology: Connections

    Directory of Open Access Journals (Sweden)

    Chia-Lin Chang

    2018-03-01

    Full Text Available The paper provides a review of the literature that connects Big Data, Computational Science, Economics, Finance, Marketing, Management, and Psychology, and discusses research issues that are related to the various disciplines. Academics could develop theoretical models and subsequent econometric and statistical models to estimate the parameters in the associated models, as well as conduct simulation to examine whether the estimators in their theories on estimation and hypothesis testing have good size and high power. Thereafter, academics and practitioners could apply theory to analyse some interesting issues in the seven disciplines and cognate areas.

  20. Science and data science.

    Science.gov (United States)

    Blei, David M; Smyth, Padhraic

    2017-08-07

    Data science has attracted a lot of attention, promising to turn vast amounts of data into useful predictions and insights. In this article, we ask why scientists should care about data science. To answer, we discuss data science from three perspectives: statistical, computational, and human. Although each of the three is a critical component of data science, we argue that the effective combination of all three components is the essence of what data science is about.

  1. PANGAEA® - Data Publisher for Earth & Environmental Science - Research data enters scholarly communication and big data analysis

    Science.gov (United States)

    Diepenbroek, Michael; Schindler, Uwe; Riedel, Morris; Huber, Robert

    2014-05-01

    The ISCU World Data Center PANGAEA is an information system for acquisition, processing, long term storage, and publication of geo-referenced data related to earth science fields. Storing more than 350.000 data sets from all fields of geosciences it belongs to the largest archives for observational earth science data. Standard conform interfaces (ISO, OGC, W3C, OAI) enable access from a variety of data and information portals, among them the search engine of PANGAEA itself ((www.pangaea.de) and e.g. GBIF. All data sets in PANGAEA are citable, fully documented, and can be referenced via persistent identifiers (Digital Object Identifier - DOI) - a premise for data publication. Together with other ICSU World Data Centers (www.icsu-wds.org) and the Technical Information Library in Germany (TIB) PANGAEA had a share in the implementation of a DOI based registry for scientific data, which by now is supported by a worldwide consortium of libraries (www.datacite.org). A further milestone was building up strong co-operations with science publishers as Elsevier, Springer, Wiley, AGU, Nature and others. A common web service allows to reference supplementary data in PANGAEA directly from an articles abstract page (e.g. Science Direct). The next step with science publishers is to further integrate the editorial process for the publication of supplementary data with the publication procedures on the journal side. Data centric research efforts such as environmental modelling or big data analysing approaches represent new challenges for PANGAEA. Integrated data warehouse technologies are used for highly efficient retrievals and compilations of time slices or surface data matrixes on any measurement parameters out of the whole data continuum. Further, new and emerging big data approaches are currently investigated within PANGAEA to e.g. evaluate its usability for quality control or data clustering. PANGAEA is operated as a joint long term facility by MARUM at the University Bremen

  2. Towards Data Management Planning Support for Research Data

    NARCIS (Netherlands)

    Görzig, Heike; Engel, Felix; Brocks, Holger; Vogel, Tobias; Hemmje, Matthias

    2015-01-01

    Görzig, H., Engel, F., Brocks, H., Vogel, T. & Hemmje, M. (2015, August). Towards Data Management Planning Support for Research Data. Paper presented at the ASE International Conference on Data Science, Stanford, United States of America.

  3. Characterizing Big Data Management

    Directory of Open Access Journals (Sweden)

    Rogério Rossi

    2015-06-01

    Full Text Available Big data management is a reality for an increasing number of organizations in many areas and represents a set of challenges involving big data modeling, storage and retrieval, analysis and visualization. However, technological resources, people and processes are crucial to facilitate the management of big data in any kind of organization, allowing information and knowledge from a large volume of data to support decision-making. Big data management can be supported by these three dimensions: technology, people and processes. Hence, this article discusses these dimensions: the technological dimension that is related to storage, analytics and visualization of big data; the human aspects of big data; and, in addition, the process management dimension that involves in a technological and business approach the aspects of big data management.

  4. Leveraging High Performance Computing for Managing Large and Evolving Data Collections

    Directory of Open Access Journals (Sweden)

    Ritu Arora

    2014-10-01

    Full Text Available The process of developing a digital collection in the context of a research project often involves a pipeline pattern during which data growth, data types, and data authenticity need to be assessed iteratively in relation to the different research steps and in the interest of archiving. Throughout a project’s lifecycle curators organize newly generated data while cleaning and integrating legacy data when it exists, and deciding what data will be preserved for the long term. Although these actions should be part of a well-oiled data management workflow, there are practical challenges in doing so if the collection is very large and heterogeneous, or is accessed by several researchers contemporaneously. There is a need for data management solutions that can help curators with efficient and on-demand analyses of their collection so that they remain well-informed about its evolving characteristics. In this paper, we describe our efforts towards developing a workflow to leverage open science High Performance Computing (HPC resources for routinely and efficiently conducting data management tasks on large collections. We demonstrate that HPC resources and techniques can significantly reduce the time for accomplishing critical data management tasks, and enable a dynamic archiving throughout the research process. We use a large archaeological data collection with a long and complex formation history as our test case. We share our experiences in adopting open science HPC resources for large-scale data management, which entails understanding usage of the open source HPC environment and training users. These experiences can be generalized to meet the needs of other data curators working with large collections.

  5. Using Grand Challenges to Teach Science: A Biology-Geology Collaboration

    Science.gov (United States)

    Lyford, M.; Myers, J. D.

    2012-12-01

    Three science courses at the University of Wyoming explore the inextricable connections between science and society by centering on grand challenges. Two of these courses are introductory integrated science courses for non-majors while the third is an upper level course for majors and non-majors. Through collaboration, the authors have developed these courses to explore the grand challenges of energy, water and climate. Each course focuses on the fundamental STEM principles required for a citizen to understand each grand challenge. However, the courses also emphasize the non-STEM perspectives (e.g., economics, politics, human well-being, externalities) that underlie each grand challenge and argue that creating equitable, sustainable and just solutions to the grand challenges hinges on an understanding of STEM and non-STEM perspectives. Moreover, the authors also consider the multitude of personal perspectives individuals bring to the classroom (e.g., values, beliefs, empathy misconceptions) that influence any stakeholder's ability to engage in fruitful discussions about grand challenge solutions. Discovering Science (LIFE 1002) focuses on the grand challenges of energy and climate. Students attend three one-hour lectures, one two-hour lab and a one-hour discussion each week. Lectures emphasize the STEM and non-STEM principles underlying each grand challenge. Laboratory activities are designed to be interdisciplinary and engage students in inquiry-driven activities to reinforce concepts from lecture and to model how science is conducted. Labs also expose students to the difficulties often associated with scientific studies, the limits of science, and the inherent uncertainties associated with scientific findings. Discussion sessions provide an opportunity for students to explore the complexity of the grand challenges from STEM and non-STEM perspectives, and expose the multitude of personal perspectives an individual might harbor related to each grand challenge

  6. Sound data management as a foundation for natural resources management and science

    Science.gov (United States)

    Burley, Thomas E.

    2012-01-01

    Effective decision making is closely related to the quality and completeness of available data and information. Data management helps to ensure data quality in any discipline and supports decision making. Managing data as a long-term scientific asset helps to ensure that data will be usable beyond the original intended application. Emerging issues in water-resources management and climate variability require the ability to analyze change in the conditions of natural resources over time. The availability of quality, well-managed, and documented data from the past and present helps support this requirement.

  7. Data Management Activities of Canada's National Science Library - 2010 Update and Prospective

    Directory of Open Access Journals (Sweden)

    Mary Zborowski

    2011-01-01

    Full Text Available NRC-CISTI serves Canada as its National Science Library (as mandated by Canada's Parliament in 1924 and also provides direct support to researchers of the National Research Council of Canada (NRC. By reason of its mandate, vision, and strategic positioning, NRC-CISTI has been rapidly and effectively mobilizing Canadian stakeholders and resources to become a lead player on both the Canadian national and international scenes in matters relating to the organization and management of scientific research data. In a previous communication (CODATA International Conference, 2008, the orientation of NRC-CISTI towards this objective and its short- and medium-term plans and strategies were presented. Since then, significant milestones have been achieved. This paper presents NRC-CISTI's most recent activities in these areas, which are progressing well alongside a strategic organizational redesign process that is realigning NRC-CISTI's structure, mission, and mandate to better serve its clients. Throughout this transformational phase, activities relating to data management remain vibrant.

  8. The Science DMZ: A Network Design Pattern for Data-Intensive Science

    Directory of Open Access Journals (Sweden)

    Eli Dart

    2014-01-01

    Full Text Available The ever-increasing scale of scientific data has become a significant challenge for researchers that rely on networks to interact with remote computing systems and transfer results to collaborators worldwide. Despite the availability of high-capacity connections, scientists struggle with inadequate cyberinfrastructure that cripples data transfer performance, and impedes scientific progress. The Science DMZ paradigm comprises a proven set of network design patterns that collectively address these problems for scientists. We explain the Science DMZ model, including network architecture, system configuration, cybersecurity, and performance tools, that creates an optimized network environment for science. We describe use cases from universities, supercomputing centers and research laboratories, highlighting the effectiveness of the Science DMZ model in diverse operational settings. In all, the Science DMZ model is a solid platform that supports any science workflow, and flexibly accommodates emerging network technologies. As a result, the Science DMZ vastly improves collaboration, accelerating scientific discovery.

  9. Infectious disease in cervids of North America: data, models, and management challenges.

    Science.gov (United States)

    Conner, Mary Margaret; Ebinger, Michael Ryan; Blanchong, Julie Anne; Cross, Paul Chafee

    2008-01-01

    Over the past two decades there has been a steady increase in the study and management of wildlife diseases. This trend has been driven by the perception of an increase in emerging zoonotic diseases and the recognition that wildlife can be a critical factor for controlling infectious diseases in domestic animals. Cervids are of recent concern because, as a group, they present a number of unique challenges. Their close ecological and phylogenetic relationship to livestock species places them at risk for receiving infections from, and reinfecting livestock. In addition, cervids are an important resource; revenue from hunting and viewing contribute substantially to agency budgets and local economies. A comprehensive coverage of infectious diseases in cervids is well beyond the scope of this chapter. In North America alone there are a number of infectious diseases that can potentially impact cervid populations, but for most of these, management is not feasible or the diseases are only a potential or future concern. We focus this chapter on three diseases that are of major management concern and the center of most disease research for cervids in North America: bovine tuberculosis, chronic wasting disease, and brucellosis. We discuss the available data and recent advances in modeling and management of these diseases.

  10. The Grand Challenges Discourse: Transforming Identity Work in Science and Science Policy.

    Science.gov (United States)

    Kaldewey, David

    2018-01-01

    This article analyzes the concept of "grand challenges" as part of a shift in how scientists and policymakers frame and communicate their respective agendas. The history of the grand challenges discourse helps to understand how identity work in science and science policy has been transformed in recent decades. Furthermore, the question is raised whether this discourse is only an indicator, or also a factor in this transformation. Building on conceptual history and historical semantics, the two parts of the article reconstruct two discursive shifts. First, the observation that in scientific communication references to "problems" are increasingly substituted by references to "challenges" indicates a broader cultural trend of how attitudes towards what is problematic have shifted in the last decades. Second, as the grand challenges discourse is rooted in the sphere of sports and competition, it introduces a specific new set of societal values and practices into the spheres of science and technology. The article concludes that this process can be characterized as the sportification of science, which contributes to self-mobilization and, ultimately, to self-optimization of the participating scientists, engineers, and policymakers.

  11. Data Logging and Data Modelling: Using seismology and seismic data to create challenge in the academic classroom.

    Science.gov (United States)

    Neighbour, Gordon

    2013-04-01

    In 2012 Computing and Information Technology was disapplied from the English National Curriculum and therefore no longer has a compulsory programme of study. Data logging and data modelling are still essential components of the curriculum in the Computing and Information Technology classroom. Once the students have mastered the basics of both spreadsheet and information handling software they need to be further challenged. All too often the data used with relation to data-logging and data-handling is not realistic enough to really challenge very able students. However, using data from seismology allows students to manipulate "real" data and enhances their experience of geo-science, developing their skills and then allowing them to build on this work in both the science and geography classroom. This new scheme of work "Seismology at School" has allowed the students to work and develop skills beyond those normally expected for their age group and has allowed them to better appreciate their learning experience of "Natural Hazards" in the science and geography classroom in later years. The students undertake research to help them develop their understanding of earthquakes. This includes using materials from other nations within the European Economic Area, to also develop and challenge their use of Modern Foreign Languages. They are then challenged to create their own seismometers using simple kits and 'free' software - this "problem-solving" approach to their work is designed to enhance team-work and to extend the challenge they experience in the classroom. The students are then are asked to manipulate a "real" set of data using international earthquake data from the most recent whole year. This allows the students to make use of many of the analytical and statistical functions of both spreadsheet software and information handling software in a meaningful way. The students will need to have developed a hypothesis which their work should have provided either validation

  12. Effective Management of Ocean Biogeochemistry and Ecological Data: the BCO-DMO Story

    Science.gov (United States)

    Chandler, C. L.; Groman, R. C.; Allison, M. D.; Wiebe, P. H.; Glover, D. M.; Gegg, S. R.

    2012-04-01

    Data availability expectations of the research community, environmental management decision makers, and funding agency representatives are changing. Consequently, data management practices in many science communities are changing as well. In an effort to improve access to data generated by ocean biogeochemistry and ecological researchers funded by the United States (US) National Science Foundation (NSF) Division of Ocean Sciences (OCE), the Biological and Chemical Oceanography Data Management Office (BCO-DMO) was created in late 2006. Currently, the main BCO-DMO objective is to ensure availability of data resulting from select OCE and Office of Polar Programs (OPP) research awards granted by the US NSF. An important requirement for the BCO-DMO data management system is that it provides open access to data that are supported by sufficient metadata to enable data discovery and accurate reuse. The office manages and serves all types of oceanographic data (in situ, experimental, model results) generated during the research process and contributed by the originating investigators from large national programs and medium-sized collaborative research projects, as well as researchers with single investigator awards. BCO-DMO staff members have made strategic use of standards and use of terms from controlled vocabularies while balancing the need to maintain flexible data ingest systems that accommodate the heterogeneous nature of ocean biogeochemistry and ecological research data. Many of the discrete ocean biogeochemistry data sets managed by BCO-DMO are still acquired manually, often with prototype sensor systems. Data sets such as these that are not "born-digital" present a significant management challenge. Use of multiple levels of term-mappings and development of an ontology has enabled BCO-DMO to incorporate a semantically enabled faceted search into the data access system that will improve data access through enhanced data discovery. BCO-DMO involves an ongoing

  13. Science Diplomacy: New Global Challenges, New Trend

    OpenAIRE

    Van Langenhove, Luk

    2016-01-01

    As new challenges such as the critical need for a universal sustainable development agenda confront mankind, science and diplomacy are converging as common tools for trouble-shooting. Science Diplomacy can be seen as a new phenomenon involving the role of science in diplomacy.

  14. Quality-assurance and data-management plan for water-quality activities in the Kansas Water Science Center, 2014

    Science.gov (United States)

    Rasmussen, Teresa J.; Bennett, Trudy J.; Foster, Guy M.; Graham, Jennifer L.; Putnam, James E.

    2014-01-01

    As the Nation’s largest water, earth, and biological science and civilian mapping information agency, the U.S. Geological Survey is relied on to collect high-quality data, and produce factual and impartial interpretive reports. This quality-assurance and data-management plan provides guidance for water-quality activities conducted by the Kansas Water Science Center. Policies and procedures are documented for activities related to planning, collecting, storing, documenting, tracking, verifying, approving, archiving, and disseminating water-quality data. The policies and procedures described in this plan complement quality-assurance plans for continuous water-quality monitoring, surface-water, and groundwater activities in Kansas.

  15. Sustainability Science Needs Sustainable Data!

    Science.gov (United States)

    Downs, R. R.; Chen, R. S.

    2013-12-01

    Sustainability science (SS) is an 'emerging field of research dealing with the interactions between natural and social systems, and with how those interactions affect the challenge of sustainability: meeting the needs of present and future generations while substantially reducing poverty and conserving the planet's life support systems' (Kates, 2011; Clark, 2007). Bettencourt & Kaur (2011) identified more than 20,000 scientific papers published on SS topics since the 1980s with more than 35,000 distinct authors. They estimated that the field is currently growing exponentially, with the number of authors doubling approximately every 8 years. These scholars are undoubtedly using and generating a vast quantity and variety of data and information for both SS research and applications. Unfortunately we know little about what data the SS community is actually using, and whether or not the data that SS scholars generate are being preserved for future use. Moreover, since much SS research is conducted by cross-disciplinary, multi-institutional teams, often scattered around the world, there could well be increased risks of data loss, reduced data quality, inadequate documentation, and poor long-term access and usability. Capabilities and processes therefore need to be established today to support continual, reliable, and efficient preservation of and access to SS data in the future, especially so that they can be reused in conjunction with future data and for new studies not conceived in the original data collection activities. Today's long-term data stewardship challenges include establishing sustainable data governance to facilitate continuing management, selecting data to ensure that limited resources are focused on high priority SS data holdings, securing sufficient rights to allow unforeseen uses, and preparing data to enable use by future communities whose specific research and information needs are not yet known. Adopting sustainable models for archival

  16. Integrating adaptive management and ecosystem services concepts to improve natural resource management: Challenges and opportunities

    Science.gov (United States)

    Epanchin-Niell, Rebecca S.; Boyd, James W.; Macauley, Molly K.; Scarlett, Lynn; Shapiro, Carl D.; Williams, Byron K.

    2018-05-07

    Executive Summary—OverviewNatural resource managers must make decisions that affect broad-scale ecosystem processes involving large spatial areas, complex biophysical interactions, numerous competing stakeholder interests, and highly uncertain outcomes. Natural and social science information and analyses are widely recognized as important for informing effective management. Chief among the systematic approaches for improving the integration of science into natural resource management are two emergent science concepts, adaptive management and ecosystem services. Adaptive management (also referred to as “adaptive decision making”) is a deliberate process of learning by doing that focuses on reducing uncertainties about management outcomes and system responses to improve management over time. Ecosystem services is a conceptual framework that refers to the attributes and outputs of ecosystems (and their components and functions) that have value for humans.This report explores how ecosystem services can be moved from concept into practice through connection to a decision framework—adaptive management—that accounts for inherent uncertainties. Simultaneously, the report examines the value of incorporating ecosystem services framing and concepts into adaptive management efforts.Adaptive management and ecosystem services analyses have not typically been used jointly in decision making. However, as frameworks, they have a natural—but to date underexplored—affinity. Both are policy and decision oriented in that they attempt to represent the consequences of resource management choices on outcomes of interest to stakeholders. Both adaptive management and ecosystem services analysis take an empirical approach to the analysis of ecological systems. This systems orientation is a byproduct of the fact that natural resource actions affect ecosystems—and corresponding societal outcomes—often across large geographic scales. Moreover, because both frameworks focus on

  17. Health Sciences

    OpenAIRE

    McEntyre, Johanna; Swan, Alma; Meier zu Verl, Christian; Horstmann, Wolfram

    2011-01-01

    This chapter provides an overview of research data management in the health sciences, primarily focused upon the sort of data curated by the European Bioinformatics Institute and similar organisations. In this field, data management is well-advanced, with a sophisticated infrastructure created and maintained by the community for the benefit of all. These advances have been brought about because the field has been data-intense for many years and has been driven by the challenges biology fac...

  18. Forest Management Challenges for Sustaining Water Resources in the Anthropocene

    Directory of Open Access Journals (Sweden)

    Ge Sun

    2016-03-01

    Full Text Available The Earth has entered the Anthropocene epoch that is dominated by humans who demand unprecedented quantities of goods and services from forests. The science of forest hydrology and watershed management generated during the past century provides a basic understanding of relationships among forests and water and offers management principles that maximize the benefits of forests for people while sustaining watershed ecosystems. However, the rapid pace of changes in climate, disturbance regimes, invasive species, human population growth, and land use expected in the 21st century is likely to create substantial challenges for watershed management that may require new approaches, models, and best management practices. These challenges are likely to be complex and large scale, involving a combination of direct and indirect biophysical watershed responses, as well as socioeconomic impacts and feedbacks. We discuss the complex relationships between forests and water in a rapidly changing environment, examine the trade-offs and conflicts between water and other resources, and propose new management approaches for sustaining water resources in the Anthropocene.

  19. Communications among data and science centers

    Science.gov (United States)

    Green, James L.

    1990-01-01

    The ability to electronically access and query the contents of remote computer archives is of singular importance in space and earth sciences; the present evaluation of such on-line information networks' development status foresees swift expansion of their data capabilities and complexity, in view of the volumes of data that will continue to be generated by NASA missions. The U.S.'s National Space Science Data Center (NSSDC) manages NASA's largest science computer network, the Space Physics Analysis Network; a comprehensive account is given of the structure of NSSDC international access through BITNET, and of connections to the NSSDC available in the Americas via the International X.25 network.

  20. Data, data everywhere ...

    Science.gov (United States)

    Chandler, C. L.

    2016-12-01

    The scientific research endeavor requires data, and in some cases massive amounts of complex and highly diverse data. From experimental design, through data acquisition and analysis, hypothesis testing, and finally drawing conclusions, data collection and proper stewardship are critical to science. Even a single experiment conducted by a single researcher will produce data to test the working hypothesis. The types of complex science questions being tackled today often require large, diverse, multi-disciplinary teams of researchers who must be prepared to exchange their data.This 2016 AGU Leptoukh Lecture comprises a series of vignettes that illustrate a brief history of data stewardship: where we have come from, how and why we have arrived where we are today, and where we are headed with respect to data management. The specific focus will be on management of marine ecosystem research data and will include observations on the drivers, challenges, strategies, and solutions that have evolved over time. The lessons learned should be applicable to other disciplines and the hope is that many will recognize parallels in their chosen domain.From historical shipboard logbooks to the high-volume, digital, quality-controlled ocean science data sets created by today's researchers, there have been enormous changes in the way ocean data are collected and reported. Rapid change in data management practices is being driven by new data exchange requirements, by modern expectations for machine-interoperable exchange, and by the desire to achieve research transparency. Advances in technology and cultural shifts contribute to the changing conditions through which data managers and informatics specialists must navigate.The unique challenges associated with collecting and managing environmental data, complicated by the onset of the big data era, make this a fascinating time to be responsible for data. It seems there are data everywhere, being collected by everyone, for all sorts of

  1. Big Data Challenges in Climate Science: Improving the Next-Generation Cyberinfrastructure

    Science.gov (United States)

    Schnase, John L.; Lee, Tsengdar J.; Mattmann, Chris A.; Lynnes, Christopher S.; Cinquini, Luca; Ramirez, Paul M.; Hart, Andre F.; Williams, Dean N.; Waliser, Duane; Rinsland, Pamela; hide

    2016-01-01

    The knowledge we gain from research in climate science depends on the generation, dissemination, and analysis of high-quality data. This work comprises technical practice as well as social practice, both of which are distinguished by their massive scale and global reach. As a result, the amount of data involved in climate research is growing at an unprecedented rate. Climate model intercomparison (CMIP) experiments, the integration of observational data and climate reanalysis data with climate model outputs, as seen in the Obs4MIPs, Ana4MIPs, and CREATE-IP activities, and the collaborative work of the Intergovernmental Panel on Climate Change (IPCC) provide examples of the types of activities that increasingly require an improved cyberinfrastructure for dealing with large amounts of critical scientific data. This paper provides an overview of some of climate science's big data problems and the technical solutions being developed to advance data publication, climate analytics as a service, and interoperability within the Earth System Grid Federation (ESGF), the primary cyberinfrastructure currently supporting global climate research activities.

  2. Web Coverage Service Challenges for NASA's Earth Science Data

    Science.gov (United States)

    Cantrell, Simon; Khan, Abdul; Lynnes, Christopher

    2017-01-01

    In an effort to ensure that data in NASA's Earth Observing System Data and Information System (EOSDIS) is available to a wide variety of users through the tools of their choice, NASA continues to focus on exposing data and services using standards based protocols. Specifically, this work has focused recently on the Web Coverage Service (WCS). Experience has been gained in data delivery via GetCoverage requests, starting out with WCS v1.1.1. The pros and cons of both the version itself and different implementation approaches will be shared during this session. Additionally, due to limitations with WCS v1.1.1 ability to work with NASA's Earth science data, this session will also discuss the benefit of migrating to WCS 2.0.1 with EO-x to enrich this capability to meet a wide range of anticipated user's needs This will enable subsetting and various types of data transformations to be performed on a variety of EOS data sets.

  3. Barriers and opportunities for integrating social science into natural resource management: lessons from National Estuarine Research Reserves.

    Science.gov (United States)

    Robinson, Patrick; Genskow, Ken; Shaw, Bret; Shepard, Robin

    2012-12-01

    The need for cross-disciplinary scientific inquiries that facilitate improved natural resource management outcomes through increased understanding of both the biophysical and human dimensions of management issues has been widely recognized. Despite this broad recognition, a number of obstacles and barriers still sometimes challenge the successful implementation of cross-disciplinary approaches. Improving understanding of these challenges and barriers will help address them and thereby foster appropriate and effective utilization of cross-disciplinary approaches to solve natural resource management challenges. This research uses a case study analysis of the United States National Estuarine Research Reserve System to improve understanding of the critical factors that influence practitioners' decisions related to incorporating social science into their natural resource management work. The case study research is analyzed and evaluated within a Theory of Planned Behavior framework to (1) determine and describe the factors that predict practitioners' intent to incorporate social science into their natural resource related activities and (2) recommend potential strategies for encouraging and enabling cross-disciplinary approaches to natural resource management. The results indicate that National Estuarine Research Reserve practitioners' decisions related to incorporating social science are primarily influenced by (1) confidence in their own capability to incorporate social science into their work and (2) beliefs about whether the outcomes of incorporating social science into their work would be valuable or beneficial.

  4. ESSD: Real World Issues and Challenges of High-Quality Data Publication

    Science.gov (United States)

    Pfeiffenberger, Hans; Carlson, David

    2013-04-01

    The Copernicus data publication journal Earth System Science Data (ESSD) represents an important and unique (and by no means final!) step forward in the larger world of data publication. Working with authors, reviewers, editors and data centres, ESSD has successfully produced many high-quality data publications across a wide variety of scientific disciplines, for individual data sets, multiple data sets as the product of scientific consortia and in special issues coordinated with other science journals. The ESSD success also exposes issues and challenges for present and future data publication, particularly around the topic and implementation of persistent identifiers. • As ESSD encourages redundant data sets across multiple data centres for access and archive purposes, how will DOIs be employed to accurately point to those distributed or replicated data? How can authenticity and integrity be verified? • How can or should object identifiers be employed in pointing from raw to quality-controlled and finally derived data processing levels; how can we designate or distinguish among these, particularly as those terms vary substantially among, for example, geophysical and ecological communities? Likewise, how to distinguish an auto-generated data product (e.g a species identification from GBIF) from a high-effort expertly reviewed data product (e.g. an ESSD publication)? • For a growing number of ESSD data publications with expected annual or periodic revisions and updates, how should data journals' and the repositories' use of persistent identifiers best record the subsequent versions, extensions or corrections? • As published data sets become a valued part of high profile science, with attendant deadlines, announcements and publicity, do the various DOI policies and minting practices among cooperating publishers, data centres and journals represent a help or a hindrance? These questions evolve directly from increasing interest in and activity by ESSD and, as

  5. The computational challenges of Earth-system science.

    Science.gov (United States)

    O'Neill, Alan; Steenman-Clark, Lois

    2002-06-15

    The Earth system--comprising atmosphere, ocean, land, cryosphere and biosphere--is an immensely complex system, involving processes and interactions on a wide range of space- and time-scales. To understand and predict the evolution of the Earth system is one of the greatest challenges of modern science, with success likely to bring enormous societal benefits. High-performance computing, along with the wealth of new observational data, is revolutionizing our ability to simulate the Earth system with computer models that link the different components of the system together. There are, however, considerable scientific and technical challenges to be overcome. This paper will consider four of them: complexity, spatial resolution, inherent uncertainty and time-scales. Meeting these challenges requires a significant increase in the power of high-performance computers. The benefits of being able to make reliable predictions about the evolution of the Earth system should, on their own, amply repay this investment.

  6. NASA UAV Airborne Science Capabilities in Support of Water Resource Management

    Science.gov (United States)

    Fladeland, Matthew

    2015-01-01

    This workshop presentation focuses on potential uses of unmanned aircraft observations in support of water resource management and agriculture. The presentation will provide an overview of NASA Airborne Science capabilities with an emphasis on past UAV missions to provide context on accomplishments as well as technical challenges. I will also focus on recent NASA Ames efforts to assist in irrigation management and invasive species management using airborne and satellite datasets.

  7. AUTHENTIC SCIENCE EXPERIENCES: PRE-COLLEGIATE SCIENCE EDUCATORS’ SUCCESSES AND CHALLENGES DURING PROFESSIONAL DEVELOPMENT

    Directory of Open Access Journals (Sweden)

    Andrea C. Burrows

    2016-04-01

    Full Text Available Twenty-three pre-collegiate educators of elementary students (ages 5-10 years and secondary students (ages 11-18 years attended a two-week science, technology, engineering, and mathematics (STEM astronomy focused professional development in the summer of 2015 with activities focused on authentic science experiences, inquiry, and partnership building. ‘Authentic’ in this research refers to scientific skills and are defined. The study explores the authentic science education experience of the pre-collegiate educators, detailing the components of authentic science as seen through a social constructionism lens. Using qualitative and quantitative methods, the researchers analyzed the successes and challenges of pre-collegiate science and mathematics educators when immersed in STEM and astronomy authentic science practices, the educators’ perceptions before and after the authentic science practices, and the educators’ performance on pre to post content tests during the authentic science practices. Findings show that the educators were initially engaged, then disengaged, and then finally re-engaged with the authentic experience. Qualitative responses are shared, as are the significant results of the quantitative pre to post content learning scores of the educators. Conclusions include the necessity for PD team delivery of detailed explanations to the participants - before, during, and after – for the entire authentic science experience and partnership building processes. Furthermore, expert structure and support is vital for participant research question generation, data collection, and data analysis (successes, failures, and reattempts. Overall, in order to include authentic science in pre-collegiate classrooms, elementary and secondary educators need experience, instruction, scaffolding, and continued support with the STEM processes.

  8. Earth Science Data Education through Cooking Up Recipes

    Science.gov (United States)

    Weigel, A. M.; Maskey, M.; Smith, T.; Conover, H.

    2016-12-01

    One of the major challenges in Earth science research and applications is understanding and applying the proper methods, tools, and software for using scientific data. These techniques are often difficult and time consuming to identify, requiring novel users to conduct extensive research, take classes, and reach out for assistance, thus hindering scientific discovery and real-world applications. To address these challenges, the Global Hydrology Resource Center (GHRC) DAAC has developed a series of data recipes that novel users such as students, decision makers, and general Earth scientists can leverage to learn how to use Earth science datasets. Once the data recipe content had been finalized, GHRC computer and Earth scientists collaborated with a web and graphic designer to ensure the content is both attractively presented to data users, and clearly communicated to promote the education and use of Earth science data. The completed data recipes include, but are not limited to, tutorials, iPython Notebooks, resources, and tools necessary for addressing key difficulties in data use across a broad user base. These recipes enable non-traditional users to learn how to use data, but also curates and communicates common methods and approaches that may be difficult and time consuming for these users to identify.

  9. Big Data Goes Personal: Privacy and Social Challenges

    Science.gov (United States)

    Bonomi, Luca

    2015-01-01

    The Big Data phenomenon is posing new challenges in our modern society. In addition to requiring information systems to effectively manage high-dimensional and complex data, the privacy and social implications associated with the data collection, data analytics, and service requirements create new important research problems. First, the high…

  10. LIMS and Clinical Data Management.

    Science.gov (United States)

    Chen, Yalan; Lin, Yuxin; Yuan, Xuye; Shen, Bairong

    2016-01-01

    In order to achieve more accurate disease prevention, diagnosis, and treatment, clinical and genetic data need extensive and systematically associated study. As one way to achieve precision medicine, a laboratory information management system (LIMS) can effectively associate clinical data in a macrocosmic aspect and genomic data in a microcosmic aspect. This chapter summarizes the application of the LIMS in a clinical data management and implementation mode. It also discusses the principles of a LIMS in clinical data management, as well as the opportunities and challenges in the context of medical informatics.

  11. Big questions, big science: meeting the challenges of global ecology.

    Science.gov (United States)

    Schimel, David; Keller, Michael

    2015-04-01

    Ecologists are increasingly tackling questions that require significant infrastucture, large experiments, networks of observations, and complex data and computation. Key hypotheses in ecology increasingly require more investment, and larger data sets to be tested than can be collected by a single investigator's or s group of investigator's labs, sustained for longer than a typical grant. Large-scale projects are expensive, so their scientific return on the investment has to justify the opportunity cost-the science foregone because resources were expended on a large project rather than supporting a number of individual projects. In addition, their management must be accountable and efficient in the use of significant resources, requiring the use of formal systems engineering and project management to mitigate risk of failure. Mapping the scientific method into formal project management requires both scientists able to work in the context, and a project implementation team sensitive to the unique requirements of ecology. Sponsoring agencies, under pressure from external and internal forces, experience many pressures that push them towards counterproductive project management but a scientific community aware and experienced in large project science can mitigate these tendencies. For big ecology to result in great science, ecologists must become informed, aware and engaged in the advocacy and governance of large ecological projects.

  12. Technical challenges for big data in biomedicine and health: data sources, infrastructure, and analytics.

    Science.gov (United States)

    Peek, N; Holmes, J H; Sun, J

    2014-08-15

    To review technical and methodological challenges for big data research in biomedicine and health. We discuss sources of big datasets, survey infrastructures for big data storage and big data processing, and describe the main challenges that arise when analyzing big data. The life and biomedical sciences are massively contributing to the big data revolution through secondary use of data that were collected during routine care and through new data sources such as social media. Efficient processing of big datasets is typically achieved by distributing computation over a cluster of computers. Data analysts should be aware of pitfalls related to big data such as bias in routine care data and the risk of false-positive findings in high-dimensional datasets. The major challenge for the near future is to transform analytical methods that are used in the biomedical and health domain, to fit the distributed storage and processing model that is required to handle big data, while ensuring confidentiality of the data being analyzed.

  13. Using and Distributing Spaceflight Data: The Johnson Space Center Life Sciences Data Archive

    Science.gov (United States)

    Cardenas, J. A.; Buckey, J. C.; Turner, J. N.; White, T. S.; Havelka,J. A.

    1995-01-01

    Life sciences data collected before, during and after spaceflight are valuable and often irreplaceable. The Johnson Space Center Life is hard to find, and much of the data (e.g. Sciences Data Archive has been designed to provide researchers, engineers, managers and educators interactive access to information about and data from human spaceflight experiments. The archive system consists of a Data Acquisition System, Database Management System, CD-ROM Mastering System and Catalog Information System (CIS). The catalog information system is the heart of the archive. The CIS provides detailed experiment descriptions (both written and as QuickTime movies), hardware descriptions, hardware images, documents, and data. An initial evaluation of the archive at a scientific meeting showed that 88% of those who evaluated the catalog want to use the system when completed. The majority of the evaluators found the archive flexible, satisfying and easy to use. We conclude that the data archive effectively provides key life sciences data to interested users.

  14. The MMS Science Data Center: Operations, Capabilities, and Resource.

    Science.gov (United States)

    Larsen, K. W.; Pankratz, C. K.; Giles, B. L.; Kokkonen, K.; Putnam, B.; Schafer, C.; Baker, D. N.

    2015-12-01

    The Magnetospheric MultiScale (MMS) constellation of satellites completed their six month commissioning period in August, 2015 and began science operations. Science operations for the Solving Magnetospheric Acceleration, Reconnection, and Turbulence (SMART) instrument package occur at the Laboratory for Atmospheric and Space Physics (LASP). The Science Data Center (SDC) at LASP is responsible for the data production, management, distribution, and archiving of the data received. The mission will collect several gigabytes per day of particles and field data. Management of these data requires effective selection, transmission, analysis, and storage of data in the ground segment of the mission, including efficient distribution paths to enable the science community to answer the key questions regarding magnetic reconnection. Due to the constraints on download volume, this includes the Scientist-in-the-Loop program that identifies high-value science data needed to answer the outstanding questions of magnetic reconnection. Of particular interest to the community is the tools and associated website we have developed to provide convenient access to the data, first by the mission science team and, beginning March 1, 2016, by the entire community. This presentation will demonstrate the data and tools available to the community via the SDC and discuss the technologies we chose and lessons learned.

  15. Management of the science ground segment for the Euclid mission

    Science.gov (United States)

    Zacchei, Andrea; Hoar, John; Pasian, Fabio; Buenadicha, Guillermo; Dabin, Christophe; Gregorio, Anna; Mansutti, Oriana; Sauvage, Marc; Vuerli, Claudio

    2016-07-01

    Euclid is an ESA mission aimed at understanding the nature of dark energy and dark matter by using simultaneously two probes (weak lensing and baryon acoustic oscillations). The mission will observe galaxies and clusters of galaxies out to z 2, in a wide extra-galactic survey covering 15000 deg2, plus a deep survey covering an area of 40 deg². The payload is composed of two instruments, an imager in the visible domain (VIS) and an imager-spectrometer (NISP) covering the near-infrared. The launch is planned in Q4 of 2020. The elements of the Euclid Science Ground Segment (SGS) are the Science Operations Centre (SOC) operated by ESA and nine Science Data Centres (SDCs) in charge of data processing, provided by the Euclid Consortium (EC), formed by over 110 institutes spread in 15 countries. SOC and the EC started several years ago a tight collaboration in order to design and develop a single, cost-efficient and truly integrated SGS. The distributed nature, the size of the data set, and the needed accuracy of the results are the main challenges expected in the design and implementation of the SGS. In particular, the huge volume of data (not only Euclid data but also ground based data) to be processed in the SDCs will require distributed storage to avoid data migration across SDCs. This paper describes the management challenges that the Euclid SGS is facing while dealing with such complexity. The main aspect is related to the organisation of a geographically distributed software development team. In principle algorithms and code is developed in a large number of institutes, while data is actually processed at fewer centers (the national SDCs) where the operational computational infrastructures are maintained. The software produced for data handling, processing and analysis is built within a common development environment defined by the SGS System Team, common to SOC and ECSGS, which has already been active for several years. The code is built incrementally through

  16. Fostering Data Openness by Enabling Science: A Proposal for Micro-Funding

    Directory of Open Access Journals (Sweden)

    Brian Rappert

    2017-09-01

    Full Text Available In recent years, the promotion of data sharing has come with the recognition that not all scientists around the world are equally placed to partake in such activities. Notably, those within developing countries are sometimes regarded as experiencing hardware infrastructure challenges and data management skill shortages. Proposed remedies often focus on the provision of information and communication technology as well as enhanced data management training. Building on prior empirical social research undertaken in sub-Sahara Africa, this article provides a complementary but alternative proposal; namely, fostering data openness by enabling research. Towards this end, the underlying rationale is outlined for a ‘bottom-up’ system of research support that addresses the day-to-day demands in low-resourced environments. This approach draws on lessons from development financial assistance programs in recent decades. In doing so, this article provides an initial framework for science funding that call for holding together concerns for ensuring research can be undertaken in low-resourced laboratory environments with concerns about the data generated in such settings can be shared.

  17. Establishing Long Term Data Management Research Priorities via a Data Decadal Survey

    Science.gov (United States)

    Wilson, A.; Uhlir, P.; Meyer, C. B.; Robinson, E.

    2013-12-01

    We live in a time of unprecedented collection of and access to scientific data. Improvements in sensor technologies and modeling capabilities are constantly producing new data sources. Data sets are being used for unexpected purposes far from their point of origin, as research spans projects, discipline domains, and temporal and geographic boundaries. The nature of science is evolving, with more open science, open publications, and changes to the nature of peer review and data "publication". Data-intensive, or computational science, has been identified as a new research paradigm. There is recognition that the creation of a data set can be a contribution to science deserving of recognition comparable to other scientific publications. Federally funded projects are generally expected to make their data open and accessible to everyone. In this dynamic environment, scientific progress is ever more dependent on good data management practices and policies. Yet current data management and stewardship practices are insufficient. Data sets created at great, and often public, expense are at risk of being lost for technological or organizational reasons. Insufficient documentation and understanding of data can mean that the data are used incorrectly or not at all. Scientific results are being scrutinized and questioned, and occasionally retracted due to problems in data management. The volume of data is greatly increasing while funding for data management is meager and generally must be found within existing budgets. Many federal government agencies, including NASA, USGS, NOAA and NSF are already making efforts to address data management issues. Executive memos and directives give substantial impetus to those efforts, such as the May 9 Executive Order directing agencies to implement Open Data Policy requirements and regularly report their progress. However, these distributed efforts risk duplicating effort, lack a unifying, long-term strategic vision, and too often work in

  18. Next-Generation Climate Modeling Science Challenges for Simulation, Workflow and Analysis Systems

    Science.gov (United States)

    Koch, D. M.; Anantharaj, V. G.; Bader, D. C.; Krishnan, H.; Leung, L. R.; Ringler, T.; Taylor, M.; Wehner, M. F.; Williams, D. N.

    2016-12-01

    We will present two examples of current and future high-resolution climate-modeling research that are challenging existing simulation run-time I/O, model-data movement, storage and publishing, and analysis. In each case, we will consider lessons learned as current workflow systems are broken by these large-data science challenges, as well as strategies to repair or rebuild the systems. First we consider the science and workflow challenges to be posed by the CMIP6 multi-model HighResMIP, involving around a dozen modeling groups performing quarter-degree simulations, in 3-member ensembles for 100 years, with high-frequency (1-6 hourly) diagnostics, which is expected to generate over 4PB of data. An example of science derived from these experiments will be to study how resolution affects the ability of models to capture extreme-events such as hurricanes or atmospheric rivers. Expected methods to transfer (using parallel Globus) and analyze (using parallel "TECA" software tools) HighResMIP data for such feature-tracking by the DOE CASCADE project will be presented. A second example will be from the Accelerated Climate Modeling for Energy (ACME) project, which is currently addressing challenges involving multiple century-scale coupled high resolution (quarter-degree) climate simulations on DOE Leadership Class computers. ACME is anticipating production of over 5PB of data during the next 2 years of simulations, in order to investigate the drivers of water cycle changes, sea-level-rise, and carbon cycle evolution. The ACME workflow, from simulation to data transfer, storage, analysis and publication will be presented. Current and planned methods to accelerate the workflow, including implementing run-time diagnostics, and implementing server-side analysis to avoid moving large datasets will be presented.

  19. Challenges for data storage in medical imaging research.

    Science.gov (United States)

    Langer, Steve G

    2011-04-01

    Researchers in medical imaging have multiple challenges for storing, indexing, maintaining viability, and sharing their data. Addressing all these concerns requires a constellation of tools, but not all of them need to be local to the site. In particular, the data storage challenges faced by researchers can begin to require professional information technology skills. With limited human resources and funds, the medical imaging researcher may be better served with an outsourcing strategy for some management aspects. This paper outlines an approach to manage the main objectives faced by medical imaging scientists whose work includes processing and data mining on non-standard file formats, and relating those files to the their DICOM standard descendents. The capacity of the approach scales as the researcher's need grows by leveraging the on-demand provisioning ability of cloud computing.

  20. The ethics of data and of data science: an economist's perspective.

    Science.gov (United States)

    Cave, Jonathan

    2016-12-28

    Data collection and modelling are increasingly important in social science and science-based policy, but threaten to crowd out other ways of thinking. Economists recognize that markets embody and shed light on human sentiments. However, their ethical consequences have been difficult to interpret, let alone manage. Although economic mechanisms are changed by data intensity, they can be redesigned to restore their benefits. We conclude with four cautions: if data are good, more may not be better; scientifically desirable data properties may not help policy; consent is a double-edged tool; and data exist only because someone thought to capture and codify them.This article is part of the themed issue 'The ethical impact of data science'. © 2016 The Author(s).

  1. NASA's Earth Science Data Systems Standards Process Experiences

    Science.gov (United States)

    Ullman, Richard E.; Enloe, Yonsook

    2007-01-01

    NASA has impaneled several internal working groups to provide recommendations to NASA management on ways to evolve and improve Earth Science Data Systems. One of these working groups is the Standards Process Group (SPC). The SPG is drawn from NASA-funded Earth Science Data Systems stakeholders, and it directs a process of community review and evaluation of proposed NASA standards. The working group's goal is to promote interoperability and interuse of NASA Earth Science data through broader use of standards that have proven implementation and operational benefit to NASA Earth science by facilitating the NASA management endorsement of proposed standards. The SPC now has two years of experience with this approach to identification of standards. We will discuss real examples of the different types of candidate standards that have been proposed to NASA's Standards Process Group such as OPeNDAP's Data Access Protocol, the Hierarchical Data Format, and Open Geospatial Consortium's Web Map Server. Each of the three types of proposals requires a different sort of criteria for understanding the broad concepts of "proven implementation" and "operational benefit" in the context of NASA Earth Science data systems. We will discuss how our Standards Process has evolved with our experiences with the three candidate standards.

  2. Guidance for Science Data Centers through Understanding Metrics

    Science.gov (United States)

    Moses, J. F.

    2006-12-01

    NASA has built a multi-year set of transaction and user satisfaction information about the evolving, broad collection of earth science products from a diverse set of users of the Earth Observing System Data and Information System (EOSDIS). The transaction and satisfaction trends provide corroborative information to support perception and intuition, and can often be the basis for understanding the results of cross-cutting initiatives and for management decisions about future strategies. The information is available through two fundamental complementary methods, product and user transaction data collected regularly from the major science data centers, and user satisfaction information collected through the American Customer Satisfaction Index survey. The combination provides the fundamental data needed to understand utilization trends in the research community. This paper will update trends based on 2006 metrics from the NASA earth science data centers and results from the 2006 EOSDIS ACSI survey. Principle concepts are explored that lead to sound guidance for data center managers and strategists over the next year.

  3. Data management in maintenance outsourcing

    International Nuclear Information System (INIS)

    Murthy, D.N.P.; Karim, M.R.; Ahmadi, A.

    2015-01-01

    Most businesses view maintenance as tasks carried out by technicians and the data collected is mostly cost related. There is a growing trend towards outsourcing of maintenance and the data collection issues are not addressed properly in most maintenance service contracts. Effective maintenance management requires proper data management - data collection and analysis for decision-making. This requires a proper framework and when maintenance is outsourced it raises several issues and challenges. The paper develops a framework for data management when maintenance is outsourced and looks at a real case study that highlights the need for proper data management. - Highlights: • Framework for data management in maintenance outsourcing. • Black-box to grey-box approaches for modelling. • Improvements to maintenance decision-making. • Case study to illustrate the approaches and the shortcomings in data collection

  4. Building Bridges Between Geoscience and Data Science through Benchmark Data Sets

    Science.gov (United States)

    Thompson, D. R.; Ebert-Uphoff, I.; Demir, I.; Gel, Y.; Hill, M. C.; Karpatne, A.; Güereque, M.; Kumar, V.; Cabral, E.; Smyth, P.

    2017-12-01

    The changing nature of observational field data demands richer and more meaningful collaboration between data scientists and geoscientists. Thus, among other efforts, the Working Group on Case Studies of the NSF-funded RCN on Intelligent Systems Research To Support Geosciences (IS-GEO) is developing a framework to strengthen such collaborations through the creation of benchmark datasets. Benchmark datasets provide an interface between disciplines without requiring extensive background knowledge. The goals are to create (1) a means for two-way communication between geoscience and data science researchers; (2) new collaborations, which may lead to new approaches for data analysis in the geosciences; and (3) a public, permanent repository of complex data sets, representative of geoscience problems, useful to coordinate efforts in research and education. The group identified 10 key elements and characteristics for ideal benchmarks. High impact: A problem with high potential impact. Active research area: A group of geoscientists should be eager to continue working on the topic. Challenge: The problem should be challenging for data scientists. Data science generality and versatility: It should stimulate development of new general and versatile data science methods. Rich information content: Ideally the data set provides stimulus for analysis at many different levels. Hierarchical problem statement: A hierarchy of suggested analysis tasks, from relatively straightforward to open-ended tasks. Means for evaluating success: Data scientists and geoscientists need means to evaluate whether the algorithms are successful and achieve intended purpose. Quick start guide: Introduction for data scientists on how to easily read the data to enable rapid initial data exploration. Geoscience context: Summary for data scientists of the specific data collection process, instruments used, any pre-processing and the science questions to be answered. Citability: A suitable identifier to

  5. A Case Study: Data Management in Biomedical Engineering

    Directory of Open Access Journals (Sweden)

    Glenn R. Gaudette

    2012-01-01

    Full Text Available In a biomedical engineering lab at Worcester Polytechnic Institute, co-author Dr. Glenn R. Gaudette and his research team are investigating the effects of stem cell therapy on the regeneration of function in damaged cardiac tissue in laboratory rats. Each instance of stem cell experimentation on a rat yields hundreds of data sets that must be carefully captured, documented and securely stored so that the data will be easily accessed and retrieved for papers, reports, further research, and validation of findings, while meeting NIH guidelines for data sharing. After a brief introduction to the bioengineering field and stem cell research, this paper focuses on the experimental workflow and the data generated in one instance of stem cell experimentation; the lab’s data management practices; and how Dr. Gaudette teaches data management to the lab’s incoming graduate students each semester. The co-authors discuss the haphazard manner by which engineering and science students typically learn data management practices, and advocate for the integration of formal data management instruction in higher education STEM curricula. The paper concludes with a discussion of the Frameworks for a Data Management Curriculum developed collaboratively by the co-authors’ institutions -- the University of Massachusetts Medical School and Worcester Polytechnic Institute -- to teach data management best practices to students in the sciences, health sciences, and engineering.

  6. Major Management Challenges and Program Risks: Small Business Administration

    Science.gov (United States)

    2001-01-01

    2001 Major Management Challenges and Program Risks Small Business AdministrationGAO-01-260 Form SF298 Citation Data Report Date ("DD MON YYYY...34) 00JAN2001 Report Type N/A Dates Covered (from... to) ("DD MON YYYY") Title and Subtitle Major Management Challenges and Program Risks Small Business ...Administration (SBA) as it seeks to aid, counsel, assist, and protect the interests of the nations small businesses and help businesses and families

  7. Next Generation Workload Management System For Big Data on Heterogeneous Distributed Computing

    International Nuclear Information System (INIS)

    Klimentov, A; Maeno, T; Nilsson, P; Panitkin, S; Wenaus, T; Buncic, P; De, K; Oleynik, D; Petrosyan, A; Jha, S; Mount, R; Porter, R J; Read, K F; Wells, J C; Vaniachine, A

    2015-01-01

    The Large Hadron Collider (LHC), operating at the international CERN Laboratory in Geneva, Switzerland, is leading Big Data driven scientific explorations. Experiments at the LHC explore the fundamental nature of matter and the basic forces that shape our universe, and were recently credited for the discovery of a Higgs boson. ATLAS and ALICE are the largest collaborations ever assembled in the sciences and are at the forefront of research at the LHC. To address an unprecedented multi-petabyte data processing challenge, both experiments rely on a heterogeneous distributed computational infrastructure. The ATLAS experiment uses PanDA (Production and Data Analysis) Workload Management System (WMS) for managing the workflow for all data processing on hundreds of data centers. Through PanDA, ATLAS physicists see a single computing facility that enables rapid scientific breakthroughs for the experiment, even though the data centers are physically scattered all over the world. The scale is demonstrated by the following numbers: PanDA manages O(10 2 ) sites, O(10 5 ) cores, O(10 8 ) jobs per year, O(10 3 ) users, and ATLAS data volume is O(10 17 ) bytes. In 2013 we started an ambitious program to expand PanDA to all available computing resources, including opportunistic use of commercial and academic clouds and Leadership Computing Facilities (LCF). The project titled ‘Next Generation Workload Management and Analysis System for Big Data’ (BigPanDA) is funded by DOE ASCR and HEP. Extending PanDA to clouds and LCF presents new challenges in managing heterogeneity and supporting workflow. The BigPanDA project is underway to setup and tailor PanDA at the Oak Ridge Leadership Computing Facility (OLCF) and at the National Research Center 'Kurchatov Institute' together with ALICE distributed computing and ORNL computing professionals. Our approach to integration of HPC platforms at the OLCF and elsewhere is to reuse, as much as possible, existing components of the

  8. Science Education: Issues, Approaches and Challenges

    Directory of Open Access Journals (Sweden)

    Shairose Irfan Jessani

    2015-06-01

    Full Text Available In today’s global education system, science education is much more than fact-based knowledge. Science education becomes meaningless and incomprehensible for learners, if the learners are unable to relate it with their lives. It is thus recommended that Pakistan, like many other countries worldwide should adopt Science Technology Society (STS approach for delivery of science education. The purpose of the STS approach lies in developing scientifically literate citizens who can make conscious decisions about the socio-scientific issues that impact their lives. The challenges in adopting this approach for Pakistan lie in four areas that will completely need to be revamped according to STS approach. These areas include: the examination system; science textbooks; science teacher education programs; and available resources and school facilities.

  9. Data management for environmental research

    International Nuclear Information System (INIS)

    Strand, R.H.

    1976-01-01

    The objective of managing environmental research data is to develop a resource sufficient for the study and potential solution of environmental problems. Consequently, environmnetal data management must include a broad spectrum of activities ranging from statistical analysis and modeling, through data set archiving to computer hardware procurement. This paper briefly summarizes the data management requirements for environmental research and the techniques and automated procedures which are currently used by the Environmental Sciences Division at Oak Ridge National Laboratory. Included in these requirements are readily retrievable data, data indexed by categories for retrieval and application, data documentation (including collection methods), design and error bounds, easily used analysis and display programs, and file manipulation routines. The statistical analysis system (SAS) and other systems provide the automated procedures and techniques for analysis and management of environmental research data

  10. INTEGRATION CHALLENGES OF UNIVERSITY AND INFORMATION MANAGEMENT SYSTEM (UIMS TO MOODLE

    Directory of Open Access Journals (Sweden)

    Jyldyzbek J. Jakshylykov

    2016-06-01

    Full Text Available Introduction: in 2006 an International Ataturk Ala-Too University (IAAU began to adopt internationally recognised Bologna system, which was initiated by the Ministry of Science and Education of Kyrgyz Republic because of problems of managerial and educational problems at universities in Kyrgyzstan. Consequently, IAAU had to improve its information and grading system and created University Information and Manage¬ment System (UIMS, which was developed by the university professional team including the author. At the same time, the university began to apply a Moodle software, which delivers open source management system programs, in order to appropriately manage teaching proc esses and manage courses online. Materials and Methods: the methodological basis of the research are descriptive method, analysis, and comparison. Results: however, the IAAU faced some challenged issues in the application of the two innovation, which was the integration challenges of those Moodle and UIMS. Hence, the main purpose of this study is to demonstrate benefits of Moodle and UIMS linking challenges. at first, this paper informs Moodle functions, features, advantages and disadvantages in a shortly manner and UIMS management features and primary functions, which included six fundamentally crucial processes with some graphical representations. In addition, the analysis and methodologies of two systems through identifying advantages and disadvantages for the possible integ ration. Discussion and Conclusions: at the end, some challenged issues were identified from analysis results, also Moodle and UIMS benefits were demonstrated in the International Ataturk Ala-Too University.

  11. Management challenges in creating value from business analytics

    OpenAIRE

    Vidgen, Richard; Shaw, S.; Grant, D.G.

    2017-01-01

    The popularity of big data and business analytics has increased tremendously in the last decade and a key challenge for organizations is in understanding how to leverage them to create business value. However, while the literature acknowledges the importance of these topics little work has addressed them from the organization’s point of view. This paper investigates the challenges faced by organizational managers seeking to become more data and information-driven in order to create value. Emp...

  12. Data management for community research projects: A JGOFS case study

    Science.gov (United States)

    Lowry, Roy K.

    1992-01-01

    Since the mid 1980s, much of the marine science research effort in the United Kingdom has been focused into large scale collaborative projects involving public sector laboratories and university departments, termed Community Research Projects. Two of these, the Biogeochemical Ocean Flux Study (BOFS) and the North Sea Project incorporated large scale data collection to underpin multidisciplinary modeling efforts. The challenge of providing project data sets to support the science was met by a small team within the British Oceanographic Data Centre (BODC) operating as a topical data center. The role of the data center was to both work up the data from the ship's sensors and to combine these data with sample measurements into online databases. The working up of the data was achieved by a unique symbiosis between data center staff and project scientists. The project management, programming and data processing skills of the data center were combined with the oceanographic experience of the project communities to develop a system which has produced quality controlled, calibrated data sets from 49 research cruises in 3.5 years of operation. The data center resources required to achieve this were modest and far outweighed by the time liberated in the scientific community by the removal of the data processing burden. Two online project databases have been assembled containing a very high proportion of the data collected. As these are under the control of BODC their long term availability as part of the UK national data archive is assured. The success of the topical data center model for UK Community Research Project data management has been founded upon the strong working relationships forged between the data center and project scientists. These can only be established by frequent personal contact and hence the relatively small size of the UK has been a critical factor. However, projects covering a larger, even international scale could be successfully supported by a

  13. Data-centric Science: New challenges for long-term archives and data publishers

    Science.gov (United States)

    Stockhause, Martina; Lautenschlager, Michael

    2016-04-01

    In the recent years the publication of data has become more and more common. Data and metadata for a single project are often disseminated by multiple data centers in federated data infrastructures. In the same time data is shared earlier to enable collaboration within research projects. The research data environment has become more heterogeneous and the data more dynamic. Only few data or metadata repositories are long-term archives (LTAs) with WDS/DSA certificates, complying to Force 11's 'Joint Declaration of Data Citation Principles'. Therefore for long-term usage of these data and information, a small number of LTAs have the task to preserve these pieces of information. They replicate, connect, quality assure, harmonize, archive, and curate these different types of data from multiple data centers with different operation procedures and data standards. Consortia or federations of certified LTAs are needed to meet the challenges of big data storage and citations. Data publishers play a central role in storing, preserving, and disseminating scientific information. Portals of these federations of LTAs or data registration agencies like DataCite might even become the portals of the future for scientific knowledge discovery. The example CMIP6 is used to illustrate this future perspective of the role of LTAs/data publishers.

  14. A Network Enabled Platform for Canadian Space Science Data

    Science.gov (United States)

    Rankin, R.; Boteler, D. R.; Jayachandran, T. P.; Mann, I. R.; Sofko, G.; Yau, A. W.

    2008-12-01

    The internet is an example of a pervasive disruptive technology that has transformed society on a global scale. The term "cyberinfrastructure" refers to technology underpinning the collaborative aspect of large science projects and is synonymous with terms such as e-Science, intelligent infrastructure, and/or e- infrastructure. In the context of space science, a significant challenge is to exploit the internet and cyberinfrastructure to form effective virtual organizations (VOs) of scientists that have common or agreed- upon objectives. A typical VO is likely to include universities and government agencies specializing in types of instrumentation (ground and/or space based), which in deployment produce large quantities of space data. Such data is most effectively described by metadata, which if defined in a standard way, facilitates discovery and retrieval of data over the internet by intelligent interfaces and cyberinfrastructure. One recent and significant approach is SPASE, which is being developed by NASA as a data-standard for its Virtual Observatories (VxOs) programs. The space science community in Canada has recently formed a VO designed to complement the e-POP microsatellite mission, and new ground-based observatories (GBOs) that collect data over a large fraction of the Canadian land-mass. The VO includes members of the CGSM community (www.cgsm.ca), which is funded operationally by the Canadian Space Agency. It also includes the UCLA VMO team, and scientists in the NASA THEMIS mission. CANARIE (www.canarie.ca), the federal agency responsible for management, design and operation of Canada's research internet, has recently recognized the value of cyberinfrastucture through the creation of a Network-Enabled-Platforms (NEPs) program. An NEP for space science was funded by CANARIE in its first competition. When fully implemented, the Space Science NEP will consist of a front-end portal providing access to CGSM data. It will utilize an adaptation of the SPASE

  15. NASA's Earth Science Flight Program Meets the Challenges of Today and Tomorrow

    Science.gov (United States)

    Ianson, Eric E.

    2016-01-01

    NASA's Earth science flight program is a dynamic undertaking that consists of a large fleet of operating satellites, an array of satellite and instrument projects in various stages of development, a robust airborne science program, and a massive data archiving and distribution system. Each element of the flight program is complex and present unique challenges. NASA builds upon its successes and learns from its setbacks to manage this evolving portfolio to meet NASA's Earth science objectives. NASA fleet of 16 operating missions provide a wide range of scientific measurements made from dedicated Earth science satellites and from instruments mounted to the International Space Station. For operational missions, the program must address issues such as an aging satellites operating well beyond their prime mission, constellation flying, and collision avoidance with other spacecraft and orbital debris. Projects in development are divided into two broad categories: systematic missions and pathfinders. The Earth Systematic Missions (ESM) include a broad range of multi-disciplinary Earth-observing research satellite missions aimed at understanding the Earth system and its response to natural and human-induced forces and changes. Understanding these forces will help determine how to predict future changes, and how to mitigate or adapt to these changes. The Earth System Science Pathfinder (ESSP) program provides frequent, regular, competitively selected Earth science research opportunities that accommodate new and emerging scientific priorities and measurement capabilities. This results in a series of relatively low-cost, small-sized investigations and missions. Principal investigators whose scientific objectives support a variety of studies lead these missions, including studies of the atmosphere, oceans, land surface, polar ice regions, or solid Earth. This portfolio of missions and investigations provides opportunity for investment in innovative Earth science that enhances

  16. Giovanni - The Bridge Between Data and Science

    Science.gov (United States)

    Liu, Zhong; Acker, James

    2017-01-01

    This article describes new features in the Geospatial Interactive Online Visualization ANd aNalysis Infrastructure (Giovanni), a user-friendly online tool that enables visualization, analysis, and assessment of NASA Earth science data sets without downloading data and software. Since the satellite era began, data collected from Earth-observing satellites have been widely used in research and applications; however, using satellite-based data sets can still be a challenge to many. To facilitate data access and evaluation, as well as scientific exploration and discovery, the NASA Goddard Earth Sciences (GES) Data and Information Services Center (DISC) has developed Giovanni for a wide range of users around the world. This article describes the latest capabilities of Giovanni with examples, and discusses future plans for this innovative system.

  17. Symposium 1: Challenges in science education and popularization of Science

    Directory of Open Access Journals (Sweden)

    Ildeo de Castro Moreira

    2014-08-01

    Full Text Available Science education and popularization of science are important elements for social inclusion. The Brazil exhibits strong inequalities regarding the distribution of wealth, access to cultural assets and appropriation of scientific and technological knowledge. Each Brazilian should have the opportunity to acquire a basic knowledge of science and its operation that allow them to understand their environment and expand their professional opportunities. However, the overall performance of Brazilian students in science and math is bad. The basic science education has, most often, few resources and is discouraging, with little appreciation of experimentation, interdisciplinarity and creativity. Beside the shortage of science teachers, especially teachers with good formation, predominate poor wage and working conditions, and deficiencies in instructional materials and laboratories. If there was a significant expansion in access to basic education, the challenge remains to improve their quality. According to the last National Conference of STI, there is need of a profound educational reform at all levels, in particular with regard to science education. Already, the popularization of science can be an important tool for the construction of scientific culture and refinement of the formal teaching instrument. However, we still lack a comprehensive and adequate public policy to her intended. Clearly, in recent decades, an increase in scientific publication occurred: creating science centers and museums; greater media presence; use of the internet and social networks; outreach events, such as the National Week of CT. But the scenario is shown still fragile and limited to broad swathes of Brazilians without access to scientific education and qualified information on CT. In this presentation, from a general diagnosis of the situation, some of the main challenges related to education and popularization of science in the country will address herself.

  18. Microbiome Data Science: Understanding Our Microbial Planet.

    Science.gov (United States)

    Kyrpides, Nikos C; Eloe-Fadrosh, Emiley A; Ivanova, Natalia N

    2016-06-01

    Microbiology is experiencing a revolution brought on by recent developments in sequencing technology. The unprecedented volume of microbiome data being generated poses significant challenges that are currently hindering progress in the field. Here, we outline the major bottlenecks and propose a vision to advance microbiome research as a data-driven science. Copyright © 2016 Elsevier Ltd. All rights reserved.

  19. The science and management of sex verification in sport | Tucker ...

    African Journals Online (AJOL)

    The verification of gender eligibility in sporting competition poses a biological and management challenge for sports science and medicine, as well as for sporting authorities. It has been established that in most sporting events, the strength and power advantage possessed by males as a result of the virilising action of ...

  20. Data and Network Science for Noisy Heterogeneous Systems

    Science.gov (United States)

    Rider, Andrew Kent

    2013-01-01

    Data in many growing fields has an underlying network structure that can be taken advantage of. In this dissertation we apply data and network science to problems in the domains of systems biology and healthcare. Data challenges in these fields include noisy, heterogeneous data, and a lack of ground truth. The primary thesis of this work is that…

  1. Plagiarism challenges at Ukrainian science and education

    Directory of Open Access Journals (Sweden)

    Denys Svyrydenko

    2016-12-01

    Full Text Available The article analyzes the types and severity of plagiarism violations at the modern educational and scientific spheres using the philosophic methodological approaches. The author analyzes Ukrainian context as well as global one and tries to formulate "order of the day" of plagiarism challenges. The plagiarism phenomenon is intuitively comprehensible for academicians but in reality it has a very complex nature and a lot of manifestation. Using approaches of ethics, philosophical anthropology, philosophy of science and education author formulates the series of recommendation for overcoming of plagiarism challenges at Ukrainian science and education.

  2. Managing globally distributed expertise with new competence management solutions: a big-science collaboration as a pilot case.

    OpenAIRE

    Ferguson, J; Koivula, T; Livan, M; Nordberg, M; Salmia, T; Vuola, O

    2003-01-01

    In today's global organisations and networks, a critical factor for effective innovation and project execution is appropriate competence and skills management. The challenges include selection of strategic competences, competence development, and leveraging the competences and skills to drive innovation and collaboration for shared goals. This paper presents a new industrial web-enabled competence management and networking solution and its implementation and piloting in a complex big-science ...

  3. 10th International Conference on Management Science and Engineering Management

    CERN Document Server

    Hajiyev, Asaf; Nickel, Stefan; Gen, Mitsuo

    2017-01-01

    This book presents the proceedings of the Tenth International Conference on Management Science and Engineering Management (ICMSEM2016) held from August 30 to September 02, 2016 at Baku, Azerbaijan and organized by the International Society of Management Science and Engineering Management, Sichuan University (Chengdu, China) and Ministry of Education of Azerbaijan. The aim of conference was to foster international research collaborations in management science and engineering management as well as to provide a forum to present current research findings. The presented papers were selected and reviewed by the Program Committee, made up of respected experts in the area of management science and engineering management from around the globe. The contributions focus on identifying management science problems in engineering, innovatively using management theory and methods to solve engineering problems effectively and establishing novel management theories and methods to address new engineering management issues.

  4. Major ecosystems in China: dynamics and challenges for sustainable management.

    Science.gov (United States)

    Lü, Yihe; Fu, Bojie; Wei, Wei; Yu, Xiubo; Sun, Ranhao

    2011-07-01

    Ecosystems, though impacted by global environmental change, can also contribute to the adaptation and mitigation of such large scale changes. Therefore, sustainable ecosystem management is crucial in reaching a sustainable future for the biosphere. Based on the published literature and publicly accessible data, this paper discussed the status and trends of forest, grassland, and wetland ecosystems in China that play important roles in the ecological integrity and human welfare of the nation. Ecological degradation has been observed in these ecosystems at various levels and geographic locations. Biophysical (e.g., climate change) and socioeconomic factors (e.g., intensive human use) are the main reasons for ecosystem degradation with the latter factors serving as the dominant driving forces. The three broad categories of ecosystems in China have partially recovered from degradation thanks to large scale ecological restoration projects implemented in the last few decades. China, as the largest and most populated developing nation, still faces huge challenges regarding ecosystem management in a changing and globalizing world. To further improve ecosystem management in China, four recommendations were proposed, including: (1) advance ecosystem management towards an application-oriented, multidisciplinary science; (2) establish a well-functioning national ecological monitoring and data sharing mechanism; (3) develop impact and effectiveness assessment approaches for policies, plans, and ecological restoration projects; and (4) promote legal and institutional innovations to balance the intrinsic needs of ecological and socioeconomic systems. Any change in China's ecosystem management approach towards a more sustainable one will benefit the whole world. Therefore, international collaborations on ecological and environmental issues need to be expanded.

  5. A Framework for the Strategic Management of Science & Technology Parks

    Directory of Open Access Journals (Sweden)

    Juliane Ribeiro

    2016-12-01

    Full Text Available Science and technology parks (STPs have been playing an increasingly influential role in the stimulation and growth of the knowledge economy. However, the spread of STPs faces relevant challenges, such as the development of robust performance management systems, able to demonstrate results and indicate improvement opportunities. Thereby, this paper proposes a theoretical model of performance management, which combines premises of the Service-Dominant Logic (S-D Logic, the Balanced Scorecard (BSC and the General Hierarchical Model (GHM. Based on a multiple-case exploratory and qualitative study, relevant information about the strategic planning and management of these projects were extracted and paved the way for the construction of a performance hierarchical model composed of five perspectives, according to the BSC. Considering the outcomes, it is expected that the proposed model provide useful insights for the consolidation of a framework for the strategic management of science and technology parks.

  6. NSF-Sponsored Biological and Chemical Oceanography Data Management Office

    Science.gov (United States)

    Allison, M. D.; Chandler, C. L.; Copley, N.; Galvarino, C.; Gegg, S. R.; Glover, D. M.; Groman, R. C.; Wiebe, P. H.; Work, T. T.; Biological; Chemical Oceanography Data Management Office

    2010-12-01

    Ocean biogeochemistry and marine ecosystem research projects are inherently interdisciplinary and benefit from improved access to well-documented data. Improved data sharing practices are important to the continued exploration of research themes that are a central focus of the ocean science community and are essential to interdisciplinary and international collaborations that address complex, global research themes. In 2006, the National Science Foundation Division of Ocean Sciences (NSF OCE) funded the Biological and Chemical Oceanography Data Management Office (BCO-DMO) to serve the data management requirements of scientific investigators funded by the National Science Foundation’s Biological and Chemical Oceanography Sections. BCO-DMO staff members work with investigators to manage marine biogeochemical, ecological, and oceanographic data and information developed in the course of scientific research. These valuable data sets are documented, stored, disseminated, and protected over short and intermediate time frames. One of the goals of the BCO-DMO is to facilitate regional, national, and international data and information exchange through improved data discovery, access, display, downloading, and interoperability. In May 2010, NSF released a statement to the effect that in October 2010, it is planning to require that all proposals include a data management plan in the form of a two-page supplementary document. The data management plan would be an element of the merit review process. NSF has long been committed to making data from NSF-funded research publicly available and the new policy will strengthen this commitment. BCO-DMO is poised to assist in creating the data management plans and in ultimately serving the data and information resulting from NSF OCE funded research. We will present an overview of the data management system capabilities including: geospatial and text-based data discovery and access systems; recent enhancements to data search tools; data

  7. Challenges in Request Management

    DEFF Research Database (Denmark)

    Sommer, Anita Friis

    2014-01-01

    and its customers. The study provides an insight into a new area of supply chain management, including the process activity flow and challenges involved across the process. Furthermore, the method is dyadic including the customer in the case study, which is rare in related research....... profitability. This research study seeks to investigate the challenges of RQM in practice. Existing demand chain management literature is used as a basis for developing a RQM framework. RQM is investigated through an explorative research design in a dyadic B2B case study including a global industrial company......Request management (RQM) is a new term used for managing customer requests for new products. It is the counterpart to typical product development processes, which has no direct customer involvement. It is essential to manage customer requests in a structured and efficient way to obtain...

  8. The Community for Data Integration (CDI): Building Knowledge, Networks, and Integrated Science Capacity

    Science.gov (United States)

    Hsu, L.

    2017-12-01

    In 2009, the U.S. Geological Survey determined that a focused effort on data integration was necessary to capture the full scientific potential of its topically and geographically diverse data assets. The Community for Data Integration was established to fill this role, and an emphasis emerged on grassroots learning and solving of shared data integration and management challenges. Now, eight years later, the CDI has grown to over 700 members and runs monthly presentations, working groups, special training events, and an annual USGS-wide grants program. With a diverse membership of scientists, technologists, data managers, program managers, and others, there are a wide range of motivations and interests competing to drive the direction of the community. Therefore, an important role of the community coordinators is to prioritize member interests while valuing and considering many different viewpoints. To do this, new tools and mechanisms are frequently introduced to circulate information and obtain community input and feedback. The coordinators then match community interests with opportunities to address USGS priorities. As a result, the community has facilitated the implementation of USGS-wide data policies and data management procedures, produced guidelines and lessons learned for technologies like mobile applications and use of semantic web technologies, and developed technical recommendations to enable integrated science capacity for USGS leadership.

  9. Big data - a 21st century science Maginot Line? No-boundary thinking: shifting from the big data paradigm.

    Science.gov (United States)

    Huang, Xiuzhen; Jennings, Steven F; Bruce, Barry; Buchan, Alison; Cai, Liming; Chen, Pengyin; Cramer, Carole L; Guan, Weihua; Hilgert, Uwe Kk; Jiang, Hongmei; Li, Zenglu; McClure, Gail; McMullen, Donald F; Nanduri, Bindu; Perkins, Andy; Rekepalli, Bhanu; Salem, Saeed; Specker, Jennifer; Walker, Karl; Wunsch, Donald; Xiong, Donghai; Zhang, Shuzhong; Zhang, Yu; Zhao, Zhongming; Moore, Jason H

    2015-01-01

    Whether your interests lie in scientific arenas, the corporate world, or in government, you have certainly heard the praises of big data: Big data will give you new insights, allow you to become more efficient, and/or will solve your problems. While big data has had some outstanding successes, many are now beginning to see that it is not the Silver Bullet that it has been touted to be. Here our main concern is the overall impact of big data; the current manifestation of big data is constructing a Maginot Line in science in the 21st century. Big data is not "lots of data" as a phenomena anymore; The big data paradigm is putting the spirit of the Maginot Line into lots of data. Big data overall is disconnecting researchers and science challenges. We propose No-Boundary Thinking (NBT), applying no-boundary thinking in problem defining to address science challenges.

  10. Nuclear Data for Astrophysics: Resources, Challenges, Strategies, and Software Solutions

    International Nuclear Information System (INIS)

    Smith, Michael Scott; Lingerfelt, Eric J.; Nesaraja, Caroline D.; Hix, William Raphael; Roberts, Luke F.; Koura, Hiroyuki; Fuller, George M.; Tytler, David

    2008-01-01

    One of the most exciting utilizations of nuclear data is to help unlock the mysteries of the Cosmos -- the creation of the chemical elements, the evolution and explosion of stars, and the origin and fate of the Universe. There are now many nuclear data sets, tools, and other resources online to help address these important questions. However, numerous serious challenges make it important to develop strategies now to ensure a sustainable future for this work. A number of strategies are advocated, including: enlisting additional manpower to evaluate the newest data; devising ways to streamline evaluation activities; and improving communication and coordination between existing efforts. Software projects are central to some of these strategies. Examples include: creating a virtual 'pipeline' leading from the nuclear laboratory to astrophysics simulations; improving data visualization and management to get the most science out of the existing datasets; and creating a nuclear astrophysics data virtual (online) community. Recent examples will be detailed, including the development of two first-generation software pipelines, the Computational Infrastructure for Nuclear Astrophysics for stellar astrophysics and the bigbangonline suite of codes for cosmology, and the coupling of nuclear data to sensitivity studies with astrophysical simulation codes to guide future research.

  11. The data deluge can libraries cope with e-science ?

    CERN Document Server

    Marcum, Deanna B

    2010-01-01

    From the frontiers of contemporary information science research comes this helpful and timely volume for libraries preparing for the deluge of data that E-science can deliver to their patrons and institutions. The Data Deluge: Can Libraries Cope with E-Science? brings together nine of the world's foremost authorities on the capabilities and requirements of E-science, offering their perspectives to librarians hoping to develop similar programs for their own institutions. The essays contained in The Data Deluge were adapted from papers first delivered at the prestigious annual Library Round Table at the Kanazawa Institute of Technology, where E-science has been the theme from the past two annual conferences. Now this groundbreaking work is available in convenient printed format for the first time. The essays are divided into three parts: an overview of E-science challenges for libraries; perspectives on E-science; and perspectives from individual research libraries.

  12. 9th International Conference on Management Science and Engineering Management

    CERN Document Server

    Nickel, Stefan; Machado, Virgilio; Hajiyev, Asaf

    2015-01-01

    This is the Proceedings of the Ninth International Conference on Management Science and Engineering Management (ICMSEM) held from July 21-23, 2015 at Karlsruhe, Germany. The goals of the conference are to foster international research collaborations in Management Science and Engineering Management as well as to provide a forum to present current findings. These proceedings cover various areas in management science and engineering management. It focuses on the identification of management science problems in engineering and innovatively using management theory and methods to solve engineering problems effectively. It also establishes a new management theory and methods based on experience of new management issues in engineering. Readers interested in the fields of management science and engineering management will benefit from the latest cutting-edge innovations and research advances presented in these proceedings and will find new ideas and research directions. A total number of 132 papers from 15 countries a...

  13. Invasive Species Science Branch: research and management tools for controlling invasive species

    Science.gov (United States)

    Reed, Robert N.; Walters, Katie D.

    2015-01-01

    Invasive, nonnative species of plants, animals, and disease organisms adversely affect the ecosystems they enter. Like “biological wildfires,” they can quickly spread and affect nearly all terrestrial and aquatic ecosystems. Invasive species have become one of the greatest environmental challenges of the 21st century in economic, environmental, and human health costs, with an estimated effect in the United States of more than $120 billion per year. Managers of the Department of the Interior and other public and private lands often rank invasive species as their top resource management problem. The Invasive Species Science Branch of the Fort Collins Science Center provides research and technical assistance relating to management concerns for invasive species, including understanding how these species are introduced, identifying areas vulnerable to invasion, forecasting invasions, and developing control methods. To disseminate this information, branch scientists are developing platforms to share invasive species information with DOI cooperators, other agency partners, and the public. From these and other data, branch scientists are constructing models to understand and predict invasive species distributions for more effective management. The branch also has extensive herpetological and population biology expertise that is applied to harmful reptile invaders such as the Brown Treesnake on Guam and Burmese Python in Florida.

  14. Physical Science Informatics: Providing Open Science Access to Microheater Array Boiling Experiment Data

    Science.gov (United States)

    McQuillen, John; Green, Robert D.; Henrie, Ben; Miller, Teresa; Chiaramonte, Fran

    2014-01-01

    The Physical Science Informatics (PSI) system is the next step in this an effort to make NASA sponsored flight data available to the scientific and engineering community, along with the general public. The experimental data, from six overall disciplines, Combustion Science, Fluid Physics, Complex Fluids, Fundamental Physics, and Materials Science, will present some unique challenges. Besides data in textual or numerical format, large portions of both the raw and analyzed data for many of these experiments are digital images and video, requiring large data storage requirements. In addition, the accessible data will include experiment design and engineering data (including applicable drawings), any analytical or numerical models, publications, reports, and patents, and any commercial products developed as a result of the research. This objective of paper includes the following: Present the preliminary layout (Figure 2) of MABE data within the PSI database. Obtain feedback on the layout. Present the procedure to obtain access to this database.

  15. The International Data Sharing Challenge: Realities and Lessons Learned from International Field Projects and Data Analysis Efforts

    Science.gov (United States)

    Williams, S. F.; Moore, J. A.

    2014-12-01

    One of the major challenges facing science in general is how foster trust and cooperation between nations that then allows the free and open exchange of data. The rich data coming from many nations conducting Arctic research must be allowed to be brought together to understand and assess the huge changes now underway in the Arctic regions. The NCAR Earth Observing Laboratory has been supporting a variety of international field process studies and WCRP sponsored international projects that require international data collection and exchange in order to be successful. Some of the programs include the Surface Heat Budget of the Arctic (SHEBA) International Tundra Experiment (ITEX), the Arctic Climate Systems Study (ACSYS), the Distributed Biological Observatory (DBO), and the Coordinated Energy and water-cycle Observations Project (CEOP) to name a few. EOL played a major role in the data management of these projects, but the CEOP effort in particular involved coordinating common site documentation and data formatting across a global network (28 sites). All these unique projects occurred over 25 years but had similar challenges in the international collection, archival, and access to the rich datasets that are their legacy. The Belmont Forum offers as its main challenge to deliver knowledge needed for action to avoid or adapt to environmental change. One of their major themes is related to the study of these changes in the Arctic. The development of capable e-infrastructure (technologies and groups supporting international collaborative environments networks and data centers) to allow access to large diverse data collections is key to meeting this challenge. The reality of meeting this challenge, however, is something much more difficult. The authors will provide several specific examples of successes and failures when trying to meet the needs of an international community of researchers specifically related to Belmont Forum Work Package Themes regarding standards of

  16. Challenges in combining different data sets during analysis when using grounded theory.

    Science.gov (United States)

    Rintala, Tuula-Maria; Paavilainen, Eija; Astedt-Kurki, Päivi

    2014-05-01

    To describe the challenges in combining two data sets during grounded theory analysis. The use of grounded theory in nursing research is common. It is a suitable method for studying human action and interaction. It is recommended that many alternative sources of data are collected to create as rich a dataset as possible. Data from interviews with people with diabetes (n=19) and their family members (n=19). Combining two data sets. When using grounded theory, there are numerous challenges in collecting and managing data, especially for the novice researcher. One challenge is to combine different data sets during the analysis. There are many methodological textbooks about grounded theory but there is little written in the literature about combining different data sets. Discussion is needed on the management of data and the challenges of grounded theory. This article provides a means for combining different data sets in the grounded theory analysis process.

  17. From Utopia to Science: Challenges of Personalised Genomics Information for Health Management and Health Enhancement

    Science.gov (United States)

    2009-01-01

    From 1900 onwards, scientists and novelists have explored the contours of a future society based on the use of “anthropotechnologies” (techniques applicable to human beings for the purpose of performance enhancement ranging from training and education to genome-based biotechnologies). Gradually but steadily, the technologies involved migrated from (science) fiction into scholarly publications, and from “utopia” (or “dystopia”) into science. Building on seminal ideas borrowed from Nietzsche, Peter Sloterdijk has outlined the challenges inherent in this development. Since time immemorial, and at least since the days of Plato’s Academy, human beings have been interested in possibilities for (physical or mental) performance enhancement. We are constantly trying to improve ourselves, both collectively and individually, for better or for worse. At present, however, new genomics-based technologies are opening up new avenues for self-amelioration. Developments in research facilities using animal models may to a certain extent be seen as expeditions into our own future. Are we able to address the bioethical and biopolitical issues awaiting us? After analyzing and assessing Sloterdijk’s views, attention will shift to a concrete domain of application, namely sport genomics. For various reasons, top athletes are likely to play the role of genomics pioneers by using personalized genomics information to adjust diet, life-style, training schedules and doping intake to the strengths and weaknesses of their personalized genome information. Thus, sport genomics may be regarded as a test bed where the contours of genomics-based self-management are tried out. PMID:20234832

  18. Materials Data Science: Current Status and Future Outlook

    Science.gov (United States)

    Kalidindi, Surya R.; De Graef, Marc

    2015-07-01

    The field of materials science and engineering is on the cusp of a digital data revolution. After reviewing the nature of data science and Big Data, we discuss the features of materials data that distinguish them from data in other fields. We introduce the concept of process-structure-property (PSP) linkages and illustrate how the determination of PSPs is one of the main objectives of materials data science. Then we review a selection of materials databases, as well as important aspects of materials data management, such as storage hardware, archiving strategies, and data access strategies. We introduce the emerging field of materials data analytics, which focuses on data-driven approaches to extract and curate materials knowledge from available data sets. The critical need for materials e-collaboration platforms is highlighted, and we conclude the article with a number of suggestions regarding the near-term future of the materials data science field.

  19. Establishing ecological and social continuities: new challenges to optimize urban watershed management

    Science.gov (United States)

    Mitroi, V.; de Coninck, A.; Vinçon-Leite, B.; Deroubaix, J.-F.

    2014-09-01

    The (re)construction of the ecological continuity is stated as one of the main objectives of the European Water Framework Directive for watershed management in Europe. Analysing the social, political, technical and scientific processes characterising the implementation of different projects of ecological continuity in two adjacent peri-urban territories in Ile-de-France, we observed science-driven approaches disregarding the social contexts. We show that, in urbanized areas, ecological continuity requires not only important technical and ecological expertise, but also social and political participation to the definition of a common vision and action plan. Being a challenge for both, technical water management institutions and "classical" ecological policies, we propose some social science contributions to deal with ecological unpredictability and reconsider stakeholder resistance to this kind of project.

  20. Challenges in managing and sustaining urban slum health ...

    African Journals Online (AJOL)

    Challenges in managing and sustaining urban slum health programmes in Kenya. ... These were hardly implemented in the projects, according to the data gathered. ... Conclusion: Land and income were big issues according to the responses.

  1. Between Scylla and Charybdis: reconciling competing data management demands in the life sciences.

    Science.gov (United States)

    Bezuidenhout, Louise M; Morrison, Michael

    2016-05-17

    The widespread sharing of biologicaConcluding Comments: Teaching Responsible Datal and biomedical data is recognised as a key element in facilitating translation of scientific discoveries into novel clinical applications and services. At the same time, twenty-first century states are increasingly concerned that this data could also be used for purposes of bioterrorism. There is thus a tension between the desire to promote the sharing of data, as encapsulated by the Open Data movement, and the desire to prevent this data from 'falling into the wrong hands' as represented by 'dual use' policies. Both frameworks posit a moral duty for life sciences researchers with respect to how they should make their data available. However, Open data and dual use concerns are rarely discussed in concert and their implementation can present scientists with potentially conflicting ethical requirements. Both dual use and Open data policies frame scientific data and data dissemination in particular, though different, ways. As such they contain implicit models for how data is translated. Both approaches are limited by a focus on abstract conceptions of data and data sharing. This works to impede consensus-building between the two ethical frameworks. As an alternative, this paper proposes that an ethics of responsible management of scientific data should be based on a more nuanced understanding of the everyday data practices of life scientists. Responsibility for these 'micromovements' of data must consider the needs and duties of scientists as individuals and as collectively-organised groups. Researchers in the life sciences are faced with conflicting ethical responsibilities to share data as widely as possible, but prevent it being used for bioterrorist purposes. In order to reconcile the responsibilities posed by the Open Data and dual use frameworks, approaches should focus more on the everyday practices of laboratory scientists and less on abstract conceptions of data.

  2. Homeland Security. Management Challenges Facing Federal Leadership

    Science.gov (United States)

    2002-12-01

    Security Management Challenges Facing Federal Leadership 5a. CONTRACT NUMBER 5b. GRANT NUMBER 5c. PROGRAM ELEMENT NUMBER 6. AUTHOR(S) 5d. PROJECT...including attention to management practices and key success factors. HOMELAND SECURITY Management Challenges Facing Federal Leadership www.gao.gov/cgi...significant management and coordination challenges if it is to provide this leadership and be successful in preventing and responding to any future

  3. The role of administrative data in the big data revolution in social science research.

    Science.gov (United States)

    Connelly, Roxanne; Playford, Christopher J; Gayle, Vernon; Dibben, Chris

    2016-09-01

    The term big data is currently a buzzword in social science, however its precise meaning is ambiguous. In this paper we focus on administrative data which is a distinctive form of big data. Exciting new opportunities for social science research will be afforded by new administrative data resources, but these are currently under appreciated by the research community. The central aim of this paper is to discuss the challenges associated with administrative data. We emphasise that it is critical for researchers to carefully consider how administrative data has been produced. We conclude that administrative datasets have the potential to contribute to the development of high-quality and impactful social science research, and should not be overlooked in the emerging field of big data. Copyright © 2016 The Authors. Published by Elsevier Inc. All rights reserved.

  4. Developing and Teaching a Two-Credit Data Management Course for Graduate Students in Climate and Space Sciences

    Science.gov (United States)

    Thielen, Joanna; Samuel, Sara M.; Carlson, Jake; Moldwin, Mark

    2017-01-01

    Engineering researchers face increasing pressure to manage, share, and preserve their data, but these subjects are not typically a part of the curricula of engineering graduate programs. To address this situation, librarians at the University of Michigan, in partnership with the Climate and Space Sciences and Engineering Department, developed a…

  5. Cloud computing with e-science applications

    CERN Document Server

    Terzo, Olivier

    2015-01-01

    The amount of data in everyday life has been exploding. This data increase has been especially significant in scientific fields, where substantial amounts of data must be captured, communicated, aggregated, stored, and analyzed. Cloud Computing with e-Science Applications explains how cloud computing can improve data management in data-heavy fields such as bioinformatics, earth science, and computer science. The book begins with an overview of cloud models supplied by the National Institute of Standards and Technology (NIST), and then:Discusses the challenges imposed by big data on scientific

  6. Data use and information creation: challenges for marine scientists and for managers.

    Science.gov (United States)

    Hiscock, Keith; Elliott, Michael; Laffoley, Dan; Rogers, Stuart

    2003-05-01

    In the coastal waters of European countries and in the offshore waters of the north-east Atlantic, there is an increasing need for scientists to meet challenging objectives, such as to identify meaningful measures of 'quality', and to recommend 'indicators' to underpin implementation of directives, conventions, statutes and other more informal national and international initiatives. Those indicators may relate to particular species or habitats, to changes in physical and chemical characteristics, and even to the use to which the system is put. The problems to be overcome are difficult, but new and developing approaches will make a significant contribution. The approaches include: criteria to identify 'sensitivity' and 'importance', structures to organise information and electronic information resources to access data. The real challenge is to make the results of the various scientific initiatives relevant to and understandable by a wide range of customers with similar overlapping requirements, and thus make a genuine contribution to protecting the marine environment. Above and beyond that is the need for scientists to drive the agenda to enable real and lasting progress to be made towards ecosystem-based management of our seas and a proper consideration of what 'sustainability' may mean in the marine environment and how we utilise its resources.

  7. Data use and information creation: challenges for marine scientists and for managers

    International Nuclear Information System (INIS)

    Hiscock, Keith; Elliott, Michael; Laffoley, Dan; Rogers, Stuart

    2003-01-01

    In the coastal waters of European countries and in the offshore waters of the north-east Atlantic, there is an increasing need for scientists to meet challenging objectives, such as to identify meaningful measures of 'quality', and to recommend 'indicators' to underpin implementation of directives, conventions, statutes and other more informal national and international initiatives. Those indicators may relate to particular species or habitats, to changes in physical and chemical characteristics, and even to the use to which the system is put. The problems to be overcome are difficult, but new and developing approaches will make a significant contribution. The approaches include: criteria to identify 'sensitivity' and 'importance', structures to organise information and electronic information resources to access data. The real challenge is to make the results of the various scientific initiatives relevant to and understandable by a wide range of customers with similar overlapping requirements, and thus make a genuine contribution to protecting the marine environment. Above and beyond that is the need for scientists to drive the agenda to enable real and lasting progress to be made towards ecosystem-based management of our seas and a proper consideration of what 'sustainability' may mean in the marine environment and how we utilise its resources

  8. Data management on the fusion computational pipeline

    International Nuclear Information System (INIS)

    Klasky, S; Beck, M; Bhat, V; Feibush, E; Ludaescher, B; Parashar, M; Shoshani, A; Silver, D; Vouk, M

    2005-01-01

    Fusion energy science, like other science areas in DOE, is becoming increasingly data intensive and network distributed. We discuss data management techniques that are essential for scientists making discoveries from their simulations and experiments, with special focus on the techniques and support that Fusion Simulation Project (FSP) scientists may need. However, the discussion applies to a broader audience since most of the fusion SciDAC's, and FSP proposals include a strong data management component. Simulations on ultra scale computing platforms imply an ability to efficiently integrate and network heterogeneous components (computational, storage, networks, codes, etc), and to move large amounts of data over large distances. We discuss the workflow categories needed to support such research as well as the automation and other aspects that can allow an FSP scientist to focus on the science and spend less time tending information technology

  9. ICSU and the Challanges of Data and Information Management for International Science

    Directory of Open Access Journals (Sweden)

    Peter Fox

    2013-01-01

    Full Text Available The International Council for Science (ICSU vision explicitly recognises the value of data and information to science and particularly emphasises the urgent requirement for universal and equitable access to high quality scientific data and information. A universal public domain for scientific data and information will be transformative for both science and society. Over the last several years, two ad-hoc ICSU committees, the Strategic Committee on Information and Data (SCID and the Strategic Coordinating Committee on Information and Data (SCCID, produced key reports that make 5 and 14 recommendations respectively aimed at improving universal and equitable access to data and information for science and providing direction for key international scientific bodies, such as the Committee on Data for Science and Technology (CODATA as well as a newly ratified (by ICSU in 2008 formation of the World Data System. This contribution outlines the framing context for both committees based on the changed world scene for scientific data conduct in the 21st century. We include details on the relevant recommendations and important consequences for the worldwide community of data providers and consumers, ultimately leading to a conclusion, and avenues for advancement that must be carried to the many thousands of data scientists world-wide.

  10. Data Quality Control: Challenges, Methods, and Solutions from an Eco-Hydrologic Instrumentation Network

    Science.gov (United States)

    Eiriksson, D.; Jones, A. S.; Horsburgh, J. S.; Cox, C.; Dastrup, D.

    2017-12-01

    Over the past few decades, advances in electronic dataloggers and in situ sensor technology have revolutionized our ability to monitor air, soil, and water to address questions in the environmental sciences. The increased spatial and temporal resolution of in situ data is alluring. However, an often overlooked aspect of these advances are the challenges data managers and technicians face in performing quality control on millions of data points collected every year. While there is general agreement that high quantities of data offer little value unless the data are of high quality, it is commonly understood that despite efforts toward quality assurance, environmental data collection occasionally goes wrong. After identifying erroneous data, data managers and technicians must determine whether to flag, delete, leave unaltered, or retroactively correct suspect data. While individual instrumentation networks often develop their own QA/QC procedures, there is a scarcity of consensus and literature regarding specific solutions and methods for correcting data. This may be because back correction efforts are time consuming, so suspect data are often simply abandoned. Correction techniques are also rarely reported in the literature, likely because corrections are often performed by technicians rather than the researchers who write the scientific papers. Details of correction procedures are often glossed over as a minor component of data collection and processing. To help address this disconnect, we present case studies of quality control challenges, solutions, and lessons learned from a large scale, multi-watershed environmental observatory in Northern Utah that monitors Gradients Along Mountain to Urban Transitions (GAMUT). The GAMUT network consists of over 40 individual climate, water quality, and storm drain monitoring stations that have collected more than 200 million unique data points in four years of operation. In all of our examples, we emphasize that scientists

  11. Data Management for Mars Exploration Rovers

    Science.gov (United States)

    Snyder, Joseph F.; Smyth, David E.

    2004-01-01

    Data Management for the Mars Exploration Rovers (MER) project is a comprehensive system addressing the needs of development, test, and operations phases of the mission. During development of flight software, including the science software, the data management system can be simulated using any POSIX file system. During testing, the on-board file system can be bit compared with files on the ground to verify proper behavior and end-to-end data flows. During mission operations, end-to-end accountability of data products is supported, from science observation concept to data products within the permanent ground repository. Automated and human-in-the-loop ground tools allow decisions regarding retransmitting, re-prioritizing, and deleting data products to be made using higher level information than is available to a protocol-stack approach such as the CCSDS File Delivery Protocol (CFDP).

  12. Management, Analysis, and Visualization of Experimental and Observational Data -- The Convergence of Data and Computing

    Energy Technology Data Exchange (ETDEWEB)

    Bethel, E. Wes; Greenwald, Martin; Kleese van Dam, Kersten; Parashar, Manish; Wild, Stefan, M.; Wiley, H. Steven

    2016-10-27

    Scientific user facilities---particle accelerators, telescopes, colliders, supercomputers, light sources, sequencing facilities, and more---operated by the U.S. Department of Energy (DOE) Office of Science (SC) generate ever increasing volumes of data at unprecedented rates from experiments, observations, and simulations. At the same time there is a growing community of experimentalists that require real-time data analysis feedback, to enable them to steer their complex experimental instruments to optimized scientific outcomes and new discoveries. Recent efforts in DOE-SC have focused on articulating the data-centric challenges and opportunities facing these science communities. Key challenges include difficulties coping with data size, rate, and complexity in the context of both real-time and post-experiment data analysis and interpretation. Solutions will require algorithmic and mathematical advances, as well as hardware and software infrastructures that adequately support data-intensive scientific workloads. This paper presents the summary findings of a workshop held by DOE-SC in September 2015, convened to identify the major challenges and the research that is needed to meet those challenges.

  13. Assessment of Data Management Services at New England Region Resource Libraries

    Directory of Open Access Journals (Sweden)

    Julie Goldman

    2015-07-01

    Full Text Available Objective: To understand how New England medical libraries are addressing scientific research data management and providing services to their communities. Setting: The National Network of Libraries of Medicine, New England Region (NN/LM NER contains 17 Resource Libraries. The University of Massachusetts Medical School serves as the New England Regional Medical Library (RML. Sixteen of the NER Resource Libraries completed this survey. Methods: A 40-question online survey assessed libraries’ services and programs for providing research data management education and support. Libraries shared their current plans and institutional challenges associated with developing data services. Results: This study shows few NER Resource Libraries currently integrate scientific research data management into their services and programs, and highlights the region’s use of resources provided by the NN/LM NER RML at the University of Massachusetts Medical School. Conclusions: Understanding the types of data services being delivered at NER libraries helps to inform the NN/LM NER about the eScience learning needs of New England medical librarians and helps in the planning of professional development programs that foster effective biomedical research data services.

  14. Towards a cyberinfrastructure for the biological sciences: progress, visions and challenges.

    Science.gov (United States)

    Stein, Lincoln D

    2008-09-01

    Biology is an information-driven science. Large-scale data sets from genomics, physiology, population genetics and imaging are driving research at a dizzying rate. Simultaneously, interdisciplinary collaborations among experimental biologists, theorists, statisticians and computer scientists have become the key to making effective use of these data sets. However, too many biologists have trouble accessing and using these electronic data sets and tools effectively. A 'cyberinfrastructure' is a combination of databases, network protocols and computational services that brings people, information and computational tools together to perform science in this information-driven world. This article reviews the components of a biological cyberinfrastructure, discusses current and pending implementations, and notes the many challenges that lie ahead.

  15. Big Data Challenges for Large Radio Arrays

    Science.gov (United States)

    Jones, Dayton L.; Wagstaff, Kiri; Thompson, David; D'Addario, Larry; Navarro, Robert; Mattmann, Chris; Majid, Walid; Lazio, Joseph; Preston, Robert; Rebbapragada, Umaa

    2012-01-01

    Future large radio astronomy arrays, particularly the Square Kilometre Array (SKA), will be able to generate data at rates far higher than can be analyzed or stored affordably with current practices. This is, by definition, a "big data" problem, and requires an end-to-end solution if future radio arrays are to reach their full scientific potential. Similar data processing, transport, storage, and management challenges face next-generation facilities in many other fields.

  16. BIG DATA-Related Challenges and Opportunities in Earth System Modeling

    Science.gov (United States)

    Bamzai, A. S.

    2012-12-01

    Knowledge of the Earth's climate has increased immensely in recent decades, both through observational analysis and modeling. BIG DATA-related challenges emerge in our quest for understanding the variability and predictability of the climate and earth system on a range of time scales, as well as in our endeavor to improve predictive capability using state-of-the-science models. To enable further scientific discovery, bottlenecks in current paradigms need to be addressed. An overview of current NSF activities in Earth System Modeling with a focus on associated data-related challenges and opportunities, will be presented.

  17. Research Data Management Education for Future Curators

    Directory of Open Access Journals (Sweden)

    Mark Scott

    2013-06-01

    Full Text Available Science has progressed by “standing on the shoulders of giants” and for centuries research and knowledge have been shared through the publication and dissemination of books, papers and scholarly communications. Moving forward, much of our understanding builds on (large scale datasets, which have been collected or generated as part of the scientific process of discovery. How will this be made available for future generations? How will we ensure that, once collected or generated, others can stand on the shoulders of the data we produce?Educating students about the challenges and opportunities of data management is a key part of the solution and helps the researchers of the future to start to think about the problems early on in their careers. We have compiled a set of case studies to show the similarities and differences in data between disciplines, and produced a booklet for students containing the case studies and an introduction to the data lifecycle and other data management practices. This has already been used at the University of Southampton within the Faculty of Engineering and is now being adopted centrally for use in other faculties. In this paper, we will provide an overview of the case studies and the guide, and reflect on the reception the guide has had to date.

  18. The role of nature-conformity presentation of data in the creation of an information system for the management of science and education

    Directory of Open Access Journals (Sweden)

    Sergey A. Saltykov

    2017-01-01

    Full Text Available The role of nature-conformity in the expansion interpretation for the presentation of open data in the creation of an information system for the management of science and education is determined. The principle of nature-conformity in our definition is represented as the genesis and development of systems according to their own internal (immanent, natural and / or cultural and external - the surrounding socio-cultural and natural-biological nature. In this context, the novelty of the research is to develop such an important parameter for the modern era in the creation of information systems as the open data presentation. The unique character of the paper is also in the development of technical requirements, and in exploring the possibility of filling the information management system of science and education developed with open data. The article outlines the prospects for the practical use of the information system for the management of science and education. It is emphasized that due to the use of open data it will be possible to integrate all the developed models, tools, principles and create a modern Russian information management system for science and education in accordance with the principle of prudence of forming systems. The following issues were developed in the research: a structural and semantic analysis of the concept of «open data» was carried out; examples of successful work with open data are presented to the discussion by various organizations - state, commercial, banking, etc.;an analysis of some provisions of the state strategy of scientific and technical development of Russia is made; requirements are created for the information system of expert-textual analysis of scientific and educational research. The conducted research has showed that the role of nature-conformity in the presentation of open data in the creation of such an information management system is great and continues to grow in connection with the development

  19. Information Quality as a Foundation for User Trustworthiness of Earth Science Data.

    Science.gov (United States)

    Wei, Y.; Moroni, D. F.; Ramapriyan, H.; Peng, G.

    2017-12-01

    Information quality is multidimensional. Four different aspects of information quality can be defined based on the lifecycle stages of Earth Science data products: science, product, stewardship and services. With increasing requirements on ensuring and improving information quality coming from multiple government agencies and throughout industry, there have been considerable efforts toward improving information quality during the last decade, much of which has not been well vetted in a collective sense until recently. Given this rich background of prior work, the Information Quality Cluster (IQC), established within the Federation of Earth Science Information Partners (ESIP) in 2011, and reactivated in the summer of 2014, has been active with membership from multiple organizations. The IQC's objectives and activities, aimed at ensuring and improving information quality for Earth science data and products, are also considered vital toward improving the trustworthiness of Earth science data to a vast and interdisciplinary community of data users. During 2016, several members of the IQC have led the development and assessment of four use cases. This was followed up in 2017 with multiple panel sessions at the 2017 Winter and Summer ESIP Meetings to survey the challenges posed in the various aspects of information quality. What was discovered to be most lacking is the transparency of data lineage (i.e., provenance and maturity), uniform methods for uncertainty characterization, and uniform quality assurance data and metadata. While solutions to these types of issues exist, most data producers have little time to investigate and collaborate to arrive at and conform to a consensus approach. The IQC has positioned itself as a community platform to bring together all relevant stakeholders from data producers, repositories, program managers, and the end users. A combination of both well-vetted and "trailblazing" solutions are presented to address how data trustworthiness can

  20. The Green Challenge in Constrution Management

    DEFF Research Database (Denmark)

    Christensen, Knud

    1999-01-01

    In the years to come,the building and constuction industry,will be met with an increasing amaount of environmental management demands.Contractors can prepare themselves to meet these challenges by devoloping environmental management systems.......In the years to come,the building and constuction industry,will be met with an increasing amaount of environmental management demands.Contractors can prepare themselves to meet these challenges by devoloping environmental management systems....

  1. Molecular Science Computing Facility Scientific Challenges: Linking Across Scales

    Energy Technology Data Exchange (ETDEWEB)

    De Jong, Wibe A.; Windus, Theresa L.

    2005-07-01

    The purpose of this document is to define the evolving science drivers for performing environmental molecular research at the William R. Wiley Environmental Molecular Sciences Laboratory (EMSL) and to provide guidance associated with the next-generation high-performance computing center that must be developed at EMSL's Molecular Science Computing Facility (MSCF) in order to address this critical research. The MSCF is the pre-eminent computing facility?supported by the U.S. Department of Energy's (DOE's) Office of Biological and Environmental Research (BER)?tailored to provide the fastest time-to-solution for current computational challenges in chemistry and biology, as well as providing the means for broad research in the molecular and environmental sciences. The MSCF provides integral resources and expertise to emerging EMSL Scientific Grand Challenges and Collaborative Access Teams that are designed to leverage the multiple integrated research capabilities of EMSL, thereby creating a synergy between computation and experiment to address environmental molecular science challenges critical to DOE and the nation.

  2. Managing globally distributed expertise with new competence management solutions a big-science collaboration as a pilot case.

    CERN Document Server

    Ferguson, J; Livan, M; Nordberg, M; Salmia, T; Vuola, O

    2003-01-01

    In today's global organisations and networks, a critical factor for effective innovation and project execution is appropriate competence and skills management. The challenges include selection of strategic competences, competence development, and leveraging the competences and skills to drive innovation and collaboration for shared goals. This paper presents a new industrial web-enabled competence management and networking solution and its implementation and piloting in a complex big-science environment of globally distributed competences.

  3. Science priorities for seamounts: research links to conservation and management.

    Directory of Open Access Journals (Sweden)

    Malcolm R Clark

    Full Text Available Seamounts shape the topography of all ocean basins and can be hotspots of biological activity in the deep sea. The Census of Marine Life on Seamounts (CenSeam was a field program that examined seamounts as part of the global Census of Marine Life (CoML initiative from 2005 to 2010. CenSeam progressed seamount science by collating historical data, collecting new data, undertaking regional and global analyses of seamount biodiversity, mapping species and habitat distributions, challenging established paradigms of seamount ecology, developing new hypotheses, and documenting the impacts of human activities on seamounts. However, because of the large number of seamounts globally, much about the structure, function and connectivity of seamount ecosystems remains unexplored and unknown. Continual, and potentially increasing, threats to seamount resources from fishing and seabed mining are creating a pressing demand for research to inform conservation and management strategies. To meet this need, intensive science effort in the following areas will be needed: 1 Improved physical and biological data; of particular importance is information on seamount location, physical characteristics (e.g. habitat heterogeneity and complexity, more complete and intensive biodiversity inventories, and increased understanding of seamount connectivity and faunal dispersal; 2 New human impact data; these shall encompass better studies on the effects of human activities on seamount ecosystems, as well as monitoring long-term changes in seamount assemblages following impacts (e.g. recovery; 3 Global data repositories; there is a pressing need for more comprehensive fisheries catch and effort data, especially on the high seas, and compilation or maintenance of geological and biodiversity databases that underpin regional and global analyses; 4 Application of support tools in a data-poor environment; conservation and management will have to increasingly rely on predictive

  4. Science priorities for seamounts: research links to conservation and management.

    Science.gov (United States)

    Clark, Malcolm R; Schlacher, Thomas A; Rowden, Ashley A; Stocks, Karen I; Consalvey, Mireille

    2012-01-01

    Seamounts shape the topography of all ocean basins and can be hotspots of biological activity in the deep sea. The Census of Marine Life on Seamounts (CenSeam) was a field program that examined seamounts as part of the global Census of Marine Life (CoML) initiative from 2005 to 2010. CenSeam progressed seamount science by collating historical data, collecting new data, undertaking regional and global analyses of seamount biodiversity, mapping species and habitat distributions, challenging established paradigms of seamount ecology, developing new hypotheses, and documenting the impacts of human activities on seamounts. However, because of the large number of seamounts globally, much about the structure, function and connectivity of seamount ecosystems remains unexplored and unknown. Continual, and potentially increasing, threats to seamount resources from fishing and seabed mining are creating a pressing demand for research to inform conservation and management strategies. To meet this need, intensive science effort in the following areas will be needed: 1) Improved physical and biological data; of particular importance is information on seamount location, physical characteristics (e.g. habitat heterogeneity and complexity), more complete and intensive biodiversity inventories, and increased understanding of seamount connectivity and faunal dispersal; 2) New human impact data; these shall encompass better studies on the effects of human activities on seamount ecosystems, as well as monitoring long-term changes in seamount assemblages following impacts (e.g. recovery); 3) Global data repositories; there is a pressing need for more comprehensive fisheries catch and effort data, especially on the high seas, and compilation or maintenance of geological and biodiversity databases that underpin regional and global analyses; 4) Application of support tools in a data-poor environment; conservation and management will have to increasingly rely on predictive modelling

  5. Science Priorities for Seamounts: Research Links to Conservation and Management

    Science.gov (United States)

    Clark, Malcolm R.; Schlacher, Thomas A.; Rowden, Ashley A.; Stocks, Karen I.; Consalvey, Mireille

    2012-01-01

    Seamounts shape the topography of all ocean basins and can be hotspots of biological activity in the deep sea. The Census of Marine Life on Seamounts (CenSeam) was a field program that examined seamounts as part of the global Census of Marine Life (CoML) initiative from 2005 to 2010. CenSeam progressed seamount science by collating historical data, collecting new data, undertaking regional and global analyses of seamount biodiversity, mapping species and habitat distributions, challenging established paradigms of seamount ecology, developing new hypotheses, and documenting the impacts of human activities on seamounts. However, because of the large number of seamounts globally, much about the structure, function and connectivity of seamount ecosystems remains unexplored and unknown. Continual, and potentially increasing, threats to seamount resources from fishing and seabed mining are creating a pressing demand for research to inform conservation and management strategies. To meet this need, intensive science effort in the following areas will be needed: 1) Improved physical and biological data; of particular importance is information on seamount location, physical characteristics (e.g. habitat heterogeneity and complexity), more complete and intensive biodiversity inventories, and increased understanding of seamount connectivity and faunal dispersal; 2) New human impact data; these shall encompass better studies on the effects of human activities on seamount ecosystems, as well as monitoring long-term changes in seamount assemblages following impacts (e.g. recovery); 3) Global data repositories; there is a pressing need for more comprehensive fisheries catch and effort data, especially on the high seas, and compilation or maintenance of geological and biodiversity databases that underpin regional and global analyses; 4) Application of support tools in a data-poor environment; conservation and management will have to increasingly rely on predictive modelling

  6. Value-added Data Services at the Goddard Earth Sciences Data and Information Services Center

    Science.gov (United States)

    Leptoukh, G. G.; Alcott, G. T.; Kempler, S. J.; Lynnes, C. S.; Vollmer, B. E.

    2004-05-01

    The NASA Goddard Earth Sciences Data and Information Services Center (GES DISC), in addition to serving the Earth Science community as one of the major Distributed Active Archive Centers (DAACs), provides much more than just data. Among the value-added services available to general users are subsetting data spatially and/or by parameter, online analysis (to avoid downloading unnecessary all the data), and assistance in obtaining data from other centers. Services available to data producers and high-volume users include consulting on building new products with standard formats and metadata and construction of data management systems. A particularly useful service is data processing at the DISC (i.e., close to the input data) with the users' algorithms. This can take a number of different forms: as a configuration-managed algorithm within the main processing stream; as a stand-alone program next to the on-line data storage; as build-it-yourself code within the Near-Archive Data Mining (NADM) system; or as an on-the-fly analysis with simple algorithms embedded into the web-based tools. Partnerships between the GES DISC and scientists, both producers and users, allow the scientists concentrate on science, while the GES DISC handles the of data management, e.g., formats, integration and data processing. The existing data management infrastructure at the GES DISC supports a wide spectrum of options: from simple data support to sophisticated on-line analysis tools, producing economies of scale and rapid time-to-deploy. At the same time, such partnerships allow the GES DISC to serve the user community more efficiently and to better prioritize on-line holdings. Several examples of successful partnerships are described in the presentation.

  7. The Glen Canyon Dam adaptive management program: progress and immediate challenges

    Science.gov (United States)

    Hamill, John F.; Melis, Theodore S.; Boon, Philip J.; Raven, Paul J.

    2012-01-01

    Adaptive management emerged as an important resource management strategy for major river systems in the United States (US) in the early 1990s. The Glen Canyon Dam Adaptive Management Program (‘the Program’) was formally established in 1997 to fulfill a statutory requirement in the 1992 Grand Canyon Protection Act (GCPA). The GCPA aimed to improve natural resource conditions in the Colorado River corridor in the Glen Canyon National Recreation Area and Grand Canyon National Park, Arizona that were affected by the Glen Canyon dam. The Program achieves this by using science and a variety of stakeholder perspectives to inform decisions about dam operations. Since the Program started the ecosystem is now much better understood and several biological and physical improvements have been achieved. These improvements include: (i) an estimated 50% increase in the adult population of endangered humpback chub (Gila cypha) between 2001 and 2008, following previous decline; (ii) a 90% decrease in non-native rainbow trout (Oncorhynchus mykiss), which are known to compete with and prey on native fish, as a result of removal experiments; and (iii) the widespread reappearance of sandbars in response to an experimental high-flow release of dam water in March 2008.Although substantial progress has been made, the Program faces several immediate challenges. These include: (i) defining specific, measurable objectives and desired future conditions for important natural, cultural and recreational attributes to inform science and management decisions; (ii) implementing structural and operational changes to improve collaboration among stakeholders; (iii) establishing a long-term experimental programme and management plan; and (iv) securing long-term funding for monitoring programmes to assess ecosystem and other responses to management actions. Addressing these challenges and building on recent progress will require strong and consistent leadership from the US Department of the Interior

  8. Report on the first round of the Mock LISA Data Challenges

    International Nuclear Information System (INIS)

    Arnaud, K A; Auger, G; Babak, S

    2007-01-01

    The Mock LISA Data Challenges (MLDCs) have the dual purpose of fostering the development of LISA data analysis tools and capabilities, and demonstrating the technical readiness already achieved by the gravitational-wave community in distilling a rich science payoff from the LISA data output. The first round of MLDCs has just been completed: nine challenges consisting of data sets containing simulated gravitational-wave signals produced either by galactic binaries or massive black hole binaries embedded in simulated LISA instrumental noise were released in June 2006 with deadline for submission of results at the beginning of December 2006. Ten groups have participated in this first round of challenges. All of the challenges had at least one entry which successfully characterized the signal to better than 95% when assessed via a correlation with phasing ambiguities accounted for. Here, we describe the challenges, summarize the results and provide a first critical assessment of the entries

  9. Sustainable Materials Management (SMM) Electronics Challenge Data

    Data.gov (United States)

    U.S. Environmental Protection Agency — On September 22, 2012, EPA launched the SMM Electronics Challenge. The Challenge encourages electronics manufacturers, brand owners and retailers to strive to send...

  10. The challenge of archiving and preserving remotely sensed data

    Directory of Open Access Journals (Sweden)

    John L Faundeen

    2003-10-01

    Full Text Available Few would question the need to archive the scientific and technical (S&T data generated by researchers. At a minimum, the data are needed for change analysis. Likewise, most people would value efforts to ensure the preservation of the archived S&T data. Future generations will use analysis techniques not even considered today. Until recently, archiving and preserving these data were usually accomplished within existing infrastructures and budgets. As the volume of archived data increases, however, organizations charged with archiving S&T data will be increasingly challenged (U.S. General Accounting Office, 2002. The U.S. Geological Survey has had experience in this area and has developed strategies to deal with the mountain of land remote sensing data currently being managed and the tidal wave of expected new data. The Agency has dealt with archiving issues, such as selection criteria, purging, advisory panels, and data access, and has met with preservation challenges involving photographic and digital media. That experience has allowed the USGS to develop management approaches, which this paper outlines.

  11. The challenge of archiving and preserving remotely sensed data

    Science.gov (United States)

    Faundeen, John L.

    2003-01-01

    Few would question the need to archive the scientific and technical (S&T) data generated by researchers. At a minimum, the data are needed for change analysis. Likewise, most people would value efforts to ensure the preservation of the archived S&T data. Future generations will use analysis techniques not even considered today. Until recently, archiving and preserving these data were usually accomplished within existing infrastructures and budgets. As the volume of archived data increases, however, organizations charged with archiving S&T data will be increasingly challenged (U.S. General Accounting Office, 2002). The U.S. Geological Survey has had experience in this area and has developed strategies to deal with the mountain of land remote sensing data currently being managed and the tidal wave of expected new data. The Agency has dealt with archiving issues, such as selection criteria, purging, advisory panels, and data access, and has met with preservation challenges involving photographic and digital media. That experience has allowed the USGS to develop management approaches, which this paper outlines.

  12. Collaborative Development of e-Infrastructures and Data Management Practices for Global Change Research

    Science.gov (United States)

    Samors, R. J.; Allison, M. L.

    2016-12-01

    An e-infrastructure that supports data-intensive, multidisciplinary research is being organized under the auspices of the Belmont Forum consortium of national science funding agencies to accelerate the pace of science to address 21st century global change research challenges. The pace and breadth of change in information management across the data lifecycle means that no one country or institution can unilaterally provide the leadership and resources required to use data and information effectively, or needed to support a coordinated, global e-infrastructure. The five action themes adopted by the Belmont Forum: 1. Adopt and make enforceable Data Principles that establish a global, interoperable e-infrastructure. 2. Foster communication, collaboration and coordination between the wider research community and Belmont Forum and its projects through an e-Infrastructure Coordination, Communication, & Collaboration Office. 3. Promote effective data planning and stewardship in all Belmont Forum agency-funded research with a goal to make it enforceable. 4. Determine international and community best practice to inform Belmont Forum research e-infrastructure policy through identification and analysis of cross-disciplinary research case studies. 5. Support the development of a cross-disciplinary training curriculum to expand human capacity in technology and data-intensive analysis methods. The Belmont Forum is ideally poised to play a vital and transformative leadership role in establishing a sustained human and technical international data e-infrastructure to support global change research. In 2016, members of the 23-nation Belmont Forum began a collaborative implementation phase. Four multi-national teams are undertaking Action Themes based on the recommendations above. Tasks include mapping the landscape, identifying and documenting existing data management plans, and scheduling a series of workshops that analyse trans-disciplinary applications of existing Belmont Forum

  13. Partnering for science: proceedings of the USGS Workshop on Citizen Science

    Science.gov (United States)

    Hines, Megan; Benson, Abigail; Govoni, David; Masaki, Derek; Poore, Barbara; Simpson, Annie; Tessler, Steven

    2013-01-01

    What U.S. Geological Survey (USGS) programs use citizen science? How can projects be best designed while meeting policy requirements? What are the most effective volunteer recruitment methods? What data should be collected to ensure validation and how should data be stored? What standard protocols are most easily used by volunteers? Can data from multiple projects be integrated to support new research or existing science questions? To help answer these and other questions, the USGS Community of Data Integration (CDI) supported the development of the Citizen Science Working Group (CSWG) in August 2011 and funded the working group’s proposal to hold a USGS Citizen Science Workshop in fiscal year 2012. The stated goals for our workshop were: raise awareness of programs and projects in the USGS that incorporate citizen science, create a community of practice for the sharing of knowledge and experiences, provide a forum to discuss the challenges of—and opportunities for—incorporating citizen science into USGS projects, and educate and support scientists and managers whose projects may benefit from public participation in science.To meet these goals, the workshop brought together 50 attendees (see appendix A for participant details) representing the USGS, partners, and external citizen science practitioners from diverse backgrounds (including scientists, managers, project coordinators, and technical developers, for example) to discuss these topics at the Denver Federal Center in Colorado on September 11–12, 2012. Over two and a half days, attendees participated in four major plenary sessions (Citizen Science Policy and Challenges, Engaging the Public in Scientific Research, Data Collection and Management, and Technology and Tools) comprised of 25 invited presentations and followed by structured discussions for each session designed to address both prepared and ad hoc "big questions." A number of important community support and infrastructure needs were identified

  14. Managing Science: Management for R&D Laboratories

    Science.gov (United States)

    Gelès, Claude; Lindecker, Gilles; Month, Mel; Roche, Christian

    1999-10-01

    A unique "how-to" manual for the management of scientific laboratories This book presents a complete set of tools for the management of research and development laboratories and projects. With an emphasis on knowledge rather than profit as a measure of output and performance, the authors apply standard management principles and techniques to the needs of high-flux, open-ended, separately funded science and technology enterprises. They also propose the novel idea that failure, and incipient failure, is an important measure of an organization's potential. From the management of complex, round-the-clock, high-tech operations to strategies for long-term planning, Managing Science: Management for R&D Laboratories discusses how to build projects with the proper research and development, obtain and account for funding, and deal with rapidly changing technologies, facilities, and trends. The entire second part of the book is devoted to personnel issues and the impact of workplace behavior on the various functions of a knowledge-based organization. Drawing on four decades of involvement with the management of scientific laboratories, the authors thoroughly illustrate their philosophy with real-world examples from the physics field and provide tables and charts. Managers of scientific laboratories as well as scientists and engineers expecting to move into management will find Managing Science: Management for R&D Laboratories an invaluable practical guide.

  15. Research Data Management Training for Geographers: First Impressions

    Directory of Open Access Journals (Sweden)

    Kerstin Helbig

    2016-03-01

    Full Text Available Sharing and secondary analysis of data have become increasingly important for research. Especially in geography, the collection of digital data has grown due to technological changes. Responsible handling and proper documentation of research data have therefore become essential for funders, publishers and higher education institutions. To achieve this goal, universities offer support and training in research data management. This article presents the experiences of a pilot workshop in research data management, especially for geographers. A discipline-specific approach to research data management training is recommended. The focus of this approach increases researchers’ interest and allows for more specific guidance. The instructors identified problems and challenges of research data management for geographers. In regards to training, the communication of benefits and reaching the target groups seem to be the biggest challenges. Consequently, better incentive structures as well as communication channels have to be established.

  16. Establishing ecological and social continuities: new challenges to optimize urban watershed management

    Directory of Open Access Journals (Sweden)

    V. Mitroi

    2014-09-01

    Full Text Available The (reconstruction of the ecological continuity is stated as one of the main objectives of the European Water Framework Directive for watershed management in Europe. Analysing the social, political, technical and scientific processes characterising the implementation of different projects of ecological continuity in two adjacent peri-urban territories in Ile-de-France, we observed science-driven approaches disregarding the social contexts. We show that, in urbanized areas, ecological continuity requires not only important technical and ecological expertise, but also social and political participation to the definition of a common vision and action plan. Being a challenge for both, technical water management institutions and “classical” ecological policies, we propose some social science contributions to deal with ecological unpredictability and reconsider stakeholder resistance to this kind of project.

  17. Hanford Site Cleanup Challenges and Opportunities for Science and Technology - A Strategic Assessment

    International Nuclear Information System (INIS)

    Johnson, W.; Reichmuth, B.; Wood, T.; Glasper, M.; Hanson, J.

    2002-01-01

    In November 2000, the U.S. Department of Energy (DOE) Richland Operations Office (RL) initiated an effort to produce a single, strategic perspective of RL Site closure challenges and potential Science and Technology (S and T) opportunities. This assessment was requested by DOE Headquarters (HQ), Office of Science and Technology, EM-50, as a means to provide a site level perspective on S and T priorities in the context of the Hanford 2012 Vision. The objectives were to evaluate the entire cleanup lifecycle (estimated at over $24 billion through 2046), to identify where the greatest uncertainties exist, and where investments in S and T can provide the maximum benefit. The assessment identified and described the eleven strategic closure challenges associated with the cleanup of the Hanford Site. The assessment was completed in the spring of 2001 and provided to DOE-HQ and the Hanford Site Technology Coordination Group (STCG) for review and input. It is the first step in developing a Site-level S and T strategy for RL. To realize the full benefits of this assessment, RL and Site contractors will work with the Hanford STCG to ensure: identified challenges and opportunities are reflected in project baselines; detailed S and T program-level road maps reflecting both near- and long-term investments are prepared using this assessment as a starting point; and integrated S and T priorities are incorporated into Environmental Management (EM) Focus Areas, Environmental Management Science Program (EMSP) and other research and development (R and D) programs to meet near-term and longer-range challenges. Hanford is now poised to begin the detailed planning and road mapping necessary to ensure that the integrated Site level S and T priorities are incorporated into the national DOE S and T program and formally incorporated into the relevant project baselines. DOE-HQ's response to this effort has been very positive and similar efforts are likely to be undertaken at other sites

  18. Strategies to address management challenges in larger intensive care units.

    Science.gov (United States)

    Matlakala, M C; Bezuidenhout, M C; Botha, A D H

    2015-10-01

    To illustrate the need for and suggest strategies that will enhance sustainable management of a large intensive care unit (ICU). The challenges faced by intensive care nursing in South Africa are well documented. However, there appear to be no strategies available to assist nurses to manage large ICUs or for ICU managers to deal with problems as they arise. Data sources to illustrate the need for strategies were challenges described by ICU managers in the management of large ICUs. A purposive sample of managers was included in individual interviews during compilation of evidence regarding the challenges experienced in the management of large ICUs. The challenges were presented at the Critical Care Society of Southern Africa Congress held on 28 August to 2 September 2012 in Sun City North-West province, South Africa. Five strategies are suggested for the challenges identified: divide the units into sections; develop a highly skilled and effective nursing workforce to ensure delivery of quality nursing care; create a culture to retain an effective ICU nursing team; manage assets; and determine the needs of ICU nurses. ICUs need measures to drive the desired strategies into actions to continuously improve the management of the unit. Future research should be aimed at investigating the effectiveness of the strategies identified. This research highlights issues relating to large ICUs and the strategies will assist ICU managers to deal with problems related to large unit sizes, shortage of trained ICU nurses, use of agency nurses, shortage of equipment and supplies and stressors in the ICU. The article will make a contribution to the body of nursing literature on management of ICUs. © 2014 John Wiley & Sons Ltd.

  19. Coding the Biodigital Child: The Biopolitics and Pedagogic Strategies of Educational Data Science

    Science.gov (United States)

    Williamson, Ben

    2016-01-01

    Educational data science is an emerging transdisciplinary field formed from an amalgamation of data science and elements of biological, psychological and neuroscientific knowledge about learning, or learning science. This article conceptualises educational data science as a biopolitical strategy focused on the evaluation and management of the…

  20. 8th International Conference on Management Science and Engineering Management

    CERN Document Server

    Cruz-Machado, Virgílio; Lev, Benjamin; Nickel, Stefan

    2014-01-01

    This is the Proceedings of the Eighth International Conference on Management Science and Engineering Management (ICMSEM) held from July 25 to 27, 2014 at Universidade Nova de Lisboa, Lisbon, Portugal and organized by International Society of Management Science and Engineering Management (ISMSEM), Sichuan University (Chengdu, China) and Universidade Nova de Lisboa (Lisbon, Portugal). The goals of the conference are to foster international research collaborations in Management Science and Engineering Management as well as to provide a forum to present current findings. A total number of 138 papers from 14 countries are selected for the proceedings by the conference scientific committee through rigorous referee review. The selected papers in the first volume are focused on Intelligent System and Management Science covering areas of Intelligent Systems, Decision Support Systems, Manufacturing and Supply Chain Management.

  1. The challenge of managing laboratory information in a managed care environment.

    Science.gov (United States)

    Friedman, B A

    1996-04-01

    This article considers some of the major changes that are occurring in pathology and pathology informatics in response to the shift to managed care in the United States. To better understand the relationship between information management in clinical laboratories and managed care, a typology of integrated delivery systems is presented. Following this is a discussion of the evolutionary trajectory for the computer networks that serve these large consolidated healthcare delivery organizations. The most complex of these computer networks is a community health information network. Participation in the planning and deployment of community health information networks will be important for pathologists because information management within pathology will be inexorably integrated into the larger effort by integrated delivery systems to share clinical, financial, and administrative data on a regional basis. Finally, four laboratory information management challenges under managed care are discussed, accompanied by possible approaches to each of them. The challenges presented are (1) organizational integration of departmental information systems such as the laboratory information system; (2) weakening of the best-of-breed approach to laboratory information system selection; (3) the shift away from the centralized laboratory paradigm; and (4) the development of rule-based systems to monitor and control laboratory utilization.

  2. InfoSymbiotics/DDDAS - The power of Dynamic Data Driven Applications Systems for New Capabilities in Environmental -, Geo-, and Space- Sciences

    Science.gov (United States)

    Darema, F.

    2016-12-01

    InfoSymbiotics/DDDAS embodies the power of Dynamic Data Driven Applications Systems (DDDAS), a concept whereby an executing application model is dynamically integrated, in a feed-back loop, with the real-time data-acquisition and control components, as well as other data sources of the application system. Advanced capabilities can be created through such new computational approaches in modeling and simulations, and in instrumentation methods, and include: enhancing the accuracy of the application model; speeding-up the computation to allow faster and more comprehensive models of a system, and create decision support systems with the accuracy of full-scale simulations; in addition, the notion of controlling instrumentation processes by the executing application results in more efficient management of application-data and addresses challenges of how to architect and dynamically manage large sets of heterogeneous sensors and controllers, an advance over the static and ad-hoc ways of today - with DDDAS these sets of resources can be managed adaptively and in optimized ways. Large-Scale-Dynamic-Data encompasses the next wave of Big Data, and namely dynamic data arising from ubiquitous sensing and control in engineered, natural, and societal systems, through multitudes of heterogeneous sensors and controllers instrumenting these systems, and where opportunities and challenges at these "large-scales" relate not only to data size but the heterogeneity in data, data collection modalities, fidelities, and timescales, ranging from real-time data to archival data. In tandem with this important dimension of dynamic data, there is an extended view of Big Computing, which includes the collective computing by networked assemblies of multitudes of sensors and controllers, this range from the high-end to the real-time seamlessly integrated and unified, and comprising the Large-Scale-Big-Computing. InfoSymbiotics/DDDAS engenders transformative impact in many application domains

  3. Let's Make Gender Diversity in Data Science a Priority Right from the Start.

    Science.gov (United States)

    Berman, Francine D; Bourne, Philip E

    2015-07-01

    The emergent field of data science is a critical driver for innovation in all sectors, a focus of tremendous workforce development, and an area of increasing importance within science, technology, engineering, and math (STEM). In all of its aspects, data science has the potential to narrow the gender gap and set a new bar for inclusion. To evolve data science in a way that promotes gender diversity, we must address two challenges: (1) how to increase the number of women acquiring skills and working in data science and (2) how to evolve organizations and professional cultures to better retain and advance women in data science. Everyone can contribute.

  4. Let's Make Gender Diversity in Data Science a Priority Right from the Start.

    Directory of Open Access Journals (Sweden)

    Francine D Berman

    2015-07-01

    Full Text Available The emergent field of data science is a critical driver for innovation in all sectors, a focus of tremendous workforce development, and an area of increasing importance within science, technology, engineering, and math (STEM. In all of its aspects, data science has the potential to narrow the gender gap and set a new bar for inclusion. To evolve data science in a way that promotes gender diversity, we must address two challenges: (1 how to increase the number of women acquiring skills and working in data science and (2 how to evolve organizations and professional cultures to better retain and advance women in data science. Everyone can contribute.

  5. Data Stewardship in the Ocean Sciences Needs to Include Physical Samples

    Science.gov (United States)

    Carter, M.; Lehnert, K.

    2016-02-01

    Across the Ocean Sciences, research involves the collection and study of samples collected above, at, and below the seafloor, including but not limited to rocks, sediments, fluids, gases, and living organisms. Many domains in the Earth Sciences have recently expressed the need for better discovery, access, and sharing of scientific samples and collections (EarthCube End-User Domain workshops, 2012 and 2013, http://earthcube.org/info/about/end-user-workshops), as has the US government (OSTP Memo, March 2014). iSamples (Internet of Samples in the Earth Sciences) is a Research Coordination Network within the EarthCube program that aims to advance the use of innovative cyberinfrastructure to support and advance the utility of physical samples and sample collections for science and ensure reproducibility of sample-based data and research results. iSamples strives to build, grow, and foster a new community of practice, in which domain scientists, curators of sample repositories and collections, computer and information scientists, software developers and technology innovators engage in and collaborate on defining, articulating, and addressing the needs and challenges of physical samples as a critical component of digital data infrastructure. A primary goal of iSamples is to deliver a community-endorsed set of best practices and standards for the registration, description, identification, and citation of physical specimens and define an actionable plan for implementation. iSamples conducted a broad community survey about sample sharing and has created 5 different working groups to address the different challenges of developing the internet of samples - from metadata schemas and unique identifiers to an architecture for a shared cyberinfrastructure to manage collections, to digitization of existing collections, to education, and ultimately to establishing the physical infrastructure that will ensure preservation and access of the physical samples. Repositories that curate

  6. Waste management in Greenland: current situation and challenges

    DEFF Research Database (Denmark)

    Eisted, Rasmus; Christensen, Thomas Højlund

    2011-01-01

    Waste management in Greenland (56 000 inhabitants) is characterized by landfilling, incineration and export to Denmark of small quantities of metals and hazardous waste. The annual amount of waste is estimated to about 50 000 tons but actual data are scarce. Data on the waste composition is basic...... are small and equipped with only moderate flue gas cleaning technology. This report summarizes the current waste management situation in Greenland and identifies important challenges in improving the waste management....... is basically lacking. The scattered small towns and settlements, the climate and the long transport distances between towns and also to recycling industries abroad constitute a complex situation with respect to waste management. The landfills have no collection of gas and leachate and the incinerators...

  7. Big Data in Plant Science: Resources and Data Mining Tools for Plant Genomics and Proteomics.

    Science.gov (United States)

    Popescu, George V; Noutsos, Christos; Popescu, Sorina C

    2016-01-01

    In modern plant biology, progress is increasingly defined by the scientists' ability to gather and analyze data sets of high volume and complexity, otherwise known as "big data". Arguably, the largest increase in the volume of plant data sets over the last decade is a consequence of the application of the next-generation sequencing and mass-spectrometry technologies to the study of experimental model and crop plants. The increase in quantity and complexity of biological data brings challenges, mostly associated with data acquisition, processing, and sharing within the scientific community. Nonetheless, big data in plant science create unique opportunities in advancing our understanding of complex biological processes at a level of accuracy without precedence, and establish a base for the plant systems biology. In this chapter, we summarize the major drivers of big data in plant science and big data initiatives in life sciences with a focus on the scope and impact of iPlant, a representative cyberinfrastructure platform for plant science.

  8. Nuclear data for astrophysics: resources, challenges, strategies, and software solutions

    International Nuclear Information System (INIS)

    Smith, M.S.; Lingerfelt, E.J.; Nesaraja, C.D.; Raphael Hix, W.; Roberts, L.F.; Hiroyuki, Koura; Fuller, G.M.; Tytler, D.

    2008-01-01

    One of the most exciting utilizations of nuclear data is to help unlock the mysteries of the Cosmos - the creation of the chemical elements, the evolution and explosion of stars, and the origin and fate of the Universe. There are now many nuclear data sets, tools, and other resources online to help address these important questions. However, numerous serious challenges make it important to develop strategies now to ensure a sustainable future for this work. A number of strategies are advocated, including: enlisting additional manpower to evaluate the newest data; devising ways to streamline evaluation activities; and improving communication and coordination between existing efforts. Software projects are central to some of these strategies. Examples include: creating a virtual - pipeline - leading from the nuclear laboratory to astrophysics simulations; improving data visualization and management to get the most science out of the existing datasets; and creating a nuclear astrophysics data virtual (online) community. Recent examples will be detailed, including the development of two first-generation software pipelines, the Computational Infrastructure for Nuclear Astrophysics for stellar astrophysics and the Bigbangonline suite of codes for cosmology, and the coupling of nuclear data to sensitivity studies with astrophysical simulation codes to guide future research. (authors)

  9. Challenges of Virtual and Open Distance Science Teacher Education in Zimbabwe

    Science.gov (United States)

    Mpofu, Vongai; Samukange, Tendai; Kusure, Lovemore M.; Zinyandu, Tinoidzwa M.; Denhere, Clever; Huggins, Nyakotyo; Wiseman, Chingombe; Ndlovu, Shakespear; Chiveya, Renias; Matavire, Monica; Mukavhi, Leckson; Gwizangwe, Isaac; Magombe, Elliot; Magomelo, Munyaradzi; Sithole, Fungai; Bindura University of Science Education (BUSE),

    2012-01-01

    This paper reports on a study of the implementation of science teacher education through virtual and open distance learning in the Mashonaland Central Province, Zimbabwe. The study provides insight into challenges faced by students and lecturers on inception of the program at four centres. Data was collected from completed evaluation survey forms…

  10. Knowledge Management Challenges For Global Business

    OpenAIRE

    Veli Denizhan Kalkan

    2011-01-01

    Managing organizational knowledge effectively is a prerequisite for securing competitive advantages in the global marketplace. The field of knowledge management brings out important challenges for global business practices. Based on a comprehensive academic and popular literature review, this paper identifies six main knowledge management challenges faced by global business today. These are developing a working definition of knowledge, dealing with tacit knowledge and utilization of informati...

  11. Smarter Earth Science Data System

    Science.gov (United States)

    Huang, Thomas

    2013-01-01

    The explosive growth in Earth observational data in the recent decade demands a better method of interoperability across heterogeneous systems. The Earth science data system community has mastered the art in storing large volume of observational data, but it is still unclear how this traditional method scale over time as we are entering the age of Big Data. Indexed search solutions such as Apache Solr (Smiley and Pugh, 2011) provides fast, scalable search via keyword or phases without any reasoning or inference. The modern search solutions such as Googles Knowledge Graph (Singhal, 2012) and Microsoft Bing, all utilize semantic reasoning to improve its accuracy in searches. The Earth science user community is demanding for an intelligent solution to help them finding the right data for their researches. The Ontological System for Context Artifacts and Resources (OSCAR) (Huang et al., 2012), was created in response to the DARPA Adaptive Vehicle Make (AVM) programs need for an intelligent context models management system to empower its terrain simulation subsystem. The core component of OSCAR is the Environmental Context Ontology (ECO) is built using the Semantic Web for Earth and Environmental Terminology (SWEET) (Raskin and Pan, 2005). This paper presents the current data archival methodology within a NASA Earth science data centers and discuss using semantic web to improve the way we capture and serve data to our users.

  12. Big data challenges

    DEFF Research Database (Denmark)

    Bachlechner, Daniel; Leimbach, Timo

    2016-01-01

    Although reports on big data success stories have been accumulating in the media, most organizations dealing with high-volume, high-velocity and high-variety information assets still face challenges. Only a thorough understanding of these challenges puts organizations into a position in which...... they can make an informed decision for or against big data, and, if the decision is positive, overcome the challenges smoothly. The combination of a series of interviews with leading experts from enterprises, associations and research institutions, and focused literature reviews allowed not only...... framework are also relevant. For large enterprises and startups specialized in big data, it is typically easier to overcome the challenges than it is for other enterprises and public administration bodies....

  13. 300 Area Integrated Field-Scale Subsurface Research Challenge (IFRC) Field Site Management Plan

    Energy Technology Data Exchange (ETDEWEB)

    Freshley, Mark D.

    2008-12-31

    Pacific Northwest National Laboratory (PNNL) has established the 300 Area Integrated Field-Scale Subsurface Research Challenge (300 Area IFRC) on the Hanford Site in southeastern Washington State for the U.S. Department of Energy’s (DOE) Office of Biological and Environmental Research (BER) within the Office of Science. The project is funded by the Environmental Remediation Sciences Division (ERSD). The purpose of the project is to conduct research at the 300 IFRC to investigate multi-scale mass transfer processes associated with a subsurface uranium plume impacting both the vadose zone and groundwater. The management approach for the 300 Area IFRC requires that a Field Site Management Plan be developed. This is an update of the plan to reflect the installation of the well network and other changes.

  14. Mash-up of techniques between data crawling/transfer, data preservation/stewardship and data processing/visualization technologies on a science cloud system designed for Earth and space science: a report of successful operation and science projects of the NICT Science Cloud

    Science.gov (United States)

    Murata, K. T.

    2014-12-01

    Data-intensive or data-centric science is 4th paradigm after observational and/or experimental science (1st paradigm), theoretical science (2nd paradigm) and numerical science (3rd paradigm). Science cloud is an infrastructure for 4th science methodology. The NICT science cloud is designed for big data sciences of Earth, space and other sciences based on modern informatics and information technologies [1]. Data flow on the cloud is through the following three techniques; (1) data crawling and transfer, (2) data preservation and stewardship, and (3) data processing and visualization. Original tools and applications of these techniques have been designed and implemented. We mash up these tools and applications on the NICT Science Cloud to build up customized systems for each project. In this paper, we discuss science data processing through these three steps. For big data science, data file deployment on a distributed storage system should be well designed in order to save storage cost and transfer time. We developed a high-bandwidth virtual remote storage system (HbVRS) and data crawling tool, NICTY/DLA and Wide-area Observation Network Monitoring (WONM) system, respectively. Data files are saved on the cloud storage system according to both data preservation policy and data processing plan. The storage system is developed via distributed file system middle-ware (Gfarm: GRID datafarm). It is effective since disaster recovery (DR) and parallel data processing are carried out simultaneously without moving these big data from storage to storage. Data files are managed on our Web application, WSDBank (World Science Data Bank). The big-data on the cloud are processed via Pwrake, which is a workflow tool with high-bandwidth of I/O. There are several visualization tools on the cloud; VirtualAurora for magnetosphere and ionosphere, VDVGE for google Earth, STICKER for urban environment data and STARStouch for multi-disciplinary data. There are 30 projects running on the NICT

  15. Challenges of Women in Science: Bangladesh Perspectives

    Indian Academy of Sciences (India)

    ranjeetha

    Director, Bose Centre for Advanced Study and Research in Natural Sciences .... Enrolment in Universities by management d d and gender. 100000 .... in science. • Encouragement in the classroom, family and environment ... Desired strategy.

  16. Big Data Challenges

    Directory of Open Access Journals (Sweden)

    Alexandru Adrian TOLE

    2013-10-01

    Full Text Available The amount of data that is traveling across the internet today, not only that is large, but is complex as well. Companies, institutions, healthcare system etc., all of them use piles of data which are further used for creating reports in order to ensure continuity regarding the services that they have to offer. The process behind the results that these entities requests represents a challenge for software developers and companies that provide IT infrastructure. The challenge is how to manipulate an impressive volume of data that has to be securely delivered through the internet and reach its destination intact. This paper treats the challenges that Big Data creates.

  17. The Brazilian Science Data Center (BSDC)

    Science.gov (United States)

    de Almeida, Ulisses Barres; Bodmann, Benno; Giommi, Paolo; Brandt, Carlos H.

    Astrophysics and Space Science are becoming increasingly characterised by what is now known as “big data”, the bottlenecks for progress partly shifting from data acquisition to “data mining”. Truth is that the amount and rate of data accumulation in many fields already surpasses the local capabilities for its processing and exploitation, and the efficient conversion of scientific data into knowledge is everywhere a challenge. The result is that, to a large extent, isolated data archives risk being progressively likened to “data graveyards”, where the information stored is not reused for scientific work. Responsible and efficient use of these large data-sets means democratising access and extracting the most science possible from it, which in turn signifies improving data accessibility and integration. Improving data processing capabilities is another important issue specific to researchers and computer scientists of each field. The project presented here wishes to exploit the enormous potential opened up by information technology at our age to advance a model for a science data center in astronomy which aims to expand data accessibility and integration to the largest possible extent and with the greatest efficiency for scientific and educational use. Greater access to data means more people producing and benefiting from information, whereas larger integration of related data from different origins means a greater research potential and increased scientific impact. The project of the BSDC is preoccupied, primarily, with providing tools and solutions for the Brazilian astronomical community. It nevertheless capitalizes on extensive international experience, and is developed in full cooperation with the ASI Science Data Center (ASDC), from the Italian Space Agency, granting it an essential ingredient of internationalisation. The BSDC is Virtual Observatory-complient and part of the “Open Universe”, a global initiative built under the auspices of the

  18. Facing tomorrow's challenges: U.S. Geological Survey science in the decade 2007-2017

    Science.gov (United States)

    ,

    2007-01-01

    In order for the U.S. Geological Survey (USGS) to respond to evolving national and global priorities, it must periodically reflect on, and optimize, its strategic directions. This report is the first comprehensive science strategy since the early 1990s to examine critically major USGS science goals and priorities. The development of this science strategy comes at a time of global trends and rapidly evolving societal needs that pose important natural-science challenges. The emergence of a global economy affects the demand for all resources. The last decade has witnessed the emergence of a new model for managing Federal lands-ecosystem-based management. The U.S. Climate Change Science Program predicts that the next few decades will see rapid changes in the Nation's and the Earth's environment. Finally, the natural environment continues to pose risks to society in the form of volcanoes, earthquakes, wildland fires, floods, droughts, invasive species, variable and changing climate, and natural and anthropogenic toxins, as well as animal-borne diseases that affect humans. The use of, and competition for, natural resources on the global scale, and natural threats to those resources, has the potential to impact the Nation's ability to sustain its economy, national security, quality of life, and natural environment. Responding to these national priorities and global trends requires a science strategy that not only builds on existing USGS strengths and partnerships but also demands the innovation made possible by integrating the full breadth and depth of USGS capabilities. The USGS chooses to go forward in the science directions proposed here because the societal issues addressed by these science directions represent major challenges for the Nation's future and for the stewards of Federal lands, both onshore and offshore. The six science directions proposed in this science strategy are listed as follows. The ecosystems strategy is listed first because it has a dual nature

  19. DIII-D DATA MANAGEMENT

    International Nuclear Information System (INIS)

    McHARG, B.B; BURUSS, J.R. Jr.; FREEMAN, J.; PARKER, C.T.; SCHACHTER, J.; SCHISSEL, D.P.

    2001-08-01

    OAK-B135 The DIII-D tokamak at the DIII-D National Fusion Facility routinely acquires ∼ 500 Megabytes of raw data per pulse of the experiment through a centralized data management system. It is expected that in FY01, nearly one Terabyte of data will be acquired. In addition there are several diagnostics, which are not part of the centralized system, which acquire hundreds of megabytes of raw data per pulse. There is also a growing suite of codes running between pulses that produce analyzed data, which add ∼ 10 Megabytes per pulse with total disk usage of about 100 Gigabytes. A relational database system has been introduced which further adds to the overall data load. In recent years there has been an order of magnitude increase in magnetic disk space devoted to raw data and a Hierarchical Storage Management system (HSM) was implemented to allow 7 x 24 unattended access to raw data. The management of all of the data is a significant and growing challenge as the quantities of both raw and analyzed data are expected to continue to increase in the future. This paper will examine the experiences of the approaches that have been taken in management of the data and plans for the continued growth of the data quantity

  20. The AmeriFlux data activity and data system: an evolving collection of data management techniques, tools, products and services

    Science.gov (United States)

    Boden, T. A.; Krassovski, M.; Yang, B.

    2013-06-01

    The Carbon Dioxide Information Analysis Center (CDIAC) at Oak Ridge National Laboratory (ORNL), USA has provided scientific data management support for the US Department of Energy and international climate change science since 1982. Among the many data archived and available from CDIAC are collections from long-term measurement projects. One current example is the AmeriFlux measurement network. AmeriFlux provides continuous measurements from forests, grasslands, wetlands, and croplands in North, Central, and South America and offers important insight about carbon cycling in terrestrial ecosystems. To successfully manage AmeriFlux data and support climate change research, CDIAC has designed flexible data systems using proven technologies and standards blended with new, evolving technologies and standards. The AmeriFlux data system, comprised primarily of a relational database, a PHP-based data interface and a FTP server, offers a broad suite of AmeriFlux data. The data interface allows users to query the AmeriFlux collection in a variety of ways and then subset, visualize and download the data. From the perspective of data stewardship, on the other hand, this system is designed for CDIAC to easily control database content, automate data movement, track data provenance, manage metadata content, and handle frequent additions and corrections. CDIAC and researchers in the flux community developed data submission guidelines to enhance the AmeriFlux data collection, enable automated data processing, and promote standardization across regional networks. Both continuous flux and meteorological data and irregular biological data collected at AmeriFlux sites are carefully scrutinized by CDIAC using established quality-control algorithms before the data are ingested into the AmeriFlux data system. Other tasks at CDIAC include reformatting and standardizing the diverse and heterogeneous datasets received from individual sites into a uniform and consistent network database

  1. Visions of the Future - the Changing Role of Actors in Data-Intensive Science

    Science.gov (United States)

    Schäfer, L.; Klump, J. F.

    2013-12-01

    Around the world scientific disciplines are increasingly facing the challenge of a burgeoning volume of research data. This data avalanche consists of a stream of information generated from sensors and scientific instruments, digital recordings, social-science surveys or drawn from the World Wide Web. All areas of the scientific economy are affected by this rapid growth in data, from the logging of digs in Archaeology, telescope data with observations of distant galaxies in Astrophysics or data from polls and surveys in the Social Sciences. The challenge for science is not only to process the data through analysis, reduction and visualization, but also to set up infrastructures for provisioning and storing the data. The rise of new technologies and developments also poses new challenges for the actors in the area of research data infrastructures. Libraries, as one of the actors, enable access to digital media and support the publication of research data and its long-term archiving. Digital media and research data, however, introduce new aspects into the libraries' range of activities. How are we to imagine the library of the future? The library as an interface to the computer centers? Will library and computer center fuse into a new service unit? What role will scientific publishers play in future? Currently the traditional form of publication still carry greater weight - articles for conferences and journals. But will this still be the case in future? New forms of publication are already making their presence felt. The tasks of the computer centers may also change. Yesterday their remit was provisioning of rapid hardware, whereas now everything revolves around the topic of data and services. Finally, how about the researchers themselves? Not such a long time ago, Geoscience was not necessarily seen as linked to Computer Science. Nowadays, modern Geoscience relies heavily on IT and its techniques. Thus, in how far will the profile of the modern geoscientist change

  2. Lessons from COASST: How Does Citizen Science Contribute to Natural Resource Management & Decision-Making?

    Science.gov (United States)

    Metes, J.; Ballard, H. L.; Parrish, J.

    2016-12-01

    As many scholars and practitioners in the environmental field turn to citizen science to collect robust scientific data as well as engage with wider audiences, it is crucial to build a more complete understanding of how citizen science influences and affects different interests within a social-ecological system. This research investigates how federal, state, and tribal natural resource managers interact with data from the Coastal Observation & Seabird Survey Team (COASST) project—a citizen science program that trains participants to monitor species and abundance of beach-cast birds on the Pacific Northwest Coast. Fifteen coastal and fisheries managers who previously requested COASST data were interviewed about how and why they used data from the project and were asked to describe how information gained from COASST affected their management decisions. Results suggest that broadly, managers value and learn from the program's capacity to gather data spanning a wide spatial-temporal range. This contribution to baseline monitoring helps managers signal and track both short- and long-term environmental change. More specifically, managers use COASST data in conjunction with other professional monitoring programs, such as the National Marine Fisheries Observer Program, to build higher degrees of reliability into management decisions. Although managers offered diverse perspectives and experiences about what the role of citizen science in natural resource management generally should be, there was agreement that agencies on their own often lack personnel and funding required to sufficiently monitor many crucial resources. Additionally, managers strongly suggested that COASST and other citizen science projects increased public awareness and support for agency decision-making and policies, and indirect yet important contribution to natural resource management.

  3. NASA Life Sciences Data Repositories: Tools for Retrospective Analysis and Future Planning

    Science.gov (United States)

    Thomas, D.; Wear, M.; VanBaalen, M.; Lee, L.; Fitts, M.

    2011-01-01

    As NASA transitions from the Space Shuttle era into the next phase of space exploration, the need to ensure the capture, analysis, and application of its research and medical data is of greater urgency than at any other previous time. In this era of limited resources and challenging schedules, the Human Research Program (HRP) based at NASA s Johnson Space Center (JSC) recognizes the need to extract the greatest possible amount of information from the data already captured, as well as focus current and future research funding on addressing the HRP goal to provide human health and performance countermeasures, knowledge, technologies, and tools to enable safe, reliable, and productive human space exploration. To this end, the Science Management Office and the Medical Informatics and Health Care Systems Branch within the HRP and the Space Medicine Division have been working to make both research data and clinical data more accessible to the user community. The Life Sciences Data Archive (LSDA), the research repository housing data and information regarding the physiologic effects of microgravity, and the Lifetime Surveillance of Astronaut Health (LSAH-R), the clinical repository housing astronaut data, have joined forces to achieve this goal. The task of both repositories is to acquire, preserve, and distribute data and information both within the NASA community and to the science community at large. This is accomplished via the LSDA s public website (http://lsda.jsc.nasa.gov), which allows access to experiment descriptions including hardware, datasets, key personnel, mission descriptions and a mechanism for researchers to request additional data, research and clinical, that is not accessible from the public website. This will result in making the work of NASA and its partners available to the wider sciences community, both domestic and international. The desired outcome is the use of these data for knowledge discovery, retrospective analysis, and planning of future

  4. Disciplinary differences in faculty research data management practices and perspectives

    Directory of Open Access Journals (Sweden)

    Katherine G. Akers

    2013-11-01

    Full Text Available Academic librarians are increasingly engaging in data curation by providing infrastructure (e.g., institutional repositories and offering services (e.g., data management plan consultations to support the management of research data on their campuses. Efforts to develop these resources may benefit from a greater understanding of disciplinary differences in research data management needs. After conducting a survey of data management practices and perspectives at our research university, we categorized faculty members into four research domains—arts and humanities, social sciences, medical sciences, and basic sciences—and analyzed variations in their patterns of survey responses. We found statistically significant differences among the four research domains for nearly every survey item, revealing important disciplinary distinctions in data management actions, attitudes, and interest in support services. Serious consideration of both the similarities and dissimilarities among disciplines will help guide academic librarians and other data curation professionals in developing a range of data-management services that can be tailored to the unique needs of different scholarly researchers.

  5. Big data related technologies, challenges and future prospects

    CERN Document Server

    Chen, Min; Zhang, Yin; Leung, Victor CM

    2014-01-01

    This Springer Brief provides a comprehensive overview of the background and recent developments of big data. The value chain of big data is divided into four phases: data generation, data acquisition, data storage and data analysis. For each phase, the book introduces the general background, discusses technical challenges and reviews the latest advances. Technologies under discussion include cloud computing, Internet of Things, data centers, Hadoop and more. The authors also explore several representative applications of big data such as enterprise management, online social networks, healthcar

  6. Data quality in citizen science urban tree inventories

    Science.gov (United States)

    Lara A. Roman; Bryant C. Scharenbroch; Johan P.A. Ostberg; Lee S. Mueller; Jason G. Henning; Andrew K. Koeser; Jessica R. Sanders; Daniel R. Betz; Rebecca C. Jordan

    2017-01-01

    Citizen science has been gaining popularity in ecological research and resource management in general and in urban forestry specifically. As municipalities and nonprofits engage volunteers in tree data collection, it is critical to understand data quality. We investigated observation error by comparing street tree data collected by experts to data collected by less...

  7. Scientific Grand Challenges: Challenges in Climate Change Science and the Role of Computing at the Extreme Scale

    Energy Technology Data Exchange (ETDEWEB)

    Khaleel, Mohammad A.; Johnson, Gary M.; Washington, Warren M.

    2009-07-02

    The U.S. Department of Energy (DOE) Office of Biological and Environmental Research (BER) in partnership with the Office of Advanced Scientific Computing Research (ASCR) held a workshop on the challenges in climate change science and the role of computing at the extreme scale, November 6-7, 2008, in Bethesda, Maryland. At the workshop, participants identified the scientific challenges facing the field of climate science and outlined the research directions of highest priority that should be pursued to meet these challenges. Representatives from the national and international climate change research community as well as representatives from the high-performance computing community attended the workshop. This group represented a broad mix of expertise. Of the 99 participants, 6 were from international institutions. Before the workshop, each of the four panels prepared a white paper, which provided the starting place for the workshop discussions. These four panels of workshop attendees devoted to their efforts the following themes: Model Development and Integrated Assessment; Algorithms and Computational Environment; Decadal Predictability and Prediction; Data, Visualization, and Computing Productivity. The recommendations of the panels are summarized in the body of this report.

  8. Data-Oriented Astrophysics at NOAO: The Science Archive & The Data Lab

    Science.gov (United States)

    Juneau, Stephanie; NOAO Data Lab, NOAO Science Archive

    2018-06-01

    As we keep progressing into an era of increasingly large astronomy datasets, NOAO’s data-oriented mission is growing in prominence. The NOAO Science Archive, which captures and processes the pixel data from mountaintops in Chile and Arizona, now contains holdings at Petabyte scales. Working at the intersection of astronomy and data science, the main goal of the NOAO Data Lab is to provide users with a suite of tools to work close to this data, the catalogs derived from them, as well as externally provided datasets, and thus optimize the scientific productivity of the astronomy community. These tools and services include databases, query tools, virtual storage space, workflows through our Jupyter Notebook server, and scripted analysis. We currently host datasets from NOAO facilities such as the Dark Energy Survey (DES), the DESI imaging Legacy Surveys (LS), the Dark Energy Camera Plane Survey (DECaPS), and the nearly all-sky NOAO Source Catalog (NSC). We are further preparing for large spectroscopy datasets such as DESI. After a brief overview of the Science Archive, the Data Lab and datasets, I will briefly showcase scientific applications showing use of our data holdings. Lastly, I will describe our vision for future developments as we tackle the next technical and scientific challenges.

  9. Data Curation Education in Research Centers (DCERC)

    Science.gov (United States)

    Marlino, M. R.; Mayernik, M. S.; Kelly, K.; Allard, S.; Tenopir, C.; Palmer, C.; Varvel, V. E., Jr.

    2012-12-01

    Digital data both enable and constrain scientific research. Scientists are enabled by digital data to develop new research methods, utilize new data sources, and investigate new topics, but they also face new data collection, management, and preservation burdens. The current data workforce consists primarily of scientists who receive little formal training in data management and data managers who are typically educated through on-the-job training. The Data Curation Education in Research Centers (DCERC) program is investigating a new model for educating data professionals to contribute to scientific research. DCERC is a collaboration between the University of Illinois at Urbana-Champaign Graduate School of Library and Information Science, the University of Tennessee School of Information Sciences, and the National Center for Atmospheric Research. The program is organized around a foundations course in data curation and provides field experiences in research and data centers for both master's and doctoral students. This presentation will outline the aims and the structure of the DCERC program and discuss results and lessons learned from the first set of summer internships in 2012. Four masters students participated and worked with both data mentors and science mentors, gaining first hand experiences in the issues, methods, and challenges of scientific data curation. They engaged in a diverse set of topics, including climate model metadata, observational data management workflows, and data cleaning, documentation, and ingest processes within a data archive. The students learned current data management practices and challenges while developing expertise and conducting research. They also made important contributions to NCAR data and science teams by evaluating data management workflows and processes, preparing data sets to be archived, and developing recommendations for particular data management activities. The master's student interns will return in summer of 2013

  10. Benchmarking and improving point cloud data management in MonetDB

    NARCIS (Netherlands)

    O. Martinez-Rubi (Oscar); P. van Oosterom; R.A. Goncalves (Romulo); T. Tijssen; M.G. Ivanova (Milena); M.L. Kersten (Martin); F. Alvanaki (Foteini)

    2014-01-01

    htmlabstractThe popularity, availability and sizes of point cloud data sets are increasing, thus raising interesting data management and processing challenges. Various software solutions are available for the management of point cloud data. A benchmark for point cloud data management systems was

  11. Managing complexity challenges for industrial engineering and operations management

    CERN Document Server

    López-Paredes, Adolfo; Pérez-Ríos, José

    2014-01-01

    This book presents papers by experts in the field of Industrial Engineering, covering topics in business strategy; modelling and simulation in operations research; logistics and production; service systems; innovation and knowledge; and project management. The focus of operations and production management has evolved from product and manufacturing to the capabilities of firms and collaborative management. Nowadays, Industrial Engineering is concerned with the study of how to design, modify, control and improve the performance of complex systems. It has extended its scope to any physical landscape populated by social agents. This raises a major challenge to Industrial Engineering:  managing complexity. This volume shows how experts are dealing with this challenge.

  12. Using NASA Data in the Classroom: Promoting STEM Learning in Formal Education using Real Space Science Data

    Science.gov (United States)

    Lawton, B.; Hemenway, M. K.; Mendez, B.; Odenwald, S.

    2013-04-01

    Among NASA's major education goals is the training of students in the Science, Technology, Engineering, and Math (STEM) disciplines. The use of real data, from some of the most sophisticated observatories in the world, provides formal educators the opportunity to teach their students real-world applications of the STEM subjects. Combining real space science data with lessons aimed at meeting state and national education standards provides a memorable educational experience that students can build upon throughout their academic careers. Many of our colleagues have adopted the use of real data in their education and public outreach (EPO) programs. There are challenges in creating resources using real data for classroom use that include, but are not limited to, accessibility to computers/Internet and proper instruction. Understanding and sharing these difficulties and best practices with the larger EPO community is critical to the development of future resources. In this session, we highlight three examples of how NASA data is being utilized in the classroom: the Galaxies and Cosmos Explorer Tool (GCET) that utilizes real Hubble Space Telescope data; the computer image-analysis resources utilized by the NASA WISE infrared mission; and the space science derived math applications from SpaceMath@NASA featuring the Chandra and Kepler space telescopes. Challenges and successes are highlighted for these projects. We also facilitate small-group discussions that focus on additional benefits and challenges of using real data in the formal education environment. The report-outs from those discussions are given here.

  13. Data Governance - Defining Accountabilities for Data Quality Management

    OpenAIRE

    Wende, Kristin

    2007-01-01

    Enterprises need data quality management (DQM) to respond to strategic and operational challenges demanding high-quality corporate data. Hitherto, companies have assigned accountabilities for DQM mostly to IT departments. They have thereby ignored the organisational issues that are critical to the success of DQM. With data governance, however, companies implement corporate-wide accountabilities for DQM that encompass professionals from business and IT. This paper proposes a contingency approa...

  14. Enhancing Diversity in Biomedical Data Science.

    Science.gov (United States)

    Canner, Judith E; McEligot, Archana J; Pérez, María-Eglée; Qian, Lei; Zhang, Xinzhi

    2017-01-01

    The gap in educational attainment separating underrepresented minorities from Whites and Asians remains wide. Such a gap has significant impact on workforce diversity and inclusion among cross-cutting Biomedical Data Science (BDS) research, which presents great opportunities as well as major challenges for addressing health disparities. This article provides a brief description of the newly established National Institutes of Health Big Data to Knowledge (BD2K) diversity initiatives at four universities: California State University, Monterey Bay; Fisk University; University of Puerto Rico, Río Piedras Campus; and California State University, Fullerton. We emphasize three main barriers to BDS careers (ie, preparation, exposure, and access to resources) experienced among those pioneer programs and recommendations for possible solutions (ie, early and proactive mentoring, enriched research experience, and data science curriculum development). The diversity disparities in BDS demonstrate the need for educators, researchers, and funding agencies to support evidence-based practices that will lead to the diversification of the BDS workforce.

  15. AMS data production facilities at science operations center at CERN

    Science.gov (United States)

    Choutko, V.; Egorov, A.; Eline, A.; Shan, B.

    2017-10-01

    The Alpha Magnetic Spectrometer (AMS) is a high energy physics experiment on the board of the International Space Station (ISS). This paper presents the hardware and software facilities of Science Operation Center (SOC) at CERN. Data Production is built around production server - a scalable distributed service which links together a set of different programming modules for science data transformation and reconstruction. The server has the capacity to manage 1000 paralleled job producers, i.e. up to 32K logical processors. Monitoring and management tool with Production GUI is also described.

  16. The Square Kilometre Array Science Data Processor. Preliminary compute platform design

    International Nuclear Information System (INIS)

    Broekema, P.C.; Nieuwpoort, R.V. van; Bal, H.E.

    2015-01-01

    The Square Kilometre Array is a next-generation radio-telescope, to be built in South Africa and Western Australia. It is currently in its detailed design phase, with procurement and construction scheduled to start in 2017. The SKA Science Data Processor is the high-performance computing element of the instrument, responsible for producing science-ready data. This is a major IT project, with the Science Data Processor expected to challenge the computing state-of-the art even in 2020. In this paper we introduce the preliminary Science Data Processor design and the principles that guide the design process, as well as the constraints to the design. We introduce a highly scalable and flexible system architecture capable of handling the SDP workload

  17. The opportunities and challenges for ICT in science education

    OpenAIRE

    Ferk Savec, Vesna

    2017-01-01

    This article examines the opportunities and challenges for the use of ICT in science education in the light of science teachers’ Technological Pedagogical Content Knowledge (TPACK). Some of the variables that have been studied with regard to the TPACK fra mework in science classrooms (such as teachers’ self - efficacy, gender, teaching experience, teachers’ beliefs, etc.) are reviewed, and variations of the TPACK framework specific for science education ...

  18. Benchmarking and improving point cloud data management in MonetDB

    NARCIS (Netherlands)

    Martinez-Rubi, O.; Van Oosterom, P.J.M.; Goncalves, R.; Tijssen, T.P.M.; Ivanova, M.; Kersten, M.L.; Alvanaki, F.

    2015-01-01

    The popularity, availability and sizes of point cloud data sets are increasing, thus raising interesting data management and processing challenges. Various software solutions are available for the management of point cloud data. A benchmark for point cloud data management systems was defined and it

  19. Sustainable Materials Management Challenge Data

    Data.gov (United States)

    U.S. Environmental Protection Agency — Sustainable Materials Management (SMM) is a systemic approach to using and reusing materials more productively over their entire lifecycles. It represents a change...

  20. Processing and Managing the Kepler Mission's Treasure Trove of Stellar and Exoplanet Data

    Science.gov (United States)

    Jenkins, Jon M.

    2016-01-01

    The Kepler telescope launched into orbit in March 2009, initiating NASAs first mission to discover Earth-size planets orbiting Sun-like stars. Kepler simultaneously collected data for 160,000 target stars at a time over its four-year mission, identifying over 4700 planet candidates, 2300 confirmed or validated planets, and over 2100 eclipsing binaries. While Kepler was designed to discover exoplanets, the long term, ultra- high photometric precision measurements it achieved made it a premier observational facility for stellar astrophysics, especially in the field of asteroseismology, and for variable stars, such as RR Lyraes. The Kepler Science Operations Center (SOC) was developed at NASA Ames Research Center to process the data acquired by Kepler from pixel-level calibrations all the way to identifying transiting planet signatures and subjecting them to a suite of diagnostic tests to establish or break confidence in their planetary nature. Detecting small, rocky planets transiting Sun-like stars presents a variety of daunting challenges, from achieving an unprecedented photometric precision of 20 parts per million (ppm) on 6.5-hour timescales, supporting the science operations, management, processing, and repeated reprocessing of the accumulating data stream. This paper describes how the design of the SOC meets these varied challenges, discusses the architecture of the SOC and how the SOC pipeline is operated and is run on the NAS Pleiades supercomputer, and summarizes the most important pipeline features addressing the multiple computational, image and signal processing challenges posed by Kepler.

  1. NASA's Earth Science Data Systems Standards Endorsement Process

    National Research Council Canada - National Science Library

    Ullman, Richard E; Enloe, Yonsook

    2005-01-01

    Starting in January 2004, NASA instituted a set of internal working groups to develop ongoing recommendations for the continuing broad evolution of Earth Science Data Systems development and management within NASA...

  2. Safari Science: Assessing the reliability of citizen science data for wildlife surveys

    Science.gov (United States)

    Steger, Cara; Butt, Bilal; Hooten, Mevin B.

    2017-01-01

    Protected areas are the cornerstone of global conservation, yet financial support for basic monitoring infrastructure is lacking in 60% of them. Citizen science holds potential to address these shortcomings in wildlife monitoring, particularly for resource-limited conservation initiatives in developing countries – if we can account for the reliability of data produced by volunteer citizen scientists (VCS).This study tests the reliability of VCS data vs. data produced by trained ecologists, presenting a hierarchical framework for integrating diverse datasets to assess extra variability from VCS data.Our results show that while VCS data are likely to be overdispersed for our system, the overdispersion varies widely by species. We contend that citizen science methods, within the context of East African drylands, may be more appropriate for species with large body sizes, which are relatively rare, or those that form small herds. VCS perceptions of the charisma of a species may also influence their enthusiasm for recording it.Tailored programme design (such as incentives for VCS) may mitigate the biases in citizen science data and improve overall participation. However, the cost of designing and implementing high-quality citizen science programmes may be prohibitive for the small protected areas that would most benefit from these approaches.Synthesis and applications. As citizen science methods continue to gain momentum, it is critical that managers remain cautious in their implementation of these programmes while working to ensure methods match data purpose. Context-specific tests of citizen science data quality can improve programme implementation, and separate data models should be used when volunteer citizen scientists' variability differs from trained ecologists' data. Partnerships across protected areas and between protected areas and other conservation institutions could help to cover the costs of citizen science programme design and implementation.

  3. Reference Data Layers for Earth and Environmental Science: History, Frameworks, Science Needs, Approaches, and New Technologies

    Science.gov (United States)

    Lenhardt, W. C.

    2015-12-01

    Global Mapping Project, Web-enabled Landsat Data (WELD), International Satellite Land Surface Climatology Project (ISLSCP), hydrology, solid earth dynamics, sedimentary geology, climate modeling, integrated assessments and so on all have needs for or have worked to develop consistently integrated data layers for Earth and environmental science. This paper will present an overview of an abstract notion of data layers of this types, what we are referring to as reference data layers for Earth and environmental science, highlight some historical examples, and delve into new approaches. The concept of reference data layers in this context combines data availability, cyberinfrastructure and data science, as well as domain science drivers. We argue that current advances in cyberinfrastructure such as iPython notebooks and integrated science processing environments such as iPlant's Discovery Environment coupled with vast arrays of new data sources warrant another look at the how to create, maintain, and provide reference data layers. The goal is to provide a context for understanding science needs for reference data layers to conduct their research. In addition, to the topics described above this presentation will also outline some of the challenges to and present some ideas for new approaches to addressing these needs. Promoting the idea of reference data layers is relevant to a number of existing related activities such as EarthCube, RDA, ESIP, the nascent NSF Regional Big Data Innovation Hubs and others.

  4. Challenges of Virtual and Open Distance Science Teacher Education in Zimbabwe

    OpenAIRE

    Vongai Mpofu; Tendai Samukange; Lovemore M Kusure; Tinoidzwa M Zinyandu; Clever Denhere; Nyakotyo Huggins; Chingombe Wiseman; Shakespear Ndlovu; Rennias Chiveya; Monica Matavire; Leckson Mukavhi; Isaac Gwizangwe; Elliot Magombe; Munyaradzi Magomelo; Fungai Sithole

    2012-01-01

    This paper reports on a study of the implementation of science teacher education through virtual and open distance learning in the Mashonaland Central Province, Zimbabwe. The study provides insight into challenges faced by students and lecturers on inception of the program at four centres. Data was collected from completed evaluation survey forms of forty-two lecturers who were directly involved at the launch of the program and in-depth interviews. Qualitative data analysis revealed that the ...

  5. Science Drivers and Technical Challenges for Advanced Magnetic Resonance

    Energy Technology Data Exchange (ETDEWEB)

    Mueller, Karl T.; Pruski, Marek; Washton, Nancy M.; Lipton, Andrew S.

    2013-03-07

    This report recaps the "Science Drivers and Technical Challenges for Advanced Magnetic Resonance" workshop, held in late 2011. This exploratory workshop's goal was to discuss and address challenges for the next generation of magnetic resonance experimentation. During the workshop, participants from throughout the world outlined the science drivers and instrumentation demands for high-field dynamic nuclear polarization (DNP) and associated magnetic resonance techniques, discussed barriers to their advancement, and deliberated the path forward for significant and impactful advances in the field.

  6. High End Computing Technologies for Earth Science Applications: Trends, Challenges, and Innovations

    Science.gov (United States)

    Parks, John (Technical Monitor); Biswas, Rupak; Yan, Jerry C.; Brooks, Walter F.; Sterling, Thomas L.

    2003-01-01

    Earth science applications of the future will stress the capabilities of even the highest performance supercomputers in the areas of raw compute power, mass storage management, and software environments. These NASA mission critical problems demand usable multi-petaflops and exabyte-scale systems to fully realize their science goals. With an exciting vision of the technologies needed, NASA has established a comprehensive program of advanced research in computer architecture, software tools, and device technology to ensure that, in partnership with US industry, it can meet these demanding requirements with reliable, cost effective, and usable ultra-scale systems. NASA will exploit, explore, and influence emerging high end computing architectures and technologies to accelerate the next generation of engineering, operations, and discovery processes for NASA Enterprises. This article captures this vision and describes the concepts, accomplishments, and the potential payoff of the key thrusts that will help meet the computational challenges in Earth science applications.

  7. Minnesota 4-H Science of Agriculture Challenge: Infusing Agricultural Science and Engineering Concepts into 4-H Youth Development

    Science.gov (United States)

    Rice, Joshua E.; Rugg, Bradley; Davis, Sharon

    2016-01-01

    Youth involved in 4-H projects have been engaged in science-related endeavors for years. Since 2006, 4-H has invested considerable resources in the advancement of science learning. The new Minnesota 4-H Science of Agriculture Challenge program challenges 4-H youth to work together to identify agriculture-related issues in their communities and to…

  8. Advancing Water Science through Data Visualization

    Science.gov (United States)

    Li, X.; Troy, T.

    2014-12-01

    As water scientists, we are increasingly handling larger and larger datasets with many variables, making it easy to lose ourselves in the details. Advanced data visualization will play an increasingly significant role in propelling the development of water science in research, economy, policy and education. It can enable analysis within research and further data scientists' understanding of behavior and processes and can potentially affect how the public, whom we often want to inform, understands our work. Unfortunately for water scientists, data visualization is approached in an ad hoc manner when a more formal methodology or understanding could potentially significantly improve both research within the academy and outreach to the public. Firstly to broaden and deepen scientific understanding, data visualization can allow for more analyzed targets to be processed simultaneously and can represent the variables effectively, finding patterns, trends and relationships; thus it can even explores the new research direction or branch of water science. Depending on visualization, we can detect and separate the pivotal and trivial influential factors more clearly to assume and abstract the original complex target system. Providing direct visual perception of the differences between observation data and prediction results of models, data visualization allows researchers to quickly examine the quality of models in water science. Secondly data visualization can also improve public awareness and perhaps influence behavior. Offering decision makers clearer perspectives of potential profits of water, data visualization can amplify the economic value of water science and also increase relevant employment rates. Providing policymakers compelling visuals of the role of water for social and natural systems, data visualization can advance the water management and legislation of water conservation. By building the publics' own data visualization through apps and games about water

  9. Experiences with Deriva: An Asset Management Platform for Accelerating eScience.

    Science.gov (United States)

    Bugacov, Alejandro; Czajkowski, Karl; Kesselman, Carl; Kumar, Anoop; Schuler, Robert E; Tangmunarunkit, Hongsuda

    2017-10-01

    The pace of discovery in eScience is increasingly dependent on a scientist's ability to acquire, curate, integrate, analyze, and share large and diverse collections of data. It is all too common for investigators to spend inordinate amounts of time developing ad hoc procedures to manage their data. In previous work, we presented Deriva, a Scientific Asset Management System, designed to accelerate data driven discovery. In this paper, we report on the use of Deriva in a number of substantial and diverse eScience applications. We describe the lessons we have learned, both from the perspective of the Deriva technology, as well as the ability and willingness of scientists to incorporate Scientific Asset Management into their daily workflows.

  10. Challenges Facing Managers in Managing Conflict in Schools in the South and South Central Regions of Botswana

    Science.gov (United States)

    Morake, Nnior Machomi; Monobe, Ratau John; Dingwe, Stephonia

    2011-01-01

    The purpose of this study was to examine the challenges facing managers in managing conflict in schools of South and South Central Regions of Botswana. In this study, the schedule of interview was used to collect empirical data. A random sample of 50 school managers and deputy school managers was selected for interviews. Major findings of the…

  11. The Emirates Mars Mission Science Data Center

    Science.gov (United States)

    Craft, J.; Al Hammadi, O.; DeWolfe, A. W.; Staley, B.; Schafer, C.; Pankratz, C. K.

    2017-12-01

    The Emirates Mars Mission (EMM), led by the Mohammed Bin Rashid Space Center (MBRSC) in Dubai, United Arab Emirates, is expected to arrive at Mars in January 2021. The EMM Science Data Center (SDC) is to be developed as a joint effort between MBRSC and the University of Colorado's Laboratory for Atmospheric and Space Physics (LASP). The EMM SDC is responsible for the production, management, distribution, and archiving of science data collected from the three instruments on board the Hope spacecraft.With the respective SDC teams on opposite sides of the world evolutionary techniques and cloud-based technologies are being utilized in the development of the EMM SDC. This presentation will provide a top down view of the EMM SDC, summarizing the cloud-based technologies being implemented in the design, as well as the tools, best practices, and lessons learned for software development and management in a geographically distributed team.

  12. NASA Johnson Space Center Life Sciences Data System

    Science.gov (United States)

    Rahman, Hasan; Cardenas, Jeffery

    1994-01-01

    The Life Sciences Project Division (LSPD) at JSC, which manages human life sciences flight experiments for the NASA Life Sciences Division, augmented its Life Sciences Data System (LSDS) in support of the Spacelab Life Sciences-2 (SLS-2) mission, October 1993. The LSDS is a portable ground system supporting Shuttle, Spacelab, and Mir based life sciences experiments. The LSDS supports acquisition, processing, display, and storage of real-time experiment telemetry in a workstation environment. The system may acquire digital or analog data, storing the data in experiment packet format. Data packets from any acquisition source are archived and meta-parameters are derived through the application of mathematical and logical operators. Parameters may be displayed in text and/or graphical form, or output to analog devices. Experiment data packets may be retransmitted through the network interface and database applications may be developed to support virtually any data packet format. The user interface provides menu- and icon-driven program control and the LSDS system can be integrated with other workstations to perform a variety of functions. The generic capabilities, adaptability, and ease of use make the LSDS a cost-effective solution to many experiment data processing requirements. The same system is used for experiment systems functional and integration tests, flight crew training sessions and mission simulations. In addition, the system has provided the infrastructure for the development of the JSC Life Sciences Data Archive System scheduled for completion in December 1994.

  13. The tale of two stories: Challenges and innovations in breast cancer management.

    Science.gov (United States)

    Henry-Tillman, Ronda S

    2018-06-01

    The keynote address The Tale of Two Stories: Challenges and Innovations in Breast Cancer Management presented on March 19-20, 2017 at the celebratory Festschrift Lecture in honor of the great Dr. LaSalle Leffall's on behalf of his impact and contributions to the field of Breast cancer science and treatment as a leader, surgeon, and mentor. This presentation and follow up opinion paper in the field of Breast Disease highlights the challenges that have baffled us and the innovations that have changed and translated into outcomes and those that have not. Where do they parallel and what are the gaps? Copyright © 2018. Published by Elsevier Inc.

  14. 8th International Conference on Management Science and Engineering Management

    CERN Document Server

    Cruz-Machado, Virgílio; Lev, Benjamin; Nickel, Stefan

    2014-01-01

    This is the Proceedings of the Eighth International Conference on Management Science and Engineering Management (ICMSEM) held from July 25 to 27, 2014 at Universidade Nova de Lisboa, Lisbon, Portugal and organized by International Society of Management Science and Engineering Management (ISMSEM), Sichuan University (Chengdu, China) and Universidade Nova de Lisboa (Lisbon, Portugal). The goals of the conference are to foster international research collaborations in Management Science and Engineering Management as well as to provide a forum to present current findings. A total number of 138 papers from 14 countries are selected for the proceedings by the conference scientific committee through rigorous referee review. The selected papers in the second volume are focused on Computing and Engineering Management covering areas of Computing Methodology, Project Management, Industrial Engineering and Information Technology.

  15. DZero data-intensive computing on the Open Science Grid

    International Nuclear Information System (INIS)

    Abbott, B; Baranovski, A; Diesburg, M; Garzoglio, G; Mhashilkar, P; Kurca, T

    2008-01-01

    High energy physics experiments periodically reprocess data, in order to take advantage of improved understanding of the detector and the data processing code. Between February and May 2007, the DZero experiment has reprocessed a substantial fraction of its dataset. This consists of half a billion events, corresponding to about 100 TB of data, organized in 300,000 files. The activity utilized resources from sites around the world, including a dozen sites participating to the Open Science Grid consortium (OSG). About 1,500 jobs were run every day across the OSG, consuming and producing hundreds of Gigabytes of data. Access to OSG computing and storage resources was coordinated by the SAM-Grid system. This system organized job access to a complex topology of data queues and job scheduling to clusters, using a SAM-Grid to OSG job forwarding infrastructure. For the first time in the lifetime of the experiment, a data intensive production activity was managed on a general purpose grid, such as OSG. This paper describes the implications of using OSG, where all resources are granted following an opportunistic model, the challenges of operating a data intensive activity over such large computing infrastructure, and the lessons learned throughout the project

  16. DZero data-intensive computing on the Open Science Grid

    International Nuclear Information System (INIS)

    Abbott, B.; Baranovski, A.; Diesburg, M.; Garzoglio, G.; Kurca, T.; Mhashilkar, P.

    2007-01-01

    High energy physics experiments periodically reprocess data, in order to take advantage of improved understanding of the detector and the data processing code. Between February and May 2007, the DZero experiment has reprocessed a substantial fraction of its dataset. This consists of half a billion events, corresponding to about 100 TB of data, organized in 300,000 files. The activity utilized resources from sites around the world, including a dozen sites participating to the Open Science Grid consortium (OSG). About 1,500 jobs were run every day across the OSG, consuming and producing hundreds of Gigabytes of data. Access to OSG computing and storage resources was coordinated by the SAM-Grid system. This system organized job access to a complex topology of data queues and job scheduling to clusters, using a SAM-Grid to OSG job forwarding infrastructure. For the first time in the lifetime of the experiment, a data intensive production activity was managed on a general purpose grid, such as OSG. This paper describes the implications of using OSG, where all resources are granted following an opportunistic model, the challenges of operating a data intensive activity over such large computing infrastructure, and the lessons learned throughout the project

  17. Past, Current, and Future Challenges in Linking Data to Publications

    Science.gov (United States)

    Hanson, B.

    2015-12-01

    Data are the currency of science and assure the integrity of published research. As the ability to collect, analyze, and visualize data has grown beyond what could be included in a publication, and as the value of the data become more clear (or the lack of availability of data was criticized), publishers and the scientific community developed several solutions to enhance access to underlying data. Most leading journals now require authors to agree as a condition of submission that underlying data will be included or made available; indeed, publication is the key leverage point in exposing much scholarly data. Most journals allow PDF or other supplements and links to data sets hosted by authors or labs, or better, data repositories such as Dryad, and some have banned "data not shown" or any reference to unpublished work. Many of these solutions have proven problematic and recent studies have found that large fraction of data are undiscoverable even a few years after publication. The best solution has been dedicated domain repositories collectively supported by publishers, funders, and the scientific community and where deposition is required before or at the time of publication. These provide quality control and curation and facilitate reuse. However, expanding these beyond a few key repositories and developing standardized workflows and functionality among repositories and between them and publishers has been problematic. Addressing these and other data challenges requires collaborative efforts among funders, publishers, repositories, societies, and researchers. One example is the Coalition on Publishing Data in the the Earth and space sciences, where most major publishers and repositories have signed a joint statement of commitment (COPDESS.org), and are starting work to direct and link published data to domain repositories. Much work remains to be done. Major challenges include establishing data curation practices into the workflow of science from data collection

  18. High Performance Multivariate Visual Data Exploration for Extremely Large Data

    International Nuclear Information System (INIS)

    Ruebel, Oliver; Wu, Kesheng; Childs, Hank; Meredith, Jeremy; Geddes, Cameron G.R.; Cormier-Michel, Estelle; Ahern, Sean; Weber, Gunther H.; Messmer, Peter; Hagen, Hans; Hamann, Bernd; Bethel, E. Wes; Prabhat

    2008-01-01

    One of the central challenges in modern science is the need to quickly derive knowledge and understanding from large, complex collections of data. We present a new approach that deals with this challenge by combining and extending techniques from high performance visual data analysis and scientific data management. This approach is demonstrated within the context of gaining insight from complex, time-varying datasets produced by a laser wakefield accelerator simulation. Our approach leverages histogram-based parallel coordinates for both visual information display as well as a vehicle for guiding a data mining operation. Data extraction and subsetting are implemented with state-of-the-art index/query technology. This approach, while applied here to accelerator science, is generally applicable to a broad set of science applications, and is implemented in a production-quality visual data analysis infrastructure. We conduct a detailed performance analysis and demonstrate good scalability on a distributed memory Cray XT4 system

  19. High Performance Multivariate Visual Data Exploration for Extremely Large Data

    Energy Technology Data Exchange (ETDEWEB)

    Rubel, Oliver; Wu, Kesheng; Childs, Hank; Meredith, Jeremy; Geddes, Cameron G.R.; Cormier-Michel, Estelle; Ahern, Sean; Weber, Gunther H.; Messmer, Peter; Hagen, Hans; Hamann, Bernd; Bethel, E. Wes; Prabhat,

    2008-08-22

    One of the central challenges in modern science is the need to quickly derive knowledge and understanding from large, complex collections of data. We present a new approach that deals with this challenge by combining and extending techniques from high performance visual data analysis and scientific data management. This approach is demonstrated within the context of gaining insight from complex, time-varying datasets produced by a laser wakefield accelerator simulation. Our approach leverages histogram-based parallel coordinates for both visual information display as well as a vehicle for guiding a data mining operation. Data extraction and subsetting are implemented with state-of-the-art index/query technology. This approach, while applied here to accelerator science, is generally applicable to a broad set of science applications, and is implemented in a production-quality visual data analysis infrastructure. We conduct a detailed performance analysis and demonstrate good scalability on a distributed memory Cray XT4 system.

  20. Business, Economics, Financial Sciences, and Management

    CERN Document Server

    2011 International Conference on Business, Economics, and Financial Sciences, Management (BEFM 2011)

    2012-01-01

    A series of papers on business, economics, and financial sciences, management selected from International Conference on Business, Economics, and Financial Sciences, Management are included in this volume.   Management in all business and organizational activities is the act of getting people together to accomplish desired goals and objectives using available resources efficiently and effectively. Management comprises planning, organizing, staffing, leading or directing, and controlling an organization (a group of one or more people or entities) or effort for the purpose of accomplishing a goal. Resourcing encompasses the deployment and manipulation of human resources, financial resources, technological resources and natural resources.   The proceedings of BEFM2011 focuses on the various aspects of advances in Business, Economics, and Financial Sciences, Management and provides a chance for academic and industry professionals to discuss recent progress in the area of Business, Economics, and Financial Scienc...

  1. The Biological and Chemical Oceanography Data Management Office

    Science.gov (United States)

    Allison, M. D.; Chandler, C. L.; Groman, R. C.; Wiebe, P. H.; Glover, D. M.; Gegg, S. R.

    2011-12-01

    Oceanography and marine ecosystem research are inherently interdisciplinary fields of study that generate and require access to a wide variety of measurements. In late 2006 the Biological and Chemical Oceanography Sections of the National Science Foundation (NSF) Geosciences Directorate Division of Ocean Sciences (OCE) funded the Biological and Chemical Oceanography Data Management Office (BCO-DMO). In late 2010 additional funding was contributed to support management of research data from the NSF Office of Polar Programs Antarctic Organisms & Ecosystems Program. The BCO-DMO is recognized in the 2011 Division of Ocean Sciences Sample and Data Policy as one of several program specific data offices that support NSF OCE funded researchers. BCO-DMO staff members offer data management support throughout the project life cycle to investigators from large national programs and medium-sized collaborative research projects, as well as researchers from single investigator awards. The office manages and serves all types of oceanographic data and information generated during the research process and contributed by the originating investigators. BCO-DMO has built a data system that includes the legacy data from several large ocean research programs (e.g. United States Joint Global Ocean Flux Study and United States GLOBal Ocean ECosystems Dynamics), to which data have been contributed from recently granted NSF OCE and OPP awards. The BCO-DMO data system can accommodate many different types of data including: in situ and experimental biological, chemical, and physical measurements; modeling results and synthesis data products. The system enables reuse of oceanographic data for new research endeavors, supports synthesis and modeling activities, provides availability of "real data" for K-12 and college level use, and provides decision-support field data for policy-relevant investigations. We will present an overview of the data management system capabilities including: map

  2. Making Data Management Accessible in the Undergraduate Chemistry Curriculum

    Science.gov (United States)

    Reisner, Barbara A.; Vaughan, K. T. L.; Shorish, Yasmeen L.

    2014-01-01

    In the age of "big data" science, data management is becoming a key information literacy skill for chemistry professionals. To introduce this skill in the undergraduate chemistry major, an activity has been developed to familiarize undergraduates with data management. In this activity, students rename and organize cards that represent…

  3. Data structure and software engineering challenges and improvements

    CERN Document Server

    Antonakos, James L

    2011-01-01

    Data structure and software engineering is an integral part of computer science. This volume presents new approaches and methods to knowledge sharing, brain mapping, data integration, and data storage. The author describes how to manage an organization's business process and domain data and presents new software and hardware testing methods. The book introduces a game development framework used as a learning aid in a software engineering at the university level. It also features a review of social software engineering metrics and methods for processing business information. It explains how to

  4. The Unstructured Data Sharing System for Natural resources and Environment Science Data of the Chinese Academy of Science

    Directory of Open Access Journals (Sweden)

    Dafang Zhuang

    2007-10-01

    Full Text Available The data sharing system for resource and environment science databases of the Chinese Academy of Science (CAS is of an open three-tiered architecture, which integrates the geographical databases of about 9 institutes of CAS by the mechanism of distributive unstructured data management, metadata integration, catalogue services, and security control. The data tiers consist of several distributive data servers that are located in each CAS institute and support such unstructured data formats as vector files, remote sensing images or other raster files, documents, multi-media files, tables, and other format files. For the spatial data files, format transformation service is provided. The middle tier involves a centralized metadata server, which stores metadata records of data on all data servers. The primary function of this tier is catalog service, supporting the creation, search, browsing, updating, and deletion of catalogs. The client tier involves an integrated client that provides the end-users interfaces to search, browse, and download data or create a catalog and upload data.

  5. Integrating science and business models of sustainability for environmentally-challenging industries such as secondary lead smelters: a systematic review and analysis of findings.

    Science.gov (United States)

    Genaidy, A M; Sequeira, R; Tolaymat, T; Kohler, J; Wallace, S; Rinder, M

    2010-09-01

    Secondary lead smelters (SLS) represent an environmentally-challenging industry as they deal with toxic substances posing potential threats to both human and environmental health, consequently, they operate under strict government regulations. Such challenges have resulted in the significant reduction of SLS plants in the last three decades. In addition, the domestic recycling of lead has been on a steep decline in the past 10 years as the amount of lead recovered has remained virtually unchanged while consumption has increased. Therefore, one may wonder whether sustainable development can be achieved among SLS. The primary objective of this study was to determine whether a roadmap for sustainable development can be established for SLS. The following aims were established in support of the study objective: (1) to conduct a systematic review and an analysis of models of sustainable systems with a particular emphasis on SLS; (2) to document the challenges for the U.S. secondary lead smelting industry; and (3) to explore practices and concepts which act as vehicles for SLS on the road to sustainable development. An evidence-based methodology was adopted to achieve the study objective. A comprehensive electronic search was conducted to implement the aforementioned specific aims. Inclusion criteria were established to filter out irrelevant scientific papers and reports. The relevant articles were closely scrutinized and appraised to extract the required information and data for the possible development of a sustainable roadmap. The search process yielded a number of research articles which were utilized in the systematic review. Two types of models emerged: management/business and science/mathematical models. Although the management/business models explored actions to achieve sustainable growth in the industrial enterprise, science/mathematical models attempted to explain the sustainable behaviors and properties aiming at predominantly ecosystem management. As such

  6. EUDAT and EPOS moving towards the efficient management of scientific data sets

    Science.gov (United States)

    Fiameni, Giuseppe; Bailo, Daniele; Cacciari, Claudio

    2016-04-01

    This abstract presents the collaboration between the European Collaborative Data Infrastructure (EUDAT) and the pan-European infrastructure for solid Earth science (EPOS) which draws on the management of scientific data sets through a reciprocal support agreement. EUDAT is a Consortium of European Data Centers and Scientific Communities whose focus is the development and realisation of the Collaborative Data Infrastructure (CDI), a common model for managing data spanning all European research data centres and data repositories and providing an interoperable layer of common data services. The EUDAT Service Suite is a set of a) implementations of the CDI model and b) standards, developed and offered by members of the EUDAT Consortium. These EUDAT Services include a baseline of CDI-compliant interface and API services - a "CDI Gateway" - plus a number of web-based GUIs and command-line client tools. On the other hand,the EPOS initiative aims at creating a pan-European infrastructure for the solid Earth science to support a safe and sustainable society. In accordance with this scientific vision, the mission of EPOS is to integrate the diverse and advanced European Research Infrastructures for solid Earth Science relying on new e-science opportunities to monitor and unravel the dynamic and complex Earth System. EPOS will enable innovative multidisciplinary research for a better understanding of the Earth's physical and chemical processes that control earthquakes, volcanic eruptions, ground instability and tsunami as well as the processes driving tectonics and Earth's surface dynamics. Through the integration of data, models and facilities EPOS will allow the Earth Science community to make a step change in developing new concepts and tools for key answers to scientific and socio-economic questions concerning geo-hazards and geo-resources as well as Earth sciences applications to the environment and to human welfare. To achieve this integration challenge and the

  7. An Ethically Ambitious Higher Education Data Science

    Science.gov (United States)

    Stevens, Mitchell L.

    2014-01-01

    The new data sciences of education bring substantial legal, political, and ethical questions about the management of information about learners. This piece provides a synoptic view of recent scholarly discussion in this domain and calls for a proactive approach to the ethics of learning research.

  8. Science Communication Through Art: Objectives, Challenges, and Outcomes.

    Science.gov (United States)

    Lesen, Amy E; Rogan, Ama; Blum, Michael J

    2016-09-01

    The arts are becoming a favored medium for conveying science to the public. Tracking trending approaches, such as community-engaged learning, alongside challenges and goals can help establish metrics to achieve more impactful outcomes, and to determine the effectiveness of arts-based science communication for raising awareness or shaping public policy. Copyright © 2016 Elsevier Ltd. All rights reserved.

  9. Data Preservation, Information Preservation, and Lifecyle of Information Management at NASA GES DISC

    Science.gov (United States)

    Khayat, Mo; Kempler, Steve; Deshong, Barbara; Johnson, James; Gerasimov, Irina; Esfandiari, Ed; Berganski, Michael; Wei, Jennifer

    2014-01-01

    Data lifecycle management awareness is common today; planners are more likely to consider lifecycle issues at mission start. NASA remote sensing missions are typically subject to life cycle management plans of the Distributed Active Archive Center (DAAC), and NASA invests in these national centers for the long-term safeguarding and benefit of future generations. As stewards of older missions, it is incumbent upon us to ensure that a comprehensive enough set of information is being preserved to prevent the risk for information loss. This risk is greater when the original data experts have moved on or are no longer available. Preservation of items like documentation related to processing algorithms, pre-flight calibration data, or input-output configuration parameters used in product generation, are examples of digital artifacts that are sometimes not fully preserved. This is the grey area of information preservation; the importance of these items is not always clear and requires careful consideration. Missing important metadata about intermediate steps used to derive a product could lead to serious challenges in the reproducibility of results or conclusions. Organizations are rapidly recognizing that the focus of life-cycle preservation needs to be enlarged from the strict raw data to the more encompassing arena of information lifecycle management. By understanding what constitutes information, and the complexities involved, we are better equipped to deliver longer lasting value about the original data and derived knowledge (information) from them. The NASA Earth Science Data Preservation Content Specification is an attempt to define the content necessary for long-term preservation. It requires new lifecycle infrastructure approach along with content repositories to accommodate artifacts other than just raw data. The NASA Goddard Earth Sciences Data and Information Services Center (GES DISC) setup an open-source Preservation System capable of long-term archive of

  10. Scientific and technical challenges of radioactive waste management

    International Nuclear Information System (INIS)

    Vira, J.

    1996-01-01

    In spite of considerable spending on research and technical development, the management of nuclear wastes continues to be a difficult issue in public decision making. The nuclear industry says that it has safe solutions for the ultimate disposal of nuclear wastes, but the message has not really got through to the public at large. Although communications problems reflect the general stigmatization of nuclear power, there are obvious issues in safety and performance assessment of nuclear waste disposal which evade scientific resolution. Any scientist is concerned for his personal credibility must respect the rules and limits of scientific practice, but the intriguing question is whether he would not do better to address the layman's worries about radioactive substances? The discussion in this paper points out the intricacies of the distinction between scientific proof and judgement, with emphasis on safety assessment for nuclear waste disposal. Who are the final arbitrators? In a democratic society it is probably those who vote.Building confidence in expert judgements is a challenge for waste managers and scientists. The media may create their own 'experts', whose only necessary credential is the trust of their audience, but scientific judgements must stand the test of time.'Confidence building' is currently a key word on the whole nuclear waste management scene, and confidence in science and scientists is certainly needed for any progress towards practical implementation of plans. The means for building confidence in the decision-making process are probably different from those applied for science and scientists. (author)

  11. A Review of Forensic Science Management Literature.

    Science.gov (United States)

    Houck, M M; McAndrew, W P; Porter, M; Davies, B

    2015-01-01

    The science in forensic science has received increased scrutiny in recent years, but interest in how forensic science is managed is a relatively new line of research. This paper summarizes the literature in forensic science management generally from 2009 to 2013, with some recent additions, to provide an overview of the growth of topics, results, and improvements in the management of forensic services in the public and private sectors. This review covers only the last three years or so and a version of this paper was originally produced for the 2013 Interpol Forensic Science Managers Symposium and is available at interpol.int. Copyright © 2015 Central Police University.

  12. Sensor Web Technology Challenges and Advancements for the Earth Science Decadal Survey Era

    Science.gov (United States)

    Norton, Charles D.; Moe, Karen

    2011-01-01

    This paper examines the Earth science decadal survey era and the role ESTO developed sensor web technologies can contribute to the scientific observations. This includes hardware and software technology advances for in-situ and in-space measurements. Also discussed are emerging areas of importance such as the potential of small satellites for sensor web based observations as well as advances in data fusion critical to the science and societal benefits of future missions, and the challenges ahead.

  13. CUAHSI Data Services: Tools and Cyberinfrastructure for Water Data Discovery, Research and Collaboration

    Science.gov (United States)

    Seul, M.; Brazil, L.; Castronova, A. M.

    2017-12-01

    CUAHSI Data Services: Tools and Cyberinfrastructure for Water Data Discovery, Research and CollaborationEnabling research surrounding interdisciplinary topics often requires a combination of finding, managing, and analyzing large data sets and models from multiple sources. This challenge has led the National Science Foundation to make strategic investments in developing community data tools and cyberinfrastructure that focus on water data, as it is central need for many of these research topics. CUAHSI (The Consortium of Universities for the Advancement of Hydrologic Science, Inc.) is a non-profit organization funded by the National Science Foundation to aid students, researchers, and educators in using and managing data and models to support research and education in the water sciences. This presentation will focus on open-source CUAHSI-supported tools that enable enhanced data discovery online using advanced searching capabilities and computational analysis run in virtual environments pre-designed for educators and scientists so they can focus their efforts on data analysis rather than IT set-up.

  14. Virtualized cloud data center networks issues in resource management

    CERN Document Server

    Tsai, Linjiun

    2016-01-01

    This book discusses the characteristics of virtualized cloud networking, identifies the requirements of cloud network management, and illustrates the challenges in deploying virtual clusters in multi-tenant cloud data centers. The book also introduces network partitioning techniques to provide contention-free allocation, topology-invariant reallocation, and highly efficient resource utilization, based on the Fat-tree network structure. Managing cloud data center resources without considering resource contentions among different cloud services and dynamic resource demands adversely affects the performance of cloud services and reduces the resource utilization of cloud data centers. These challenges are mainly due to strict cluster topology requirements, resource contentions between uncooperative cloud services, and spatial/temporal data center resource fragmentation. Cloud data center network resource allocation/reallocation which cope well with such challenges will allow cloud services to be provisioned with ...

  15. Functional Land Management: Bridging the Think-Do-Gap using a multi-stakeholder science policy interface.

    Science.gov (United States)

    O'Sullivan, Lilian; Wall, David; Creamer, Rachel; Bampa, Francesca; Schulte, Rogier P O

    2018-03-01

    Functional Land Management (FLM) is proposed as an integrator for sustainability policies and assesses the functional capacity of the soil and land to deliver primary productivity, water purification and regulation, carbon cycling and storage, habitat for biodiversity and recycling of nutrients. This paper presents the catchment challenge as a method to bridge the gap between science, stakeholders and policy for the effective management of soils to deliver these functions. Two challenges were completed by a wide range of stakeholders focused around a physical catchment model-(1) to design an optimised catchment based on soil function targets, (2) identify gaps to implementation of the proposed design. In challenge 1, a high level of consensus between different stakeholders emerged on soil and management measures to be implemented to achieve soil function targets. Key gaps including knowledge, a mix of market and voluntary incentives and mandatory measures were identified in challenge 2.

  16. From Mars to Media: The Phoenix Mars Mission and the Challenges of Real-Time, Multimedia Science Communication and Public Education

    Science.gov (United States)

    Buxner, S.; Bitter, C.

    2008-12-01

    Although the Mars Exploration Rovers, Mars Reconnaissance Orbiter, and Mars Odyssey Missions set the standard for science communication and public education about Mars, the Phoenix Mission was presented with robust new communication challenges and opportunities. The new frontier includes Web 2.0, international forums, internal and external blogs, social networking sites, as well as the traditional media and education outlets for communicating science and information. We will explore the highlights and difficulties of managing the 'message from Mars' in our current multimedia saturated world while balancing authentic science discoveries, public expectations, and communication demands. Our goal is to create a more science savvy public and a more communication oriented science community for the future. The key issues are helping the public and our scientists distinguish between information and knowledge and managing the content that connects the two.

  17. Architectures Toward Reusable Science Data Systems

    Science.gov (United States)

    Moses, John

    2015-01-01

    Science Data Systems (SDS) comprise an important class of data processing systems that support product generation from remote sensors and in-situ observations. These systems enable research into new science data products, replication of experiments and verification of results. NASA has been building systems for satellite data processing since the first Earth observing satellites launched and is continuing development of systems to support NASA science research and NOAAs Earth observing satellite operations. The basic data processing workflows and scenarios continue to be valid for remote sensor observations research as well as for the complex multi-instrument operational satellite data systems being built today. System functions such as ingest, product generation and distribution need to be configured and performed in a consistent and repeatable way with an emphasis on scalability. This paper will examine the key architectural elements of several NASA satellite data processing systems currently in operation and under development that make them suitable for scaling and reuse. Examples of architectural elements that have become attractive include virtual machine environments, standard data product formats, metadata content and file naming, workflow and job management frameworks, data acquisition, search, and distribution protocols. By highlighting key elements and implementation experience we expect to find architectures that will outlast their original application and be readily adaptable for new applications. Concepts and principles are explored that lead to sound guidance for SDS developers and strategists.

  18. The backstage work of data sharing

    Energy Technology Data Exchange (ETDEWEB)

    Kervin, Karina E. [Univ. of Michigan, Ann Arbor, MI (United States); Cook, Robert B. [Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States); Michener, William K. [Univ. of New Mexico, Albuquerque, NM (United States)

    2014-11-09

    Conventional wisdom makes the suggestion that there are benefits to the creation of shared repositories of scientific data. Funding agencies require that the data from sponsored projects be shared publicly, but individual researchers often see little personal benefit to offset the work of creating easily sharable data. These conflicting forces have led to the emergence of a new role to support researchers: data managers. This paper identifies key differences between the socio-technical context of data managers and other "human infrastructure" roles articulated previously in Computer Supported Cooperative Work (CSCW) literature and summarizes the challenges that data managers face when accepting data for archival and reuse. Finally, while data managers' work is critical for advancing science and science policy, their work is often invisible and under-appreciated since it takes place behind the scenes.

  19. Big data e data science

    OpenAIRE

    Cavique, Luís

    2014-01-01

    Neste artigo foram apresentados os conceitos básicos de Big Data e a nova área a que deu origem, a Data Science. Em Data Science foi discutida e exemplificada a noção de redução da dimensionalidade dos dados.

  20. Challenges to the Indicators on Science, Technology and Innovation Development

    OpenAIRE

    Chobanova, Rossitsa

    2006-01-01

    The paper attempts to define the challenges to the indicators on science, technology and innovation development which result from the contemporary dynamics of the global knowledge based economy progress and the pursued challenges of identification of the specific national priority dimensions for public funding research and innovation projects on the case of Bulgaria. It is argued that recent the most widespread methodologies of positioning science, technology and innovation indicators do not ...

  1. Challenges encountered by critical care unit managers in the large intensive care units.

    Science.gov (United States)

    Matlakala, Mokgadi C; Bezuidenhout, Martie C; Botha, Annali D H

    2014-04-04

    Nurses in intensive care units (ICUs) are exposed regularly to huge demands interms of fulfilling the many roles that are placed upon them. Unit managers, in particular, are responsible for the efficient management of the units and have the responsibilities of planning, organising, leading and controlling the daily activities in order to facilitate the achievement of the unit objectives. The objective of this study was to explore and present the challenges encountered by ICU managers in the management of large ICUs. A qualitative, exploratory and descriptive study was conducted at five hospital ICUs in Gauteng province, South Africa. Data were collected through individual interviews from purposively-selected critical care unit managers, then analysed using the matic coding. Five themes emerged from the data: challenges related to the layout and structure of the unit, human resources provision and staffing, provision of material resources, stressors in the unit and visitors in the ICU. Unit managers in large ICUs face multifaceted challenges which include the demand for efficient and sufficient specialised nurses; lack of or inadequate equipment that goes along with technology in ICU and supplies; and stressors in the ICU that limit the efficiency to plan, organise, lead and control the daily activities in the unit. The challenges identified call for multiple strategies to assist in the efficient management of large ICUs.

  2. Challenges encountered by critical care unit managers in the large intensive care units

    Directory of Open Access Journals (Sweden)

    Mokgadi C. Matlakala

    2014-04-01

    Full Text Available Background: Nurses in intensive care units (ICUs are exposed regularly to huge demands interms of fulfilling the many roles that are placed upon them. Unit managers, in particular, are responsible for the efficient management of the units and have the responsibilities of planning, organising, leading and controlling the daily activities in order to facilitate the achievement of the unit objectives. Objectives: The objective of this study was to explore and present the challenges encountered by ICU managers in the management of large ICUs. Method: A qualitative, exploratory and descriptive study was conducted at five hospital ICUs in Gauteng province, South Africa. Data were collected through individual interviews from purposively-selected critical care unit managers, then analysed using the matic coding. Results: Five themes emerged from the data: challenges related to the layout and structure of the unit, human resources provision and staffing, provision of material resources, stressors in the unit and visitors in the ICU. Conclusion: Unit managers in large ICUs face multifaceted challenges which include the demand for efficient and sufficient specialised nurses; lack of or inadequate equipment that goes along with technology in ICU and supplies; and stressors in the ICU that limit the efficiency to plan, organise, lead and control the daily activities in the unit. The challenges identified call for multiple strategies to assist in the efficient management of large ICUs.

  3. Globus Platform-as-a-Service for Collaborative Science Applications.

    Science.gov (United States)

    Ananthakrishnan, Rachana; Chard, Kyle; Foster, Ian; Tuecke, Steven

    2015-02-01

    Globus, developed as Software-as-a-Service (SaaS) for research data management, also provides APIs that constitute a flexible and powerful Platform-as-a-Service (PaaS) to which developers can outsource data management activities such as transfer and sharing, as well as identity, profile and group management. By providing these frequently important but always challenging capabilities as a service, accessible over the network, Globus PaaS streamlines web application development and makes it easy for individuals, teams, and institutions to create collaborative applications such as science gateways for science communities. We introduce the capabilities of this platform and review representative applications.

  4. Management of science policy, sociology of science policy and economics of science policy

    CERN Document Server

    Ruivo, Beatriz

    2017-01-01

    'Management of science policy, sociology of science policy and economics of science policy' is a theoretical essay on the scientific foundation of science policy (formulation, implementation, instruments and procedures). It can be also used as a textbook.

  5. Dynamic and adaptive data-management in ATLAS

    CERN Document Server

    Lassnig, M; Branco, M; Molfetas, A

    2010-01-01

    Distributed data-management on the grid is subject to huge uncertainties yet static policies govern its usage. Due to the unpredictability of user behaviour, the high-latency and the heterogeneous nature of the environment, distributed data-management on the grid is challenging. In this paper we present the first steps towards a future dynamic data-management system that adapts to the changing conditions and environment. Such a system would eliminate the number of manual interventions and remove unnecessary software layers, thereby providing a higher quality of service to the collaboration.

  6. Clinical Parameters and Challenges of Managing Cervicofacial ...

    African Journals Online (AJOL)

    Introduction: Necrotizing fasciitis is a severe soft tissue infection. In our environment, patients presenting with this infection are usually financially incapacitated and, therefore, their management can be challenging. This paper aimed to document the pattern and challenges encountered in the management of cervicofacial ...

  7. Targeted learning in data science causal inference for complex longitudinal studies

    CERN Document Server

    van der Laan, Mark J

    2018-01-01

    This textbook for graduate students in statistics, data science, and public health deals with the practical challenges that come with big, complex, and dynamic data. It presents a scientific roadmap to translate real-world data science applications into formal statistical estimation problems by using the general template of targeted maximum likelihood estimators. These targeted machine learning algorithms estimate quantities of interest while still providing valid inference. Targeted learning methods within data science area critical component for solving scientific problems in the modern age. The techniques can answer complex questions including optimal rules for assigning treatment based on longitudinal data with time-dependent confounding, as well as other estimands in dependent data structures, such as networks. Included in Targeted Learning in Data Science are demonstrations with soft ware packages and real data sets that present a case that targeted learning is crucial for the next generatio...

  8. CITIESData: a smart city data management framework

    DEFF Research Database (Denmark)

    Liu, Xiufeng; Heller, Alfred; Nielsen, Per Sieverts

    2017-01-01

    and publishing challenging. In this paper,we propose a framework to streamline smart city data management, including data collection, cleansing, anonymization, and publishing. The paper classifies smart city data in sensitive, quasi-sensitive, and open/public levels and then suggests different strategies...

  9. Technical Challenges in Developing Software to Collect Twitter Data

    Directory of Open Access Journals (Sweden)

    Daniel Chudnov

    2014-10-01

    Full Text Available Over the past two years, George Washington University Libraries developed Social Feed Manager (SFM, a Python and Django-based application for collecting social media data from Twitter. Expanding the project from a research prototype to a more widely useful application has presented a number of technical challenges, including changes in the Twitter API, supervision of simultaneous streaming processes, management, storage, and organization of collected data, meeting researcher needs for groups or sets of data, and improving documentation to facilitate other institutions’ installation and use of SFM. This article will describe how the Social Feed Manager project addressed these issues, use of supervisord to manage processes, and other technical decisions made in the course of this project through late summer 2014. This article is targeted towards librarians and archivists who are interested in building collections around web archives and social media data, and have a particular interest in the technical work involved in applying software to the problem of building a sustainable collection management program around these sources.

  10. A comprehensive data acquisition and management system for an ecosystem-scale peatland warming and elevated CO2 experiment

    Science.gov (United States)

    Krassovski, M. B.; Riggs, J. S.; Hook, L. A.; Nettles, W. R.; Hanson, P. J.; Boden, T. A.

    2015-11-01

    Ecosystem-scale manipulation experiments represent large science investments that require well-designed data acquisition and management systems to provide reliable, accurate information to project participants and third party users. The SPRUCE project (Spruce and Peatland Responses Under Climatic and Environmental Change, http://mnspruce.ornl.gov) is such an experiment funded by the Department of Energy's (DOE), Office of Science, Terrestrial Ecosystem Science (TES) Program. The SPRUCE experimental mission is to assess ecosystem-level biological responses of vulnerable, high carbon terrestrial ecosystems to a range of climate warming manipulations and an elevated CO2 atmosphere. SPRUCE provides a platform for testing mechanisms controlling the vulnerability of organisms, biogeochemical processes, and ecosystems to climatic change (e.g., thresholds for organism decline or mortality, limitations to regeneration, biogeochemical limitations to productivity, and the cycling and release of CO2 and CH4 to the atmosphere). The SPRUCE experiment will generate a wide range of continuous and discrete measurements. To successfully manage SPRUCE data collection, achieve SPRUCE science objectives, and support broader climate change research, the research staff has designed a flexible data system using proven network technologies and software components. The primary SPRUCE data system components are the following: 1. data acquisition and control system - set of hardware and software to retrieve biological and engineering data from sensors, collect sensor status information, and distribute feedback to control components; 2. data collection system - set of hardware and software to deliver data to a central depository for storage and further processing; 3. data management plan - set of plans, policies, and practices to control consistency, protect data integrity, and deliver data. This publication presents our approach to meeting the challenges of designing and constructing an

  11. The AmeriFlux data activity and data system: an evolving collection of data management techniques, tools, products and services

    Directory of Open Access Journals (Sweden)

    T. A. Boden

    2013-06-01

    Full Text Available The Carbon Dioxide Information Analysis Center (CDIAC at Oak Ridge National Laboratory (ORNL, USA has provided scientific data management support for the US Department of Energy and international climate change science since 1982. Among the many data archived and available from CDIAC are collections from long-term measurement projects. One current example is the AmeriFlux measurement network. AmeriFlux provides continuous measurements from forests, grasslands, wetlands, and croplands in North, Central, and South America and offers important insight about carbon cycling in terrestrial ecosystems. To successfully manage AmeriFlux data and support climate change research, CDIAC has designed flexible data systems using proven technologies and standards blended with new, evolving technologies and standards. The AmeriFlux data system, comprised primarily of a relational database, a PHP-based data interface and a FTP server, offers a broad suite of AmeriFlux data. The data interface allows users to query the AmeriFlux collection in a variety of ways and then subset, visualize and download the data. From the perspective of data stewardship, on the other hand, this system is designed for CDIAC to easily control database content, automate data movement, track data provenance, manage metadata content, and handle frequent additions and corrections. CDIAC and researchers in the flux community developed data submission guidelines to enhance the AmeriFlux data collection, enable automated data processing, and promote standardization across regional networks. Both continuous flux and meteorological data and irregular biological data collected at AmeriFlux sites are carefully scrutinized by CDIAC using established quality-control algorithms before the data are ingested into the AmeriFlux data system. Other tasks at CDIAC include reformatting and standardizing the diverse and heterogeneous datasets received from individual sites into a uniform and consistent

  12. Application of Bayesian Classification to Content-Based Data Management

    Science.gov (United States)

    Lynnes, Christopher; Berrick, S.; Gopalan, A.; Hua, X.; Shen, S.; Smith, P.; Yang, K-Y.; Wheeler, K.; Curry, C.

    2004-01-01

    The high volume of Earth Observing System data has proven to be challenging to manage for data centers and users alike. At the Goddard Earth Sciences Distributed Active Archive Center (GES DAAC), about 1 TB of new data are archived each day. Distribution to users is also about 1 TB/day. A substantial portion of this distribution is MODIS calibrated radiance data, which has a wide variety of uses. However, much of the data is not useful for a particular user's needs: for example, ocean color users typically need oceanic pixels that are free of cloud and sun-glint. The GES DAAC is using a simple Bayesian classification scheme to rapidly classify each pixel in the scene in order to support several experimental content-based data services for near-real-time MODIS calibrated radiance products (from Direct Readout stations). Content-based subsetting would allow distribution of, say, only clear pixels to the user if desired. Content-based subscriptions would distribute data to users only when they fit the user's usability criteria in their area of interest within the scene. Content-based cache management would retain more useful data on disk for easy online access. The classification may even be exploited in an automated quality assessment of the geolocation product. Though initially to be demonstrated at the GES DAAC, these techniques have applicability in other resource-limited environments, such as spaceborne data systems.

  13. Science Data Management for the E-ELT: usecase MICADO

    NARCIS (Netherlands)

    Verdoes Kleijn, Gijs

    2015-01-01

    The E-ELT First-light instrument MICADO will explore new parameter space in terms of precision astrometry, photometry and spectroscopy. This provides challenges for the data handling and reduction to ensure MICADO takes the observational capabilities of the AO-assisted E-ELT towards its limits. Our

  14. From big data to deep insight in developmental science.

    Science.gov (United States)

    Gilmore, Rick O

    2016-01-01

    The use of the term 'big data' has grown substantially over the past several decades and is now widespread. In this review, I ask what makes data 'big' and what implications the size, density, or complexity of datasets have for the science of human development. A survey of existing datasets illustrates how existing large, complex, multilevel, and multimeasure data can reveal the complexities of developmental processes. At the same time, significant technical, policy, ethics, transparency, cultural, and conceptual issues associated with the use of big data must be addressed. Most big developmental science data are currently hard to find and cumbersome to access, the field lacks a culture of data sharing, and there is no consensus about who owns or should control research data. But, these barriers are dissolving. Developmental researchers are finding new ways to collect, manage, store, share, and enable others to reuse data. This promises a future in which big data can lead to deeper insights about some of the most profound questions in behavioral science. © 2016 The Authors. WIREs Cognitive Science published by Wiley Periodicals, Inc.

  15. Prospects for the Development of Administration and Challenges for the Management of Organizations

    Directory of Open Access Journals (Sweden)

    Guido Angello Castro Ríos

    2011-06-01

    Full Text Available Administration, as incipient science, presents a series of challenges to the organization’sdirector and for academy itself. The adaptation, the strategy, the communication skills andeven the ability to radiate in the organization in the organization a sense of accomplishment,are approaches that will shape the development of the organization and management as aset of knowledge that should facilitate the sustainability of the organization as a complexsystem.

  16. Knowledge Representation and Management: a Linked Data Perspective.

    Science.gov (United States)

    Barros, M; Couto, F M

    2016-11-10

    Biomedical research is increasingly becoming a data-intensive science in several areas, where prodigious amounts of data is being generated that has to be stored, integrated, shared and analyzed. In an effort to improve the accessibility of data and knowledge, the Linked Data initiative proposed a well-defined set of recommendations for exposing, sharing and integrating data, information and knowledge, using semantic web technologies. The main goal of this paper is to identify the current status and future trends of knowledge representation and management in Life and Health Sciences, mostly with regard to linked data technologies. We selected three prominent linked data studies, namely Bio2RDF, Open PHACTS and EBI RDF platform, and selected 14 studies published after 2014 (inclusive) that cited any of the three studies. We manually analyzed these 14 papers in relation to how they use linked data techniques. The analyses show a tendency to use linked data techniques in Life and Health Sciences, and even if some studies do not follow all of the recommendations, many of them already represent and manage their knowledge using RDF and biomedical ontologies. These insights from RDF and biomedical ontologies are having a strong impact on how knowledge is generated from biomedical data, by making data elements increasingly connected and by providing a better description of their semantics. As health institutes become more data centric, we believe that the adoption of linked data techniques will continue to grow and be an effective solution to knowledge representation and management.

  17. The Management of Managers: Challenges in a Small Economy

    Science.gov (United States)

    Gilbert, John; Boxall, Peter

    2009-01-01

    Purpose: The purpose of this paper is to discuss the findings of a study of the management of senior managers. The aim is to describe the ways in which firms in a small economy, such as New Zealand, manage their managers and analyse how they deal with the strategic challenges that are involved. Design/methodology/approach: The study applies the…

  18. A new approach to environmental education: environment-challenge for science, technology and society

    International Nuclear Information System (INIS)

    Popovic, D.

    2002-01-01

    The paper presents a new approach to environmental education within the project Environment: Challenge for Science, Technology and Education, realized on the Alternative Academic Education Network (AAEN) in Belgrade. The project is designed for graduate or advanced undergraduate students of science, medicine, engineering, biotechnology, political and law sciences. It is multidisciplinary and interdisciplinary project aimed to support students interest in different areas of the environmental sciences through strong inter-connection between modern scientific ideas, technological achievements and society. The project contains four basic courses (Living in the Environment; Physical and Chemical Processes in the Environment; Industrial Ecology and Sustainable Development; Environmental Philosophy and Ethics) and a number of elective courses dealing with environmental biology, adaptation processes , global eco politics, environmental ethics, scientific and public policy, environmental consequences of warfare, environmental pollution control, energy management, environmental impact assessment, etc. The standard ex catedra teaching is replaced with active student-teacher communication method enabling students to participate actively in the subject through seminars, workshops, short essays and individual research projects

  19. Support for global science: Remote sensing's challenge

    Science.gov (United States)

    Estes, J. E.; Star, J. L.

    1986-01-01

    Remote sensing uses a wide variety of techniques and methods. Resulting data are analyzed by man and machine, using both analog and digital technology. The newest and most important initiatives in the U. S. civilian space program currently revolve around the space station complex, which includes the core station as well as co-orbiting and polar satellite platforms. This proposed suite of platforms and support systems offers a unique potential for facilitating long term, multidisciplinary scientific investigations on a truly global scale. Unlike previous generations of satellites, designed for relatively limited constituencies, the space station offers the potential to provide an integrated source of information which recognizes the scientific interest in investigating the dynamic coupling between the oceans, land surface, and atmosphere. Earth scientist already face problems that are truly global in extent. Problems such as the global carbon balance, regional deforestation, and desertification require new approaches, which combine multidisciplinary, multinational research teams, employing advanced technologies to produce a type, quantity, and quality of data not previously available. The challenge before the international scientific community is to continue to develop both the infrastructure and expertise to, on the one hand, develop the science and technology of remote sensing, while on the other hand, develop an integrated understanding of global life support systems, and work toward a quantiative science of the biosphere.

  20. Trends and frontiers for the science and management of the oceans.

    Science.gov (United States)

    Mumby, Peter J

    2017-06-05

    People have an enduring fascination with the biology of the oceans. When the BBC's 'Blue Planet' series first aired on British television almost a quarter of the nation tuned in. As the diversity of science in this special issue of Current Biology attests, the ocean presents a challenging environment for study while also exhibiting some of the most profound and disruptive symptoms of global change. Marine science has made major advances in the past few decades, which were primarily made possible through important technological innovations. This progress notwithstanding, there are persistent challenges in achieving an understanding of marine processes at appropriate scales and delivering meaningful insights to guide ocean policy and management. Naturally, the examples chosen below betray my ecological leanings, but I hope that many of the issues raised resonate with readers in many different disciplines. Copyright © 2017 Elsevier Ltd. All rights reserved.

  1. Internet of Things, Challenges for Demand Side Management

    Directory of Open Access Journals (Sweden)

    Simona-Vasilica OPREA

    2017-01-01

    Full Text Available The adoption of any new product means also the apparition of new issues and challenges, and this is especially true when we talk about a mass adoption. The advent of Internet of Things (IoT devices will be, in the authors of this paper opinion, the largest and the fastest product adoption yet to be seen, as several early sources were predicting a volume of 50 billion IoT devices to be active by 2020 [1][2]. While later forecasts reduced the predicted amount to about 20-30 billion devices [3], even for such “reduced” number, demand side management issues are foreseeable, for the potential economic impact of IoT applications in 2025 will be between 3.9 and $11.1 trillion USD [4]. Not only that new patterns will emerge in energy consumption and Internet traffic, but we predict that the sheer amount of data produced by this quantity of IoT devices will give birth to a new sort of demand side management, the demand side management of IoT data. How will this work is yet to be seen but, at the current moment, one can at least identify the bits and pieces that will constitute it. This paper is intended to serve as short guide regarding the possible challenges raised by the adoption of IoT devices. The data types and structures, lifecycle and patterns will be briefly discussed throughout the following article.

  2. Science Planning and Orbit Classification for Solar Probe Plus

    Science.gov (United States)

    Kusterer, M. B.; Fox, N. J.; Rodgers, D. J.; Turner, F. S.

    2016-12-01

    There are a number of challenges for the Science Planning Team (SPT) of the Solar Probe Plus (SPP) Mission. Since SPP is using a decoupled payload operations approach, tight coordination between the mission operations and payload teams will be required. The payload teams must manage the volume of data that they write to the spacecraft solid-state recorders (SSR) for their individual instruments for downlink to the ground. Making this process more difficult, the geometry of the celestial bodies and the spacecraft during some of the SPP mission orbits cause limited uplink and downlink opportunities. The payload teams will also be required to coordinate power on opportunities, command uplink opportunities, and data transfers from instrument memory to the spacecraft SSR with the operation team. The SPT also intend to coordinate observations with other spacecraft and ground based systems. To solve these challenges, detailed orbit activity planning is required in advance for each orbit. An orbit planning process is being created to facilitate the coordination of spacecraft and payload activities for each orbit. An interactive Science Planning Tool is being designed to integrate the payload data volume and priority allocations, spacecraft ephemeris, attitude, downlink and uplink schedules, spacecraft and payload activities, and other spacecraft ephemeris. It will be used during science planning to select the instrument data priorities and data volumes that satisfy the orbit data volume constraints and power on, command uplink and data transfer time periods. To aid in the initial stages of science planning we have created an orbit classification scheme based on downlink availability and significant science events. Different types of challenges arise in the management of science data driven by orbital geometry and operational constraints, and this scheme attempts to identify the patterns that emerge.

  3. Machine learning and data science in soft materials engineering

    Science.gov (United States)

    Ferguson, Andrew L.

    2018-01-01

    In many branches of materials science it is now routine to generate data sets of such large size and dimensionality that conventional methods of analysis fail. Paradigms and tools from data science and machine learning can provide scalable approaches to identify and extract trends and patterns within voluminous data sets, perform guided traversals of high-dimensional phase spaces, and furnish data-driven strategies for inverse materials design. This topical review provides an accessible introduction to machine learning tools in the context of soft and biological materials by ‘de-jargonizing’ data science terminology, presenting a taxonomy of machine learning techniques, and surveying the mathematical underpinnings and software implementations of popular tools, including principal component analysis, independent component analysis, diffusion maps, support vector machines, and relative entropy. We present illustrative examples of machine learning applications in soft matter, including inverse design of self-assembling materials, nonlinear learning of protein folding landscapes, high-throughput antimicrobial peptide design, and data-driven materials design engines. We close with an outlook on the challenges and opportunities for the field.

  4. Machine learning and data science in soft materials engineering.

    Science.gov (United States)

    Ferguson, Andrew L

    2018-01-31

    In many branches of materials science it is now routine to generate data sets of such large size and dimensionality that conventional methods of analysis fail. Paradigms and tools from data science and machine learning can provide scalable approaches to identify and extract trends and patterns within voluminous data sets, perform guided traversals of high-dimensional phase spaces, and furnish data-driven strategies for inverse materials design. This topical review provides an accessible introduction to machine learning tools in the context of soft and biological materials by 'de-jargonizing' data science terminology, presenting a taxonomy of machine learning techniques, and surveying the mathematical underpinnings and software implementations of popular tools, including principal component analysis, independent component analysis, diffusion maps, support vector machines, and relative entropy. We present illustrative examples of machine learning applications in soft matter, including inverse design of self-assembling materials, nonlinear learning of protein folding landscapes, high-throughput antimicrobial peptide design, and data-driven materials design engines. We close with an outlook on the challenges and opportunities for the field.

  5. Ex Machina: Analytical platforms, Law and the Challenges of Computational Legal Science

    Directory of Open Access Journals (Sweden)

    Nicola Lettieri

    2018-04-01

    Full Text Available Over the years, computation has become a fundamental part of the scientific practice in several research fields that goes far beyond the boundaries of natural sciences. Data mining, machine learning, simulations and other computational methods lie today at the hearth of the scientific endeavour in a growing number of social research areas from anthropology to economics. In this scenario, an increasingly important role is played by analytical platforms: integrated environments allowing researchers to experiment cutting-edge data-driven and computation-intensive analyses. The paper discusses the appearance of such tools in the emerging field of computational legal science. After a general introduction to the impact of computational methods on both natural and social sciences, we describe the concept and the features of an analytical platform exploring innovative cross-methodological approaches to the academic and investigative study of crime. Stemming from an ongoing project involving researchers from law, computer science and bioinformatics, the initiative is presented and discussed as an opportunity to raise a debate about the future of legal scholarship and, inside of it, about the challenges of computational legal science.

  6. Accreditation status of hospital pharmacies and their challenges of medication management: A case of south Iranian largest university.

    Science.gov (United States)

    Barati, Omid; Dorosti, Hesam; Talebzadeh, Alireza; Bastani, Peivand

    2016-01-01

    Considering the importance of accreditation for hospital pharmacies, this study was to determine the challenges of medication management in hospital pharmacies affiliated with Shiraz University of Medical Sciences, Iran. The study was a mix-method research conducted in two qualitative and quantitative phases during the years 2014-2015 in Shiraz, Iran. National Accreditation Standard checklist for hospitals was used for data collection in the first phase, and Delphi method was applied in three rounds to achieve the most challenges of medication management and the related solutions. Results indicated a medium status of accreditation for all three dimensions in the above hospital pharmacies (3.53, 42.15 and 7, respectively). Lack of clinical pharmacists, nonparticipation of the pharmacy director in annual budgeting, lack of access to patient information, discontinuity of pharmaceutical care for patients discharged, defects in pharmacy staff training, lack of legislation in support of pharmacists and lack of adequate access to physicians' prescriptions, shortages in reporting medication errors, and lack of evidence related to microbial contamination are the most challenges extracted from the second phase. It seems that the studied hospital pharmacies encounter numerous problems regarding accreditation, pharmaceutical care as well as appropriate medication management and supply chain. Attempts to solve these problems can play an important role in improving the efficiency and effectiveness of pharmacies in Iran.

  7. Accreditation status of hospital pharmacies and their challenges of medication management: A case of south Iranian largest university

    Directory of Open Access Journals (Sweden)

    Omid Barati

    2016-01-01

    Full Text Available Considering the importance of accreditation for hospital pharmacies, this study was to determine the challenges of medication management in hospital pharmacies affiliated with Shiraz University of Medical Sciences, Iran. The study was a mix-method research conducted in two qualitative and quantitative phases during the years 2014–2015 in Shiraz, Iran. National Accreditation Standard checklist for hospitals was used for data collection in the first phase, and Delphi method was applied in three rounds to achieve the most challenges of medication management and the related solutions. Results indicated a medium status of accreditation for all three dimensions in the above hospital pharmacies (3.53, 42.15 and 7, respectively. Lack of clinical pharmacists, nonparticipation of the pharmacy director in annual budgeting, lack of access to patient information, discontinuity of pharmaceutical care for patients discharged, defects in pharmacy staff training, lack of legislation in support of pharmacists and lack of adequate access to physicians' prescriptions, shortages in reporting medication errors, and lack of evidence related to microbial contamination are the most challenges extracted from the second phase. It seems that the studied hospital pharmacies encounter numerous problems regarding accreditation, pharmaceutical care as well as appropriate medication management and supply chain. Attempts to solve these problems can play an important role in improving the efficiency and effectiveness of pharmacies in Iran.

  8. Using Twitter for Demographic and Social Science Research: Tools for Data Collection and Processing.

    Science.gov (United States)

    McCormick, Tyler H; Lee, Hedwig; Cesare, Nina; Shojaie, Ali; Spiro, Emma S

    2017-08-01

    Despite recent and growing interest in using Twitter to examine human behavior and attitudes, there is still significant room for growth regarding the ability to leverage Twitter data for social science research. In particular, gleaning demographic information about Twitter users-a key component of much social science research-remains a challenge. This article develops an accurate and reliable data processing approach for social science researchers interested in using Twitter data to examine behaviors and attitudes, as well as the demographic characteristics of the populations expressing or engaging in them. Using information gathered from Twitter users who state an intention to not vote in the 2012 presidential election, we describe and evaluate a method for processing data to retrieve demographic information reported by users that is not encoded as text (e.g., details of images) and evaluate the reliability of these techniques. We end by assessing the challenges of this data collection strategy and discussing how large-scale social media data may benefit demographic researchers.

  9. Frameworks Coordinate Scientific Data Management

    Science.gov (United States)

    2012-01-01

    Jet Propulsion Laboratory computer scientists developed a unique software framework to help NASA manage its massive amounts of science data. Through a partnership with the Apache Software Foundation of Forest Hill, Maryland, the technology is now available as an open-source solution and is in use by cancer researchers and pediatric hospitals.

  10. Managing scientific information and research data

    CERN Document Server

    Baykoucheva, Svetla

    2015-01-01

    Innovative technologies are changing the way research is performed, preserved, and communicated. Managing Scientific Information and Research Data explores how these technologies are used and provides detailed analysis of the approaches and tools developed to manage scientific information and data. Following an introduction, the book is then divided into 15 chapters discussing the changes in scientific communication; new models of publishing and peer review; ethics in scientific communication; preservation of data; discovery tools; discipline-specific practices of researchers for gathering and using scientific information; academic social networks; bibliographic management tools; information literacy and the information needs of students and researchers; the involvement of academic libraries in eScience and the new opportunities it presents to librarians; and interviews with experts in scientific information and publishing.

  11. Managing change : Case study: HAMK University of Applied Sciences, Valkeakoski

    OpenAIRE

    Chau Thi Tra, Mi

    2012-01-01

    In response to changes imposed by the Finnish government on the Univer-sities of Applied Sciences system in the near future, HAMK has proactive-ly adopted several programmes to prepare for future challenges and rein-force the organization’s competitiveness. However, organizational change has never been an easy, straightforward issue and how to manage change effectively has become an interest to the organization. The study aims at providing suggestions for a more successful change im-pleme...

  12. Applying science and mathematics to big data for smarter buildings.

    Science.gov (United States)

    Lee, Young M; An, Lianjun; Liu, Fei; Horesh, Raya; Chae, Young Tae; Zhang, Rui

    2013-08-01

    Many buildings are now collecting a large amount of data on operations, energy consumption, and activities through systems such as a building management system (BMS), sensors, and meters (e.g., submeters and smart meters). However, the majority of data are not utilized and are thrown away. Science and mathematics can play an important role in utilizing these big data and accurately assessing how energy is consumed in buildings and what can be done to save energy, make buildings energy efficient, and reduce greenhouse gas (GHG) emissions. This paper discusses an analytical tool that has been developed to assist building owners, facility managers, operators, and tenants of buildings in assessing, benchmarking, diagnosing, tracking, forecasting, and simulating energy consumption in building portfolios. © 2013 New York Academy of Sciences.

  13. Data and Science: GES DISC Users' Data Usage and Science Exploration

    Science.gov (United States)

    Shie, C. L.; Greene, M.; Acker, J. G.; Lei, G. D.; Al-Jazrawi, A. F.; Meyer, D. J.

    2017-12-01

    Motivation: Recall the arguably most renowned anecdote in the history of science: the young Isaac Newton was hit on his head by a falling apple (the data!) when he sat in his garden, which inspired Newton's brilliant insight and his eventually understanding and demonstrating of gravitational force (the science!). This well-known "coupling" of "data" and "science" can be considered as the trigger for this study (as well as its title). The NASA Goddard Earth Sciences Data and Information Service Center (GES DISC) has provided massive amounts of Earth science data, information, and services to diverse research communities and the general public for decades. How much those data products from different missions or projects have been used by diverse user communities, as well as how they have been used by our various user categories (such as research scientists, applications scientists, general public, and students) for different science research or/and applications are the primary focus of this study. We have performed an integrated analysis on "data usage" vs. "science research/application" by investigating three different, yet related, groups of records, i.e., user Help Tickets (the questions and feedback from the users), user publications (info acquired especially via users' acknowledgments of using Giovanni, our powerful in-house visualization tool, in their papers), and user metrics (the collected information of data and service usage by the users) in recent years (2013-2017). For example, precipitation, hydrology, and atmospheric chemistry have been found as frequently applied science variables or/and science areas that have been exploited or/and explored by the users based on the user tickets we have analyzed so far. With regard to Giovanni, a significant minority of the users are applications users (air quality, water quality, agriculture, natural disasters, etc.) in contrast to the majority of basic research users. More users employ Giovanni as an adjunct data

  14. LHCb Data Management: consistency, integrity and coherence of data

    CERN Document Server

    Bargiotti, Marianne

    2007-01-01

    The Large Hadron Collider (LHC) at CERN will start operating in 2007. The LHCb experiment is preparing for the real data handling and analysis via a series of data challenges and production exercises. The aim of these activities is to demonstrate the readiness of the computing infrastructure based on WLCG (Worldwide LHC Computing Grid) technologies, to validate the computing model and to provide useful samples of data for detector and physics studies. DIRAC (Distributed Infrastructure with Remote Agent Control) is the gateway to WLCG. The Dirac Data Management System (DMS) relies on both WLCG Data Management services (LCG File Catalogues, Storage Resource Managers and File Transfer Service) and LHCb specific components (Bookkeeping Metadata File Catalogue). Although the Dirac DMS has been extensively used over the past years and has proved to achieve a high grade of maturity and reliability, the complexity of both the DMS and its interactions with numerous WLCG components as well as the instability of facilit...

  15. Ecoinformatics: supporting ecology as a data-intensive science

    OpenAIRE

    Michener, William H.; Jones, Matthew B.

    2012-01-01

    Ecology is evolving rapidly and increasingly changing into a more open, accountable, interdisciplinary, collaborative and data-intensive science. Discovering, integrating and analyzing massive amounts of heterogeneous data are central to ecology as researchers address complex ques- tions at scales from the gene to the biosphere. Ecoinfor- matics offers tools and approaches for managing ecological data and transforming the data into informa- tion and knowledge. Here, we review the state-of-the...

  16. FUSION ENERGY SCIENCES WORKSHOP ON PLASMA MATERIALS INTERACTIONS: Report on Science Challenges and Research Opportunities in Plasma Materials Interactions

    Energy Technology Data Exchange (ETDEWEB)

    Maingi, Rajesh [Princeton Plasma Physics Lab. (PPPL), Princeton, NJ (United States); Zinkle, Steven J. [University of Tennessee – Knoxville; Foster, Mark S. [U.S. Department of Energy

    2015-05-01

    The realization of controlled thermonuclear fusion as an energy source would transform society, providing a nearly limitless energy source with renewable fuel. Under the auspices of the U.S. Department of Energy, the Fusion Energy Sciences (FES) program management recently launched a series of technical workshops to “seek community engagement and input for future program planning activities” in the targeted areas of (1) Integrated Simulation for Magnetic Fusion Energy Sciences, (2) Control of Transients, (3) Plasma Science Frontiers, and (4) Plasma-Materials Interactions aka Plasma-Materials Interface (PMI). Over the past decade, a number of strategic planning activities1-6 have highlighted PMI and plasma facing components as a major knowledge gap, which should be a priority for fusion research towards ITER and future demonstration fusion energy systems. There is a strong international consensus that new PMI solutions are required in order for fusion to advance beyond ITER. The goal of the 2015 PMI community workshop was to review recent innovations and improvements in understanding the challenging PMI issues, identify high-priority scientific challenges in PMI, and to discuss potential options to address those challenges. The community response to the PMI research assessment was enthusiastic, with over 80 participants involved in the open workshop held at Princeton Plasma Physics Laboratory on May 4-7, 2015. The workshop provided a useful forum for the scientific community to review progress in scientific understanding achieved during the past decade, and to openly discuss high-priority unresolved research questions. One of the key outcomes of the workshop was a focused set of community-initiated Priority Research Directions (PRDs) for PMI. Five PRDs were identified, labeled A-E, which represent community consensus on the most urgent near-term PMI scientific issues. For each PRD, an assessment was made of the scientific challenges, as well as a set of actions

  17. Animal health surveillance applications: The interaction of science and management.

    Science.gov (United States)

    Willeberg, Preben

    2012-08-01

    Animal health surveillance is an ever-evolving activity, since health- and risk-related policy and management decisions need to be backed by the best available scientific evidence and methodology. International organizations, trade partners, politicians, media and the public expect fast, understandable, up-to-date presentation and valid interpretation of animal disease data to support and document proper animal health management - in crises as well as in routine control applications. The delivery and application of surveillance information need to be further developed and optimized, and epidemiologists, risk managers, administrators and policy makers need to work together in order to secure progress. Promising new developments in areas such as risk-based surveillance, spatial presentation and analysis, and genomic epidemiology will be mentioned. Limitations and areas in need of further progress will be underlined, such as the general lack of a wide and open exchange of international animal disease surveillance data. During my more than 30 year career as a professor of Veterinary Epidemiology I had the good fortune of working in challenging environments with different eminent colleagues in different countries on a variety of animal health surveillance issues. My career change from professor to Chief Veterinary Officer (CVO) - "from science to application" - was caused by my desire to see for myself if and how well epidemiology would actually work to solve real-life problems as I had been telling my students for years that it would. Fortunately it worked for me! The job of a CVO is not that different from that of a professor of Veterinary Epidemiology; the underlying professional principles are the same. Every day I had to work from science, and base decisions and discussions on documented evidence - although sometimes the evidence was incomplete or data were simply lacking. A basic understanding of surveillance methodology is very useful for a CVO, since it provides

  18. The Data Issue: Opportunities and Challenges for Scientific Publishers

    Science.gov (United States)

    Murphy, F.; Irving, D. H.

    2011-12-01

    Using the recent report for the 'Opportunities in Data Exchange' Project produced by - and for - researchers, libraries/data centres and publishers (and which is based on a broad range of studies, questionnaires and evidence) we have defined current practices and expectations, and the gaps and dilemmas involved in producing data and datasets, and then analysed their relationship to formal publications. As a result, we identified potential opportunities to evolve scientific insights to be more useful and re-useful: with consequent implications for custodianship and long-term data management. We also defined a number of key incentives and barriers towards achieving these objectives. As a case study, the earth and environmental sciences have come under particularly close scrutiny with respect to data-ownership and -sharing arrangements, sometimes with damaging results to the discipline's reputation. These issues, along with considerable technological challenges, have to be handled effectively in order to best support all the users along the data chain. To that end, we show that key stakeholders - among them scientific publishers - need to have a clear idea of how to progress data-intensive derived information, which we demonstrate is often not the case. Towards bridging this knowledge gap, we have compiled a roadmap of next steps and key issues to be acknowledged and addressed by the scientific publishing community. These include: engaging directly with researchers, policy-makers, funding bodies and direct competitors to build innovative partnerships and enhance impact; providing technological and training investment and developing alongside the emerging discipline of 'data scientist': the 'data publisher'. This individual/company will need to combine a close understanding of researchers' priorities, together with market, legal and technical opportunities and restrictions.

  19. Public Access to NASA's Earth Science Data

    Science.gov (United States)

    Behnke, J.; James, N.

    2013-12-01

    Many steps have been taken over the past 20 years to make NASA's Earth Science data more accessible to the public. The data collected by NASA represent a significant public investment in research. NASA holds these data in a public trust to promote comprehensive, long-term Earth science research. Consequently, NASA developed a free, open and non-discriminatory policy consistent with existing international policies to maximize access to data and to keep user costs as low as possible. These policies apply to all data archived, maintained, distributed or produced by NASA data systems. The Earth Observing System Data and Information System (EOSDIS) is a major core capability within NASA Earth Science Data System Program. EOSDIS is designed to ingest, process, archive, and distribute data from approximately 90 instruments. Today over 6800 data products are available to the public through the EOSDIS. Last year, EOSDIS distributed over 636 million science data products to the user community, serving over 1.5 million distinct users. The system supports a variety of science disciplines including polar processes, land cover change, radiation budget, and most especially global climate change. A core philosophy of EOSDIS is that the general user is best served by providing discipline specific support for the data. To this end, EOSDIS has collocated NASA Earth science data with centers of science discipline expertise, called Distributed Active Archive Centers (DAACs). DAACs are responsible for data management, archive and distribution of data products. There are currently twelve DAACs in the EOSDIS system. The centralized entrance point to the NASA Earth Science data collection can be found at http://earthdata.nasa.gov. Over the years, we have developed several methods for determining needs of the user community including use of the American Customer Satisfaction Index survey and a broad metrics program. Annually, we work with an independent organization (CFI Group) to send this

  20. Ethical challenges within Veterans Administration healthcare facilities: perspectives of managers, clinicians, patients, and ethics committee chairpersons.

    Science.gov (United States)

    Foglia, Mary Beth; Pearlman, Robert A; Bottrell, Melissa; Altemose, Jane K; Fox, Ellen

    2009-04-01

    To promote ethical practices, healthcare managers must understand the ethical challenges encountered by key stakeholders. To characterize ethical challenges in Veterans Administration (VA) facilities from the perspectives of managers, clinicians, patients, and ethics consultants. We conducted focus groups with patients (n = 32) and managers (n = 38); semi-structured interviews with managers (n = 31), clinicians (n = 55), and ethics committee chairpersons (n = 21). Data were analyzed using content analysis. Managers reported that the greatest ethical challenge was fairly distributing resources across programs and services, whereas clinicians identified the effect of resource constraints on patient care. Ethics committee chairpersons identified end-of-life care as the greatest ethical challenge, whereas patients identified obtaining fair, respectful, and caring treatment. Perspectives on ethical challenges varied depending on the respondent's role. Understanding these differences can help managers take practical steps to address these challenges. Further, ethics committees seemingly, are not addressing the range of ethical challenges within their institutions.

  1. Major Challenges for the Modern Chemistry in Particular and Science in General.

    Science.gov (United States)

    Uskokovíc, Vuk

    2010-11-01

    In the past few hundred years, science has exerted an enormous influence on the way the world appears to human observers. Despite phenomenal accomplishments of science, science nowadays faces numerous challenges that threaten its continued success. As scientific inventions become embedded within human societies, the challenges are further multiplied. In this critical review, some of the critical challenges for the field of modern chemistry are discussed, including: (a) interlinking theoretical knowledge and experimental approaches; (b) implementing the principles of sustainability at the roots of the chemical design; (c) defining science from a philosophical perspective that acknowledges both pragmatic and realistic aspects thereof; (d) instigating interdisciplinary research; (e) learning to recognize and appreciate the aesthetic aspects of scientific knowledge and methodology, and promote truly inspiring education in chemistry. In the conclusion, I recapitulate that the evolution of human knowledge inherently depends upon our ability to adopt creative problem-solving attitudes, and that challenges will always be present within the scope of scientific interests.

  2. The evolution, approval and implementation of the U.S. Geological Survey Science Data Lifecycle Model

    Science.gov (United States)

    Faundeen, John L.; Hutchison, Vivian

    2017-01-01

    This paper details how the United States Geological Survey (USGS) Community for Data Integration (CDI) Data Management Working Group developed a Science Data Lifecycle Model, and the role the Model plays in shaping agency-wide policies. Starting with an extensive literature review of existing data Lifecycle models, representatives from various backgrounds in USGS attended a two-day meeting where the basic elements for the Science Data Lifecycle Model were determined. Refinements and reviews spanned two years, leading to finalization of the model and documentation in a formal agency publication . The Model serves as a critical framework for data management policy, instructional resources, and tools. The Model helps the USGS address both the Office of Science and Technology Policy (OSTP) for increased public access to federally funded research, and the Office of Management and Budget (OMB) 2013 Open Data directives, as the foundation for a series of agency policies related to data management planning, metadata development, data release procedures, and the long-term preservation of data. Additionally, the agency website devoted to data management instruction and best practices (www2.usgs.gov/datamanagement) is designed around the Model’s structure and concepts. This paper also illustrates how the Model is being used to develop tools for supporting USGS research and data management processes.

  3. Data management for early hearing detection and intervention in South Africa

    Directory of Open Access Journals (Sweden)

    Selvarani Moodley

    2017-06-01

    Full Text Available Introduction: Internationally, newborn hearing screening is becoming part of standard neonatal healthcare service guidelines for the implementation of early hearing detection and intervention (EHDI initiatives, including screening, diagnosis, data management and intervention. Data management includes the processes of data collection and storage thereof, as well as analysis and interpretation of data to guide the future planning, implementation and evaluation of EHDI programmes. There have been limited studies on data management in the South African EHDI context. Methods: The aim of this study was to determine the type of data management systems in use in South Africa and whether they allow for cross-disciplinary sharing and evaluation of the EHDI processes. A survey instrument on the management of EHDI data was developed and sent to HI HOPES referral agents in both public and private sectors. Results: A return rate of 80% was achieved, with 19 (59% public sector and 13 (41% private sector audiologists participating in the study. The data revealed that there was no uniform data management system in use nationally, and no consistent shared system within the public or private sectors. The majority of respondents (44% used a paper-based system for data recording. No institutions were using data management systems that enabled sharing of information with other medical professionals. Conclusion: Data management and tracking of the pathway from screening to diagnosis to intervention is necessary to ensure quality care and outcomes for children identified with hearing loss. International studies reveal the importance of effective implementation of data management systems; however, to date these have focussed on developed country contexts. Data management challenges identified in this study reflect international challenges as well as challenges unique to a developing country context.

  4. The Kepler Science Data Processing Pipeline Source Code Road Map

    Science.gov (United States)

    Wohler, Bill; Jenkins, Jon M.; Twicken, Joseph D.; Bryson, Stephen T.; Clarke, Bruce Donald; Middour, Christopher K.; Quintana, Elisa Victoria; Sanderfer, Jesse Thomas; Uddin, Akm Kamal; Sabale, Anima; hide

    2016-01-01

    We give an overview of the operational concepts and architecture of the Kepler Science Processing Pipeline. Designed, developed, operated, and maintained by the Kepler Science Operations Center (SOC) at NASA Ames Research Center, the Science Processing Pipeline is a central element of the Kepler Ground Data System. The SOC consists of an office at Ames Research Center, software development and operations departments, and a data center which hosts the computers required to perform data analysis. The SOC's charter is to analyze stellar photometric data from the Kepler spacecraft and report results to the Kepler Science Office for further analysis. We describe how this is accomplished via the Kepler Science Processing Pipeline, including, the software algorithms. We present the high-performance, parallel computing software modules of the pipeline that perform transit photometry, pixel-level calibration, systematic error correction, attitude determination, stellar target management, and instrument characterization.

  5. Challenges for current University management

    Directory of Open Access Journals (Sweden)

    Pedro Rodríguez Vargas

    2016-03-01

    Full Text Available The Ecuadorian university through a change of era, this complex pathway mediated by globalization imposes imminently. The aim of this paper is to present a review of literature on the main aspects of management that are generated for the interaction University - context. This scan was performed on secondary sources, and grounded in scienti?c data abstraction. College education is an information and training process that allows the scientific, technological, economic, political, social and cultural development of a region or country; however, some phenomena such as globalization, technological revolution or multiculturalism are key to this, same that can be considered as a problem or a challenge.

  6. Personalized, Shareable Geoscience Dataspaces For Simplifying Data Management and Improving Reproducibility

    Science.gov (United States)

    Malik, T.; Foster, I.; Goodall, J. L.; Peckham, S. D.; Baker, J. B. H.; Gurnis, M.

    2015-12-01

    Research activities are iterative, collaborative, and now data- and compute-intensive. Such research activities mean that even the many researchers who work in small laboratories must often create, acquire, manage, and manipulate much diverse data and keep track of complex software. They face difficult data and software management challenges, and data sharing and reproducibility are neglected. There is signficant federal investment in powerful cyberinfrastructure, in part to lesson the burden associated with modern data- and compute-intensive research. Similarly, geoscience communities are establishing research repositories to facilitate data preservation. Yet we observe a large fraction of the geoscience community continues to struggle with data and software management. The reason, studies suggest, is not lack of awareness but rather that tools do not adequately support time-consuming data life cycle activities. Through NSF/EarthCube-funded GeoDataspace project, we are building personalized, shareable dataspaces that help scientists connect their individual or research group efforts with the community at large. The dataspaces provide a light-weight multiplatform research data management system with tools for recording research activities in what we call geounits, so that a geoscientist can at any time snapshot and preserve, both for their own use and to share with the community, all data and code required to understand and reproduce a study. A software-as-a-service (SaaS) deployment model enhances usability of core components, and integration with widely used software systems. In this talk we will present the open-source GeoDataspace project and demonstrate how it is enabling reproducibility across geoscience domains of hydrology, space science, and modeling toolkits.

  7. Assessment of and Response to Data Needs of Clinical and Translational Science Researchers and Beyond

    Directory of Open Access Journals (Sweden)

    Hannah F. Norton

    2016-04-01

    Full Text Available Objective and Setting: As universities and libraries grapple with data management and “big data,” the need for data management solutions across disciplines is particularly relevant in clinical and translational science (CTS research, which is designed to traverse disciplinary and institutional boundaries. At the University of Florida Health Science Center Library, a team of librarians undertook an assessment of the research data management needs of CTS researchers, including an online assessment and follow-up one-on-one interviews. Design and Methods: The 20-question online assessment was distributed to all investigators affiliated with UF’s Clinical and Translational Science Institute (CTSI and 59 investigators responded. Follow-up in-depth interviews were conducted with nine faculty and staff members. Results: Results indicate that UF’s CTS researchers have diverse data management needs that are often specific to their discipline or current research project and span the data lifecycle. A common theme in responses was the need for consistent data management training, particularly for graduate students; this led to localized training within the Health Science Center and CTSI, as well as campus-wide training. Another campus-wide outcome was the creation of an action-oriented Data Management/Curation Task Force, led by the libraries and with participation from Research Computing and the Office of Research. Conclusions: Initiating conversations with affected stakeholders and campus leadership about best practices in data management and implications for institutional policy shows the library’s proactive leadership and furthers our goal to provide concrete guidance to our users in this area.

  8. Tools and data for meeting America's conservation challenges

    Science.gov (United States)

    Gergely, Kevin J.; McKerrow, Alexa

    2013-01-01

    The Gap Analysis Program (GAP) produces data and tools that help meet critical national challenges such as biodiversity conservation, renewable energy development, climate change adaptation, and infrastructure investment. The GAP is managed by the U.S. Geological Survey, Department of the Interior. GAP supports a wide range of national, State, and local agencies as well as nongovernmental organizations and businesses with scientific tools and data. GAP uses a collaborative approach to do research, analysis, and data development, resulting in a history of cooperation with more than 500 agencies and organizations nationally.

  9. Community-based management of environmental challenges in Latin America and the Caribbean

    Directory of Open Access Journals (Sweden)

    Maria del Mar Delgado-Serrano

    2017-03-01

    Full Text Available This Special Feature gathers the results of five research projects funded by the 7th Research Framework Program of the European Union and aims to identify successful cases of community-based management of environmental challenges in Latin America. The funding scheme, Research for the benefit of Civil Society Organizations, fostered innovative research approaches between civil society and research organizations. More than 20 field sites have been explored, and issues such as trade-offs between conservation and development, scientific versus local knowledge, social learning, ecosystem services, community owned solutions, scaling-up and scaling-out strategies, the influence of context and actors in effective environmental management and governance, and the conflicts of interests around natural resources have been addressed. Based on our experiences as project coordinators, in this editorial we reflect on some of the important lessons gained for research praxis and impact, focusing on knowledge of governance models and their scaling-out and scaling-up, and on methods and tools to enable action research at the science-civil society interface. The results highlight the richness of community-based management experiences that exist in Latin America and the diversity of approaches to encourage the sustainable community-based management of environmental challenges.

  10. Managing voluntary turnover through challenging assignments

    NARCIS (Netherlands)

    Preenen, P.T.Y.; de Pater, I.E.; van Vianen, A.E.M.; Keijzer, L.

    2011-01-01

    This study examines employees’ challenging assignments as manageable means to reduce turnover intentions, job search behaviors, and voluntary turnover. Results indicate that challenging assignments are negatively related to turnover intentions and job search behaviors and that these relationships

  11. Managing voluntary turnover through challenging assignments

    NARCIS (Netherlands)

    Preenen, P.T.Y.; Pater, I.E. de; Vianen, A.E.M. van; Keijzer, L.

    2011-01-01

    This study examines employees' challenging assignments as manageable means to reduce turnover intentions, job search behaviors, and voluntary turnover. Results indicate that challenging assignments are negatively related to turnover intentions and job search behaviors and that these relationships

  12. Health Policy and Management: in praise of political science. Comment on "On Health Policy and Management (HPAM): mind the theory-policy-practice gap".

    Science.gov (United States)

    Hunter, David J

    2015-03-12

    Health systems have entered a third era embracing whole systems thinking and posing complex policy and management challenges. Understanding how such systems work and agreeing what needs to be put in place to enable them to undergo effective and sustainable change are more pressing issues than ever for policy-makers. The theory-policy-practice-gap and its four dimensions, as articulated by Chinitz and Rodwin, is acknowledged. It is suggested that insights derived from political science can both enrich our understanding of the gap and suggest what changes are needed to tackle the complex challenges facing health systems. © 2015 by Kerman University of Medical Sciences.

  13. Data Science in Supply Chain Management: Data-Related Influences on Demand Planning

    Science.gov (United States)

    Jin, Yao

    2013-01-01

    Data-driven decisions have become an important aspect of supply chain management. Demand planners are tasked with analyzing volumes of data that are being collected at a torrential pace from myriad sources in order to translate them into actionable business intelligence. In particular, demand volatilities and planning are vital for effective and…

  14. Meeting the challenge of interacting threats in freshwater ecosystems: A call to scientists and managers

    Directory of Open Access Journals (Sweden)

    Laura S. Craig

    2017-12-01

    Full Text Available Human activities create threats that have consequences for freshwater ecosystems and, in most watersheds, observed ecological responses are the result of complex interactions among multiple threats and their associated ecological alterations. Here we discuss the value of considering multiple threats in research and management, offer suggestions for filling knowledge gaps, and provide guidance for addressing the urgent management challenges posed by multiple threats in freshwater ecosystems. There is a growing literature assessing responses to multiple alterations, and we build off this background to identify three areas that require greater attention: linking observed alterations to threats, understanding when and where threats overlap, and choosing metrics that best quantify the effects of multiple threats. Advancing science in these areas will help us understand existing ecosystem conditions and predict future risk from multiple threats. Because addressing the complex issues and novel ecosystems that arise from the interaction of multiple threats in freshwater ecosystems represents a significant management challenge, and the risks of management failure include loss of biodiversity, ecological goods, and ecosystem services, we also identify actions that could improve decision-making and management outcomes. These actions include drawing insights from management of individual threats, using threat attributes (e.g., causes and spatio-temporal dynamics to identify suitable management approaches, testing management strategies that are likely to be successful despite uncertainties about the nature of interactions among threats, avoiding unintended consequences, and maximizing conservation benefits. We also acknowledge the broadly applicable challenges of decision-making within a socio-political and economic framework, and suggest that multidisciplinary teams will be needed to innovate solutions to meet the current and future challenge of interacting

  15. Meeting the challenge of interacting threats in freshwater ecosystems: A call to scientists and managers

    Science.gov (United States)

    Craig, Laura S.; Olden, Julian D.; Arthington, Angela; Entrekin, Sally; Hawkins, Charles P.; Kelly, John J.; Kennedy, Theodore A.; Maitland, Bryan M.; Rosi, Emma J.; Roy, Allison; Strayer, David L.; Tank, Jennifer L.; West, Amie O.; Wooten, Matthew S.

    2017-01-01

    Human activities create threats that have consequences for freshwater ecosystems and, in most watersheds, observed ecological responses are the result of complex interactions among multiple threats and their associated ecological alterations. Here we discuss the value of considering multiple threats in research and management, offer suggestions for filling knowledge gaps, and provide guidance for addressing the urgent management challenges posed by multiple threats in freshwater ecosystems. There is a growing literature assessing responses to multiple alterations, and we build off this background to identify three areas that require greater attention: linking observed alterations to threats, understanding when and where threats overlap, and choosing metrics that best quantify the effects of multiple threats. Advancing science in these areas will help us understand existing ecosystem conditions and predict future risk from multiple threats. Because addressing the complex issues and novel ecosystems that arise from the interaction of multiple threats in freshwater ecosystems represents a significant management challenge, and the risks of management failure include loss of biodiversity, ecological goods, and ecosystem services, we also identify actions that could improve decision-making and management outcomes. These actions include drawing insights from management of individual threats, using threat attributes (e.g., causes and spatio-temporal dynamics) to identify suitable management approaches, testing management strategies that are likely to be successful despite uncertainties about the nature of interactions among threats, avoiding unintended consequences, and maximizing conservation benefits. We also acknowledge the broadly applicable challenges of decision-making within a socio-political and economic framework, and suggest that multidisciplinary teams will be needed to innovate solutions to meet the current and future challenge of interacting threats in

  16. The IRIS DMC: Perspectives on Real-Time Data Management and Open Access From a Large Seismological Archive: Challenges, Tools, and Quality Assurance

    Science.gov (United States)

    Benson, R. B.

    2007-05-01

    The IRIS Data Management Center, located in Seattle, WA, is the largest openly accessible geophysical archive in the world, and has a unique perspective on data management and operational practices that gets the most out of your network. Networks scale broad domains in time and space, from finite needs to monitor bridges and dams to national and international networks like the GSN and the FDSN that establish a baseline for global monitoring and research, the requirements that go into creating a well-tuned DMC archive treat these the same, building a collaborative network of networks that generations of users rely on and adds value to the data. Funded by the National Science Foundation through the Division of Earth Sciences, IRIS is operated through member universities and in cooperation with the USGS, and the DMS facility is a bridge between a globally distributed collaboration of seismic networks and an equally distributed network of users that demand a high standard for data quality, completeness, and ease of access. I will describe the role that a perpetual archive has in the life cycle of data, and how hosting real-time data performs a dual role of being a hub for continuous data from approximately 59 real-time networks, and distributing these (along with other data from the 40-year library of available time-series data) to researchers, while simultaneously providing shared data back to networks in real- time that benefits monitoring activities. I will describe aspects of our quality-assurance framework that are both passively and actively performed on 1100 seismic stations, generating over 6,000 channels of regularly sampled data arriving daily, that data providers can use as aids in operating their network, and users can likewise use when requesting suitable data for research purposes. The goal of the DMC is to eliminate bottlenecks in data discovery and shortening the steps leading to analysis. This includes many challenges, including keeping metadata

  17. Next Generation Space Telescope Integrated Science Module Data System

    Science.gov (United States)

    Schnurr, Richard G.; Greenhouse, Matthew A.; Jurotich, Matthew M.; Whitley, Raymond; Kalinowski, Keith J.; Love, Bruce W.; Travis, Jeffrey W.; Long, Knox S.

    1999-01-01

    The Data system for the Next Generation Space Telescope (NGST) Integrated Science Module (ISIM) is the primary data interface between the spacecraft, telescope, and science instrument systems. This poster includes block diagrams of the ISIM data system and its components derived during the pre-phase A Yardstick feasibility study. The poster details the hardware and software components used to acquire and process science data for the Yardstick instrument compliment, and depicts the baseline external interfaces to science instruments and other systems. This baseline data system is a fully redundant, high performance computing system. Each redundant computer contains three 150 MHz power PC processors. All processors execute a commercially available real time multi-tasking operating system supporting, preemptive multi-tasking, file management and network interfaces. These six processors in the system are networked together. The spacecraft interface baseline is an extension of the network, which links the six processors. The final selection for Processor busses, processor chips, network interfaces, and high-speed data interfaces will be made during mid 2002.

  18. Facing the Challenges of Accessing, Managing, and Integrating Large Observational Datasets in Ecology: Enabling and Enriching the Use of NEON's Observational Data

    Science.gov (United States)

    Thibault, K. M.

    2013-12-01

    As the construction of NEON and its transition to operations progresses, more and more data will become available to the scientific community, both from NEON directly and from the concomitant growth of existing data repositories. Many of these datasets include ecological observations of a diversity of taxa in both aquatic and terrestrial environments. Although observational data have been collected and used throughout the history of organismal biology, the field has not yet fully developed a culture of data management, documentation, standardization, sharing and discoverability to facilitate the integration and synthesis of datasets. Moreover, the tools required to accomplish these goals, namely database design, implementation, and management, and automation and parallelization of analytical tasks through computational techniques, have not historically been included in biology curricula, at either the undergraduate or graduate levels. To ensure the success of data-generating projects like NEON in advancing organismal ecology and to increase transparency and reproducibility of scientific analyses, an acceleration of the cultural shift to open science practices, the development and adoption of data standards, such as the DarwinCore standard for taxonomic data, and increased training in computational approaches for biologists need to be realized. Here I highlight several initiatives that are intended to increase access to and discoverability of publicly available datasets and equip biologists and other scientists with the skills that are need to manage, integrate, and analyze data from multiple large-scale projects. The EcoData Retriever (ecodataretriever.org) is a tool that downloads publicly available datasets, re-formats the data into an efficient relational database structure, and then automatically imports the data tables onto a user's local drive into the database tool of the user's choice. The automation of these tasks results in nearly instantaneous execution

  19. Big data analytics a management perspective

    CERN Document Server

    Corea, Francesco

    2016-01-01

    This book is about innovation, big data, and data science seen from a business perspective. Big data is a buzzword nowadays, and there is a growing necessity within practitioners to understand better the phenomenon, starting from a clear stated definition. This book aims to be a starting reading for executives who want (and need) to keep the pace with the technological breakthrough introduced by new analytical techniques and piles of data. Common myths about big data will be explained, and a series of different strategic approaches will be provided. By browsing the book, it will be possible to learn how to implement a big data strategy and how to use a maturity framework to monitor the progress of the data science team, as well as how to move forward from one stage to the next. Crucial challenges related to big data will be discussed, where some of them are more general - such as ethics, privacy, and ownership – while others concern more specific business situations (e.g., initial public offering, growth st...

  20. Using secondary data sets in health care management : opportunities and challenges

    OpenAIRE

    Buttigieg, Sandra; Annual Meeting of the American Academy of Management

    2013-01-01

    The importance of secondary data sets within the medical services and management sector is discussed. Secondary data sets which are readily available and thus reduce considerable costs, can also provide accurate, valid and reliable evidence.