WorldWideScience

Sample records for big volume modelirovanie

  1. HARNESSING BIG DATA VOLUMES

    Directory of Open Access Journals (Sweden)

    Bogdan DINU

    2014-04-01

    Full Text Available Big Data can revolutionize humanity. Hidden within the huge amounts and variety of the data we are creating we may find information, facts, social insights and benchmarks that were once virtually impossible to find or were simply inexistent. Large volumes of data allow organizations to tap in real time the full potential of all the internal or external information they possess. Big data calls for quick decisions and innovative ways to assist customers and the society as a whole. Big data platforms and product portfolio will help customers harness to the full the value of big data volumes. This paper deals with technical and technological issues related to handling big data volumes in the Big Data environment.

  2. Volume and Value of Big Healthcare Data.

    Science.gov (United States)

    Dinov, Ivo D

    Modern scientific inquiries require significant data-driven evidence and trans-disciplinary expertise to extract valuable information and gain actionable knowledge about natural processes. Effective evidence-based decisions require collection, processing and interpretation of vast amounts of complex data. The Moore's and Kryder's laws of exponential increase of computational power and information storage, respectively, dictate the need rapid trans-disciplinary advances, technological innovation and effective mechanisms for managing and interrogating Big Healthcare Data. In this article, we review important aspects of Big Data analytics and discuss important questions like: What are the challenges and opportunities associated with this biomedical, social, and healthcare data avalanche? Are there innovative statistical computing strategies to represent, model, analyze and interpret Big heterogeneous data? We present the foundation of a new compressive big data analytics (CBDA) framework for representation, modeling and inference of large, complex and heterogeneous datasets. Finally, we consider specific directions likely to impact the process of extracting information from Big healthcare data, translating that information to knowledge, and deriving appropriate actions.

  3. Characterizing Big Data Management

    OpenAIRE

    Rogério Rossi; Kechi Hirama

    2015-01-01

    Big data management is a reality for an increasing number of organizations in many areas and represents a set of challenges involving big data modeling, storage and retrieval, analysis and visualization. However, technological resources, people and processes are crucial to facilitate the management of big data in any kind of organization, allowing information and knowledge from a large volume of data to support decision-making. Big data management can be supported by these three dimensions: t...

  4. Turning big bang into big bounce: II. Quantum dynamics

    Energy Technology Data Exchange (ETDEWEB)

    Malkiewicz, Przemyslaw; Piechocki, Wlodzimierz, E-mail: pmalk@fuw.edu.p, E-mail: piech@fuw.edu.p [Theoretical Physics Department, Institute for Nuclear Studies, Hoza 69, 00-681 Warsaw (Poland)

    2010-11-21

    We analyze the big bounce transition of the quantum Friedmann-Robertson-Walker model in the setting of the nonstandard loop quantum cosmology (LQC). Elementary observables are used to quantize composite observables. The spectrum of the energy density operator is bounded and continuous. The spectrum of the volume operator is bounded from below and discrete. It has equally distant levels defining a quantum of the volume. The discreteness may imply a foamy structure of spacetime at a semiclassical level which may be detected in astro-cosmo observations. The nonstandard LQC method has a free parameter that should be fixed in some way to specify the big bounce transition.

  5. Big Data; A Management Revolution : The emerging role of big data in businesses

    OpenAIRE

    Blasiak, Kevin

    2014-01-01

    Big data is a term that was coined in 2012 and has since then emerged to one of the top trends in business and technology. Big data is an agglomeration of different technologies resulting in data processing capabilities that have been unreached before. Big data is generally characterized by 4 factors. Volume, velocity and variety. These three factors distinct it from the traditional data use. The possibilities to utilize this technology are vast. Big data technology has touch points in differ...

  6. Big Data and Chemical Education

    Science.gov (United States)

    Pence, Harry E.; Williams, Antony J.

    2016-01-01

    The amount of computerized information that organizations collect and process is growing so large that the term Big Data is commonly being used to describe the situation. Accordingly, Big Data is defined by a combination of the Volume, Variety, Velocity, and Veracity of the data being processed. Big Data tools are already having an impact in…

  7. Characterizing Big Data Management

    Directory of Open Access Journals (Sweden)

    Rogério Rossi

    2015-06-01

    Full Text Available Big data management is a reality for an increasing number of organizations in many areas and represents a set of challenges involving big data modeling, storage and retrieval, analysis and visualization. However, technological resources, people and processes are crucial to facilitate the management of big data in any kind of organization, allowing information and knowledge from a large volume of data to support decision-making. Big data management can be supported by these three dimensions: technology, people and processes. Hence, this article discusses these dimensions: the technological dimension that is related to storage, analytics and visualization of big data; the human aspects of big data; and, in addition, the process management dimension that involves in a technological and business approach the aspects of big data management.

  8. Big Data's Role in Precision Public Health.

    Science.gov (United States)

    Dolley, Shawn

    2018-01-01

    Precision public health is an emerging practice to more granularly predict and understand public health risks and customize treatments for more specific and homogeneous subpopulations, often using new data, technologies, and methods. Big data is one element that has consistently helped to achieve these goals, through its ability to deliver to practitioners a volume and variety of structured or unstructured data not previously possible. Big data has enabled more widespread and specific research and trials of stratifying and segmenting populations at risk for a variety of health problems. Examples of success using big data are surveyed in surveillance and signal detection, predicting future risk, targeted interventions, and understanding disease. Using novel big data or big data approaches has risks that remain to be resolved. The continued growth in volume and variety of available data, decreased costs of data capture, and emerging computational methods mean big data success will likely be a required pillar of precision public health into the future. This review article aims to identify the precision public health use cases where big data has added value, identify classes of value that big data may bring, and outline the risks inherent in using big data in precision public health efforts.

  9. Big data uncertainties.

    Science.gov (United States)

    Maugis, Pierre-André G

    2018-07-01

    Big data-the idea that an always-larger volume of information is being constantly recorded-suggests that new problems can now be subjected to scientific scrutiny. However, can classical statistical methods be used directly on big data? We analyze the problem by looking at two known pitfalls of big datasets. First, that they are biased, in the sense that they do not offer a complete view of the populations under consideration. Second, that they present a weak but pervasive level of dependence between all their components. In both cases we observe that the uncertainty of the conclusion obtained by statistical methods is increased when used on big data, either because of a systematic error (bias), or because of a larger degree of randomness (increased variance). We argue that the key challenge raised by big data is not only how to use big data to tackle new problems, but to develop tools and methods able to rigorously articulate the new risks therein. Copyright © 2016. Published by Elsevier Ltd.

  10. Main Issues in Big Data Security

    Directory of Open Access Journals (Sweden)

    Julio Moreno

    2016-09-01

    Full Text Available Data is currently one of the most important assets for companies in every field. The continuous growth in the importance and volume of data has created a new problem: it cannot be handled by traditional analysis techniques. This problem was, therefore, solved through the creation of a new paradigm: Big Data. However, Big Data originated new issues related not only to the volume or the variety of the data, but also to data security and privacy. In order to obtain a full perspective of the problem, we decided to carry out an investigation with the objective of highlighting the main issues regarding Big Data security, and also the solutions proposed by the scientific community to solve them. In this paper, we explain the results obtained after applying a systematic mapping study to security in the Big Data ecosystem. It is almost impossible to carry out detailed research into the entire topic of security, and the outcome of this research is, therefore, a big picture of the main problems related to security in a Big Data system, along with the principal solutions to them proposed by the research community.

  11. Nursing Needs Big Data and Big Data Needs Nursing.

    Science.gov (United States)

    Brennan, Patricia Flatley; Bakken, Suzanne

    2015-09-01

    Contemporary big data initiatives in health care will benefit from greater integration with nursing science and nursing practice; in turn, nursing science and nursing practice has much to gain from the data science initiatives. Big data arises secondary to scholarly inquiry (e.g., -omics) and everyday observations like cardiac flow sensors or Twitter feeds. Data science methods that are emerging ensure that these data be leveraged to improve patient care. Big data encompasses data that exceed human comprehension, that exist at a volume unmanageable by standard computer systems, that arrive at a velocity not under the control of the investigator and possess a level of imprecision not found in traditional inquiry. Data science methods are emerging to manage and gain insights from big data. The primary methods included investigation of emerging federal big data initiatives, and exploration of exemplars from nursing informatics research to benchmark where nursing is already poised to participate in the big data revolution. We provide observations and reflections on experiences in the emerging big data initiatives. Existing approaches to large data set analysis provide a necessary but not sufficient foundation for nursing to participate in the big data revolution. Nursing's Social Policy Statement guides a principled, ethical perspective on big data and data science. There are implications for basic and advanced practice clinical nurses in practice, for the nurse scientist who collaborates with data scientists, and for the nurse data scientist. Big data and data science has the potential to provide greater richness in understanding patient phenomena and in tailoring interventional strategies that are personalized to the patient. © 2015 Sigma Theta Tau International.

  12. Big Data Analytics in Medicine and Healthcare.

    Science.gov (United States)

    Ristevski, Blagoj; Chen, Ming

    2018-05-10

    This paper surveys big data with highlighting the big data analytics in medicine and healthcare. Big data characteristics: value, volume, velocity, variety, veracity and variability are described. Big data analytics in medicine and healthcare covers integration and analysis of large amount of complex heterogeneous data such as various - omics data (genomics, epigenomics, transcriptomics, proteomics, metabolomics, interactomics, pharmacogenomics, diseasomics), biomedical data and electronic health records data. We underline the challenging issues about big data privacy and security. Regarding big data characteristics, some directions of using suitable and promising open-source distributed data processing software platform are given.

  13. Big data challenges

    DEFF Research Database (Denmark)

    Bachlechner, Daniel; Leimbach, Timo

    2016-01-01

    Although reports on big data success stories have been accumulating in the media, most organizations dealing with high-volume, high-velocity and high-variety information assets still face challenges. Only a thorough understanding of these challenges puts organizations into a position in which...... they can make an informed decision for or against big data, and, if the decision is positive, overcome the challenges smoothly. The combination of a series of interviews with leading experts from enterprises, associations and research institutions, and focused literature reviews allowed not only...... framework are also relevant. For large enterprises and startups specialized in big data, it is typically easier to overcome the challenges than it is for other enterprises and public administration bodies....

  14. Big Data’s Role in Precision Public Health

    Science.gov (United States)

    Dolley, Shawn

    2018-01-01

    Precision public health is an emerging practice to more granularly predict and understand public health risks and customize treatments for more specific and homogeneous subpopulations, often using new data, technologies, and methods. Big data is one element that has consistently helped to achieve these goals, through its ability to deliver to practitioners a volume and variety of structured or unstructured data not previously possible. Big data has enabled more widespread and specific research and trials of stratifying and segmenting populations at risk for a variety of health problems. Examples of success using big data are surveyed in surveillance and signal detection, predicting future risk, targeted interventions, and understanding disease. Using novel big data or big data approaches has risks that remain to be resolved. The continued growth in volume and variety of available data, decreased costs of data capture, and emerging computational methods mean big data success will likely be a required pillar of precision public health into the future. This review article aims to identify the precision public health use cases where big data has added value, identify classes of value that big data may bring, and outline the risks inherent in using big data in precision public health efforts. PMID:29594091

  15. Addressing big data issues in Scientific Data Infrastructure

    NARCIS (Netherlands)

    Demchenko, Y.; Membrey, P.; Grosso, P.; de Laat, C.; Smari, W.W.; Fox, G.C.

    2013-01-01

    Big Data are becoming a new technology focus both in science and in industry. This paper discusses the challenges that are imposed by Big Data on the modern and future Scientific Data Infrastructure (SDI). The paper discusses a nature and definition of Big Data that include such features as Volume,

  16. Big Data Analytics in Healthcare.

    Science.gov (United States)

    Belle, Ashwin; Thiagarajan, Raghuram; Soroushmehr, S M Reza; Navidi, Fatemeh; Beard, Daniel A; Najarian, Kayvan

    2015-01-01

    The rapidly expanding field of big data analytics has started to play a pivotal role in the evolution of healthcare practices and research. It has provided tools to accumulate, manage, analyze, and assimilate large volumes of disparate, structured, and unstructured data produced by current healthcare systems. Big data analytics has been recently applied towards aiding the process of care delivery and disease exploration. However, the adoption rate and research development in this space is still hindered by some fundamental problems inherent within the big data paradigm. In this paper, we discuss some of these major challenges with a focus on three upcoming and promising areas of medical research: image, signal, and genomics based analytics. Recent research which targets utilization of large volumes of medical data while combining multimodal data from disparate sources is discussed. Potential areas of research within this field which have the ability to provide meaningful impact on healthcare delivery are also examined.

  17. Exploring complex and big data

    Directory of Open Access Journals (Sweden)

    Stefanowski Jerzy

    2017-12-01

    Full Text Available This paper shows how big data analysis opens a range of research and technological problems and calls for new approaches. We start with defining the essential properties of big data and discussing the main types of data involved. We then survey the dedicated solutions for storing and processing big data, including a data lake, virtual integration, and a polystore architecture. Difficulties in managing data quality and provenance are also highlighted. The characteristics of big data imply also specific requirements and challenges for data mining algorithms, which we address as well. The links with related areas, including data streams and deep learning, are discussed. The common theme that naturally emerges from this characterization is complexity. All in all, we consider it to be the truly defining feature of big data (posing particular research and technological challenges, which ultimately seems to be of greater importance than the sheer data volume.

  18. Big data analytics methods and applications

    CERN Document Server

    Rao, BLS; Rao, SB

    2016-01-01

    This book has a collection of articles written by Big Data experts to describe some of the cutting-edge methods and applications from their respective areas of interest, and provides the reader with a detailed overview of the field of Big Data Analytics as it is practiced today. The chapters cover technical aspects of key areas that generate and use Big Data such as management and finance; medicine and healthcare; genome, cytome and microbiome; graphs and networks; Internet of Things; Big Data standards; bench-marking of systems; and others. In addition to different applications, key algorithmic approaches such as graph partitioning, clustering and finite mixture modelling of high-dimensional data are also covered. The varied collection of themes in this volume introduces the reader to the richness of the emerging field of Big Data Analytics.

  19. Big data governance an emerging imperative

    CERN Document Server

    Soares, Sunil

    2012-01-01

    Written by a leading expert in the field, this guide focuses on the convergence of two major trends in information management-big data and information governance-by taking a strategic approach oriented around business cases and industry imperatives. With the advent of new technologies, enterprises are expanding and handling very large volumes of data; this book, nontechnical in nature and geared toward business audiences, encourages the practice of establishing appropriate governance over big data initiatives and addresses how to manage and govern big data, highlighting the relevant processes,

  20. Big data: survey, technologies, opportunities, and challenges.

    Science.gov (United States)

    Khan, Nawsher; Yaqoob, Ibrar; Hashem, Ibrahim Abaker Targio; Inayat, Zakira; Ali, Waleed Kamaleldin Mahmoud; Alam, Muhammad; Shiraz, Muhammad; Gani, Abdullah

    2014-01-01

    Big Data has gained much attention from the academia and the IT industry. In the digital and computing world, information is generated and collected at a rate that rapidly exceeds the boundary range. Currently, over 2 billion people worldwide are connected to the Internet, and over 5 billion individuals own mobile phones. By 2020, 50 billion devices are expected to be connected to the Internet. At this point, predicted data production will be 44 times greater than that in 2009. As information is transferred and shared at light speed on optic fiber and wireless networks, the volume of data and the speed of market growth increase. However, the fast growth rate of such large data generates numerous challenges, such as the rapid growth of data, transfer speed, diverse data, and security. Nonetheless, Big Data is still in its infancy stage, and the domain has not been reviewed in general. Hence, this study comprehensively surveys and classifies the various attributes of Big Data, including its nature, definitions, rapid growth rate, volume, management, analysis, and security. This study also proposes a data life cycle that uses the technologies and terminologies of Big Data. Future research directions in this field are determined based on opportunities and several open issues in Big Data domination. These research directions facilitate the exploration of the domain and the development of optimal techniques to address Big Data.

  1. Big Data: Survey, Technologies, Opportunities, and Challenges

    Science.gov (United States)

    Khan, Nawsher; Yaqoob, Ibrar; Hashem, Ibrahim Abaker Targio; Inayat, Zakira; Mahmoud Ali, Waleed Kamaleldin; Alam, Muhammad; Shiraz, Muhammad; Gani, Abdullah

    2014-01-01

    Big Data has gained much attention from the academia and the IT industry. In the digital and computing world, information is generated and collected at a rate that rapidly exceeds the boundary range. Currently, over 2 billion people worldwide are connected to the Internet, and over 5 billion individuals own mobile phones. By 2020, 50 billion devices are expected to be connected to the Internet. At this point, predicted data production will be 44 times greater than that in 2009. As information is transferred and shared at light speed on optic fiber and wireless networks, the volume of data and the speed of market growth increase. However, the fast growth rate of such large data generates numerous challenges, such as the rapid growth of data, transfer speed, diverse data, and security. Nonetheless, Big Data is still in its infancy stage, and the domain has not been reviewed in general. Hence, this study comprehensively surveys and classifies the various attributes of Big Data, including its nature, definitions, rapid growth rate, volume, management, analysis, and security. This study also proposes a data life cycle that uses the technologies and terminologies of Big Data. Future research directions in this field are determined based on opportunities and several open issues in Big Data domination. These research directions facilitate the exploration of the domain and the development of optimal techniques to address Big Data. PMID:25136682

  2. Big Data Analytics

    Indian Academy of Sciences (India)

    The volume and variety of data being generated using computersis doubling every two years. It is estimated that in 2015,8 Zettabytes (Zetta=1021) were generated which consistedmostly of unstructured data such as emails, blogs, Twitter,Facebook posts, images, and videos. This is called big data. Itis possible to analyse ...

  3. Big Data - Smart Health Strategies

    Science.gov (United States)

    2014-01-01

    Summary Objectives To select best papers published in 2013 in the field of big data and smart health strategies, and summarize outstanding research efforts. Methods A systematic search was performed using two major bibliographic databases for relevant journal papers. The references obtained were reviewed in a two-stage process, starting with a blinded review performed by the two section editors, and followed by a peer review process operated by external reviewers recognized as experts in the field. Results The complete review process selected four best papers, illustrating various aspects of the special theme, among them: (a) using large volumes of unstructured data and, specifically, clinical notes from Electronic Health Records (EHRs) for pharmacovigilance; (b) knowledge discovery via querying large volumes of complex (both structured and unstructured) biological data using big data technologies and relevant tools; (c) methodologies for applying cloud computing and big data technologies in the field of genomics, and (d) system architectures enabling high-performance access to and processing of large datasets extracted from EHRs. Conclusions The potential of big data in biomedicine has been pinpointed in various viewpoint papers and editorials. The review of current scientific literature illustrated a variety of interesting methods and applications in the field, but still the promises exceed the current outcomes. As we are getting closer towards a solid foundation with respect to common understanding of relevant concepts and technical aspects, and the use of standardized technologies and tools, we can anticipate to reach the potential that big data offer for personalized medicine and smart health strategies in the near future. PMID:25123721

  4. A Knowledge-Driven Geospatially Enabled Framework for Geological Big Data

    OpenAIRE

    Liang Wu; Lei Xue; Chaoling Li; Xia Lv; Zhanlong Chen; Baode Jiang; Mingqiang Guo; Zhong Xie

    2017-01-01

    Geologic survey procedures accumulate large volumes of structured and unstructured data. Fully exploiting the knowledge and information that are included in geological big data and improving the accessibility of large volumes of data are important endeavors. In this paper, which is based on the architecture of the geological survey information cloud-computing platform (GSICCP) and big-data-related technologies, we split geologic unstructured data into fragments and extract multi-dimensional f...

  5. Big Data in the Aerospace Industry

    Directory of Open Access Journals (Sweden)

    Victor Emmanuell BADEA

    2018-01-01

    Full Text Available This paper presents the approaches related to the need for large volume data analysis, Big Data, and also the information that the beneficiaries of this analysis can interpret. Aerospace companies understand better the challenges of Big Data than the rest of the industries. Also, in this paper we describe a novel analytical system that enables query processing and predictive analytics over streams of large aviation data.

  6. Big Data: Survey, Technologies, Opportunities, and Challenges

    Directory of Open Access Journals (Sweden)

    Nawsher Khan

    2014-01-01

    Full Text Available Big Data has gained much attention from the academia and the IT industry. In the digital and computing world, information is generated and collected at a rate that rapidly exceeds the boundary range. Currently, over 2 billion people worldwide are connected to the Internet, and over 5 billion individuals own mobile phones. By 2020, 50 billion devices are expected to be connected to the Internet. At this point, predicted data production will be 44 times greater than that in 2009. As information is transferred and shared at light speed on optic fiber and wireless networks, the volume of data and the speed of market growth increase. However, the fast growth rate of such large data generates numerous challenges, such as the rapid growth of data, transfer speed, diverse data, and security. Nonetheless, Big Data is still in its infancy stage, and the domain has not been reviewed in general. Hence, this study comprehensively surveys and classifies the various attributes of Big Data, including its nature, definitions, rapid growth rate, volume, management, analysis, and security. This study also proposes a data life cycle that uses the technologies and terminologies of Big Data. Future research directions in this field are determined based on opportunities and several open issues in Big Data domination. These research directions facilitate the exploration of the domain and the development of optimal techniques to address Big Data.

  7. Will Organization Design Be Affected By Big Data?

    Directory of Open Access Journals (Sweden)

    Giles Slinger

    2014-12-01

    Full Text Available Computing power and analytical methods allow us to create, collate, and analyze more data than ever before. When datasets are unusually large in volume, velocity, and variety, they are referred to as “big data.” Some observers have suggested that in order to cope with big data (a organizational structures will need to change and (b the processes used to design organizations will be different. In this article, we differentiate big data from relatively slow-moving, linked people data. We argue that big data will change organizational structures as organizations pursue the opportunities presented by big data. The processes by which organizations are designed, however, will be relatively unaffected by big data. Instead, organization design processes will be more affected by the complex links found in people data.

  8. Modeling and Management of Big Data: Challenges and opportunities

    OpenAIRE

    Gil, David; Song, Il-Yeol

    2016-01-01

    The term Big Data denotes huge-volume, complex, rapid growing datasets with numerous, autonomous and independent sources. In these new circumstances Big Data bring many attractive opportunities; however, good opportunities are always followed by challenges, such as modelling, new paradigms, novel architectures that require original approaches to address data complexities. The purpose of this special issue on Modeling and Management of Big Data is to discuss research and experience in modellin...

  9. Research on Implementing Big Data: Technology, People, & Processes

    Science.gov (United States)

    Rankin, Jenny Grant; Johnson, Margie; Dennis, Randall

    2015-01-01

    When many people hear the term "big data", they primarily think of a technology tool for the collection and reporting of data of high variety, volume, and velocity. However, the complexity of big data is not only the technology, but the supporting processes, policies, and people supporting it. This paper was written by three experts to…

  10. On the Performance Evaluation of Big Data Systems

    OpenAIRE

    Pirzadeh, Pouria

    2015-01-01

    Big Data is turning to be a key basis for the competition and growth among various businesses. The emerging need to store and process huge volumes of data has resulted in the appearance of different Big Data serving systems with fundamental differences. Big Data benchmarking is a means to assist users to pick the correct system to fulfill their applications' needs. It can also help the developers of these systems to make the correct decisions in building and extending them. While there have b...

  11. Automated Big Traffic Analytics for Cyber Security

    OpenAIRE

    Miao, Yuantian; Ruan, Zichan; Pan, Lei; Wang, Yu; Zhang, Jun; Xiang, Yang

    2018-01-01

    Network traffic analytics technology is a cornerstone for cyber security systems. We demonstrate its use through three popular and contemporary cyber security applications in intrusion detection, malware analysis and botnet detection. However, automated traffic analytics faces the challenges raised by big traffic data. In terms of big data's three characteristics --- volume, variety and velocity, we review three state of the art techniques to mitigate the key challenges including real-time tr...

  12. The big data-big model (BDBM) challenges in ecological research

    Science.gov (United States)

    Luo, Y.

    2015-12-01

    The field of ecology has become a big-data science in the past decades due to development of new sensors used in numerous studies in the ecological community. Many sensor networks have been established to collect data. For example, satellites, such as Terra and OCO-2 among others, have collected data relevant on global carbon cycle. Thousands of field manipulative experiments have been conducted to examine feedback of terrestrial carbon cycle to global changes. Networks of observations, such as FLUXNET, have measured land processes. In particular, the implementation of the National Ecological Observatory Network (NEON), which is designed to network different kinds of sensors at many locations over the nation, will generate large volumes of ecological data every day. The raw data from sensors from those networks offer an unprecedented opportunity for accelerating advances in our knowledge of ecological processes, educating teachers and students, supporting decision-making, testing ecological theory, and forecasting changes in ecosystem services. Currently, ecologists do not have the infrastructure in place to synthesize massive yet heterogeneous data into resources for decision support. It is urgent to develop an ecological forecasting system that can make the best use of multiple sources of data to assess long-term biosphere change and anticipate future states of ecosystem services at regional and continental scales. Forecasting relies on big models that describe major processes that underlie complex system dynamics. Ecological system models, despite great simplification of the real systems, are still complex in order to address real-world problems. For example, Community Land Model (CLM) incorporates thousands of processes related to energy balance, hydrology, and biogeochemistry. Integration of massive data from multiple big data sources with complex models has to tackle Big Data-Big Model (BDBM) challenges. Those challenges include interoperability of multiple

  13. Principles of big data preparing, sharing, and analyzing complex information

    CERN Document Server

    Berman, Jules J

    2013-01-01

    Principles of Big Data helps readers avoid the common mistakes that endanger all Big Data projects. By stressing simple, fundamental concepts, this book teaches readers how to organize large volumes of complex data, and how to achieve data permanence when the content of the data is constantly changing. General methods for data verification and validation, as specifically applied to Big Data resources, are stressed throughout the book. The book demonstrates how adept analysts can find relationships among data objects held in disparate Big Data resources, when the data objects are endo

  14. Big Data for personalized healthcare

    NARCIS (Netherlands)

    Siemons, Liseth; Sieverink, Floor; Vollenbroek, Wouter; van de Wijngaert, Lidwien; Braakman-Jansen, Annemarie; van Gemert-Pijnen, Lisette

    2016-01-01

    Big Data, often defined according to the 5V model (volume, velocity, variety, veracity and value), is seen as the key towards personalized healthcare. However, it also confronts us with new technological and ethical challenges that require more sophisticated data management tools and data analysis

  15. Big Data: Concept, Potentialities and Vulnerabilities

    Directory of Open Access Journals (Sweden)

    Fernando Almeida

    2018-03-01

    Full Text Available The evolution of information systems and the growth in the use of the Internet and social networks has caused an explosion in the amount of available data relevant to the activities of the companies. Therefore, the treatment of these available data is vital to support operational, tactical and strategic decisions. This paper aims to present the concept of big data and the main technologies that support the analysis of large data volumes. The potential of big data is explored considering nine sectors of activity, such as financial, retail, healthcare, transports, agriculture, energy, manufacturing, public, and media and entertainment. In addition, the main current opportunities, vulnerabilities and privacy challenges of big data are discussed. It was possible to conclude that despite the potential for using the big data to grow in the previously identified areas, there are still some challenges that need to be considered and mitigated, namely the privacy of information, the existence of qualified human resources to work with Big Data and the promotion of a data-driven organizational culture.

  16. The Shadow of Big Data: Data-Citizenship and Exclusion

    DEFF Research Database (Denmark)

    Rossi, Luca; Hjelholt, Morten; Neumayer, Christina

    2016-01-01

    The shadow of Big Data: data-citizenship and exclusion Big data are understood as being able to provide insights on human behaviour at an individual as well as at an aggregated societal level (Manyka et al. 2011). These insights are expected to be more detailed and precise than anything before...... thanks to the large volume of digital data and to the unobstrusive nature of the data collection (Fishleigh 2014). Within this perspective, these two dimensions (volume and unobstrusiveness) define contemporary big data techniques as a socio-technical offering to society, a live representation of itself...... this process "data-citizenship" emerges. Data-citizenship assumes that citizens will be visible to the state through the data they produce. On a general level data-citizenship shifts citizenship from an intrinsic status of a group of people to a status achieved through action. This approach assumes equal...

  17. Big Data Provenance: Challenges, State of the Art and Opportunities

    OpenAIRE

    Wang, Jianwu; Crawl, Daniel; Purawat, Shweta; Nguyen, Mai; Altintas, Ilkay

    2015-01-01

    Ability to track provenance is a key feature of scientific workflows to support data lineage and reproducibility. The challenges that are introduced by the volume, variety and velocity of Big Data, also pose related challenges for provenance and quality of Big Data, defined as veracity. The increasing size and variety of distributed Big Data provenance information bring new technical challenges and opportunities throughout the provenance lifecycle including recording, querying, sharing and ut...

  18. Big Data Provenance: Challenges, State of the Art and Opportunities.

    Science.gov (United States)

    Wang, Jianwu; Crawl, Daniel; Purawat, Shweta; Nguyen, Mai; Altintas, Ilkay

    2015-01-01

    Ability to track provenance is a key feature of scientific workflows to support data lineage and reproducibility. The challenges that are introduced by the volume, variety and velocity of Big Data, also pose related challenges for provenance and quality of Big Data, defined as veracity. The increasing size and variety of distributed Big Data provenance information bring new technical challenges and opportunities throughout the provenance lifecycle including recording, querying, sharing and utilization. This paper discusses the challenges and opportunities of Big Data provenance related to the veracity of the datasets themselves and the provenance of the analytical processes that analyze these datasets. It also explains our current efforts towards tracking and utilizing Big Data provenance using workflows as a programming model to analyze Big Data.

  19. Epidemiology in the Era of Big Data

    Science.gov (United States)

    Mooney, Stephen J; Westreich, Daniel J; El-Sayed, Abdulrahman M

    2015-01-01

    Big Data has increasingly been promoted as a revolutionary development in the future of science, including epidemiology. However, the definition and implications of Big Data for epidemiology remain unclear. We here provide a working definition of Big Data predicated on the so-called ‘3 Vs’: variety, volume, and velocity. From this definition, we argue that Big Data has evolutionary and revolutionary implications for identifying and intervening on the determinants of population health. We suggest that as more sources of diverse data become publicly available, the ability to combine and refine these data to yield valid answers to epidemiologic questions will be invaluable. We conclude that, while epidemiology as practiced today will continue to be practiced in the Big Data future, a component of our field’s future value lies in integrating subject matter knowledge with increased technical savvy. Our training programs and our visions for future public health interventions should reflect this future. PMID:25756221

  20. Techniques and environments for big data analysis parallel, cloud, and grid computing

    CERN Document Server

    Dehuri, Satchidananda; Kim, Euiwhan; Wang, Gi-Name

    2016-01-01

    This volume is aiming at a wide range of readers and researchers in the area of Big Data by presenting the recent advances in the fields of Big Data Analysis, as well as the techniques and tools used to analyze it. The book includes 10 distinct chapters providing a concise introduction to Big Data Analysis and recent Techniques and Environments for Big Data Analysis. It gives insight into how the expensive fitness evaluation of evolutionary learning can play a vital role in big data analysis by adopting Parallel, Grid, and Cloud computing environments.

  1. Medical big data: promise and challenges.

    Science.gov (United States)

    Lee, Choong Ho; Yoon, Hyung-Jin

    2017-03-01

    The concept of big data, commonly characterized by volume, variety, velocity, and veracity, goes far beyond the data type and includes the aspects of data analysis, such as hypothesis-generating, rather than hypothesis-testing. Big data focuses on temporal stability of the association, rather than on causal relationship and underlying probability distribution assumptions are frequently not required. Medical big data as material to be analyzed has various features that are not only distinct from big data of other disciplines, but also distinct from traditional clinical epidemiology. Big data technology has many areas of application in healthcare, such as predictive modeling and clinical decision support, disease or safety surveillance, public health, and research. Big data analytics frequently exploits analytic methods developed in data mining, including classification, clustering, and regression. Medical big data analyses are complicated by many technical issues, such as missing values, curse of dimensionality, and bias control, and share the inherent limitations of observation study, namely the inability to test causality resulting from residual confounding and reverse causation. Recently, propensity score analysis and instrumental variable analysis have been introduced to overcome these limitations, and they have accomplished a great deal. Many challenges, such as the absence of evidence of practical benefits of big data, methodological issues including legal and ethical issues, and clinical integration and utility issues, must be overcome to realize the promise of medical big data as the fuel of a continuous learning healthcare system that will improve patient outcome and reduce waste in areas including nephrology.

  2. Medical big data: promise and challenges

    Directory of Open Access Journals (Sweden)

    Choong Ho Lee

    2017-03-01

    Full Text Available The concept of big data, commonly characterized by volume, variety, velocity, and veracity, goes far beyond the data type and includes the aspects of data analysis, such as hypothesis-generating, rather than hypothesis-testing. Big data focuses on temporal stability of the association, rather than on causal relationship and underlying probability distribution assumptions are frequently not required. Medical big data as material to be analyzed has various features that are not only distinct from big data of other disciplines, but also distinct from traditional clinical epidemiology. Big data technology has many areas of application in healthcare, such as predictive modeling and clinical decision support, disease or safety surveillance, public health, and research. Big data analytics frequently exploits analytic methods developed in data mining, including classification, clustering, and regression. Medical big data analyses are complicated by many technical issues, such as missing values, curse of dimensionality, and bias control, and share the inherent limitations of observation study, namely the inability to test causality resulting from residual confounding and reverse causation. Recently, propensity score analysis and instrumental variable analysis have been introduced to overcome these limitations, and they have accomplished a great deal. Many challenges, such as the absence of evidence of practical benefits of big data, methodological issues including legal and ethical issues, and clinical integration and utility issues, must be overcome to realize the promise of medical big data as the fuel of a continuous learning healthcare system that will improve patient outcome and reduce waste in areas including nephrology.

  3. Sonar atlas of caverns comprising the U.S. Strategic Petroleum Reserve. Volume 2, Big Hill Site, Texas.

    Energy Technology Data Exchange (ETDEWEB)

    Rautman, Christopher Arthur; Lord, Anna Snider

    2007-08-01

    Downhole sonar surveys from the four active U.S. Strategic Petroleum Reserve sites have been modeled and used to generate a four-volume sonar atlas, showing the three-dimensional geometry of each cavern. This volume 2 focuses on the Big Hill SPR site, located in southeastern Texas. Volumes 1, 3, and 4, respectively, present images for the Bayou Choctaw SPR site, Louisiana, the Bryan Mound SPR site, Texas, and the West Hackberry SPR site, Louisiana. The atlas uses a consistent presentation format throughout. The basic geometric measurements provided by the down-cavern surveys have also been used to generate a number of geometric attributes, the values of which have been mapped onto the geometric form of each cavern using a color-shading scheme. The intent of the various geometrical attributes is to highlight deviations of the cavern shape from the idealized cylindrical form of a carefully leached underground storage cavern in salt. The atlas format does not allow interpretation of such geometric deviations and anomalies. However, significant geometric anomalies, not directly related to the leaching history of the cavern, may provide insight into the internal structure of the relevant salt dome.

  4. Big Spectrum Data: The New Resource for Cognitive Wireless Networking

    OpenAIRE

    Ding, Guoru; Wu, Qihui; Wang, Jinlong; Yao, Yu-Dong

    2014-01-01

    The era of Big Data is here now, which has brought both unprecedented opportunities and critical challenges. In this article, from a perspective of cognitive wireless networking, we start with a definition of Big Spectrum Data by analyzing its characteristics in terms of six Vs, i.e., volume, variety, velocity, veracity, viability, and value. We then present a high-level tutorial on research frontiers in Big Spectrum Data analytics to guide the development of practical algorithms. We also hig...

  5. Soft computing in big data processing

    CERN Document Server

    Park, Seung-Jong; Lee, Jee-Hyong

    2014-01-01

    Big data is an essential key to build a smart world as a meaning of the streaming, continuous integration of large volume and high velocity data covering from all sources to final destinations. The big data range from data mining, data analysis and decision making, by drawing statistical rules and mathematical patterns through systematical or automatically reasoning. The big data helps serve our life better, clarify our future and deliver greater value. We can discover how to capture and analyze data. Readers will be guided to processing system integrity and implementing intelligent systems. With intelligent systems, we deal with the fundamental data management and visualization challenges in effective management of dynamic and large-scale data, and efficient processing of real-time and spatio-temporal data. Advanced intelligent systems have led to managing the data monitoring, data processing and decision-making in realistic and effective way. Considering a big size of data, variety of data and frequent chan...

  6. Energy scale of the Big Bounce

    International Nuclear Information System (INIS)

    Malkiewicz, Przemyslaw; Piechocki, Wlodzimierz

    2009-01-01

    We examine the nature of the cosmological Big Bounce transition within the loop geometry underlying loop quantum cosmology at classical and quantum levels. Our canonical quantization method is an alternative to the standard loop quantum cosmology. An evolution parameter we use has a clear interpretation. Our method opens the door for analyses of spectra of physical observables like the energy density and the volume operator. We find that one cannot determine the energy scale specific to the Big Bounce by making use of the loop geometry without an extra input from observational cosmology.

  7. Big Data in the Earth Observing System Data and Information System

    Science.gov (United States)

    Lynnes, Chris; Baynes, Katie; McInerney, Mark

    2016-01-01

    Approaches that are being pursued for the Earth Observing System Data and Information System (EOSDIS) data system to address the challenges of Big Data were presented to the NASA Big Data Task Force. Cloud prototypes are underway to tackle the volume challenge of Big Data. However, advances in computer hardware or cloud won't help (much) with variety. Rather, interoperability standards, conventions, and community engagement are the key to addressing variety.

  8. GEOSS: Addressing Big Data Challenges

    Science.gov (United States)

    Nativi, S.; Craglia, M.; Ochiai, O.

    2014-12-01

    In the sector of Earth Observation, the explosion of data is due to many factors including: new satellite constellations, the increased capabilities of sensor technologies, social media, crowdsourcing, and the need for multidisciplinary and collaborative research to face Global Changes. In this area, there are many expectations and concerns about Big Data. Vendors have attempted to use this term for their commercial purposes. It is necessary to understand whether Big Data is a radical shift or an incremental change for the existing digital infrastructures. This presentation tries to explore and discuss the impact of Big Data challenges and new capabilities on the Global Earth Observation System of Systems (GEOSS) and particularly on its common digital infrastructure called GCI. GEOSS is a global and flexible network of content providers allowing decision makers to access an extraordinary range of data and information at their desk. The impact of the Big Data dimensionalities (commonly known as 'V' axes: volume, variety, velocity, veracity, visualization) on GEOSS is discussed. The main solutions and experimentation developed by GEOSS along these axes are introduced and analyzed. GEOSS is a pioneering framework for global and multidisciplinary data sharing in the Earth Observation realm; its experience on Big Data is valuable for the many lessons learned.

  9. Toward a Literature-Driven Definition of Big Data in Healthcare.

    Science.gov (United States)

    Baro, Emilie; Degoul, Samuel; Beuscart, Régis; Chazard, Emmanuel

    2015-01-01

    The aim of this study was to provide a definition of big data in healthcare. A systematic search of PubMed literature published until May 9, 2014, was conducted. We noted the number of statistical individuals (n) and the number of variables (p) for all papers describing a dataset. These papers were classified into fields of study. Characteristics attributed to big data by authors were also considered. Based on this analysis, a definition of big data was proposed. A total of 196 papers were included. Big data can be defined as datasets with Log(n∗p) ≥ 7. Properties of big data are its great variety and high velocity. Big data raises challenges on veracity, on all aspects of the workflow, on extracting meaningful information, and on sharing information. Big data requires new computational methods that optimize data management. Related concepts are data reuse, false knowledge discovery, and privacy issues. Big data is defined by volume. Big data should not be confused with data reuse: data can be big without being reused for another purpose, for example, in omics. Inversely, data can be reused without being necessarily big, for example, secondary use of Electronic Medical Records (EMR) data.

  10. Big Data for Infectious Disease Surveillance and Modeling

    OpenAIRE

    Bansal, Shweta; Chowell, Gerardo; Simonsen, Lone; Vespignani, Alessandro; Viboud, Cécile

    2016-01-01

    We devote a special issue of the Journal of Infectious Diseases to review the recent advances of big data in strengthening disease surveillance, monitoring medical adverse events, informing transmission models, and tracking patient sentiments and mobility. We consider a broad definition of big data for public health, one encompassing patient information gathered from high-volume electronic health records and participatory surveillance systems, as well as mining of digital traces such as socia...

  11. Big data analytics turning big data into big money

    CERN Document Server

    Ohlhorst, Frank J

    2012-01-01

    Unique insights to implement big data analytics and reap big returns to your bottom line Focusing on the business and financial value of big data analytics, respected technology journalist Frank J. Ohlhorst shares his insights on the newly emerging field of big data analytics in Big Data Analytics. This breakthrough book demonstrates the importance of analytics, defines the processes, highlights the tangible and intangible values and discusses how you can turn a business liability into actionable material that can be used to redefine markets, improve profits and identify new business opportuni

  12. [Big data and their perspectives in radiation therapy].

    Science.gov (United States)

    Guihard, Sébastien; Thariat, Juliette; Clavier, Jean-Baptiste

    2017-02-01

    The concept of big data indicates a change of scale in the use of data and data aggregation into large databases through improved computer technology. One of the current challenges in the creation of big data in the context of radiation therapy is the transformation of routine care items into dark data, i.e. data not yet collected, and the fusion of databases collecting different types of information (dose-volume histograms and toxicity data for example). Processes and infrastructures devoted to big data collection should not impact negatively on the doctor-patient relationship, the general process of care or the quality of the data collected. The use of big data requires a collective effort of physicians, physicists, software manufacturers and health authorities to create, organize and exploit big data in radiotherapy and, beyond, oncology. Big data involve a new culture to build an appropriate infrastructure legally and ethically. Processes and issues are discussed in this article. Copyright © 2016 Société Française du Cancer. Published by Elsevier Masson SAS. All rights reserved.

  13. Commentary: Epidemiology in the era of big data.

    Science.gov (United States)

    Mooney, Stephen J; Westreich, Daniel J; El-Sayed, Abdulrahman M

    2015-05-01

    Big Data has increasingly been promoted as a revolutionary development in the future of science, including epidemiology. However, the definition and implications of Big Data for epidemiology remain unclear. We here provide a working definition of Big Data predicated on the so-called "three V's": variety, volume, and velocity. From this definition, we argue that Big Data has evolutionary and revolutionary implications for identifying and intervening on the determinants of population health. We suggest that as more sources of diverse data become publicly available, the ability to combine and refine these data to yield valid answers to epidemiologic questions will be invaluable. We conclude that while epidemiology as practiced today will continue to be practiced in the Big Data future, a component of our field's future value lies in integrating subject matter knowledge with increased technical savvy. Our training programs and our visions for future public health interventions should reflect this future.

  14. Market research & the ethics of big data

    OpenAIRE

    Nunan, Daniel; Di Domenico, M.

    2013-01-01

    The term ‘big data’ has recently emerged to describe a range of technological and\\ud commercial trends enabling the storage and analysis of huge amounts of customer data,\\ud such as that generated by social networks and mobile devices. Much of the commercial\\ud promise of big data is in the ability to generate valuable insights from collecting new\\ud types and volumes of data in ways that were not previously economically viable. At the\\ud same time a number of questions have been raised about...

  15. From big data to smart data

    CERN Document Server

    Iafrate, Fernando

    2015-01-01

    A pragmatic approach to Big Data by taking the reader on a journey between Big Data (what it is) and the Smart Data (what it is for). Today's decision making can be reached via information (related to the data), knowledge (related to people and processes), and timing (the capacity to decide, act and react at the right time). The huge increase in volume of data traffic, and its format (unstructured data such as blogs, logs, and video) generated by the "digitalization" of our world modifies radically our relationship to the space (in motion) and time, dimension and by capillarity, the enterpr

  16. Big Data and Regional Science: Opportunities, Challenges, and Directions for Future Research

    OpenAIRE

    Schintler, Laurie A.; Fischer, Manfred M.

    2018-01-01

    Recent technological, social, and economic trends and transformations are contributing to the production of what is usually referred to as Big Data. Big Data, which is typically defined by four dimensions -- Volume, Velocity, Veracity, and Variety -- changes the methods and tactics for using, analyzing, and interpreting data, requiring new approaches for data provenance, data processing, data analysis and modeling, and knowledge representation. The use and analysis of Big Data involves severa...

  17. Big Data and HPC: A Happy Marriage

    KAUST Repository

    Mehmood, Rashid

    2016-01-01

    International Data Corporation (IDC) defines Big Data technologies as “a new generation of technologies and architectures, designed to economically extract value from very large volumes of a wide variety of data produced every day, by enabling high

  18. Big Opportunities and Big Concerns of Big Data in Education

    Science.gov (United States)

    Wang, Yinying

    2016-01-01

    Against the backdrop of the ever-increasing influx of big data, this article examines the opportunities and concerns over big data in education. Specifically, this article first introduces big data, followed by delineating the potential opportunities of using big data in education in two areas: learning analytics and educational policy. Then, the…

  19. Simulation Experiments: Better data, not just big data

    OpenAIRE

    Sanchez, Susan M.

    2014-01-01

    Proceeding WSC '14 Proceedings of the 2014 Winter Simulation Conference Pages 805-816 Data mining tools have been around for several decades, but the term “big data” has only recently captured widespread attention. Numerous success stories have been promulgated as organizations have sifted through massive volumes of data to find interesting patterns that are, in turn, transformed into actionable information. Yet a key drawback to the big data paradigm is that it relies on obser...

  20. Information granularity, big data, and computational intelligence

    CERN Document Server

    Chen, Shyi-Ming

    2015-01-01

    The recent pursuits emerging in the realm of big data processing, interpretation, collection and organization have emerged in numerous sectors including business, industry, and government organizations. Data sets such as customer transactions for a mega-retailer, weather monitoring, intelligence gathering, quickly outpace the capacities of traditional techniques and tools of data analysis. The 3V (volume, variability and velocity) challenges led to the emergence of new techniques and tools in data visualization, acquisition, and serialization. Soft Computing being regarded as a plethora of technologies of fuzzy sets (or Granular Computing), neurocomputing and evolutionary optimization brings forward a number of unique features that might be instrumental to the development of concepts and algorithms to deal with big data. This carefully edited volume provides the reader with an updated, in-depth material on the emerging principles, conceptual underpinnings, algorithms and practice of Computational Intelligenc...

  1. Toward a Literature-Driven Definition of Big Data in Healthcare

    Directory of Open Access Journals (Sweden)

    Emilie Baro

    2015-01-01

    Full Text Available Objective. The aim of this study was to provide a definition of big data in healthcare. Methods. A systematic search of PubMed literature published until May 9, 2014, was conducted. We noted the number of statistical individuals (n and the number of variables (p for all papers describing a dataset. These papers were classified into fields of study. Characteristics attributed to big data by authors were also considered. Based on this analysis, a definition of big data was proposed. Results. A total of 196 papers were included. Big data can be defined as datasets with Log⁡(n*p≥7. Properties of big data are its great variety and high velocity. Big data raises challenges on veracity, on all aspects of the workflow, on extracting meaningful information, and on sharing information. Big data requires new computational methods that optimize data management. Related concepts are data reuse, false knowledge discovery, and privacy issues. Conclusion. Big data is defined by volume. Big data should not be confused with data reuse: data can be big without being reused for another purpose, for example, in omics. Inversely, data can be reused without being necessarily big, for example, secondary use of Electronic Medical Records (EMR data.

  2. Toward a Literature-Driven Definition of Big Data in Healthcare

    Science.gov (United States)

    Baro, Emilie; Degoul, Samuel; Beuscart, Régis; Chazard, Emmanuel

    2015-01-01

    Objective. The aim of this study was to provide a definition of big data in healthcare. Methods. A systematic search of PubMed literature published until May 9, 2014, was conducted. We noted the number of statistical individuals (n) and the number of variables (p) for all papers describing a dataset. These papers were classified into fields of study. Characteristics attributed to big data by authors were also considered. Based on this analysis, a definition of big data was proposed. Results. A total of 196 papers were included. Big data can be defined as datasets with Log⁡(n∗p) ≥ 7. Properties of big data are its great variety and high velocity. Big data raises challenges on veracity, on all aspects of the workflow, on extracting meaningful information, and on sharing information. Big data requires new computational methods that optimize data management. Related concepts are data reuse, false knowledge discovery, and privacy issues. Conclusion. Big data is defined by volume. Big data should not be confused with data reuse: data can be big without being reused for another purpose, for example, in omics. Inversely, data can be reused without being necessarily big, for example, secondary use of Electronic Medical Records (EMR) data. PMID:26137488

  3. Big Data: Evaluating business value and firm performance

    OpenAIRE

    Vitari , Claudio; Raguseo , Elisabetta

    2016-01-01

    This report is an output of a research project co-financed by Grenoble Ecole de Management and Auvergne-Rhône-Alpes French region. This study was conducted with the aim of understanding how Information Technology (IT) provides new opportunities to firms, specifically focusing on the role of Big Data in creating value for the companies. Gartner defines Big Data as “high volume, velocity and/or variety information assets that demand cost-effective, innovative forms of information processing tha...

  4. A Survey of Scholarly Data: From Big Data Perspective

    DEFF Research Database (Denmark)

    Khan, Samiya; Liu, Xiufeng; Shakil, Kashish A.

    2017-01-01

    of which, this scholarly reserve is popularly referred to as big scholarly data. In order to facilitate data analytics for big scholarly data, architectures and services for the same need to be developed. The evolving nature of research problems has made them essentially interdisciplinary. As a result......, there is a growing demand for scholarly applications like collaborator discovery, expert finding and research recommendation systems, in addition to several others. This research paper investigates the current trends and identifies the existing challenges in development of a big scholarly data platform......Recently, there has been a shifting focus of organizations and governments towards digitization of academic and technical documents, adding a new facet to the concept of digital libraries. The volume, variety and velocity of this generated data, satisfies the big data definition, as a result...

  5. Big data in forensic science and medicine.

    Science.gov (United States)

    Lefèvre, Thomas

    2018-07-01

    In less than a decade, big data in medicine has become quite a phenomenon and many biomedical disciplines got their own tribune on the topic. Perspectives and debates are flourishing while there is a lack for a consensual definition for big data. The 3Vs paradigm is frequently evoked to define the big data principles and stands for Volume, Variety and Velocity. Even according to this paradigm, genuine big data studies are still scarce in medicine and may not meet all expectations. On one hand, techniques usually presented as specific to the big data such as machine learning techniques are supposed to support the ambition of personalized, predictive and preventive medicines. These techniques are mostly far from been new and are more than 50 years old for the most ancient. On the other hand, several issues closely related to the properties of big data and inherited from other scientific fields such as artificial intelligence are often underestimated if not ignored. Besides, a few papers temper the almost unanimous big data enthusiasm and are worth attention since they delineate what is at stakes. In this context, forensic science is still awaiting for its position papers as well as for a comprehensive outline of what kind of contribution big data could bring to the field. The present situation calls for definitions and actions to rationally guide research and practice in big data. It is an opportunity for grounding a true interdisciplinary approach in forensic science and medicine that is mainly based on evidence. Copyright © 2017 Elsevier Ltd and Faculty of Forensic and Legal Medicine. All rights reserved.

  6. Big Data Analytics in Chemical Engineering.

    Science.gov (United States)

    Chiang, Leo; Lu, Bo; Castillo, Ivan

    2017-06-07

    Big data analytics is the journey to turn data into insights for more informed business and operational decisions. As the chemical engineering community is collecting more data (volume) from different sources (variety), this journey becomes more challenging in terms of using the right data and the right tools (analytics) to make the right decisions in real time (velocity). This article highlights recent big data advancements in five industries, including chemicals, energy, semiconductors, pharmaceuticals, and food, and then discusses technical, platform, and culture challenges. To reach the next milestone in multiplying successes to the enterprise level, government, academia, and industry need to collaboratively focus on workforce development and innovation.

  7. [Utilization of Big Data in Medicine and Future Outlook].

    Science.gov (United States)

    Kinosada, Yasutomi; Uematsu, Machiko; Fujiwara, Takuya

    2016-03-01

    "Big data" is a new buzzword. The point is not to be dazzled by the volume of data, but rather to analyze it, and convert it into insights, innovations, and business value. There are also real differences between conventional analytics and big data. In this article, we show some results of big data analysis using open DPC (Diagnosis Procedure Combination) data in areas of the central part of JAPAN: Toyama, Ishikawa, Fukui, Nagano, Gifu, Aichi, Shizuoka, and Mie Prefectures. These 8 prefectures contain 51 medical administration areas called the second medical area. By applying big data analysis techniques such as k-means, hierarchical clustering, and self-organizing maps to DPC data, we can visualize the disease structure and detect similarities or variations among the 51 second medical areas. The combination of a big data analysis technique and open DPC data is a very powerful method to depict real figures on patient distribution in Japan.

  8. Big data analysis new algorithms for a new society

    CERN Document Server

    Stefanowski, Jerzy

    2016-01-01

    This edited volume is devoted to Big Data Analysis from a Machine Learning standpoint as presented by some of the most eminent researchers in this area. It demonstrates that Big Data Analysis opens up new research problems which were either never considered before, or were only considered within a limited range. In addition to providing methodological discussions on the principles of mining Big Data and the difference between traditional statistical data analysis and newer computing frameworks, this book presents recently developed algorithms affecting such areas as business, financial forecasting, human mobility, the Internet of Things, information networks, bioinformatics, medical systems and life science. It explores, through a number of specific examples, how the study of Big Data Analysis has evolved and how it has started and will most likely continue to affect society. While the benefits brought upon by Big Data Analysis are underlined, the book also discusses some of the warnings that have been issued...

  9. Frameworks for management, storage and preparation of large data volumes Big Data

    Directory of Open Access Journals (Sweden)

    Marco Antonio Almeida Pamiño

    2017-05-01

    Full Text Available Weather systems like the World Meteorological Organization ́s Global Information System need to store different kinds of images, data and files. Big Data and its 3V paradigm can provide a suitable solution to solve this problem. This tutorial presents some concepts around the Hadoop framework, de facto standard implementation of Big Data, and how to store semi-estructured data generated by automatic weather stations using this framework. Finally, a formal method to generate weather reports using Hadoop ́s ecosystem frameworks is presented.

  10. Big data challenges: Impact, potential responses and research needs

    OpenAIRE

    Bachlechner, Daniel; Leimbach, Timo

    2016-01-01

    Although reports on big data success stories have been accumulating in the media, most organizations dealing with high-volume, high-velocity and high-variety information assets still face challenges. Only a thorough understanding of these challenges puts organizations into a position in which they can make an informed decision for or against big data, and, if the decision is positive, overcome the challenges smoothly. The combination of a series of interviews with leading experts from enterpr...

  11. On 'light' fermions and proton stability in 'big divisor' D3/D7 large volume compactifications

    International Nuclear Information System (INIS)

    Misra, Aalok; Shukla, Pramod

    2011-01-01

    Building on our earlier work (Misra and Shukla, Nucl. Phys. B 827:112, 2010; Phys. Lett. B 685:347-352, 2010), we show the possibility of generating ''light'' fermion mass scales of MeV-GeV range (possibly related to the first two generations of quarks/leptons) as well as eV (possibly related to first two generations of neutrinos) in type IIB string theory compactified on Swiss-Cheese orientifolds in the presence of a mobile space-time filling D3-brane restricted to (in principle) stacks of fluxed D7-branes wrapping the ''big'' divisor Σ B . This part of the paper is an expanded version of the latter half of Sect. 3 of a published short invited review (Misra, Mod. Phys. Lett. A 26:1, 2011) written by one of the authors [AM ]. Further, we also show that there are no SUSY GUT-type dimension-five operators corresponding to proton decay, and we estimate the proton lifetime from a SUSY GUT-type four-fermion dimension-six operator to be 10 61 years. Based on GLSM calculations in (Misra and Shukla, Nucl. Phys. B 827:112, 2010) for obtaining the geometric Kaehler potential for the ''big divisor,'' using further the Donaldson's algorithm, we also briefly discuss in the first of the two appendices the metric for the Swiss-Cheese Calabi-Yau used, which we obtain and which becomes Ricci flat in the large-volume limit. (orig.)

  12. On `light' fermions and proton stability in `big divisor' D3/ D7 large volume compactifications

    Science.gov (United States)

    Misra, Aalok; Shukla, Pramod

    2011-06-01

    Building on our earlier work (Misra and Shukla, Nucl. Phys. B 827:112, 2010; Phys. Lett. B 685:347-352, 2010), we show the possibility of generating "light" fermion mass scales of MeV-GeV range (possibly related to the first two generations of quarks/leptons) as well as eV (possibly related to first two generations of neutrinos) in type IIB string theory compactified on Swiss-Cheese orientifolds in the presence of a mobile space-time filling D3-brane restricted to (in principle) stacks of fluxed D7-branes wrapping the "big" divisor Σ B . This part of the paper is an expanded version of the latter half of Sect. 3 of a published short invited review (Misra, Mod. Phys. Lett. A 26:1, 2011) written by one of the authors [AM]. Further, we also show that there are no SUSY GUT-type dimension-five operators corresponding to proton decay, and we estimate the proton lifetime from a SUSY GUT-type four-fermion dimension-six operator to be 1061 years. Based on GLSM calculations in (Misra and Shukla, Nucl. Phys. B 827:112, 2010) for obtaining the geometric Kähler potential for the "big divisor," using further the Donaldson's algorithm, we also briefly discuss in the first of the two appendices the metric for the Swiss-Cheese Calabi-Yau used, which we obtain and which becomes Ricci flat in the large-volume limit.

  13. Big data and biomedical informatics: a challenging opportunity.

    Science.gov (United States)

    Bellazzi, R

    2014-05-22

    Big data are receiving an increasing attention in biomedicine and healthcare. It is therefore important to understand the reason why big data are assuming a crucial role for the biomedical informatics community. The capability of handling big data is becoming an enabler to carry out unprecedented research studies and to implement new models of healthcare delivery. Therefore, it is first necessary to deeply understand the four elements that constitute big data, namely Volume, Variety, Velocity, and Veracity, and their meaning in practice. Then, it is mandatory to understand where big data are present, and where they can be beneficially collected. There are research fields, such as translational bioinformatics, which need to rely on big data technologies to withstand the shock wave of data that is generated every day. Other areas, ranging from epidemiology to clinical care, can benefit from the exploitation of the large amounts of data that are nowadays available, from personal monitoring to primary care. However, building big data-enabled systems carries on relevant implications in terms of reproducibility of research studies and management of privacy and data access; proper actions should be taken to deal with these issues. An interesting consequence of the big data scenario is the availability of new software, methods, and tools, such as map-reduce, cloud computing, and concept drift machine learning algorithms, which will not only contribute to big data research, but may be beneficial in many biomedical informatics applications. The way forward with the big data opportunity will require properly applied engineering principles to design studies and applications, to avoid preconceptions or over-enthusiasms, to fully exploit the available technologies, and to improve data processing and data management regulations.

  14. Big data for space situation awareness

    Science.gov (United States)

    Blasch, Erik; Pugh, Mark; Sheaff, Carolyn; Raquepas, Joe; Rocci, Peter

    2017-05-01

    Recent advances in big data (BD) have focused research on the volume, velocity, veracity, and variety of data. These developments enable new opportunities in information management, visualization, machine learning, and information fusion that have potential implications for space situational awareness (SSA). In this paper, we explore some of these BD trends as applicable for SSA towards enhancing the space operating picture. The BD developments could increase in measures of performance and measures of effectiveness for future management of the space environment. The global SSA influences include resident space object (RSO) tracking and characterization, cyber protection, remote sensing, and information management. The local satellite awareness can benefit from space weather, health monitoring, and spectrum management for situation space understanding. One area in big data of importance to SSA is value - getting the correct data/information at the right time, which corresponds to SSA visualization for the operator. A SSA big data example is presented supporting disaster relief for space situation awareness, assessment, and understanding.

  15. Entering the 'big data' era in medicinal chemistry: molecular promiscuity analysis revisited.

    Science.gov (United States)

    Hu, Ye; Bajorath, Jürgen

    2017-06-01

    The 'big data' concept plays an increasingly important role in many scientific fields. Big data involves more than unprecedentedly large volumes of data that become available. Different criteria characterizing big data must be carefully considered in computational data mining, as we discuss herein focusing on medicinal chemistry. This is a scientific discipline where big data is beginning to emerge and provide new opportunities. For example, the ability of many drugs to specifically interact with multiple targets, termed promiscuity, forms the molecular basis of polypharmacology, a hot topic in drug discovery. Compound promiscuity analysis is an area that is much influenced by big data phenomena. Different results are obtained depending on chosen data selection and confidence criteria, as we also demonstrate.

  16. Education Policy Research in the Big Data Era: Methodological Frontiers, Misconceptions, and Challenges

    Science.gov (United States)

    Wang, Yinying

    2017-01-01

    Despite abundant data and increasing data availability brought by technological advances, there has been very limited education policy studies that have capitalized on big data--characterized by large volume, wide variety, and high velocity. Drawing on the recent progress of using big data in public policy and computational social science…

  17. Implementing the “Big Data” Concept in Official Statistics

    Directory of Open Access Journals (Sweden)

    О. V.

    2017-02-01

    Full Text Available Big data is a huge resource that needs to be used at all levels of economic planning. The article is devoted to the study of the development of the concept of “Big Data” in the world and its impact on the transformation of statistical simulation of economic processes. Statistics at the current stage should take into account the complex system of international economic relations, which functions in the conditions of globalization and brings new forms of economic development in small open economies. Statistical science should take into account such phenomena as gig-economy, common economy, institutional factors, etc. The concept of “Big Data” and open data are analyzed, problems of implementation of “Big Data” in the official statistics are shown. The ways of implementation of “Big Data” in the official statistics of Ukraine through active use of technological opportunities of mobile operators, navigation systems, surveillance cameras, social networks, etc. are presented. The possibilities of using “Big Data” in different sectors of the economy, also on the level of companies are shown. The problems of storage of large volumes of data are highlighted. The study shows that “Big Data” is a huge resource that should be used across the Ukrainian economy.

  18. Big Data Analytics for Prostate Radiotherapy.

    Science.gov (United States)

    Coates, James; Souhami, Luis; El Naqa, Issam

    2016-01-01

    Radiation therapy is a first-line treatment option for localized prostate cancer and radiation-induced normal tissue damage are often the main limiting factor for modern radiotherapy regimens. Conversely, under-dosing of target volumes in an attempt to spare adjacent healthy tissues limits the likelihood of achieving local, long-term control. Thus, the ability to generate personalized data-driven risk profiles for radiotherapy outcomes would provide valuable prognostic information to help guide both clinicians and patients alike. Big data applied to radiation oncology promises to deliver better understanding of outcomes by harvesting and integrating heterogeneous data types, including patient-specific clinical parameters, treatment-related dose-volume metrics, and biological risk factors. When taken together, such variables make up the basis for a multi-dimensional space (the "RadoncSpace") in which the presented modeling techniques search in order to identify significant predictors. Herein, we review outcome modeling and big data-mining techniques for both tumor control and radiotherapy-induced normal tissue effects. We apply many of the presented modeling approaches onto a cohort of hypofractionated prostate cancer patients taking into account different data types and a large heterogeneous mix of physical and biological parameters. Cross-validation techniques are also reviewed for the refinement of the proposed framework architecture and checking individual model performance. We conclude by considering advanced modeling techniques that borrow concepts from big data analytics, such as machine learning and artificial intelligence, before discussing the potential future impact of systems radiobiology approaches.

  19. Big Data is a powerful tool for environmental improvements in the construction business

    Science.gov (United States)

    Konikov, Aleksandr; Konikov, Gregory

    2017-10-01

    The work investigates the possibility of applying the Big Data method as a tool to implement environmental improvements in the construction business. The method is recognized as effective in analyzing big volumes of heterogeneous data. It is noted that all preconditions exist for this method to be successfully used for resolution of environmental issues in the construction business. It is proven that the principal Big Data techniques (cluster analysis, crowd sourcing, data mixing and integration) can be applied in the sphere in question. It is concluded that Big Data is a truly powerful tool to implement environmental improvements in the construction business.

  20. A Big Data Decision-making Mechanism for Food Supply Chain

    Directory of Open Access Journals (Sweden)

    Ji Guojun

    2017-01-01

    Full Text Available Many companies have captured and analyzed huge volumes of data to improve the decision mechanism of supply chain, this paper presents a big data harvest model that uses big data as inputs to make more informed decisions in the food supply chain. By introducing a method of Bayesian network, this paper integrates sample data and finds a cause-and-effect between data to predict market demand. Then the deduction graph model that translates foods demand into processes and divides processes into tasks and assets is presented, and an example of how big data in the food supply chain can be combined with Bayesian network and deduction graph model to guide production decision. Our conclusions indicate that the decision-making mechanism has vast potential by extracting value from big data.

  1. Creating value in health care through big data: opportunities and policy implications.

    Science.gov (United States)

    Roski, Joachim; Bo-Linn, George W; Andrews, Timothy A

    2014-07-01

    Big data has the potential to create significant value in health care by improving outcomes while lowering costs. Big data's defining features include the ability to handle massive data volume and variety at high velocity. New, flexible, and easily expandable information technology (IT) infrastructure, including so-called data lakes and cloud data storage and management solutions, make big-data analytics possible. However, most health IT systems still rely on data warehouse structures. Without the right IT infrastructure, analytic tools, visualization approaches, work flows, and interfaces, the insights provided by big data are likely to be limited. Big data's success in creating value in the health care sector may require changes in current polices to balance the potential societal benefits of big-data approaches and the protection of patients' confidentiality. Other policy implications of using big data are that many current practices and policies related to data use, access, sharing, privacy, and stewardship need to be revised. Project HOPE—The People-to-People Health Foundation, Inc.

  2. Seasonal shifts in the diet of the big brown bat (Eptesicus fuscus), Fort Collins, Colorado

    Science.gov (United States)

    Valdez, Ernest W.; O'Shea, Thomas J.

    2014-01-01

    Recent analyses suggest that the big brown bat (Eptesicus fuscus) may be less of a beetle specialist (Coleoptera) in the western United States than previously thought, and that its diet might also vary with temperature. We tested the hypothesis that big brown bats might opportunistically prey on moths by analyzing insect fragments in guano pellets from 30 individual bats (27 females and 3 males) captured while foraging in Fort Collins, Colorado, during May, late July–early August, and late September 2002. We found that bats sampled 17–20 May (n = 12 bats) had a high (81–83%) percentage of volume of lepidopterans in guano, with the remainder (17–19% volume) dipterans and no coleopterans. From 28 May–9 August (n = 17 bats) coleopterans dominated (74–98% volume). On 20 September (n = 1 bat) lepidopterans were 99% of volume in guano. Migratory miller moths (Euxoa auxiliaris) were unusually abundant in Fort Collins in spring and autumn of 2002 and are known agricultural pests as larvae (army cutworms), suggesting that seasonal dietary flexibility in big brown bats has economic benefits.

  3. A practical guide to big data research in psychology.

    Science.gov (United States)

    Chen, Eric Evan; Wojcik, Sean P

    2016-12-01

    The massive volume of data that now covers a wide variety of human behaviors offers researchers in psychology an unprecedented opportunity to conduct innovative theory- and data-driven field research. This article is a practical guide to conducting big data research, covering data management, acquisition, processing, and analytics (including key supervised and unsupervised learning data mining methods). It is accompanied by walkthrough tutorials on data acquisition, text analysis with latent Dirichlet allocation topic modeling, and classification with support vector machines. Big data practitioners in academia, industry, and the community have built a comprehensive base of tools and knowledge that makes big data research accessible to researchers in a broad range of fields. However, big data research does require knowledge of software programming and a different analytical mindset. For those willing to acquire the requisite skills, innovative analyses of unexpected or previously untapped data sources can offer fresh ways to develop, test, and extend theories. When conducted with care and respect, big data research can become an essential complement to traditional research. (PsycINFO Database Record (c) 2016 APA, all rights reserved).

  4. 3rd International Symposium on Big Data and Cloud Computing Challenges

    CERN Document Server

    Neelanarayanan, V

    2016-01-01

    This proceedings volume contains selected papers that were presented in the 3rd International Symposium on Big data and Cloud Computing Challenges, 2016 held at VIT University, India on March 10 and 11. New research issues, challenges and opportunities shaping the future agenda in the field of Big Data and Cloud Computing are identified and presented throughout the book, which is intended for researchers, scholars, students, software developers and practitioners working at the forefront in their field. This book acts as a platform for exchanging ideas, setting questions for discussion, and sharing the experience in Big Data and Cloud Computing domain.

  5. Big Data and Biomedical Informatics: A Challenging Opportunity

    Science.gov (United States)

    2014-01-01

    Summary Big data are receiving an increasing attention in biomedicine and healthcare. It is therefore important to understand the reason why big data are assuming a crucial role for the biomedical informatics community. The capability of handling big data is becoming an enabler to carry out unprecedented research studies and to implement new models of healthcare delivery. Therefore, it is first necessary to deeply understand the four elements that constitute big data, namely Volume, Variety, Velocity, and Veracity, and their meaning in practice. Then, it is mandatory to understand where big data are present, and where they can be beneficially collected. There are research fields, such as translational bioinformatics, which need to rely on big data technologies to withstand the shock wave of data that is generated every day. Other areas, ranging from epidemiology to clinical care, can benefit from the exploitation of the large amounts of data that are nowadays available, from personal monitoring to primary care. However, building big data-enabled systems carries on relevant implications in terms of reproducibility of research studies and management of privacy and data access; proper actions should be taken to deal with these issues. An interesting consequence of the big data scenario is the availability of new software, methods, and tools, such as map-reduce, cloud computing, and concept drift machine learning algorithms, which will not only contribute to big data research, but may be beneficial in many biomedical informatics applications. The way forward with the big data opportunity will require properly applied engineering principles to design studies and applications, to avoid preconceptions or over-enthusiasms, to fully exploit the available technologies, and to improve data processing and data management regulations. PMID:24853034

  6. A Brief Review on Leading Big Data Models

    Directory of Open Access Journals (Sweden)

    Sugam Sharma

    2014-11-01

    Full Text Available Today, science is passing through an era of transformation, where the inundation of data, dubbed data deluge is influencing the decision making process. The science is driven by the data and is being termed as data science. In this internet age, the volume of the data has grown up to petabytes, and this large, complex, structured or unstructured, and heterogeneous data in the form of “Big Data” has gained significant attention. The rapid pace of data growth through various disparate sources, especially social media such as Facebook, has seriously challenged the data analytic capabilities of traditional relational databases. The velocity of the expansion of the amount of data gives rise to a complete paradigm shift in how new age data is processed. Confidence in the data engineering of the existing data processing systems is gradually fading whereas the capabilities of the new techniques for capturing, storing, visualizing, and analyzing data are evolving. In this review paper, we discuss some of the modern Big Data models that are leading contributors in the NoSQL era and claim to address Big Data challenges in reliable and efficient ways. Also, we take the potential of Big Data into consideration and try to reshape the original operationaloriented definition of “Big Science” (Furner, 2003 into a new data-driven definition and rephrase it as “The science that deals with Big Data is Big Science.”

  7. How Big Are "Martin's Big Words"? Thinking Big about the Future.

    Science.gov (United States)

    Gardner, Traci

    "Martin's Big Words: The Life of Dr. Martin Luther King, Jr." tells of King's childhood determination to use "big words" through biographical information and quotations. In this lesson, students in grades 3 to 5 explore information on Dr. King to think about his "big" words, then they write about their own…

  8. Big Data in the Industry - Overview of Selected Issues

    Science.gov (United States)

    Gierej, Sylwia

    2017-12-01

    This article reviews selected issues related to the use of Big Data in the industry. The aim is to define the potential scope and forms of using large data sets in manufacturing companies. By systematically reviewing scientific and professional literature, selected issues related to the use of mass data analytics in production were analyzed. A definition of Big Data was presented, detailing its main attributes. The importance of mass data processing technology in the development of Industry 4.0 concept has been highlighted. Subsequently, attention was paid to issues such as production process optimization, decision making and mass production individualisation, and indicated the potential for large volumes of data. As a result, conclusions were drawn regarding the potential of using Big Data in the industry.

  9. Entering the ‘big data’ era in medicinal chemistry: molecular promiscuity analysis revisited

    Science.gov (United States)

    Hu, Ye; Bajorath, Jürgen

    2017-01-01

    The ‘big data’ concept plays an increasingly important role in many scientific fields. Big data involves more than unprecedentedly large volumes of data that become available. Different criteria characterizing big data must be carefully considered in computational data mining, as we discuss herein focusing on medicinal chemistry. This is a scientific discipline where big data is beginning to emerge and provide new opportunities. For example, the ability of many drugs to specifically interact with multiple targets, termed promiscuity, forms the molecular basis of polypharmacology, a hot topic in drug discovery. Compound promiscuity analysis is an area that is much influenced by big data phenomena. Different results are obtained depending on chosen data selection and confidence criteria, as we also demonstrate. PMID:28670471

  10. Sideloading - Ingestion of Large Point Clouds Into the Apache Spark Big Data Engine

    Science.gov (United States)

    Boehm, J.; Liu, K.; Alis, C.

    2016-06-01

    In the geospatial domain we have now reached the point where data volumes we handle have clearly grown beyond the capacity of most desktop computers. This is particularly true in the area of point cloud processing. It is therefore naturally lucrative to explore established big data frameworks for big geospatial data. The very first hurdle is the import of geospatial data into big data frameworks, commonly referred to as data ingestion. Geospatial data is typically encoded in specialised binary file formats, which are not naturally supported by the existing big data frameworks. Instead such file formats are supported by software libraries that are restricted to single CPU execution. We present an approach that allows the use of existing point cloud file format libraries on the Apache Spark big data framework. We demonstrate the ingestion of large volumes of point cloud data into a compute cluster. The approach uses a map function to distribute the data ingestion across the nodes of a cluster. We test the capabilities of the proposed method to load billions of points into a commodity hardware compute cluster and we discuss the implications on scalability and performance. The performance is benchmarked against an existing native Apache Spark data import implementation.

  11. Big Data in Science and Healthcare: A Review of Recent Literature and Perspectives

    Science.gov (United States)

    Miron-Shatz, T.; Lau, A. Y. S.; Paton, C.

    2014-01-01

    Summary Objectives As technology continues to evolve and rise in various industries, such as healthcare, science, education, and gaming, a sophisticated concept known as Big Data is surfacing. The concept of analytics aims to understand data. We set out to portray and discuss perspectives of the evolving use of Big Data in science and healthcare and, to examine some of the opportunities and challenges. Methods A literature review was conducted to highlight the implications associated with the use of Big Data in scientific research and healthcare innovations, both on a large and small scale. Results Scientists and health-care providers may learn from one another when it comes to understanding the value of Big Data and analytics. Small data, derived by patients and consumers, also requires analytics to become actionable. Connectivism provides a framework for the use of Big Data and analytics in the areas of science and healthcare. This theory assists individuals to recognize and synthesize how human connections are driving the increase in data. Despite the volume and velocity of Big Data, it is truly about technology connecting humans and assisting them to construct knowledge in new ways. Concluding Thoughts The concept of Big Data and associated analytics are to be taken seriously when approaching the use of vast volumes of both structured and unstructured data in science and health-care. Future exploration of issues surrounding data privacy, confidentiality, and education are needed. A greater focus on data from social media, the quantified self-movement, and the application of analytics to “small data” would also be useful. PMID:25123717

  12. Curating Big Data Made Simple: Perspectives from Scientific Communities.

    Science.gov (United States)

    Sowe, Sulayman K; Zettsu, Koji

    2014-03-01

    The digital universe is exponentially producing an unprecedented volume of data that has brought benefits as well as fundamental challenges for enterprises and scientific communities alike. This trend is inherently exciting for the development and deployment of cloud platforms to support scientific communities curating big data. The excitement stems from the fact that scientists can now access and extract value from the big data corpus, establish relationships between bits and pieces of information from many types of data, and collaborate with a diverse community of researchers from various domains. However, despite these perceived benefits, to date, little attention is focused on the people or communities who are both beneficiaries and, at the same time, producers of big data. The technical challenges posed by big data are as big as understanding the dynamics of communities working with big data, whether scientific or otherwise. Furthermore, the big data era also means that big data platforms for data-intensive research must be designed in such a way that research scientists can easily search and find data for their research, upload and download datasets for onsite/offsite use, perform computations and analysis, share their findings and research experience, and seamlessly collaborate with their colleagues. In this article, we present the architecture and design of a cloud platform that meets some of these requirements, and a big data curation model that describes how a community of earth and environmental scientists is using the platform to curate data. Motivation for developing the platform, lessons learnt in overcoming some challenges associated with supporting scientists to curate big data, and future research directions are also presented.

  13. Big Data in Plant Science: Resources and Data Mining Tools for Plant Genomics and Proteomics.

    Science.gov (United States)

    Popescu, George V; Noutsos, Christos; Popescu, Sorina C

    2016-01-01

    In modern plant biology, progress is increasingly defined by the scientists' ability to gather and analyze data sets of high volume and complexity, otherwise known as "big data". Arguably, the largest increase in the volume of plant data sets over the last decade is a consequence of the application of the next-generation sequencing and mass-spectrometry technologies to the study of experimental model and crop plants. The increase in quantity and complexity of biological data brings challenges, mostly associated with data acquisition, processing, and sharing within the scientific community. Nonetheless, big data in plant science create unique opportunities in advancing our understanding of complex biological processes at a level of accuracy without precedence, and establish a base for the plant systems biology. In this chapter, we summarize the major drivers of big data in plant science and big data initiatives in life sciences with a focus on the scope and impact of iPlant, a representative cyberinfrastructure platform for plant science.

  14. Semantic Web technologies for the big data in life sciences.

    Science.gov (United States)

    Wu, Hongyan; Yamaguchi, Atsuko

    2014-08-01

    The life sciences field is entering an era of big data with the breakthroughs of science and technology. More and more big data-related projects and activities are being performed in the world. Life sciences data generated by new technologies are continuing to grow in not only size but also variety and complexity, with great speed. To ensure that big data has a major influence in the life sciences, comprehensive data analysis across multiple data sources and even across disciplines is indispensable. The increasing volume of data and the heterogeneous, complex varieties of data are two principal issues mainly discussed in life science informatics. The ever-evolving next-generation Web, characterized as the Semantic Web, is an extension of the current Web, aiming to provide information for not only humans but also computers to semantically process large-scale data. The paper presents a survey of big data in life sciences, big data related projects and Semantic Web technologies. The paper introduces the main Semantic Web technologies and their current situation, and provides a detailed analysis of how Semantic Web technologies address the heterogeneous variety of life sciences big data. The paper helps to understand the role of Semantic Web technologies in the big data era and how they provide a promising solution for the big data in life sciences.

  15. A Big Data Decision-making Mechanism for Food Supply Chain

    OpenAIRE

    Ji Guojun; Tan KimHua

    2017-01-01

    Many companies have captured and analyzed huge volumes of data to improve the decision mechanism of supply chain, this paper presents a big data harvest model that uses big data as inputs to make more informed decisions in the food supply chain. By introducing a method of Bayesian network, this paper integrates sample data and finds a cause-and-effect between data to predict market demand. Then the deduction graph model that translates foods demand into processes and divides processes into ta...

  16. Statistical methods and computing for big data

    Science.gov (United States)

    Wang, Chun; Chen, Ming-Hui; Schifano, Elizabeth; Wu, Jing

    2016-01-01

    Big data are data on a massive scale in terms of volume, intensity, and complexity that exceed the capacity of standard analytic tools. They present opportunities as well as challenges to statisticians. The role of computational statisticians in scientific discovery from big data analyses has been under-recognized even by peer statisticians. This article summarizes recent methodological and software developments in statistics that address the big data challenges. Methodologies are grouped into three classes: subsampling-based, divide and conquer, and online updating for stream data. As a new contribution, the online updating approach is extended to variable selection with commonly used criteria, and their performances are assessed in a simulation study with stream data. Software packages are summarized with focuses on the open source R and R packages, covering recent tools that help break the barriers of computer memory and computing power. Some of the tools are illustrated in a case study with a logistic regression for the chance of airline delay. PMID:27695593

  17. Statistical methods and computing for big data.

    Science.gov (United States)

    Wang, Chun; Chen, Ming-Hui; Schifano, Elizabeth; Wu, Jing; Yan, Jun

    2016-01-01

    Big data are data on a massive scale in terms of volume, intensity, and complexity that exceed the capacity of standard analytic tools. They present opportunities as well as challenges to statisticians. The role of computational statisticians in scientific discovery from big data analyses has been under-recognized even by peer statisticians. This article summarizes recent methodological and software developments in statistics that address the big data challenges. Methodologies are grouped into three classes: subsampling-based, divide and conquer, and online updating for stream data. As a new contribution, the online updating approach is extended to variable selection with commonly used criteria, and their performances are assessed in a simulation study with stream data. Software packages are summarized with focuses on the open source R and R packages, covering recent tools that help break the barriers of computer memory and computing power. Some of the tools are illustrated in a case study with a logistic regression for the chance of airline delay.

  18. Leveraging cloud based big data analytics in knowledge management for enhanced decision making in organizations

    OpenAIRE

    Shorfuzzaman, Mohammad

    2017-01-01

    In recent past, big data opportunities have gained much momentum to enhance knowledge management in organizations. However, big data due to its various properties like high volume, variety, and velocity can no longer be effectively stored and analyzed with traditional data management techniques to generate values for knowledge development. Hence, new technologies and architectures are required to store and analyze this big data through advanced data analytics and in turn generate vital real-t...

  19. Big Data, Big Problems: A Healthcare Perspective.

    Science.gov (United States)

    Househ, Mowafa S; Aldosari, Bakheet; Alanazi, Abdullah; Kushniruk, Andre W; Borycki, Elizabeth M

    2017-01-01

    Much has been written on the benefits of big data for healthcare such as improving patient outcomes, public health surveillance, and healthcare policy decisions. Over the past five years, Big Data, and the data sciences field in general, has been hyped as the "Holy Grail" for the healthcare industry promising a more efficient healthcare system with the promise of improved healthcare outcomes. However, more recently, healthcare researchers are exposing the potential and harmful effects Big Data can have on patient care associating it with increased medical costs, patient mortality, and misguided decision making by clinicians and healthcare policy makers. In this paper, we review the current Big Data trends with a specific focus on the inadvertent negative impacts that Big Data could have on healthcare, in general, and specifically, as it relates to patient and clinical care. Our study results show that although Big Data is built up to be as a the "Holy Grail" for healthcare, small data techniques using traditional statistical methods are, in many cases, more accurate and can lead to more improved healthcare outcomes than Big Data methods. In sum, Big Data for healthcare may cause more problems for the healthcare industry than solutions, and in short, when it comes to the use of data in healthcare, "size isn't everything."

  20. Geospatial Big Data Handling Theory and Methods: A Review and Research Challenges

    DEFF Research Database (Denmark)

    Li, Songnian; Dragicevic, Suzana; Anton, François

    2016-01-01

    Big data has now become a strong focus of global interest that is increasingly attracting the attention of academia, industry, government and other organizations. Big data can be situated in the disciplinary area of traditional geospatial data handling theory and methods. The increasing volume...... for Photogrammetry and Remote Sensing (ISPRS) Technical Commission II (TC II) revisits the existing geospatial data handling methods and theories to determine if they are still capable of handling emerging geospatial big data. Further, the paper synthesises problems, major issues and challenges with current...... developments as well as recommending what needs to be developed further in the near future....

  1. Research in Big Data Warehousing using Hadoop

    Directory of Open Access Journals (Sweden)

    Abderrazak Sebaa

    2017-05-01

    Full Text Available Traditional data warehouses have played a key role in decision support system until the recent past. However, the rapid growing of the data generation by the current applications requires new data warehousing systems: volume and format of collected datasets, data source variety, integration of unstructured data and powerful analytical processing. In the age of the Big Data, it is important to follow this pace and adapt the existing warehouse systems to overcome the new issues and challenges. In this paper, we focus on the data warehousing over big data. We discuss the limitations of the traditional ones. We present its alternative technologies and related future work for data warehousing.

  2. Visualizing big energy data

    DEFF Research Database (Denmark)

    Hyndman, Rob J.; Liu, Xueqin Amy; Pinson, Pierre

    2018-01-01

    Visualization is a crucial component of data analysis. It is always a good idea to plot the data before fitting models, making predictions, or drawing conclusions. As sensors of the electric grid are collecting large volumes of data from various sources, power industry professionals are facing th...... the challenge of visualizing such data in a timely fashion. In this article, we demonstrate several data-visualization solutions for big energy data through three case studies involving smart-meter data, phasor measurement unit (PMU) data, and probabilistic forecasts, respectively....

  3. Teoria del Big Bang e buchi neri

    CERN Document Server

    Wald, Robert M

    1980-01-01

    Un giovane fisico americano delinea con chiarezza in questo volume le attuali concezioni dello spazio, del tempo e della gravitazione, cosi come si sono andate delineando dopo e innovazioni teoriche aperte da Einstein. Esse investono problemi affascinanti, come la teoria del big bang, da cui avrebbe avuto origine l'universo, e l'enigma dei buchi neri.

  4. Big Surveys, Big Data Centres

    Science.gov (United States)

    Schade, D.

    2016-06-01

    Well-designed astronomical surveys are powerful and have consistently been keystones of scientific progress. The Byurakan Surveys using a Schmidt telescope with an objective prism produced a list of about 3000 UV-excess Markarian galaxies but these objects have stimulated an enormous amount of further study and appear in over 16,000 publications. The CFHT Legacy Surveys used a wide-field imager to cover thousands of square degrees and those surveys are mentioned in over 1100 publications since 2002. Both ground and space-based astronomy have been increasing their investments in survey work. Survey instrumentation strives toward fair samples and large sky coverage and therefore strives to produce massive datasets. Thus we are faced with the "big data" problem in astronomy. Survey datasets require specialized approaches to data management. Big data places additional challenging requirements for data management. If the term "big data" is defined as data collections that are too large to move then there are profound implications for the infrastructure that supports big data science. The current model of data centres is obsolete. In the era of big data the central problem is how to create architectures that effectively manage the relationship between data collections, networks, processing capabilities, and software, given the science requirements of the projects that need to be executed. A stand alone data silo cannot support big data science. I'll describe the current efforts of the Canadian community to deal with this situation and our successes and failures. I'll talk about how we are planning in the next decade to try to create a workable and adaptable solution to support big data science.

  5. Recht voor big data, big data voor recht

    NARCIS (Netherlands)

    Lafarre, Anne

    Big data is een niet meer weg te denken fenomeen in onze maatschappij. Het is de hype cycle voorbij en de eerste implementaties van big data-technieken worden uitgevoerd. Maar wat is nu precies big data? Wat houden de vijf V's in die vaak genoemd worden in relatie tot big data? Ter inleiding van

  6. A Grey Theory Based Approach to Big Data Risk Management Using FMEA

    Directory of Open Access Journals (Sweden)

    Maisa Mendonça Silva

    2016-01-01

    Full Text Available Big data is the term used to denote enormous sets of data that differ from other classic databases in four main ways: (huge volume, (high velocity, (much greater variety, and (big value. In general, data are stored in a distributed fashion and on computing nodes as a result of which big data may be more susceptible to attacks by hackers. This paper presents a risk model for big data, which comprises Failure Mode and Effects Analysis (FMEA and Grey Theory, more precisely grey relational analysis. This approach has several advantages: it provides a structured approach in order to incorporate the impact of big data risk factors; it facilitates the assessment of risk by breaking down the overall risk to big data; and finally its efficient evaluation criteria can help enterprises reduce the risks associated with big data. In order to illustrate the applicability of our proposal in practice, a numerical example, with realistic data based on expert knowledge, was developed. The numerical example analyzes four dimensions, that is, managing identification and access, registering the device and application, managing the infrastructure, and data governance, and 20 failure modes concerning the vulnerabilities of big data. The results show that the most important aspect of risk to big data relates to data governance.

  7. SIDELOADING – INGESTION OF LARGE POINT CLOUDS INTO THE APACHE SPARK BIG DATA ENGINE

    Directory of Open Access Journals (Sweden)

    J. Boehm

    2016-06-01

    Full Text Available In the geospatial domain we have now reached the point where data volumes we handle have clearly grown beyond the capacity of most desktop computers. This is particularly true in the area of point cloud processing. It is therefore naturally lucrative to explore established big data frameworks for big geospatial data. The very first hurdle is the import of geospatial data into big data frameworks, commonly referred to as data ingestion. Geospatial data is typically encoded in specialised binary file formats, which are not naturally supported by the existing big data frameworks. Instead such file formats are supported by software libraries that are restricted to single CPU execution. We present an approach that allows the use of existing point cloud file format libraries on the Apache Spark big data framework. We demonstrate the ingestion of large volumes of point cloud data into a compute cluster. The approach uses a map function to distribute the data ingestion across the nodes of a cluster. We test the capabilities of the proposed method to load billions of points into a commodity hardware compute cluster and we discuss the implications on scalability and performance. The performance is benchmarked against an existing native Apache Spark data import implementation.

  8. Perspectives on Policy and the Value of Nursing Science in a Big Data Era.

    Science.gov (United States)

    Gephart, Sheila M; Davis, Mary; Shea, Kimberly

    2018-01-01

    As data volume explodes, nurse scientists grapple with ways to adapt to the big data movement without jeopardizing its epistemic values and theoretical focus that celebrate while acknowledging the authority and unity of its body of knowledge. In this article, the authors describe big data and emphasize ways that nursing science brings value to its study. Collective nursing voices that call for more nursing engagement in the big data era are answered with ways to adapt and integrate theoretical and domain expertise from nursing into data science.

  9. Big data analytics in support of virtual network topology adaptability

    OpenAIRE

    Gifre Renom, Lluís; Contreras, Luis Miguel; Lopez Alvarez, Victor; Velasco Esteban, Luis Domingo

    2016-01-01

    ABNO's OAM Handler is extended with big data analytics capabilities to anticipate traffic changes in volume and direction. Predicted traffic is used as input for VNT re-optimization. Experimental assessment is realized on UPC's SYNERGY testbed. Peer Reviewed

  10. Modeling and Analysis in Marine Big Data: Advances and Challenges

    Directory of Open Access Journals (Sweden)

    Dongmei Huang

    2015-01-01

    Full Text Available It is aware that big data has gathered tremendous attentions from academic research institutes, governments, and enterprises in all aspects of information sciences. With the development of diversity of marine data acquisition techniques, marine data grow exponentially in last decade, which forms marine big data. As an innovation, marine big data is a double-edged sword. On the one hand, there are many potential and highly useful values hidden in the huge volume of marine data, which is widely used in marine-related fields, such as tsunami and red-tide warning, prevention, and forecasting, disaster inversion, and visualization modeling after disasters. There is no doubt that the future competitions in marine sciences and technologies will surely converge into the marine data explorations. On the other hand, marine big data also brings about many new challenges in data management, such as the difficulties in data capture, storage, analysis, and applications, as well as data quality control and data security. To highlight theoretical methodologies and practical applications of marine big data, this paper illustrates a broad view about marine big data and its management, makes a survey on key methods and models, introduces an engineering instance that demonstrates the management architecture, and discusses the existing challenges.

  11. Reducing Racial Disparities in Breast Cancer Care: The Role of 'Big Data'.

    Science.gov (United States)

    Reeder-Hayes, Katherine E; Troester, Melissa A; Meyer, Anne-Marie

    2017-10-15

    Advances in a wide array of scientific technologies have brought data of unprecedented volume and complexity into the oncology research space. These novel big data resources are applied across a variety of contexts-from health services research using data from insurance claims, cancer registries, and electronic health records, to deeper and broader genomic characterizations of disease. Several forms of big data show promise for improving our understanding of racial disparities in breast cancer, and for powering more intelligent and far-reaching interventions to close the racial gap in breast cancer survival. In this article we introduce several major types of big data used in breast cancer disparities research, highlight important findings to date, and discuss how big data may transform breast cancer disparities research in ways that lead to meaningful, lifesaving changes in breast cancer screening and treatment. We also discuss key challenges that may hinder progress in using big data for cancer disparities research and quality improvement.

  12. Big Data Challenges

    Directory of Open Access Journals (Sweden)

    Alexandru Adrian TOLE

    2013-10-01

    Full Text Available The amount of data that is traveling across the internet today, not only that is large, but is complex as well. Companies, institutions, healthcare system etc., all of them use piles of data which are further used for creating reports in order to ensure continuity regarding the services that they have to offer. The process behind the results that these entities requests represents a challenge for software developers and companies that provide IT infrastructure. The challenge is how to manipulate an impressive volume of data that has to be securely delivered through the internet and reach its destination intact. This paper treats the challenges that Big Data creates.

  13. Big data in complex systems challenges and opportunities

    CERN Document Server

    Azar, Ahmad; Snasael, Vaclav; Kacprzyk, Janusz; Abawajy, Jemal

    2015-01-01

    This volume provides challenges and Opportunities with updated, in-depth material on the application of Big data to complex systems in order to find solutions for the challenges and problems facing big data sets applications. Much data today is not natively in structured format; for example, tweets and blogs are weakly structured pieces of text, while images and video are structured for storage and display, but not for semantic content and search. Therefore transforming such content into a structured format for later analysis is a major challenge. Data analysis, organization, retrieval, and modeling are other  foundational challenges treated in this book. The material of this book will be useful for researchers and practitioners in the field of big data as well as advanced undergraduate and graduate  students. Each of the 17 chapters in the book opens with a chapter abstract and key terms list. The chapters are organized along the lines of problem description, related works, and analysis of the results and ...

  14. Technology for Mining the Big Data of MOOCs

    Science.gov (United States)

    O'Reilly, Una-May; Veeramachaneni, Kalyan

    2014-01-01

    Because MOOCs bring big data to the forefront, they confront learning science with technology challenges. We describe an agenda for developing technology that enables MOOC analytics. Such an agenda needs to efficiently address the detailed, low level, high volume nature of MOOC data. It also needs to help exploit the data's capacity to reveal, in…

  15. A hybrid ARIMA and neural network model applied to forecast catch volumes of Selar crumenophthalmus

    Science.gov (United States)

    Aquino, Ronald L.; Alcantara, Nialle Loui Mar T.; Addawe, Rizavel C.

    2017-11-01

    The Selar crumenophthalmus with the English name big-eyed scad fish, locally known as matang-baka, is one of the fishes commonly caught along the waters of La Union, Philippines. The study deals with the forecasting of catch volumes of big-eyed scad fish for commercial consumption. The data used are quarterly caught volumes of big-eyed scad fish from 2002 to first quarter of 2017. This actual data is available from the open stat database published by the Philippine Statistics Authority (PSA)whose task is to collect, compiles, analyzes and publish information concerning different aspects of the Philippine setting. Autoregressive Integrated Moving Average (ARIMA) models, Artificial Neural Network (ANN) model and the Hybrid model consisting of ARIMA and ANN were developed to forecast catch volumes of big-eyed scad fish. Statistical errors such as Mean Absolute Errors (MAE) and Root Mean Square Errors (RMSE) were computed and compared to choose the most suitable model for forecasting the catch volume for the next few quarters. A comparison of the results of each model and corresponding statistical errors reveals that the hybrid model, ARIMA-ANN (2,1,2)(6:3:1), is the most suitable model to forecast the catch volumes of the big-eyed scad fish for the next few quarters.

  16. MapFactory - Towards a mapping design pattern for big geospatial data

    Science.gov (United States)

    Rautenbach, Victoria; Coetzee, Serena

    2018-05-01

    With big geospatial data emerging, cartographers and geographic information scientists have to find new ways of dealing with the volume, variety, velocity, and veracity (4Vs) of the data. This requires the development of tools that allow processing, filtering, analysing, and visualising of big data through multidisciplinary collaboration. In this paper, we present the MapFactory design pattern that will be used for the creation of different maps according to the (input) design specification for big geospatial data. The design specification is based on elements from ISO19115-1:2014 Geographic information - Metadata - Part 1: Fundamentals that would guide the design and development of the map or set of maps to be produced. The results of the exploratory research suggest that the MapFactory design pattern will help with software reuse and communication. The MapFactory design pattern will aid software developers to build the tools that are required to automate map making with big geospatial data. The resulting maps would assist cartographers and others to make sense of big geospatial data.

  17. Astronomy in the Big Data Era

    Directory of Open Access Journals (Sweden)

    Yanxia Zhang

    2015-05-01

    Full Text Available The fields of Astrostatistics and Astroinformatics are vital for dealing with the big data issues now faced by astronomy. Like other disciplines in the big data era, astronomy has many V characteristics. In this paper, we list the different data mining algorithms used in astronomy, along with data mining software and tools related to astronomical applications. We present SDSS, a project often referred to by other astronomical projects, as the most successful sky survey in the history of astronomy and describe the factors influencing its success. We also discuss the success of Astrostatistics and Astroinformatics organizations and the conferences and summer schools on these issues that are held annually. All the above indicates that astronomers and scientists from other areas are ready to face the challenges and opportunities provided by massive data volume.

  18. Big Data in Drug Discovery.

    Science.gov (United States)

    Brown, Nathan; Cambruzzi, Jean; Cox, Peter J; Davies, Mark; Dunbar, James; Plumbley, Dean; Sellwood, Matthew A; Sim, Aaron; Williams-Jones, Bryn I; Zwierzyna, Magdalena; Sheppard, David W

    2018-01-01

    Interpretation of Big Data in the drug discovery community should enhance project timelines and reduce clinical attrition through improved early decision making. The issues we encounter start with the sheer volume of data and how we first ingest it before building an infrastructure to house it to make use of the data in an efficient and productive way. There are many problems associated with the data itself including general reproducibility, but often, it is the context surrounding an experiment that is critical to success. Help, in the form of artificial intelligence (AI), is required to understand and translate the context. On the back of natural language processing pipelines, AI is also used to prospectively generate new hypotheses by linking data together. We explain Big Data from the context of biology, chemistry and clinical trials, showcasing some of the impressive public domain sources and initiatives now available for interrogation. © 2018 Elsevier B.V. All rights reserved.

  19. BigOP: Generating Comprehensive Big Data Workloads as a Benchmarking Framework

    OpenAIRE

    Zhu, Yuqing; Zhan, Jianfeng; Weng, Chuliang; Nambiar, Raghunath; Zhang, Jinchao; Chen, Xingzhen; Wang, Lei

    2014-01-01

    Big Data is considered proprietary asset of companies, organizations, and even nations. Turning big data into real treasure requires the support of big data systems. A variety of commercial and open source products have been unleashed for big data storage and processing. While big data users are facing the choice of which system best suits their needs, big data system developers are facing the question of how to evaluate their systems with regard to general big data processing needs. System b...

  20. Big data - smart health strategies. Findings from the yearbook 2014 special theme.

    Science.gov (United States)

    Koutkias, V; Thiessard, F

    2014-08-15

    To select best papers published in 2013 in the field of big data and smart health strategies, and summarize outstanding research efforts. A systematic search was performed using two major bibliographic databases for relevant journal papers. The references obtained were reviewed in a two-stage process, starting with a blinded review performed by the two section editors, and followed by a peer review process operated by external reviewers recognized as experts in the field. The complete review process selected four best papers, illustrating various aspects of the special theme, among them: (a) using large volumes of unstructured data and, specifically, clinical notes from Electronic Health Records (EHRs) for pharmacovigilance; (b) knowledge discovery via querying large volumes of complex (both structured and unstructured) biological data using big data technologies and relevant tools; (c) methodologies for applying cloud computing and big data technologies in the field of genomics, and (d) system architectures enabling high-performance access to and processing of large datasets extracted from EHRs. The potential of big data in biomedicine has been pinpointed in various viewpoint papers and editorials. The review of current scientific literature illustrated a variety of interesting methods and applications in the field, but still the promises exceed the current outcomes. As we are getting closer towards a solid foundation with respect to common understanding of relevant concepts and technical aspects, and the use of standardized technologies and tools, we can anticipate to reach the potential that big data offer for personalized medicine and smart health strategies in the near future.

  1. NASA EOSDIS Evolution in the BigData Era

    Science.gov (United States)

    Lynnes, Christopher

    2015-01-01

    NASA's EOSDIS system faces several challenges in the Big Data Era. Although volumes are large (but not unmanageably so), the variety of different data collections is daunting. That variety also brings with it a large and diverse user community. One key evolution EOSDIS is working toward is to enable more science analysis to be performed close to the data.

  2. How Big Is Too Big?

    Science.gov (United States)

    Cibes, Margaret; Greenwood, James

    2016-01-01

    Media Clips appears in every issue of Mathematics Teacher, offering readers contemporary, authentic applications of quantitative reasoning based on print or electronic media. This issue features "How Big is Too Big?" (Margaret Cibes and James Greenwood) in which students are asked to analyze the data and tables provided and answer a…

  3. Big data mining: In-database Oracle data mining over hadoop

    Science.gov (United States)

    Kovacheva, Zlatinka; Naydenova, Ina; Kaloyanova, Kalinka; Markov, Krasimir

    2017-07-01

    Big data challenges different aspects of storing, processing and managing data, as well as analyzing and using data for business purposes. Applying Data Mining methods over Big Data is another challenge because of huge data volumes, variety of information, and the dynamic of the sources. Different applications are made in this area, but their successful usage depends on understanding many specific parameters. In this paper we present several opportunities for using Data Mining techniques provided by the analytical engine of RDBMS Oracle over data stored in Hadoop Distributed File System (HDFS). Some experimental results are given and they are discussed.

  4. [Big Data: the great opportunities and challenges to microbiome and other biomedical research].

    Science.gov (United States)

    Xu, Zhenjiang

    2015-02-01

    With the development of high-throughput technologies, biomedical data has been increasing exponentially in an explosive manner. This brings enormous opportunities and challenges to biomedical researchers on how to effectively utilize big data. Big data is different from traditional data in many ways, described as 3Vs - volume, variety and velocity. From the perspective of biomedical research, here I introduced the characteristics of big data, such as its messiness, re-usage and openness. Focusing on microbiome research of meta-analysis, the author discussed the prospective principles in data collection, challenges of privacy protection in data management, and the scalable tools in data analysis with examples from real life.

  5. Big data: the management revolution.

    Science.gov (United States)

    McAfee, Andrew; Brynjolfsson, Erik

    2012-10-01

    Big data, the authors write, is far more powerful than the analytics of the past. Executives can measure and therefore manage more precisely than ever before. They can make better predictions and smarter decisions. They can target more-effective interventions in areas that so far have been dominated by gut and intuition rather than by data and rigor. The differences between big data and analytics are a matter of volume, velocity, and variety: More data now cross the internet every second than were stored in the entire internet 20 years ago. Nearly real-time information makes it possible for a company to be much more agile than its competitors. And that information can come from social networks, images, sensors, the web, or other unstructured sources. The managerial challenges, however, are very real. Senior decision makers have to learn to ask the right questions and embrace evidence-based decision making. Organizations must hire scientists who can find patterns in very large data sets and translate them into useful business information. IT departments have to work hard to integrate all the relevant internal and external sources of data. The authors offer two success stories to illustrate how companies are using big data: PASSUR Aerospace enables airlines to match their actual and estimated arrival times. Sears Holdings directly analyzes its incoming store data to make promotions much more precise and faster.

  6. Challenges and potential solutions for big data implementations in developing countries.

    Science.gov (United States)

    Luna, D; Mayan, J C; García, M J; Almerares, A A; Househ, M

    2014-08-15

    The volume of data, the velocity with which they are generated, and their variety and lack of structure hinder their use. This creates the need to change the way information is captured, stored, processed, and analyzed, leading to the paradigm shift called Big Data. To describe the challenges and possible solutions for developing countries when implementing Big Data projects in the health sector. A non-systematic review of the literature was performed in PubMed and Google Scholar. The following keywords were used: "big data", "developing countries", "data mining", "health information systems", and "computing methodologies". A thematic review of selected articles was performed. There are challenges when implementing any Big Data program including exponential growth of data, special infrastructure needs, need for a trained workforce, need to agree on interoperability standards, privacy and security issues, and the need to include people, processes, and policies to ensure their adoption. Developing countries have particular characteristics that hinder further development of these projects. The advent of Big Data promises great opportunities for the healthcare field. In this article, we attempt to describe the challenges developing countries would face and enumerate the options to be used to achieve successful implementations of Big Data programs.

  7. Optimizing Hadoop Performance for Big Data Analytics in Smart Grid

    Directory of Open Access Journals (Sweden)

    Mukhtaj Khan

    2017-01-01

    Full Text Available The rapid deployment of Phasor Measurement Units (PMUs in power systems globally is leading to Big Data challenges. New high performance computing techniques are now required to process an ever increasing volume of data from PMUs. To that extent the Hadoop framework, an open source implementation of the MapReduce computing model, is gaining momentum for Big Data analytics in smart grid applications. However, Hadoop has over 190 configuration parameters, which can have a significant impact on the performance of the Hadoop framework. This paper presents an Enhanced Parallel Detrended Fluctuation Analysis (EPDFA algorithm for scalable analytics on massive volumes of PMU data. The novel EPDFA algorithm builds on an enhanced Hadoop platform whose configuration parameters are optimized by Gene Expression Programming. Experimental results show that the EPDFA is 29 times faster than the sequential DFA in processing PMU data and 1.87 times faster than a parallel DFA, which utilizes the default Hadoop configuration settings.

  8. BIG Data - BIG Gains? Understanding the Link Between Big Data Analytics and Innovation

    OpenAIRE

    Niebel, Thomas; Rasel, Fabienne; Viete, Steffen

    2017-01-01

    This paper analyzes the relationship between firms’ use of big data analytics and their innovative performance for product innovations. Since big data technologies provide new data information practices, they create new decision-making possibilities, which firms can use to realize innovations. Applying German firm-level data we find suggestive evidence that big data analytics matters for the likelihood of becoming a product innovator as well as the market success of the firms’ product innovat...

  9. Networking for big data

    CERN Document Server

    Yu, Shui; Misic, Jelena; Shen, Xuemin (Sherman)

    2015-01-01

    Networking for Big Data supplies an unprecedented look at cutting-edge research on the networking and communication aspects of Big Data. Starting with a comprehensive introduction to Big Data and its networking issues, it offers deep technical coverage of both theory and applications.The book is divided into four sections: introduction to Big Data, networking theory and design for Big Data, networking security for Big Data, and platforms and systems for Big Data applications. Focusing on key networking issues in Big Data, the book explains network design and implementation for Big Data. It exa

  10. Global fluctuation spectra in big-crunch-big-bang string vacua

    International Nuclear Information System (INIS)

    Craps, Ben; Ovrut, Burt A.

    2004-01-01

    We study big-crunch-big-bang cosmologies that correspond to exact world-sheet superconformal field theories of type II strings. The string theory spacetime contains a big crunch and a big bang cosmology, as well as additional 'whisker' asymptotic and intermediate regions. Within the context of free string theory, we compute, unambiguously, the scalar fluctuation spectrum in all regions of spacetime. Generically, the big crunch fluctuation spectrum is altered while passing through the bounce singularity. The change in the spectrum is characterized by a function Δ, which is momentum and time dependent. We compute Δ explicitly and demonstrate that it arises from the whisker regions. The whiskers are also shown to lead to 'entanglement' entropy in the big bang region. Finally, in the Milne orbifold limit of our superconformal vacua, we show that Δ→1 and, hence, the fluctuation spectrum is unaltered by the big-crunch-big-bang singularity. We comment on, but do not attempt to resolve, subtleties related to gravitational back reaction and light winding modes when interactions are taken into account

  11. BIG DATA IN SUPPLY CHAIN MANAGEMENT: AN EXPLORATORY STUDY

    Directory of Open Access Journals (Sweden)

    Gheorghe MILITARU

    2015-12-01

    Full Text Available The objective of this paper is to set a framework for examining the conditions under which the big data can create long-term profitability through developing dynamic operations and digital supply networks in supply chain. We investigate the extent to which big data analytics has the power to change the competitive landscape of industries that could offer operational, strategic and competitive advantages. This paper is based upon a qualitative study of the convergence of predictive analytics and big data in the field of supply chain management. Our findings indicate a need for manufacturers to introduce analytics tools, real-time data, and more flexible production techniques to improve their productivity in line with the new business model. By gathering and analysing vast volumes of data, analytics tools help companies to resource allocation and capital spends more effectively based on risk assessment. Finally, implications and directions for future research are discussed.

  12. The Big Five of Personality and structural imaging revisited: a VBM - DARTEL study.

    Science.gov (United States)

    Liu, Wei-Yin; Weber, Bernd; Reuter, Martin; Markett, Sebastian; Chu, Woei-Chyn; Montag, Christian

    2013-05-08

    The present study focuses on the neurostructural foundations of the human personality. In a large sample of 227 healthy human individuals (168 women and 59 men), we used MRI to examine the relationship between personality traits and both regional gray and white matter volume, while controlling for age and sex. Personality was assessed using the German version of the NEO Five-Factor Inventory that measures individual differences in the 'Big Five of Personality': extraversion, neuroticism, agreeableness, conscientiousness, and openness to experience. In contrast to most previous studies on neural correlates of the Big Five, we used improved processing strategies: white and gray matter were independently assessed by segmentation steps before data analysis. In addition, customized sex-specific diffeomorphic anatomical registration using exponentiated lie algebra templates were used. Our results did not show significant correlations between any dimension of the Big Five and regional gray matter volume. However, among others, higher conscientiousness scores correlated significantly with reductions in regional white matter volume in different brain areas, including the right insula, putamen, caudate, and left fusiformis. These correlations were driven by the female subsample. The present study suggests that many results from the literature on the neurostructural basis of personality should be reviewed carefully, considering the results when the sample size is larger, imaging methods are rigorously applied, and sex-related and age-related effects are controlled.

  13. Big Argumentation?

    Directory of Open Access Journals (Sweden)

    Daniel Faltesek

    2013-08-01

    Full Text Available Big Data is nothing new. Public concern regarding the mass diffusion of data has appeared repeatedly with computing innovations, in the formation before Big Data it was most recently referred to as the information explosion. In this essay, I argue that the appeal of Big Data is not a function of computational power, but of a synergistic relationship between aesthetic order and a politics evacuated of a meaningful public deliberation. Understanding, and challenging, Big Data requires an attention to the aesthetics of data visualization and the ways in which those aesthetics would seem to depoliticize information. The conclusion proposes an alternative argumentative aesthetic as the appropriate response to the depoliticization posed by the popular imaginary of Big Data.

  14. [Big data in imaging].

    Science.gov (United States)

    Sewerin, Philipp; Ostendorf, Benedikt; Hueber, Axel J; Kleyer, Arnd

    2018-04-01

    Until now, most major medical advancements have been achieved through hypothesis-driven research within the scope of clinical trials. However, due to a multitude of variables, only a certain number of research questions could be addressed during a single study, thus rendering these studies expensive and time consuming. Big data acquisition enables a new data-based approach in which large volumes of data can be used to investigate all variables, thus opening new horizons. Due to universal digitalization of the data as well as ever-improving hard- and software solutions, imaging would appear to be predestined for such analyses. Several small studies have already demonstrated that automated analysis algorithms and artificial intelligence can identify pathologies with high precision. Such automated systems would also seem well suited for rheumatology imaging, since a method for individualized risk stratification has long been sought for these patients. However, despite all the promising options, the heterogeneity of the data and highly complex regulations covering data protection in Germany would still render a big data solution for imaging difficult today. Overcoming these boundaries is challenging, but the enormous potential advances in clinical management and science render pursuit of this goal worthwhile.

  15. Big Data in Health: a Literature Review from the Year 2005.

    Science.gov (United States)

    de la Torre Díez, Isabel; Cosgaya, Héctor Merino; Garcia-Zapirain, Begoña; López-Coronado, Miguel

    2016-09-01

    The information stored in healthcare systems has increased over the last ten years, leading it to be considered Big Data. There is a wealth of health information ready to be analysed. However, the sheer volume raises a challenge for traditional methods. The aim of this article is to conduct a cutting-edge study on Big Data in healthcare from 2005 to the present. This literature review will help researchers to know how Big Data has developed in the health industry and open up new avenues for research. Information searches have been made on various scientific databases such as Pubmed, Science Direct, Scopus and Web of Science for Big Data in healthcare. The search criteria were "Big Data" and "health" with a date range from 2005 to the present. A total of 9724 articles were found on the databases. 9515 articles were discarded as duplicates or for not having a title of interest to the study. 209 articles were read, with the resulting decision that 46 were useful for this study. 52.6 % of the articles used were found in Science Direct, 23.7 % in Pubmed, 22.1 % through Scopus and the remaining 2.6 % through the Web of Science. Big Data has undergone extremely high growth since 2011 and its use is becoming compulsory in developed nations and in an increasing number of developing nations. Big Data is a step forward and a cost reducer for public and private healthcare.

  16. Does Implementation of Big Data Analytics Improve Firms’ Market Value? Investors’ Reaction in Stock Market

    Directory of Open Access Journals (Sweden)

    Hansol Lee

    2017-06-01

    Full Text Available Recently, due to the development of social media, multimedia, and the Internet of Things (IoT, various types of data have increased. As the existing data analytics tools cannot cover this huge volume of data, big data analytics becomes one of the emerging technologies for business today. Considering that big data analytics is an up-to-date term, in the present study, we investigated the impact of implementing big data analytics in the short-term perspective. We used an event study methodology to investigate the changes in stock price caused by announcements on big data analytics solution investment. A total of 54 investment announcements of firms publicly traded in NASDAQ and NYSE from 2010 to 2015 were collected. Our results empirically demonstrate that announcement of firms’ investment on big data solution leads to positive stock market reactions. In addition, we also found that investments on small vendors’ solution with industry-oriented functions tend to result in higher abnormal returns than those on big vendors’ solution with general functions. Finally, our results also suggest that stock market investors highly evaluate big data analytics investments of big firms as compared to those of small firms.

  17. The Big Data Tools Impact on Development of Simulation-Concerned Academic Disciplines

    Directory of Open Access Journals (Sweden)

    A. A. Sukhobokov

    2015-01-01

    Full Text Available The article gives a definition of Big Data on the basis of 5V (Volume, Variety, Velocity, Veracity, Value as well as shows examples of tasks that require using Big Data tools in a diversity of areas, namely: health, education, financial services, industry, agriculture, logistics, retail, information technology, telecommunications and others. An overview of Big Data tools is delivered, including open source products, IBM Bluemix and SAP HANA platforms. Examples of architecture of corporate data processing and management systems using Big Data tools are shown for big Internet companies and for enterprises in traditional industries. Within the overview, a classification of Big Data tools is proposed that fills gaps of previously developed similar classifications. The new classification contains 19 classes and allows embracing several hundreds of existing and emerging products.The uprise and use of Big Data tools, in addition to solving practical problems, affects the development of scientific disciplines concerning the simulation of technical, natural or socioeconomic systems and the solution of practical problems based on developed models. New schools arise in these disciplines. These new schools decide peculiar to each discipline tasks, but for systems with a much bigger number of internal elements and connections between them. Characteristics of the problems to be solved under new schools, not always meet the criteria for Big Data. It is suggested to identify the Big Data as a part of the theory of sorting and searching algorithms. In other disciplines the new schools are called by analogy with Big Data: Big Calculation in numerical methods, Big Simulation in imitational modeling, Big Management in the management of socio-economic systems, Big Optimal Control in the optimal control theory. The paper shows examples of tasks and methods to be developed within new schools. The educed tendency is not limited to the considered disciplines: there are

  18. A glossary for big data in population and public health: discussion and commentary on terminology and research methods.

    Science.gov (United States)

    Fuller, Daniel; Buote, Richard; Stanley, Kevin

    2017-11-01

    The volume and velocity of data are growing rapidly and big data analytics are being applied to these data in many fields. Population and public health researchers may be unfamiliar with the terminology and statistical methods used in big data. This creates a barrier to the application of big data analytics. The purpose of this glossary is to define terms used in big data and big data analytics and to contextualise these terms. We define the five Vs of big data and provide definitions and distinctions for data mining, machine learning and deep learning, among other terms. We provide key distinctions between big data and statistical analysis methods applied to big data. We contextualise the glossary by providing examples where big data analysis methods have been applied to population and public health research problems and provide brief guidance on how to learn big data analysis methods. © Article author(s) (or their employer(s) unless otherwise stated in the text of the article) 2017. All rights reserved. No commercial use is permitted unless otherwise expressly granted.

  19. Transforming Healthcare Delivery: Integrating Dynamic Simulation Modelling and Big Data in Health Economics and Outcomes Research.

    Science.gov (United States)

    Marshall, Deborah A; Burgos-Liz, Lina; Pasupathy, Kalyan S; Padula, William V; IJzerman, Maarten J; Wong, Peter K; Higashi, Mitchell K; Engbers, Jordan; Wiebe, Samuel; Crown, William; Osgood, Nathaniel D

    2016-02-01

    In the era of the Information Age and personalized medicine, healthcare delivery systems need to be efficient and patient-centred. The health system must be responsive to individual patient choices and preferences about their care, while considering the system consequences. While dynamic simulation modelling (DSM) and big data share characteristics, they present distinct and complementary value in healthcare. Big data and DSM are synergistic-big data offer support to enhance the application of dynamic models, but DSM also can greatly enhance the value conferred by big data. Big data can inform patient-centred care with its high velocity, volume, and variety (the three Vs) over traditional data analytics; however, big data are not sufficient to extract meaningful insights to inform approaches to improve healthcare delivery. DSM can serve as a natural bridge between the wealth of evidence offered by big data and informed decision making as a means of faster, deeper, more consistent learning from that evidence. We discuss the synergies between big data and DSM, practical considerations and challenges, and how integrating big data and DSM can be useful to decision makers to address complex, systemic health economics and outcomes questions and to transform healthcare delivery.

  20. A study on decision-making of food supply chain based on big data

    OpenAIRE

    Ji, Guojun; Hu, Limei; Tan, Kim Hua

    2017-01-01

    As more and more companies have captured and analyzed huge volumes of data to improve the performance of supply chain, this paper develops a big data harvest model that uses big data as inputs to make more informed production decisions in the food supply chain. By introducing a method of Bayesian network, this paper integrates sample data and finds a cause-and-effect between data to predict market demand. Then the deduction graph model that translates products demand into processes and divide...

  1. An optimal big data workflow for biomedical image analysis

    Directory of Open Access Journals (Sweden)

    Aurelle Tchagna Kouanou

    Full Text Available Background and objective: In the medical field, data volume is increasingly growing, and traditional methods cannot manage it efficiently. In biomedical computation, the continuous challenges are: management, analysis, and storage of the biomedical data. Nowadays, big data technology plays a significant role in the management, organization, and analysis of data, using machine learning and artificial intelligence techniques. It also allows a quick access to data using the NoSQL database. Thus, big data technologies include new frameworks to process medical data in a manner similar to biomedical images. It becomes very important to develop methods and/or architectures based on big data technologies, for a complete processing of biomedical image data. Method: This paper describes big data analytics for biomedical images, shows examples reported in the literature, briefly discusses new methods used in processing, and offers conclusions. We argue for adapting and extending related work methods in the field of big data software, using Hadoop and Spark frameworks. These provide an optimal and efficient architecture for biomedical image analysis. This paper thus gives a broad overview of big data analytics to automate biomedical image diagnosis. A workflow with optimal methods and algorithm for each step is proposed. Results: Two architectures for image classification are suggested. We use the Hadoop framework to design the first, and the Spark framework for the second. The proposed Spark architecture allows us to develop appropriate and efficient methods to leverage a large number of images for classification, which can be customized with respect to each other. Conclusions: The proposed architectures are more complete, easier, and are adaptable in all of the steps from conception. The obtained Spark architecture is the most complete, because it facilitates the implementation of algorithms with its embedded libraries. Keywords: Biomedical images, Big

  2. The Big Mac and Teaching about Japan. Footnotes. Volume 8, Number 5

    Science.gov (United States)

    Ellington, Lucien

    2003-01-01

    The Big Mac can be effective tool in helping students achieve a better understanding of Japan. It can defeat Orientalist stereotypes about the Japanese--and also challenge young people who might have oversimplified notions of what exactly occurs when U.S. fast food chains take root in another culture. Many deride McDonald's as a villain…

  3. Big data

    DEFF Research Database (Denmark)

    Madsen, Anders Koed; Flyverbom, Mikkel; Hilbert, Martin

    2016-01-01

    is to outline a research agenda that can be used to raise a broader set of sociological and practice-oriented questions about the increasing datafication of international relations and politics. First, it proposes a way of conceptualizing big data that is broad enough to open fruitful investigations......The claim that big data can revolutionize strategy and governance in the context of international relations is increasingly hard to ignore. Scholars of international political sociology have mainly discussed this development through the themes of security and surveillance. The aim of this paper...... into the emerging use of big data in these contexts. This conceptualization includes the identification of three moments contained in any big data practice. Second, it suggests a research agenda built around a set of subthemes that each deserve dedicated scrutiny when studying the interplay between big data...

  4. Big data computing

    CERN Document Server

    Akerkar, Rajendra

    2013-01-01

    Due to market forces and technological evolution, Big Data computing is developing at an increasing rate. A wide variety of novel approaches and tools have emerged to tackle the challenges of Big Data, creating both more opportunities and more challenges for students and professionals in the field of data computation and analysis. Presenting a mix of industry cases and theory, Big Data Computing discusses the technical and practical issues related to Big Data in intelligent information management. Emphasizing the adoption and diffusion of Big Data tools and technologies in industry, the book i

  5. A peek into the future of radiology using big data applications

    Directory of Open Access Journals (Sweden)

    Amit T Kharat

    2017-01-01

    Full Text Available Big data is extremely large amount of data which is available in the radiology department. Big data is identified by four Vs – Volume, Velocity, Variety, and Veracity. By applying different algorithmic tools and converting raw data to transformed data in such large datasets, there is a possibility of understanding and using radiology data for gaining new knowledge and insights. Big data analytics consists of 6Cs – Connection, Cloud, Cyber, Content, Community, and Customization. The global technological prowess and per-capita capacity to save digital information has roughly doubled every 40 months since the 1980's. By using big data, the planning and implementation of radiological procedures in radiology departments can be given a great boost. Potential applications of big data in the future are scheduling of scans, creating patient-specific personalized scanning protocols, radiologist decision support, emergency reporting, virtual quality assurance for the radiologist, etc. Targeted use of big data applications can be done for images by supporting the analytic process. Screening software tools designed on big data can be used to highlight a region of interest, such as subtle changes in parenchymal density, solitary pulmonary nodule, or focal hepatic lesions, by plotting its multidimensional anatomy. Following this, we can run more complex applications such as three-dimensional multi planar reconstructions (MPR, volumetric rendering (VR, and curved planar reconstruction, which consume higher system resources on targeted data subsets rather than querying the complete cross-sectional imaging dataset. This pre-emptive selection of dataset can substantially reduce the system requirements such as system memory, server load and provide prompt results. However, a word of caution, “big data should not become “dump data” due to inadequate and poor analysis and non-structured improperly stored data. In the near future, big data can ring in the

  6. A peek into the future of radiology using big data applications.

    Science.gov (United States)

    Kharat, Amit T; Singhal, Shubham

    2017-01-01

    Big data is extremely large amount of data which is available in the radiology department. Big data is identified by four Vs - Volume, Velocity, Variety, and Veracity. By applying different algorithmic tools and converting raw data to transformed data in such large datasets, there is a possibility of understanding and using radiology data for gaining new knowledge and insights. Big data analytics consists of 6Cs - Connection, Cloud, Cyber, Content, Community, and Customization. The global technological prowess and per-capita capacity to save digital information has roughly doubled every 40 months since the 1980's. By using big data, the planning and implementation of radiological procedures in radiology departments can be given a great boost. Potential applications of big data in the future are scheduling of scans, creating patient-specific personalized scanning protocols, radiologist decision support, emergency reporting, virtual quality assurance for the radiologist, etc. Targeted use of big data applications can be done for images by supporting the analytic process. Screening software tools designed on big data can be used to highlight a region of interest, such as subtle changes in parenchymal density, solitary pulmonary nodule, or focal hepatic lesions, by plotting its multidimensional anatomy. Following this, we can run more complex applications such as three-dimensional multi planar reconstructions (MPR), volumetric rendering (VR), and curved planar reconstruction, which consume higher system resources on targeted data subsets rather than querying the complete cross-sectional imaging dataset. This pre-emptive selection of dataset can substantially reduce the system requirements such as system memory, server load and provide prompt results. However, a word of caution, "big data should not become "dump data" due to inadequate and poor analysis and non-structured improperly stored data. In the near future, big data can ring in the era of personalized and

  7. A peek into the future of radiology using big data applications

    Science.gov (United States)

    Kharat, Amit T.; Singhal, Shubham

    2017-01-01

    Big data is extremely large amount of data which is available in the radiology department. Big data is identified by four Vs – Volume, Velocity, Variety, and Veracity. By applying different algorithmic tools and converting raw data to transformed data in such large datasets, there is a possibility of understanding and using radiology data for gaining new knowledge and insights. Big data analytics consists of 6Cs – Connection, Cloud, Cyber, Content, Community, and Customization. The global technological prowess and per-capita capacity to save digital information has roughly doubled every 40 months since the 1980's. By using big data, the planning and implementation of radiological procedures in radiology departments can be given a great boost. Potential applications of big data in the future are scheduling of scans, creating patient-specific personalized scanning protocols, radiologist decision support, emergency reporting, virtual quality assurance for the radiologist, etc. Targeted use of big data applications can be done for images by supporting the analytic process. Screening software tools designed on big data can be used to highlight a region of interest, such as subtle changes in parenchymal density, solitary pulmonary nodule, or focal hepatic lesions, by plotting its multidimensional anatomy. Following this, we can run more complex applications such as three-dimensional multi planar reconstructions (MPR), volumetric rendering (VR), and curved planar reconstruction, which consume higher system resources on targeted data subsets rather than querying the complete cross-sectional imaging dataset. This pre-emptive selection of dataset can substantially reduce the system requirements such as system memory, server load and provide prompt results. However, a word of caution, “big data should not become “dump data” due to inadequate and poor analysis and non-structured improperly stored data. In the near future, big data can ring in the era of personalized

  8. The New Possibilities from "Big Data" to Overlooked Associations Between Diabetes, Biochemical Parameters, Glucose Control, and Osteoporosis.

    Science.gov (United States)

    Kruse, Christian

    2018-06-01

    To review current practices and technologies within the scope of "Big Data" that can further our understanding of diabetes mellitus and osteoporosis from large volumes of data. "Big Data" techniques involving supervised machine learning, unsupervised machine learning, and deep learning image analysis are presented with examples of current literature. Supervised machine learning can allow us to better predict diabetes-induced osteoporosis and understand relative predictor importance of diabetes-affected bone tissue. Unsupervised machine learning can allow us to understand patterns in data between diabetic pathophysiology and altered bone metabolism. Image analysis using deep learning can allow us to be less dependent on surrogate predictors and use large volumes of images to classify diabetes-induced osteoporosis and predict future outcomes directly from images. "Big Data" techniques herald new possibilities to understand diabetes-induced osteoporosis and ascertain our current ability to classify, understand, and predict this condition.

  9. From big bang to big crunch and beyond

    International Nuclear Information System (INIS)

    Elitzur, Shmuel; Rabinovici, Eliezer; Giveon, Amit; Kutasov, David

    2002-01-01

    We study a quotient Conformal Field Theory, which describes a 3+1 dimensional cosmological spacetime. Part of this spacetime is the Nappi-Witten (NW) universe, which starts at a 'big bang' singularity, expands and then contracts to a 'big crunch' singularity at a finite time. The gauged WZW model contains a number of copies of the NW spacetime, with each copy connected to the preceding one and to the next one at the respective big bang/big crunch singularities. The sequence of NW spacetimes is further connected at the singularities to a series of non-compact static regions with closed timelike curves. These regions contain boundaries, on which the observables of the theory live. This suggests a holographic interpretation of the physics. (author)

  10. BIG data - BIG gains? Empirical evidence on the link between big data analytics and innovation

    OpenAIRE

    Niebel, Thomas; Rasel, Fabienne; Viete, Steffen

    2017-01-01

    This paper analyzes the relationship between firms’ use of big data analytics and their innovative performance in terms of product innovations. Since big data technologies provide new data information practices, they create novel decision-making possibilities, which are widely believed to support firms’ innovation process. Applying German firm-level data within a knowledge production function framework we find suggestive evidence that big data analytics is a relevant determinant for the likel...

  11. Big Data technology in traffic: A case study of automatic counters

    Directory of Open Access Journals (Sweden)

    Janković Slađana R.

    2016-01-01

    Full Text Available Modern information and communication technologies together with intelligent devices provide a continuous inflow of large amounts of data that are used by traffic and transport systems. Collecting traffic data does not represent a challenge nowadays, but the issues remains in relation to storing and processing increasing amounts of data. In this paper we have investigated the possibilities of using Big Data technology to store and process data in the transport domain. The term Big Data refers to a large volume of information resource, its velocity and variety, far beyond the capabilities of commonly used software for storing, processing and data management. In our case study, Apache™ Hadoop® Big Data was used for processing data collected from 10 automatic traffic counters set up in Novi Sad and its surroundings. Indicators of traffic load which were calculated using the Big Data platforms were presented using tables and graphs in Microsoft Office Excel tool. The visualization and geolocation of the obtained indicators were performed using the Microsoft Business Intelligence (BI tools such as: Excel Power View and Excel Power Map. This case study showed that Big Data technologies combined with the BI tools can be used as a reliable support in monitoring of the traffic management systems.

  12. Research in Big Data Warehousing using Hadoop

    OpenAIRE

    Abderrazak Sebaa; Fatima Chikh; Amina Nouicer; Abdelkamel Tari

    2017-01-01

    Traditional data warehouses have played a key role in decision support system until the recent past. However, the rapid growing of the data generation by the current applications requires new data warehousing systems: volume and format of collected datasets, data source variety, integration of unstructured data and powerful analytical processing. In the age of the Big Data, it is important to follow this pace and adapt the existing warehouse systems to overcome the new issues and challenges. ...

  13. Reflection on Quality Assurance System of Higher Vocational Education under Big Data Era

    Directory of Open Access Journals (Sweden)

    Jiang Xinlan

    2015-01-01

    Full Text Available Big data has the features like Volume, Variety, Value and Velocity. Here come the new opportunities and challenges for construction of Chinese quality assurance system of higher vocational education under big data era. There are problems in current quality assurance system of higher vocational education, such as imperfect main body, non-formation of internally and externally incorporated quality assurance system, non-scientific security standard and insufficiency in security investment. The construction of higher vocational education under big data era requires a change in the idea of quality assurance system construction to realize the multiple main bodies and multiple layers development trend for educational quality assurance system, and strengthen the construction of information platform for quality assurance system.

  14. Big Data for Infectious Disease Surveillance and Modeling.

    Science.gov (United States)

    Bansal, Shweta; Chowell, Gerardo; Simonsen, Lone; Vespignani, Alessandro; Viboud, Cécile

    2016-12-01

    We devote a special issue of the Journal of Infectious Diseases to review the recent advances of big data in strengthening disease surveillance, monitoring medical adverse events, informing transmission models, and tracking patient sentiments and mobility. We consider a broad definition of big data for public health, one encompassing patient information gathered from high-volume electronic health records and participatory surveillance systems, as well as mining of digital traces such as social media, Internet searches, and cell-phone logs. We introduce nine independent contributions to this special issue and highlight several cross-cutting areas that require further research, including representativeness, biases, volatility, and validation, and the need for robust statistical and hypotheses-driven analyses. Overall, we are optimistic that the big-data revolution will vastly improve the granularity and timeliness of available epidemiological information, with hybrid systems augmenting rather than supplanting traditional surveillance systems, and better prospects for accurate infectious diseases models and forecasts. Published by Oxford University Press for the Infectious Diseases Society of America 2016. This work is written by (a) US Government employee(s) and is in the public domain in the US.

  15. Data Management and Preservation Planning for Big Science

    Directory of Open Access Journals (Sweden)

    Juan Bicarregui

    2013-06-01

    Full Text Available ‘Big Science’ - that is, science which involves large collaborations with dedicated facilities, and involving large data volumes and multinational investments – is often seen as different when it comes to data management and preservation planning. Big Science handles its data differently from other disciplines and has data management problems that are qualitatively different from other disciplines. In part, these differences arise from the quantities of data involved, but possibly more importantly from the cultural, organisational and technical distinctiveness of these academic cultures. Consequently, the data management systems are typically and rationally bespoke, but this means that the planning for data management and preservation (DMP must also be bespoke.These differences are such that ‘just read and implement the OAIS specification’ is reasonable Data Management and Preservation (DMP advice, but this bald prescription can and should be usefully supported by a methodological ‘toolkit’, including overviews, case-studies and costing models to provide guidance on developing best practice in DMP policy and infrastructure for these projects, as well as considering OAIS validation, audit and cost modelling.In this paper, we build on previous work with the LIGO collaboration to consider the role of DMP planning within these big science scenarios, and discuss how to apply current best practice. We discuss the result of the MaRDI-Gross project (Managing Research Data Infrastructures – Big Science, which has been developing a toolkit to provide guidelines on the application of best practice in DMP planning within big science projects. This is targeted primarily at projects’ engineering managers, but intending also to help funders collaborate on DMP plans which satisfy the requirements imposed on them.

  16. Big data in pharmacy practice: current use, challenges, and the future.

    Science.gov (United States)

    Ma, Carolyn; Smith, Helen Wong; Chu, Cherie; Juarez, Deborah T

    2015-01-01

    Pharmacy informatics is defined as the use and integration of data, information, knowledge, technology, and automation in the medication-use process for the purpose of improving health outcomes. The term "big data" has been coined and is often defined in three V's: volume, velocity, and variety. This paper describes three major areas in which pharmacy utilizes big data, including: 1) informed decision making (clinical pathways and clinical practice guidelines); 2) improved care delivery in health care settings such as hospitals and community pharmacy practice settings; and 3) quality performance measurement for the Centers for Medicare and Medicaid and medication management activities such as tracking medication adherence and medication reconciliation.

  17. BIG GEO DATA MANAGEMENT: AN EXPLORATION WITH SOCIAL MEDIA AND TELECOMMUNICATIONS OPEN DATA

    Directory of Open Access Journals (Sweden)

    C. Arias Munoz

    2016-06-01

    Full Text Available The term Big Data has been recently used to define big, highly varied, complex data sets, which are created and updated at a high speed and require faster processing, namely, a reduced time to filter and analyse relevant data. These data is also increasingly becoming Open Data (data that can be freely distributed made public by the government, agencies, private enterprises and among others. There are at least two issues that can obstruct the availability and use of Open Big Datasets: Firstly, the gathering and geoprocessing of these datasets are very computationally intensive; hence, it is necessary to integrate high-performance solutions, preferably internet based, to achieve the goals. Secondly, the problems of heterogeneity and inconsistency in geospatial data are well known and affect the data integration process, but is particularly problematic for Big Geo Data. Therefore, Big Geo Data integration will be one of the most challenging issues to solve. With these applications, we demonstrate that is possible to provide processed Big Geo Data to common users, using open geospatial standards and technologies. NoSQL databases like MongoDB and frameworks like RASDAMAN could offer different functionalities that facilitate working with larger volumes and more heterogeneous geospatial data sources.

  18. Big Geo Data Management: AN Exploration with Social Media and Telecommunications Open Data

    Science.gov (United States)

    Arias Munoz, C.; Brovelli, M. A.; Corti, S.; Zamboni, G.

    2016-06-01

    The term Big Data has been recently used to define big, highly varied, complex data sets, which are created and updated at a high speed and require faster processing, namely, a reduced time to filter and analyse relevant data. These data is also increasingly becoming Open Data (data that can be freely distributed) made public by the government, agencies, private enterprises and among others. There are at least two issues that can obstruct the availability and use of Open Big Datasets: Firstly, the gathering and geoprocessing of these datasets are very computationally intensive; hence, it is necessary to integrate high-performance solutions, preferably internet based, to achieve the goals. Secondly, the problems of heterogeneity and inconsistency in geospatial data are well known and affect the data integration process, but is particularly problematic for Big Geo Data. Therefore, Big Geo Data integration will be one of the most challenging issues to solve. With these applications, we demonstrate that is possible to provide processed Big Geo Data to common users, using open geospatial standards and technologies. NoSQL databases like MongoDB and frameworks like RASDAMAN could offer different functionalities that facilitate working with larger volumes and more heterogeneous geospatial data sources.

  19. Benchmarking Big Data Systems and the BigData Top100 List.

    Science.gov (United States)

    Baru, Chaitanya; Bhandarkar, Milind; Nambiar, Raghunath; Poess, Meikel; Rabl, Tilmann

    2013-03-01

    "Big data" has become a major force of innovation across enterprises of all sizes. New platforms with increasingly more features for managing big datasets are being announced almost on a weekly basis. Yet, there is currently a lack of any means of comparability among such platforms. While the performance of traditional database systems is well understood and measured by long-established institutions such as the Transaction Processing Performance Council (TCP), there is neither a clear definition of the performance of big data systems nor a generally agreed upon metric for comparing these systems. In this article, we describe a community-based effort for defining a big data benchmark. Over the past year, a Big Data Benchmarking Community has become established in order to fill this void. The effort focuses on defining an end-to-end application-layer benchmark for measuring the performance of big data applications, with the ability to easily adapt the benchmark specification to evolving challenges in the big data space. This article describes the efforts that have been undertaken thus far toward the definition of a BigData Top100 List. While highlighting the major technical as well as organizational challenges, through this article, we also solicit community input into this process.

  20. Big data, big knowledge: big data for personalized healthcare.

    Science.gov (United States)

    Viceconti, Marco; Hunter, Peter; Hose, Rod

    2015-07-01

    The idea that the purely phenomenological knowledge that we can extract by analyzing large amounts of data can be useful in healthcare seems to contradict the desire of VPH researchers to build detailed mechanistic models for individual patients. But in practice no model is ever entirely phenomenological or entirely mechanistic. We propose in this position paper that big data analytics can be successfully combined with VPH technologies to produce robust and effective in silico medicine solutions. In order to do this, big data technologies must be further developed to cope with some specific requirements that emerge from this application. Such requirements are: working with sensitive data; analytics of complex and heterogeneous data spaces, including nontextual information; distributed data management under security and performance constraints; specialized analytics to integrate bioinformatics and systems biology information with clinical observations at tissue, organ and organisms scales; and specialized analytics to define the "physiological envelope" during the daily life of each patient. These domain-specific requirements suggest a need for targeted funding, in which big data technologies for in silico medicine becomes the research priority.

  1. BigDataBench: a Big Data Benchmark Suite from Internet Services

    OpenAIRE

    Wang, Lei; Zhan, Jianfeng; Luo, Chunjie; Zhu, Yuqing; Yang, Qiang; He, Yongqiang; Gao, Wanling; Jia, Zhen; Shi, Yingjie; Zhang, Shujie; Zheng, Chen; Lu, Gang; Zhan, Kent; Li, Xiaona; Qiu, Bizhu

    2014-01-01

    As architecture, systems, and data management communities pay greater attention to innovative big data systems and architectures, the pressure of benchmarking and evaluating these systems rises. Considering the broad use of big data systems, big data benchmarks must include diversity of data and workloads. Most of the state-of-the-art big data benchmarking efforts target evaluating specific types of applications or system software stacks, and hence they are not qualified for serving the purpo...

  2. Analysis of Big Data in Gait Biomechanics: Current Trends and Future Directions.

    Science.gov (United States)

    Phinyomark, Angkoon; Petri, Giovanni; Ibáñez-Marcelo, Esther; Osis, Sean T; Ferber, Reed

    2018-01-01

    The increasing amount of data in biomechanics research has greatly increased the importance of developing advanced multivariate analysis and machine learning techniques, which are better able to handle "big data". Consequently, advances in data science methods will expand the knowledge for testing new hypotheses about biomechanical risk factors associated with walking and running gait-related musculoskeletal injury. This paper begins with a brief introduction to an automated three-dimensional (3D) biomechanical gait data collection system: 3D GAIT, followed by how the studies in the field of gait biomechanics fit the quantities in the 5 V's definition of big data: volume, velocity, variety, veracity, and value. Next, we provide a review of recent research and development in multivariate and machine learning methods-based gait analysis that can be applied to big data analytics. These modern biomechanical gait analysis methods include several main modules such as initial input features, dimensionality reduction (feature selection and extraction), and learning algorithms (classification and clustering). Finally, a promising big data exploration tool called "topological data analysis" and directions for future research are outlined and discussed.

  3. Conociendo Big Data

    Directory of Open Access Journals (Sweden)

    Juan José Camargo-Vega

    2014-12-01

    Full Text Available Teniendo en cuenta la importancia que ha adquirido el término Big Data, la presente investigación buscó estudiar y analizar de manera exhaustiva el estado del arte del Big Data; además, y como segundo objetivo, analizó las características, las herramientas, las tecnologías, los modelos y los estándares relacionados con Big Data, y por último buscó identificar las características más relevantes en la gestión de Big Data, para que con ello se pueda conocer todo lo concerniente al tema central de la investigación.La metodología utilizada incluyó revisar el estado del arte de Big Data y enseñar su situación actual; conocer las tecnologías de Big Data; presentar algunas de las bases de datos NoSQL, que son las que permiten procesar datos con formatos no estructurados, y mostrar los modelos de datos y las tecnologías de análisis de ellos, para terminar con algunos beneficios de Big Data.El diseño metodológico usado para la investigación fue no experimental, pues no se manipulan variables, y de tipo exploratorio, debido a que con esta investigación se empieza a conocer el ambiente del Big Data.

  4. An Industrial Perspective of CAM/ROB Fuzzy Integrated Postprocessing Implementation for Redundant Robotic Workcells Applicability for Big Volume Prototyping

    Science.gov (United States)

    Andrés, J.; Gracia, L.; Tornero, J.; García, J. A.; González, F.

    2009-11-01

    The implementation of a postprocessor for the NX™ platform (Siemens Corp.) is described in this paper. It is focused on a milling redundant robotic milling workcell consisting of one KUKA KR 15/2 manipulator (6 rotary joints, KRC2 controller) mounted on a linear axis and synchronized with a rotary table (i.e., two additional joints). For carrying out a milling task, a choice among a set of possible configurations is required, taking into account the ability to avoid singular configurations by using both additional joints. Usually, experience and knowledge of the workman allow an efficient control in these cases, but being it a tedious job. Similarly to this expert knowledge, a stand-alone fuzzy controller has been programmed with Matlab's Fuzzy Logic Toolbox (The MathWorks, Inc.). Two C++ programs complement the translation of the toolpath tracking (expressed in the Cartesian space) from the NX™-CAM module into KRL (KUKA Robot Language). In order to avoid singularities or joint limits, the location of the robot and the workpiece during the execution of the task is fit after an inverse kinematics position analysis and a fuzzy inference (i.e., fuzzy criterion in the Joint Space). Additionally, the applicability of robot arms for the manufacture of big volume prototypes with this technique is proven by means of one case studied. It consists of a big orographic model to simulate floodways, return flows and retention storage of a reservoir in the Mijares river (Puebla de Arenoso, Spain). This article deals with the problem for a constant tool orientation milling process and sets the technological basis for future research at five axis milling operations.

  5. BigDansing

    KAUST Repository

    Khayyat, Zuhair

    2015-06-02

    Data cleansing approaches have usually focused on detecting and fixing errors with little attention to scaling to big datasets. This presents a serious impediment since data cleansing often involves costly computations such as enumerating pairs of tuples, handling inequality joins, and dealing with user-defined functions. In this paper, we present BigDansing, a Big Data Cleansing system to tackle efficiency, scalability, and ease-of-use issues in data cleansing. The system can run on top of most common general purpose data processing platforms, ranging from DBMSs to MapReduce-like frameworks. A user-friendly programming interface allows users to express data quality rules both declaratively and procedurally, with no requirement of being aware of the underlying distributed platform. BigDansing takes these rules into a series of transformations that enable distributed computations and several optimizations, such as shared scans and specialized joins operators. Experimental results on both synthetic and real datasets show that BigDansing outperforms existing baseline systems up to more than two orders of magnitude without sacrificing the quality provided by the repair algorithms.

  6. excess molar volumes, and refractive index of binary mixtures

    African Journals Online (AJOL)

    Preferred Customer

    because (a) water molecules have hydroxyl group which can make stronger hydrogen bonding than methanol and (b) water molecules and glycerol have suitable kinetic energy for bulk volumes at high temperature. Thus, the mixture of glycerol + water have big excess molar volume than methanol. The hydrogen bonding ...

  7. Modeling of the Nuclear Power Plant Life cycle for 'Big Data' Management System: A Systems Engineering Viewpoint

    International Nuclear Information System (INIS)

    Ha, Bui Hoang; Khanh, Tran Quang Diep; Shakirah, Wan; Kahar, Wan Abdul; Jung, Jae Cheon

    2012-01-01

    Together with the significant development of Internet and Web technologies, the rapid evolution of 'Big Data' idea has been observed since it is first introduced in 1941 as an 'information explosion'(OED). Using the '3Vs' model, as proposed by Gartner, 'Big Data' can be defined as 'high volume, high velocity, and/or high variety information assets that require new forms of processing to enable enhanced decision making, insight discovery and process optimization.' Big Data technologies and tools have been developed to address the way large quantities of data are stored, accessed and presented for manipulation or analysis. The idea also focuses on how the users can easily access and extract the 'useful and right' data, information, or even knowledge from the 'Big Data'.

  8. Big science

    CERN Multimedia

    Nadis, S

    2003-01-01

    " "Big science" is moving into astronomy, bringing large experimental teams, multi-year research projects, and big budgets. If this is the wave of the future, why are some astronomers bucking the trend?" (2 pages).

  9. After the Big Bang: What's Next in Design Education? Time to Relax?

    Science.gov (United States)

    Fleischmann, Katja

    2015-01-01

    The article "Big Bang technology: What's next in design education, radical innovation or incremental change?" (Fleischmann, 2013) appeared in the "Journal of Learning Design" Volume 6, Issue 3 in 2013. Two years on, Associate Professor Fleischmann reflects upon her original article within this article. Although it has only been…

  10. Internet of Things and big data technologies for next generation healthcare

    CERN Document Server

    Dey, Nilanjan; Ashour, Amira

    2017-01-01

    This comprehensive book focuses on better big-data security for healthcare organizations. Following an extensive introduction to the Internet of Things (IoT) in healthcare including challenging topics and scenarios, it offers an in-depth analysis of medical body area networks with the 5th generation of IoT communication technology along with its nanotechnology. It also describes a novel strategic framework and computationally intelligent model to measure possible security vulnerabilities in the context of e-health. Moreover, the book addresses healthcare systems that handle large volumes of data driven by patients’ records and health/personal information, including big-data-based knowledge management systems to support clinical decisions. Several of the issues faced in storing/processing big data are presented along with the available tools, technologies and algorithms to deal with those problems as well as a case study in healthcare analytics. Addressing trust, privacy, and security issues as well as the I...

  11. Big bang and big crunch in matrix string theory

    OpenAIRE

    Bedford, J; Papageorgakis, C; Rodríguez-Gómez, D; Ward, J

    2007-01-01

    Following the holographic description of linear dilaton null Cosmologies with a Big Bang in terms of Matrix String Theory put forward by Craps, Sethi and Verlinde, we propose an extended background describing a Universe including both Big Bang and Big Crunch singularities. This belongs to a class of exact string backgrounds and is perturbative in the string coupling far away from the singularities, both of which can be resolved using Matrix String Theory. We provide a simple theory capable of...

  12. Observer variation in target volume delineation of lung cancer related to radiation oncologist-computer interaction: A 'Big Brother' evaluation

    International Nuclear Information System (INIS)

    Steenbakkers, Roel J.H.M.; Duppen, Joop C.; Fitton, Isabelle; Deurloo, Kirsten E.I.; Zijp, Lambert; Uitterhoeve, Apollonia L.J.; Rodrigus, Patrick T.R.; Kramer, Gijsbert W.P.; Bussink, Johan; Jaeger, Katrien De; Belderbos, Jose S.A.; Hart, Augustinus A.M.; Nowak, Peter J.C.M.; Herk, Marcel van; Rasch, Coen R.N.

    2005-01-01

    Background and purpose: To evaluate the process of target volume delineation in lung cancer for optimization of imaging, delineation protocol and delineation software. Patients and methods: Eleven radiation oncologists (observers) from five different institutions delineated the Gross Tumor Volume (GTV) including positive lymph nodes of 22 lung cancer patients (stages I-IIIB) on CT only. All radiation oncologist-computer interactions were recorded with a tool called 'Big Brother'. For each radiation oncologist and patient the following issues were analyzed: delineation time, number of delineated points and corrections, zoom levels, level and window (L/W) settings, CT slice changes, use of side windows (coronal and sagittal) and software button use. Results: The mean delineation time per GTV was 16 min (SD 10 min). The mean delineation time for lymph node positive patients was on average 3 min larger (P=0.02) than for lymph node negative patients. Many corrections (55%) were due to L/W change (e.g. delineating in mediastinum L/W and then correcting in lung L/W). For the lymph node region, a relatively large number of corrections was found (3.7 corr/cm 2 ), indicating that it was difficult to delineate lymph nodes. For the tumor-atelectasis region, a relative small number of corrections was found (1.0 corr/cm 2 ), indicating that including or excluding atelectasis into the GTV was a clinical decision. Inappropriate use of L/W settings was frequently found (e.g. 46% of all delineated points in the tumor-lung region were delineated in mediastinum L/W settings). Despite a large observer variation in cranial and caudal direction of 0.72 cm (1 SD), the coronal and sagittal side windows were not used in 45 and 60% of the cases, respectively. For the more difficult cases, observer variation was smaller when the coronal and sagittal side windows were used. Conclusions: With the 'Big Brother' tool a method was developed to trace the delineation process. The differences between

  13. Bliver big data til big business?

    DEFF Research Database (Denmark)

    Ritter, Thomas

    2015-01-01

    Danmark har en digital infrastruktur, en registreringskultur og it-kompetente medarbejdere og kunder, som muliggør en førerposition, men kun hvis virksomhederne gør sig klar til næste big data-bølge.......Danmark har en digital infrastruktur, en registreringskultur og it-kompetente medarbejdere og kunder, som muliggør en førerposition, men kun hvis virksomhederne gør sig klar til næste big data-bølge....

  14. Big Data, jejich skladování a možnosti využití

    OpenAIRE

    Macek, Jáchym

    2013-01-01

    The content of this bachelor's thesis is to analyze work with data especially with large-volume unstructured data, thus Big Data. The thesis is retrieval and contents informational survey based on questionnaires and interviews. The aim is to evaluate and approximate Big Data theme, their storage, tools for their management and opportunities of its exploitation to the reader from technological and business point of view. The objective for the practical part is a survey. The thesis is divided i...

  15. Applications of the MapReduce programming framework to clinical big data analysis: current landscape and future trends.

    Science.gov (United States)

    Mohammed, Emad A; Far, Behrouz H; Naugler, Christopher

    2014-01-01

    The emergence of massive datasets in a clinical setting presents both challenges and opportunities in data storage and analysis. This so called "big data" challenges traditional analytic tools and will increasingly require novel solutions adapted from other fields. Advances in information and communication technology present the most viable solutions to big data analysis in terms of efficiency and scalability. It is vital those big data solutions are multithreaded and that data access approaches be precisely tailored to large volumes of semi-structured/unstructured data. THE MAPREDUCE PROGRAMMING FRAMEWORK USES TWO TASKS COMMON IN FUNCTIONAL PROGRAMMING: Map and Reduce. MapReduce is a new parallel processing framework and Hadoop is its open-source implementation on a single computing node or on clusters. Compared with existing parallel processing paradigms (e.g. grid computing and graphical processing unit (GPU)), MapReduce and Hadoop have two advantages: 1) fault-tolerant storage resulting in reliable data processing by replicating the computing tasks, and cloning the data chunks on different computing nodes across the computing cluster; 2) high-throughput data processing via a batch processing framework and the Hadoop distributed file system (HDFS). Data are stored in the HDFS and made available to the slave nodes for computation. In this paper, we review the existing applications of the MapReduce programming framework and its implementation platform Hadoop in clinical big data and related medical health informatics fields. The usage of MapReduce and Hadoop on a distributed system represents a significant advance in clinical big data processing and utilization, and opens up new opportunities in the emerging era of big data analytics. The objective of this paper is to summarize the state-of-the-art efforts in clinical big data analytics and highlight what might be needed to enhance the outcomes of clinical big data analytics tools. This paper is concluded by

  16. The Document Explosion in the World of Big Data--Curriculum Considerations

    Science.gov (United States)

    Liu, Michelle; Murphy, Diane

    2014-01-01

    Within the context of "big data", there is an increasing focus on the source of the large volumes of data now stored electronically. The greatest portion of this data is unstructured and comes from a variety of sources in a variety of formats, much of which does not conform to a consistent data model. As business and government…

  17. A Knowledge-Driven Geospatially Enabled Framework for Geological Big Data

    Directory of Open Access Journals (Sweden)

    Liang Wu

    2017-06-01

    Full Text Available Geologic survey procedures accumulate large volumes of structured and unstructured data. Fully exploiting the knowledge and information that are included in geological big data and improving the accessibility of large volumes of data are important endeavors. In this paper, which is based on the architecture of the geological survey information cloud-computing platform (GSICCP and big-data-related technologies, we split geologic unstructured data into fragments and extract multi-dimensional features via geological domain ontology. These fragments are reorganized into a NoSQL (Not Only SQL database, and then associations between the fragments are added. A specific class of geological questions was analyzed and transformed into workflow tasks according to the predefined rules and associations between fragments to identify spatial information and unstructured content. We establish a knowledge-driven geologic survey information smart-service platform (GSISSP based on previous work, and we detail a study case for our research. The study case shows that all the content that has known relationships or semantic associations can be mined with the assistance of multiple ontologies, thereby improving the accuracy and comprehensiveness of geological information discovery.

  18. Marketing de Relacionamento: Agregando Valor ao Negócio com Big Data

    Directory of Open Access Journals (Sweden)

    Ana Cláudia Borges Coutrim dos Reis

    2016-12-01

    Full Text Available Nos últimos anos, as constantes inovações tecnológicas, o aumento exponencial do volume de dados, junto às evoluções nas tendências de consumo, geraram um cenário global de grandes desafios para as organizações. O uso de tecnologias de Big Data, para análise do volume crescente de dados, disponíveis no mundo virtual globalizado, oferece uma nova visão sobre o mercado e pode contribuir com o futuro das organizações, por meio da disponibilização de informações de qualidade que apoiarão fortemente a tomada de decisão. Nesse cenário de alta competitividade, considerar o contexto de análises de dados, com o uso de tecnologias de Big Data, pode ampliar a visão das empresas para as tendências do mercado, favorecendo a identificação de estratégias melhor direcionadas aos perfis de comportamentos e preferências de consumo. O objetivo da pesquisa é investigar como o uso das tecnologias de Big Data pode ampliar a capacidade do marketing de relacionamento das organizações, agregando valor ao negócio. Os resultados demonstram que as organizações que fazem uso das tecnologias de Big Data, conseguem melhor definir suas estratégias de marketing de relacionamento com os clientes, obtendo como resultados relacionamentos mais duradouros com o mercado e seus clientes e retornos favoráveis aos seus negócios. Em um futuro próximo, a utilização de Big Data, para ampliação das estratégias de marketing de relacionamento, serão regras de sobrevivência para as organizações. O diferencial estará na forma como o gestor interpretará as informações disponíveis e agirá para alcançar os resultados almejados.

  19. Big Data and HPC: A Happy Marriage

    KAUST Repository

    Mehmood, Rashid

    2016-01-25

    International Data Corporation (IDC) defines Big Data technologies as “a new generation of technologies and architectures, designed to economically extract value from very large volumes of a wide variety of data produced every day, by enabling high velocity capture, discovery, and/or analysis”. High Performance Computing (HPC) most generally refers to “the practice of aggregating computing power in a way that delivers much higher performance than one could get out of a typical desktop computer or workstation in order to solve large problems in science, engineering, or business”. Big data platforms are built primarily considering the economics and capacity of the system for dealing with the 4V characteristics of data. HPC traditionally has been more focussed on the speed of digesting (computing) the data. For these reasons, the two domains (HPC and Big Data) have developed their own paradigms and technologies. However, recently, these two have grown fond of each other. HPC technologies are needed by Big Data to deal with the ever increasing Vs of data in order to forecast and extract insights from existing and new domains, faster, and with greater accuracy. Increasingly more data is being produced by scientific experiments from areas such as bioscience, physics, and climate, and therefore, HPC needs to adopt data-driven paradigms. Moreover, there are synergies between them with unimaginable potential for developing new computing paradigms, solving long-standing grand challenges, and making new explorations and discoveries. Therefore, they must get married to each other. In this talk, we will trace the HPC and big data landscapes through time including their respective technologies, paradigms and major applications areas. Subsequently, we will present the factors that are driving the convergence of the two technologies, the synergies between them, as well as the benefits of their convergence to the biosciences field. The opportunities and challenges of the

  20. Big Data and Dementia: Charting the Route Ahead for Research, Ethics, and Policy.

    Science.gov (United States)

    Ienca, Marcello; Vayena, Effy; Blasimme, Alessandro

    2018-01-01

    Emerging trends in pervasive computing and medical informatics are creating the possibility for large-scale collection, sharing, aggregation and analysis of unprecedented volumes of data, a phenomenon commonly known as big data. In this contribution, we review the existing scientific literature on big data approaches to dementia, as well as commercially available mobile-based applications in this domain. Our analysis suggests that big data approaches to dementia research and care hold promise for improving current preventive and predictive models, casting light on the etiology of the disease, enabling earlier diagnosis, optimizing resource allocation, and delivering more tailored treatments to patients with specific disease trajectories. Such promissory outlook, however, has not materialized yet, and raises a number of technical, scientific, ethical, and regulatory challenges. This paper provides an assessment of these challenges and charts the route ahead for research, ethics, and policy.

  1. Big bang and big crunch in matrix string theory

    International Nuclear Information System (INIS)

    Bedford, J.; Ward, J.; Papageorgakis, C.; Rodriguez-Gomez, D.

    2007-01-01

    Following the holographic description of linear dilaton null cosmologies with a big bang in terms of matrix string theory put forward by Craps, Sethi, and Verlinde, we propose an extended background describing a universe including both big bang and big crunch singularities. This belongs to a class of exact string backgrounds and is perturbative in the string coupling far away from the singularities, both of which can be resolved using matrix string theory. We provide a simple theory capable of describing the complete evolution of this closed universe

  2. Big data a primer

    CERN Document Server

    Bhuyan, Prachet; Chenthati, Deepak

    2015-01-01

    This book is a collection of chapters written by experts on various aspects of big data. The book aims to explain what big data is and how it is stored and used. The book starts from  the fundamentals and builds up from there. It is intended to serve as a review of the state-of-the-practice in the field of big data handling. The traditional framework of relational databases can no longer provide appropriate solutions for handling big data and making it available and useful to users scattered around the globe. The study of big data covers a wide range of issues including management of heterogeneous data, big data frameworks, change management, finding patterns in data usage and evolution, data as a service, service-generated data, service management, privacy and security. All of these aspects are touched upon in this book. It also discusses big data applications in different domains. The book will prove useful to students, researchers, and practicing database and networking engineers.

  3. A study and analysis of recommendation systems for location-based social network (LBSN with big data

    Directory of Open Access Journals (Sweden)

    Murale Narayanan

    2016-03-01

    Full Text Available Recommender systems play an important role in our day-to-day life. A recommender system automatically suggests an item to a user that he/she might be interested in. Small-scale datasets are used to provide recommendations based on location, but in real time, the volume of data is large. We have selected Foursquare dataset to study the need for big data in recommendation systems for location-based social network (LBSN. A few quality parameters like parallel processing and multimodal interface have been selected to study the need for big data in recommender systems. This paper provides a study and analysis of quality parameters of recommendation systems for LBSN with big data.

  4. Infectious Disease Surveillance in the Big Data Era: Towards Faster and Locally Relevant Systems

    Science.gov (United States)

    Simonsen, Lone; Gog, Julia R.; Olson, Don; Viboud, Cécile

    2016-01-01

    While big data have proven immensely useful in fields such as marketing and earth sciences, public health is still relying on more traditional surveillance systems and awaiting the fruits of a big data revolution. A new generation of big data surveillance systems is needed to achieve rapid, flexible, and local tracking of infectious diseases, especially for emerging pathogens. In this opinion piece, we reflect on the long and distinguished history of disease surveillance and discuss recent developments related to use of big data. We start with a brief review of traditional systems relying on clinical and laboratory reports. We then examine how large-volume medical claims data can, with great spatiotemporal resolution, help elucidate local disease patterns. Finally, we review efforts to develop surveillance systems based on digital and social data streams, including the recent rise and fall of Google Flu Trends. We conclude by advocating for increased use of hybrid systems combining information from traditional surveillance and big data sources, which seems the most promising option moving forward. Throughout the article, we use influenza as an exemplar of an emerging and reemerging infection which has traditionally been considered a model system for surveillance and modeling. PMID:28830112

  5. How to Generate Economic and Sustainability Reports from Big Data? Qualifications of Process Industry

    Directory of Open Access Journals (Sweden)

    Esa Hämäläinen

    2017-11-01

    Full Text Available Big Data may introduce new opportunities, and for this reason it has become a mantra among most industries. This paper focuses on examining how to develop cost and sustainable reporting by utilizing Big Data that covers economic values, production volumes, and emission information. We assume strongly that this use supports cleaner production, while at the same time offers more information for revenue and profitability development. We argue that Big Data brings company-wide business benefits if data queries and interfaces are built to be interactive, intuitive, and user-friendly. The amount of information related to operations, costs, emissions, and the supply chain would increase enormously if Big Data was used in various manufacturing industries. It is essential to expose the relevant correlations between different attributes and data fields. Proper algorithm design and programming are key to making the most of Big Data. This paper introduces ideas on how to refine raw data into valuable information, which can serve many types of end users, decision makers, and even external auditors. Concrete examples are given through an industrial paper mill case, which covers environmental aspects, cost-efficiency management, and process design.

  6. Microsoft big data solutions

    CERN Document Server

    Jorgensen, Adam; Welch, John; Clark, Dan; Price, Christopher; Mitchell, Brian

    2014-01-01

    Tap the power of Big Data with Microsoft technologies Big Data is here, and Microsoft's new Big Data platform is a valuable tool to help your company get the very most out of it. This timely book shows you how to use HDInsight along with HortonWorks Data Platform for Windows to store, manage, analyze, and share Big Data throughout the enterprise. Focusing primarily on Microsoft and HortonWorks technologies but also covering open source tools, Microsoft Big Data Solutions explains best practices, covers on-premises and cloud-based solutions, and features valuable case studies. Best of all,

  7. Summary big data

    CERN Document Server

    2014-01-01

    This work offers a summary of Cukier the book: "Big Data: A Revolution That Will Transform How we Live, Work, and Think" by Viktor Mayer-Schonberg and Kenneth. Summary of the ideas in Viktor Mayer-Schonberg's and Kenneth Cukier's book: " Big Data " explains that big data is where we use huge quantities of data to make better predictions based on the fact we identify patters in the data rather than trying to understand the underlying causes in more detail. This summary highlights that big data will be a source of new economic value and innovation in the future. Moreover, it shows that it will

  8. Addressing the Big-Earth-Data Variety Challenge with the Hierarchical Triangular Mesh

    Science.gov (United States)

    Rilee, Michael L.; Kuo, Kwo-Sen; Clune, Thomas; Oloso, Amidu; Brown, Paul G.; Yu, Honfeng

    2016-01-01

    We have implemented an updated Hierarchical Triangular Mesh (HTM) as the basis for a unified data model and an indexing scheme for geoscience data to address the variety challenge of Big Earth Data. We observe that, in the absence of variety, the volume challenge of Big Data is relatively easily addressable with parallel processing. The more important challenge in achieving optimal value with a Big Data solution for Earth Science (ES) data analysis, however, is being able to achieve good scalability with variety. With HTM unifying at least the three popular data models, i.e. Grid, Swath, and Point, used by current ES data products, data preparation time for integrative analysis of diverse datasets can be drastically reduced and better variety scaling can be achieved. In addition, since HTM is also an indexing scheme, when it is used to index all ES datasets, data placement alignment (or co-location) on the shared nothing architecture, which most Big Data systems are based on, is guaranteed and better performance is ensured. Moreover, our updated HTM encoding turns most geospatial set operations into integer interval operations, gaining further performance advantages.

  9. Health level seven interoperability strategy: big data, incrementally structured.

    Science.gov (United States)

    Dolin, R H; Rogers, B; Jaffe, C

    2015-01-01

    Describe how the HL7 Clinical Document Architecture (CDA), a foundational standard in US Meaningful Use, contributes to a "big data, incrementally structured" interoperability strategy, whereby data structured incrementally gets large amounts of data flowing faster. We present cases showing how this approach is leveraged for big data analysis. To support the assertion that semi-structured narrative in CDA format can be a useful adjunct in an overall big data analytic approach, we present two case studies. The first assesses an organization's ability to generate clinical quality reports using coded data alone vs. coded data supplemented by CDA narrative. The second leverages CDA to construct a network model for referral management, from which additional observations can be gleaned. The first case shows that coded data supplemented by CDA narrative resulted in significant variances in calculated performance scores. In the second case, we found that the constructed network model enables the identification of differences in patient characteristics among different referral work flows. The CDA approach goes after data indirectly, by focusing first on the flow of narrative, which is then incrementally structured. A quantitative assessment of whether this approach will lead to a greater flow of data and ultimately a greater flow of structured data vs. other approaches is planned as a future exercise. Along with growing adoption of CDA, we are now seeing the big data community explore the standard, particularly given its potential to supply analytic en- gines with volumes of data previously not possible.

  10. Empowering Personalized Medicine with Big Data and Semantic Web Technology: Promises, Challenges, and Use Cases.

    Science.gov (United States)

    Panahiazar, Maryam; Taslimitehrani, Vahid; Jadhav, Ashutosh; Pathak, Jyotishman

    2014-10-01

    In healthcare, big data tools and technologies have the potential to create significant value by improving outcomes while lowering costs for each individual patient. Diagnostic images, genetic test results and biometric information are increasingly generated and stored in electronic health records presenting us with challenges in data that is by nature high volume, variety and velocity, thereby necessitating novel ways to store, manage and process big data. This presents an urgent need to develop new, scalable and expandable big data infrastructure and analytical methods that can enable healthcare providers access knowledge for the individual patient, yielding better decisions and outcomes. In this paper, we briefly discuss the nature of big data and the role of semantic web and data analysis for generating "smart data" which offer actionable information that supports better decision for personalized medicine. In our view, the biggest challenge is to create a system that makes big data robust and smart for healthcare providers and patients that can lead to more effective clinical decision-making, improved health outcomes, and ultimately, managing the healthcare costs. We highlight some of the challenges in using big data and propose the need for a semantic data-driven environment to address them. We illustrate our vision with practical use cases, and discuss a path for empowering personalized medicine using big data and semantic web technology.

  11. Big Data in Science and Healthcare: A Review of Recent Literature and Perspectives. Contribution of the IMIA Social Media Working Group.

    Science.gov (United States)

    Hansen, M M; Miron-Shatz, T; Lau, A Y S; Paton, C

    2014-08-15

    As technology continues to evolve and rise in various industries, such as healthcare, science, education, and gaming, a sophisticated concept known as Big Data is surfacing. The concept of analytics aims to understand data. We set out to portray and discuss perspectives of the evolving use of Big Data in science and healthcare and, to examine some of the opportunities and challenges. A literature review was conducted to highlight the implications associated with the use of Big Data in scientific research and healthcare innovations, both on a large and small scale. Scientists and health-care providers may learn from one another when it comes to understanding the value of Big Data and analytics. Small data, derived by patients and consumers, also requires analytics to become actionable. Connectivism provides a framework for the use of Big Data and analytics in the areas of science and healthcare. This theory assists individuals to recognize and synthesize how human connections are driving the increase in data. Despite the volume and velocity of Big Data, it is truly about technology connecting humans and assisting them to construct knowledge in new ways. Concluding Thoughts: The concept of Big Data and associated analytics are to be taken seriously when approaching the use of vast volumes of both structured and unstructured data in science and health-care. Future exploration of issues surrounding data privacy, confidentiality, and education are needed. A greater focus on data from social media, the quantified self-movement, and the application of analytics to "small data" would also be useful.

  12. Constraints on pre-big-bang parameter space from CMBR anisotropies

    International Nuclear Information System (INIS)

    Bozza, V.; Gasperini, M.; Giovannini, M.; Veneziano, G.

    2003-01-01

    The so-called curvaton mechanism--a way to convert isocurvature perturbations into adiabatic ones--is investigated both analytically and numerically in a pre-big-bang scenario where the role of the curvaton is played by a sufficiently massive Kalb-Ramond axion of superstring theory. When combined with observations of CMBR anisotropies at large and moderate angular scales, the present analysis allows us to constrain quite considerably the parameter space of the model: in particular, the initial displacement of the axion from the minimum of its potential and the rate of evolution of the compactification volume during pre-big-bang inflation. The combination of theoretical and experimental constraints favors a slightly blue spectrum of scalar perturbations, and/or a value of the string scale in the vicinity of the SUSY GUT scale

  13. Constraints on pre-big bang parameter space from CMBR anisotropies

    CERN Document Server

    Bozza, Valerio; Giovannini, Massimo; Veneziano, Gabriele

    2003-01-01

    The so-called curvaton mechanism --a way to convert isocurvature perturbations into adiabatic ones-- is investigated both analytically and numerically in a pre-big bang scenario where the role of the curvaton is played by a sufficiently massive Kalb--Ramond axion of superstring theory. When combined with observations of CMBR anisotropies at large and moderate angular scales, the present analysis allows us to constrain quite considerably the parameter space of the model: in particular, the initial displacement of the axion from the minimum of its potential and the rate of evolution of the compactification volume during pre-big bang inflation. The combination of theoretical and experimental constraints favours a slightly blue spectrum of scalar perturbations, and/or a value of the string scale in the vicinity of the SUSY-GUT scale.

  14. Big Data en surveillance, deel 1 : Definities en discussies omtrent Big Data

    NARCIS (Netherlands)

    Timan, Tjerk

    2016-01-01

    Naar aanleiding van een (vrij kort) college over surveillance en Big Data, werd me gevraagd iets dieper in te gaan op het thema, definities en verschillende vraagstukken die te maken hebben met big data. In dit eerste deel zal ik proberen e.e.a. uiteen te zetten betreft Big Data theorie en

  15. Big Data and Dementia: Charting the Route Ahead for Research, Ethics, and Policy

    Directory of Open Access Journals (Sweden)

    Marcello Ienca

    2018-02-01

    Full Text Available Emerging trends in pervasive computing and medical informatics are creating the possibility for large-scale collection, sharing, aggregation and analysis of unprecedented volumes of data, a phenomenon commonly known as big data. In this contribution, we review the existing scientific literature on big data approaches to dementia, as well as commercially available mobile-based applications in this domain. Our analysis suggests that big data approaches to dementia research and care hold promise for improving current preventive and predictive models, casting light on the etiology of the disease, enabling earlier diagnosis, optimizing resource allocation, and delivering more tailored treatments to patients with specific disease trajectories. Such promissory outlook, however, has not materialized yet, and raises a number of technical, scientific, ethical, and regulatory challenges. This paper provides an assessment of these challenges and charts the route ahead for research, ethics, and policy.

  16. Big Data and Knowledge Management: A Possible Course to Combine Them Together

    Science.gov (United States)

    Hijazi, Sam

    2017-01-01

    Big data (BD) is the buzz phrase these days. Everyone is talking about its potential, its volume, its variety, and its velocity. Knowledge management (KM) has been around since the mid-1990s. The goals of KM have been to collect, store, categorize, mine, and process data into knowledge. The methods of knowledge acquisition varied from…

  17. Big Data in der Cloud

    DEFF Research Database (Denmark)

    Leimbach, Timo; Bachlechner, Daniel

    2014-01-01

    Technology assessment of big data, in particular cloud based big data services, for the Office for Technology Assessment at the German federal parliament (Bundestag)......Technology assessment of big data, in particular cloud based big data services, for the Office for Technology Assessment at the German federal parliament (Bundestag)...

  18. An analysis of cross-sectional differences in big and non-big public accounting firms' audit programs

    NARCIS (Netherlands)

    Blokdijk, J.H. (Hans); Drieenhuizen, F.; Stein, M.T.; Simunic, D.A.

    2006-01-01

    A significant body of prior research has shown that audits by the Big 5 (now Big 4) public accounting firms are quality differentiated relative to non-Big 5 audits. This result can be derived analytically by assuming that Big 5 and non-Big 5 firms face different loss functions for "audit failures"

  19. Big Data is invading big places as CERN

    CERN Multimedia

    CERN. Geneva

    2017-01-01

    Big Data technologies are becoming more popular with the constant grow of data generation in different fields such as social networks, internet of things and laboratories like CERN. How is CERN making use of such technologies? How machine learning is applied at CERN with Big Data technologies? How much data we move and how it is analyzed? All these questions will be answered during the talk.

  20. The big bang

    International Nuclear Information System (INIS)

    Chown, Marcus.

    1987-01-01

    The paper concerns the 'Big Bang' theory of the creation of the Universe 15 thousand million years ago, and traces events which physicists predict occurred soon after the creation. Unified theory of the moment of creation, evidence of an expanding Universe, the X-boson -the particle produced very soon after the big bang and which vanished from the Universe one-hundredth of a second after the big bang, and the fate of the Universe, are all discussed. (U.K.)

  1. Big Data challenges and solutions in building the Global Earth Observation System of Systems (GEOSS)

    Science.gov (United States)

    Mazzetti, Paolo; Nativi, Stefano; Santoro, Mattia; Boldrini, Enrico

    2014-05-01

    The Group on Earth Observation (GEO) is a voluntary partnership of governments and international organizations launched in response to calls for action by the 2002 World Summit on Sustainable Development and by the G8 (Group of Eight) leading industrialized countries. These high-level meetings recognized that international collaboration is essential for exploiting the growing potential of Earth observations to support decision making in an increasingly complex and environmentally stressed world. To this aim is constructing the Global Earth Observation System of Systems (GEOSS) on the basis of a 10-Year Implementation Plan for the period 2005 to 2015 when it will become operational. As a large-scale integrated system handling large datasets as those provided by Earth Observation, GEOSS needs to face several challenges related to big data handling and big data infrastructures management. Referring to the traditional multiple Vs characteristics of Big Data (volume, variety, velocity, veracity and visualization) it is evident how most of them can be found in data handled by GEOSS. In particular, concerning Volume, Earth Observation already generates a large amount of data which can be estimated in the range of Petabytes (1015 bytes), with Exabytes (1018) already targeted. Moreover, the challenge is related not only to the data size, but also to the large amount of datasets (not necessarily having a big size) that systems need to manage. Variety is the other main challenge since datasets coming from different sensors, processed for different use-cases are published with highly heterogeneous metadata and data models, through different service interfaces. Innovative multidisciplinary applications need to access and use those datasets in a harmonized way. Moreover Earth Observation data are growing in size and variety at an exceptionally fast rate and new technologies and applications, including crowdsourcing, will even increase data volume and variety in the next future

  2. Small Big Data Congress 2017

    NARCIS (Netherlands)

    Doorn, J.

    2017-01-01

    TNO, in collaboration with the Big Data Value Center, presents the fourth Small Big Data Congress! Our congress aims at providing an overview of practical and innovative applications based on big data. Do you want to know what is happening in applied research with big data? And what can already be

  3. Big data opportunities and challenges

    CERN Document Server

    2014-01-01

    This ebook aims to give practical guidance for all those who want to understand big data better and learn how to make the most of it. Topics range from big data analysis, mobile big data and managing unstructured data to technologies, governance and intellectual property and security issues surrounding big data.

  4. Big Data and Neuroimaging.

    Science.gov (United States)

    Webb-Vargas, Yenny; Chen, Shaojie; Fisher, Aaron; Mejia, Amanda; Xu, Yuting; Crainiceanu, Ciprian; Caffo, Brian; Lindquist, Martin A

    2017-12-01

    Big Data are of increasing importance in a variety of areas, especially in the biosciences. There is an emerging critical need for Big Data tools and methods, because of the potential impact of advancements in these areas. Importantly, statisticians and statistical thinking have a major role to play in creating meaningful progress in this arena. We would like to emphasize this point in this special issue, as it highlights both the dramatic need for statistical input for Big Data analysis and for a greater number of statisticians working on Big Data problems. We use the field of statistical neuroimaging to demonstrate these points. As such, this paper covers several applications and novel methodological developments of Big Data tools applied to neuroimaging data.

  5. Social big data mining

    CERN Document Server

    Ishikawa, Hiroshi

    2015-01-01

    Social Media. Big Data and Social Data. Hypotheses in the Era of Big Data. Social Big Data Applications. Basic Concepts in Data Mining. Association Rule Mining. Clustering. Classification. Prediction. Web Structure Mining. Web Content Mining. Web Access Log Mining, Information Extraction and Deep Web Mining. Media Mining. Scalability and Outlier Detection.

  6. Cryptography for Big Data Security

    Science.gov (United States)

    2015-07-13

    Cryptography for Big Data Security Book Chapter for Big Data: Storage, Sharing, and Security (3S) Distribution A: Public Release Ariel Hamlin1 Nabil...Email: arkady@ll.mit.edu ii Contents 1 Cryptography for Big Data Security 1 1.1 Introduction...48 Chapter 1 Cryptography for Big Data Security 1.1 Introduction With the amount

  7. Data: Big and Small.

    Science.gov (United States)

    Jones-Schenk, Jan

    2017-02-01

    Big data is a big topic in all leadership circles. Leaders in professional development must develop an understanding of what data are available across the organization that can inform effective planning for forecasting. Collaborating with others to integrate data sets can increase the power of prediction. Big data alone is insufficient to make big decisions. Leaders must find ways to access small data and triangulate multiple types of data to ensure the best decision making. J Contin Educ Nurs. 2017;48(2):60-61. Copyright 2017, SLACK Incorporated.

  8. Big Data Revisited

    DEFF Research Database (Denmark)

    Kallinikos, Jannis; Constantiou, Ioanna

    2015-01-01

    We elaborate on key issues of our paper New games, new rules: big data and the changing context of strategy as a means of addressing some of the concerns raised by the paper’s commentators. We initially deal with the issue of social data and the role it plays in the current data revolution...... and the technological recording of facts. We further discuss the significance of the very mechanisms by which big data is produced as distinct from the very attributes of big data, often discussed in the literature. In the final section of the paper, we qualify the alleged importance of algorithms and claim...... that the structures of data capture and the architectures in which data generation is embedded are fundamental to the phenomenon of big data....

  9. Big Data in industry

    Science.gov (United States)

    Latinović, T. S.; Preradović, D. M.; Barz, C. R.; Latinović, M. T.; Petrica, P. P.; Pop-Vadean, A.

    2016-08-01

    The amount of data at the global level has grown exponentially. Along with this phenomena, we have a need for a new unit of measure like exabyte, zettabyte, and yottabyte as the last unit measures the amount of data. The growth of data gives a situation where the classic systems for the collection, storage, processing, and visualization of data losing the battle with a large amount, speed, and variety of data that is generated continuously. Many of data that is created by the Internet of Things, IoT (cameras, satellites, cars, GPS navigation, etc.). It is our challenge to come up with new technologies and tools for the management and exploitation of these large amounts of data. Big Data is a hot topic in recent years in IT circles. However, Big Data is recognized in the business world, and increasingly in the public administration. This paper proposes an ontology of big data analytics and examines how to enhance business intelligence through big data analytics as a service by presenting a big data analytics services-oriented architecture. This paper also discusses the interrelationship between business intelligence and big data analytics. The proposed approach in this paper might facilitate the research and development of business analytics, big data analytics, and business intelligence as well as intelligent agents.

  10. Advanced Research and Data Methods in Women's Health: Big Data Analytics, Adaptive Studies, and the Road Ahead.

    Science.gov (United States)

    Macedonia, Christian R; Johnson, Clark T; Rajapakse, Indika

    2017-02-01

    Technical advances in science have had broad implications in reproductive and women's health care. Recent innovations in population-level data collection and storage have made available an unprecedented amount of data for analysis while computational technology has evolved to permit processing of data previously thought too dense to study. "Big data" is a term used to describe data that are a combination of dramatically greater volume, complexity, and scale. The number of variables in typical big data research can readily be in the thousands, challenging the limits of traditional research methodologies. Regardless of what it is called, advanced data methods, predictive analytics, or big data, this unprecedented revolution in scientific exploration has the potential to dramatically assist research in obstetrics and gynecology broadly across subject matter. Before implementation of big data research methodologies, however, potential researchers and reviewers should be aware of strengths, strategies, study design methods, and potential pitfalls. Examination of big data research examples contained in this article provides insight into the potential and the limitations of this data science revolution and practical pathways for its useful implementation.

  11. Big Data Analytics An Overview

    Directory of Open Access Journals (Sweden)

    Jayshree Dwivedi

    2015-08-01

    Full Text Available Big data is a data beyond the storage capacity and beyond the processing power is called big data. Big data term is used for data sets its so large or complex that traditional data it involves data sets with sizes. Big data size is a constantly moving target year by year ranging from a few dozen terabytes to many petabytes of data means like social networking sites the amount of data produced by people is growing rapidly every year. Big data is not only a data rather it become a complete subject which includes various tools techniques and framework. It defines the epidemic possibility and evolvement of data both structured and unstructured. Big data is a set of techniques and technologies that require new forms of assimilate to uncover large hidden values from large datasets that are diverse complex and of a massive scale. It is difficult to work with using most relational database management systems and desktop statistics and visualization packages exacting preferably massively parallel software running on tens hundreds or even thousands of servers. Big data environment is used to grab organize and resolve the various types of data. In this paper we describe applications problems and tools of big data and gives overview of big data.

  12. How to use Big Data technologies to optimize operations in Upstream Petroleum Industry

    Directory of Open Access Journals (Sweden)

    Abdelkader Baaziz

    2013-12-01

    Full Text Available “Big Data is the oil of the new economy” is the most famous citation during the three last years. It has even been adopted by the World Economic Forum in 2011. In fact, Big Data is like crude! It’s valuable, but if unrefined it cannot be used. It must be broken down, analyzed for it to have value. But what about Big Data generated by the Petroleum Industry and particularly its upstream segment? Upstream is no stranger to Big Data. Understanding and leveraging data in the upstream segment enables firms to remain competitive throughout planning, exploration, delineation, and field development.Oil & Gas Companies conduct advanced geophysics modeling and simulation to support operations where 2D, 3D & 4D Seismic generate significant data during exploration phases. They closely monitor the performance of their operational assets. To do this, they use tens of thousands of data-collecting sensors in subsurface wells and surface facilities to provide continuous and real-time monitoring of assets and environmental conditions. Unfortunately, this information comes in various and increasingly complex forms, making it a challenge to collect, interpret, and leverage the disparate data. As an example, Chevron’s internal IT traffic alone exceeds 1.5 terabytes a day.Big Data technologies integrate common and disparate data sets to deliver the right information at the appropriate time to the correct decision-maker. These capabilities help firms act on large volumes of data, transforming decision-making from reactive to proactive and optimizing all phases of exploration, development and production. Furthermore, Big Data offers multiple opportunities to ensure safer, more responsible operations. Another invaluable effect of that would be shared learning.The aim of this paper is to explain how to use Big Data technologies to optimize operations. How can Big Data help experts to decision-making leading the desired outcomes?Keywords:Big Data; Analytics

  13. Towards large volume big divisor D3/D7 " μ-split supersymmetry" and Ricci-flat Swiss-cheese metrics, and dimension-six neutrino mass operators

    Science.gov (United States)

    Dhuria, Mansi; Misra, Aalok

    2012-02-01

    We show that it is possible to realize a " μ-split SUSY" scenario (Cheng and Cheng, 2005) [1] in the context of large volume limit of type IIB compactifications on Swiss-cheese Calabi-Yau orientifolds in the presence of a mobile space-time filling D3-brane and a (stack of) D7-brane(s) wrapping the "big" divisor. For this, we investigate the possibility of getting one Higgs to be light while other to be heavy in addition to a heavy higgsino mass parameter. Further, we examine the existence of long lived gluino that manifests one of the major consequences of μ-split SUSY scenario, by computing its decay width as well as lifetime corresponding to the three-body decays of the gluino into either a quark, a squark and a neutralino or a quark, squark and goldstino, as well as two-body decays of the gluino into either a neutralino and a gluon or a goldstino and a gluon. Guided by the geometric Kähler potential for Σ obtained in Misra and Shukla (2010) [2] based on GLSM techniques, and the Donaldson's algorithm (Barun et al., 2008) [3] for obtaining numerically a Ricci-flat metric, we give details of our calculation in Misra and Shukla (2011) [4] pertaining to our proposed metric for the full Swiss-cheese Calabi-Yau (the geometric Kähler potential being needed to be included in the full moduli space Kähler potential in the presence of the mobile space-time filling D3-brane), but for simplicity of calculation, close to the big divisor, which is Ricci-flat in the large volume limit. Also, as an application of the one-loop RG flow solution for the higgsino mass parameter, we show that the contribution to the neutrino masses at the EW scale from dimension-six operators arising from the Kähler potential, is suppressed relative to the Weinberg-type dimension-five operators.

  14. Urbanising Big

    DEFF Research Database (Denmark)

    Ljungwall, Christer

    2013-01-01

    Development in China raises the question of how big a city can become, and at the same time be sustainable, writes Christer Ljungwall of the Swedish Agency for Growth Policy Analysis.......Development in China raises the question of how big a city can become, and at the same time be sustainable, writes Christer Ljungwall of the Swedish Agency for Growth Policy Analysis....

  15. Big bang nucleosynthesis

    International Nuclear Information System (INIS)

    Boyd, Richard N.

    2001-01-01

    The precision of measurements in modern cosmology has made huge strides in recent years, with measurements of the cosmic microwave background and the determination of the Hubble constant now rivaling the level of precision of the predictions of big bang nucleosynthesis. However, these results are not necessarily consistent with the predictions of the Standard Model of big bang nucleosynthesis. Reconciling these discrepancies may require extensions of the basic tenets of the model, and possibly of the reaction rates that determine the big bang abundances

  16. Study of LBS for characterization and analysis of big data benchmarks

    International Nuclear Information System (INIS)

    Chandio, A.A.; Zhang, F.; Memon, T.D.

    2014-01-01

    In the past few years, most organizations are gradually diverting their applications and services to Cloud. This is because Cloud paradigm enables (a) on-demand accessed and (b) large data processing for their applications and users on Internet anywhere in the world. The rapid growth of urbanization in developed and developing countries leads a new emerging concept called Urban Computing, one of the application domains that is rapidly deployed to the Cloud. More precisely, in the concept of Urban Computing, sensors, vehicles, devices, buildings, and roads are used as a component to probe city dynamics. Their data representation is widely available including GPS traces of vehicles. However, their applications are more towards data processing and storage hungry, which is due to their data increment in large volume starts from few dozen of TB (Tera Bytes) to thousands of PT (Peta Bytes) (i.e. Big Data). To increase the development and the assessment of the applications such as LBS (Location Based Services), a benchmark of Big Data is urgently needed. This research is a novel research on LBS to characterize and analyze the Big Data benchmarks. We focused on map-matching, which is being used as pre-processing step in many LBS applications. In this preliminary work, this paper also describes current status of Big Data benchmarks and our future direction. (author)

  17. Study on LBS for Characterization and Analysis of Big Data Benchmarks

    Directory of Open Access Journals (Sweden)

    Aftab Ahmed Chandio

    2014-10-01

    Full Text Available In the past few years, most organizations are gradually diverting their applications and services to Cloud. This is because Cloud paradigm enables (a on-demand accessed and (b large data processing for their applications and users on Internet anywhere in the world. The rapid growth of urbanization in developed and developing countries leads a new emerging concept called Urban Computing, one of the application domains that is rapidly deployed to the Cloud. More precisely, in the concept of Urban Computing, sensors, vehicles, devices, buildings, and roads are used as a component to probe city dynamics. Their data representation is widely available including GPS traces of vehicles. However, their applications are more towards data processing and storage hungry, which is due to their data increment in large volume starts from few dozen of TB (Tera Bytes to thousands of PT (Peta Bytes (i.e. Big Data. To increase the development and the assessment of the applications such as LBS (Location Based Services, a benchmark of Big Data is urgently needed. This research is a novel research on LBS to characterize and analyze the Big Data benchmarks. We focused on map-matching, which is being used as pre-processing step in many LBS applications. In this preliminary work, this paper also describes current status of Big Data benchmarks and our future direction

  18. How Big Data Fast Tracked Human Mobility Research and the Lessons for Animal Movement Ecology

    KAUST Repository

    Thums, Michele; Ferná ndez-Gracia, Juan; Sequeira, Ana M. M.; Eguí luz, Ví ctor M.; Duarte, Carlos M.; Meekan, Mark G.

    2018-01-01

    The rise of the internet coupled with technological innovations such as smartphones have generated massive volumes of geo-referenced data (big data) on human mobility. This has allowed the number of studies of human mobility to rapidly overtake those of animal movement. Today, telemetry studies of animals are also approaching big data status. Here, we review recent advances in studies of human mobility and identify the opportunities they present for advancing our understanding of animal movement. We describe key analytical techniques, potential bottlenecks and a roadmap for progress toward a synthesis of movement patterns of wild animals.

  19. How Big Data Fast Tracked Human Mobility Research and the Lessons for Animal Movement Ecology

    Directory of Open Access Journals (Sweden)

    Michele Thums

    2018-02-01

    Full Text Available The rise of the internet coupled with technological innovations such as smartphones have generated massive volumes of geo-referenced data (big data on human mobility. This has allowed the number of studies of human mobility to rapidly overtake those of animal movement. Today, telemetry studies of animals are also approaching big data status. Here, we review recent advances in studies of human mobility and identify the opportunities they present for advancing our understanding of animal movement. We describe key analytical techniques, potential bottlenecks and a roadmap for progress toward a synthesis of movement patterns of wild animals.

  20. How Big Data Fast Tracked Human Mobility Research and the Lessons for Animal Movement Ecology

    KAUST Repository

    Thums, Michele

    2018-02-13

    The rise of the internet coupled with technological innovations such as smartphones have generated massive volumes of geo-referenced data (big data) on human mobility. This has allowed the number of studies of human mobility to rapidly overtake those of animal movement. Today, telemetry studies of animals are also approaching big data status. Here, we review recent advances in studies of human mobility and identify the opportunities they present for advancing our understanding of animal movement. We describe key analytical techniques, potential bottlenecks and a roadmap for progress toward a synthesis of movement patterns of wild animals.

  1. Using Multiple Big Datasets and Machine Learning to Produce a New Global Particulate Dataset: A Technology Challenge Case Study

    Science.gov (United States)

    Lary, D. J.

    2013-12-01

    A BigData case study is described where multiple datasets from several satellites, high-resolution global meteorological data, social media and in-situ observations are combined using machine learning on a distributed cluster using an automated workflow. The global particulate dataset is relevant to global public health studies and would not be possible to produce without the use of the multiple big datasets, in-situ data and machine learning.To greatly reduce the development time and enhance the functionality a high level language capable of parallel processing has been used (Matlab). A key consideration for the system is high speed access due to the large data volume, persistence of the large data volumes and a precise process time scheduling capability.

  2. The ethics of big data in big agriculture

    OpenAIRE

    Carbonell (Isabelle M.)

    2016-01-01

    This paper examines the ethics of big data in agriculture, focusing on the power asymmetry between farmers and large agribusinesses like Monsanto. Following the recent purchase of Climate Corp., Monsanto is currently the most prominent biotech agribusiness to buy into big data. With wireless sensors on tractors monitoring or dictating every decision a farmer makes, Monsanto can now aggregate large quantities of previously proprietary farming data, enabling a privileged position with unique in...

  3. A Big Video Manifesto

    DEFF Research Database (Denmark)

    Mcilvenny, Paul Bruce; Davidsen, Jacob

    2017-01-01

    and beautiful visualisations. However, we also need to ask what the tools of big data can do both for the Humanities and for more interpretative approaches and methods. Thus, we prefer to explore how the power of computation, new sensor technologies and massive storage can also help with video-based qualitative......For the last few years, we have witnessed a hype about the potential results and insights that quantitative big data can bring to the social sciences. The wonder of big data has moved into education, traffic planning, and disease control with a promise of making things better with big numbers...

  4. Identifying Dwarfs Workloads in Big Data Analytics

    OpenAIRE

    Gao, Wanling; Luo, Chunjie; Zhan, Jianfeng; Ye, Hainan; He, Xiwen; Wang, Lei; Zhu, Yuqing; Tian, Xinhui

    2015-01-01

    Big data benchmarking is particularly important and provides applicable yardsticks for evaluating booming big data systems. However, wide coverage and great complexity of big data computing impose big challenges on big data benchmarking. How can we construct a benchmark suite using a minimum set of units of computation to represent diversity of big data analytics workloads? Big data dwarfs are abstractions of extracting frequently appearing operations in big data computing. One dwarf represen...

  5. Applications of Big Data in Education

    OpenAIRE

    Faisal Kalota

    2015-01-01

    Big Data and analytics have gained a huge momentum in recent years. Big Data feeds into the field of Learning Analytics (LA) that may allow academic institutions to better understand the learners' needs and proactively address them. Hence, it is important to have an understanding of Big Data and its applications. The purpose of this descriptive paper is to provide an overview of Big Data, the technologies used in Big Data, and some of the applications of Big Data in educa...

  6. Big Data Semantics

    NARCIS (Netherlands)

    Ceravolo, Paolo; Azzini, Antonia; Angelini, Marco; Catarci, Tiziana; Cudré-Mauroux, Philippe; Damiani, Ernesto; Mazak, Alexandra; van Keulen, Maurice; Jarrar, Mustafa; Santucci, Giuseppe; Sattler, Kai-Uwe; Scannapieco, Monica; Wimmer, Manuel; Wrembel, Robert; Zaraket, Fadi

    2018-01-01

    Big Data technology has discarded traditional data modeling approaches as no longer applicable to distributed data processing. It is, however, largely recognized that Big Data impose novel challenges in data and infrastructure management. Indeed, multiple components and procedures must be

  7. Comparative validity of brief to medium-length Big Five and Big Six personality questionnaires

    NARCIS (Netherlands)

    Thalmayer, A.G.; Saucier, G.; Eigenhuis, A.

    2011-01-01

    A general consensus on the Big Five model of personality attributes has been highly generative for the field of personality psychology. Many important psychological and life outcome correlates with Big Five trait dimensions have been established. But researchers must choose between multiple Big Five

  8. The Need for a Definition of Big Data for Nursing Science: A Case Study of Disaster Preparedness

    Science.gov (United States)

    Wong, Ho Ting; Chiang, Vico Chung Lim; Choi, Kup Sze; Loke, Alice Yuen

    2016-01-01

    The rapid development of technology has made enormous volumes of data available and achievable anytime and anywhere around the world. Data scientists call this change a data era and have introduced the term “Big Data”, which has drawn the attention of nursing scholars. Nevertheless, the concept of Big Data is quite fuzzy and there is no agreement on its definition among researchers of different disciplines. Without a clear consensus on this issue, nursing scholars who are relatively new to the concept may consider Big Data to be merely a dataset of a bigger size. Having a suitable definition for nurse researchers in their context of research and practice is essential for the advancement of nursing research. In view of the need for a better understanding on what Big Data is, the aim in this paper is to explore and discuss the concept. Furthermore, an example of a Big Data research study on disaster nursing preparedness involving six million patient records is used for discussion. The example demonstrates that a Big Data analysis can be conducted from many more perspectives than would be possible in traditional sampling, and is superior to traditional sampling. Experience gained from the process of using Big Data in this study will shed light on future opportunities for conducting evidence-based nursing research to achieve competence in disaster nursing. PMID:27763525

  9. The Need for a Definition of Big Data for Nursing Science: A Case Study of Disaster Preparedness

    Directory of Open Access Journals (Sweden)

    Ho Ting Wong

    2016-10-01

    Full Text Available The rapid development of technology has made enormous volumes of data available and achievable anytime and anywhere around the world. Data scientists call this change a data era and have introduced the term “Big Data”, which has drawn the attention of nursing scholars. Nevertheless, the concept of Big Data is quite fuzzy and there is no agreement on its definition among researchers of different disciplines. Without a clear consensus on this issue, nursing scholars who are relatively new to the concept may consider Big Data to be merely a dataset of a bigger size. Having a suitable definition for nurse researchers in their context of research and practice is essential for the advancement of nursing research. In view of the need for a better understanding on what Big Data is, the aim in this paper is to explore and discuss the concept. Furthermore, an example of a Big Data research study on disaster nursing preparedness involving six million patient records is used for discussion. The example demonstrates that a Big Data analysis can be conducted from many more perspectives than would be possible in traditional sampling, and is superior to traditional sampling. Experience gained from the process of using Big Data in this study will shed light on future opportunities for conducting evidence-based nursing research to achieve competence in disaster nursing.

  10. The Need for a Definition of Big Data for Nursing Science: A Case Study of Disaster Preparedness.

    Science.gov (United States)

    Wong, Ho Ting; Chiang, Vico Chung Lim; Choi, Kup Sze; Loke, Alice Yuen

    2016-10-17

    The rapid development of technology has made enormous volumes of data available and achievable anytime and anywhere around the world. Data scientists call this change a data era and have introduced the term "Big Data", which has drawn the attention of nursing scholars. Nevertheless, the concept of Big Data is quite fuzzy and there is no agreement on its definition among researchers of different disciplines. Without a clear consensus on this issue, nursing scholars who are relatively new to the concept may consider Big Data to be merely a dataset of a bigger size. Having a suitable definition for nurse researchers in their context of research and practice is essential for the advancement of nursing research. In view of the need for a better understanding on what Big Data is, the aim in this paper is to explore and discuss the concept. Furthermore, an example of a Big Data research study on disaster nursing preparedness involving six million patient records is used for discussion. The example demonstrates that a Big Data analysis can be conducted from many more perspectives than would be possible in traditional sampling, and is superior to traditional sampling. Experience gained from the process of using Big Data in this study will shed light on future opportunities for conducting evidence-based nursing research to achieve competence in disaster nursing.

  11. Forget the hype or reality. Big data presents new opportunities in Earth Science.

    Science.gov (United States)

    Lee, T. J.

    2015-12-01

    Earth science is arguably one of the most mature science discipline which constantly acquires, curates, and utilizes a large volume of data with diverse variety. We deal with big data before there is big data. For example, while developing the EOS program in the 1980s, the EOS data and information system (EOSDIS) was developed to manage the vast amount of data acquired by the EOS fleet of satellites. EOSDIS continues to be a shining example of modern science data systems in the past two decades. With the explosion of internet, the usage of social media, and the provision of sensors everywhere, the big data era has bring new challenges. First, Goggle developed the search algorithm and a distributed data management system. The open source communities quickly followed up and developed Hadoop file system to facility the map reduce workloads. The internet continues to generate tens of petabytes of data every day. There is a significant shortage of algorithms and knowledgeable manpower to mine the data. In response, the federal government developed the big data programs that fund research and development projects and training programs to tackle these new challenges. Meanwhile, comparatively to the internet data explosion, Earth science big data problem has become quite small. Nevertheless, the big data era presents an opportunity for Earth science to evolve. We learned about the MapReduce algorithms, in memory data mining, machine learning, graph analysis, and semantic web technologies. How do we apply these new technologies to our discipline and bring the hype to Earth? In this talk, I will discuss how we might want to apply some of the big data technologies to our discipline and solve many of our challenging problems. More importantly, I will propose new Earth science data system architecture to enable new type of scientific inquires.

  12. Middleware for big data processing: test results

    Science.gov (United States)

    Gankevich, I.; Gaiduchok, V.; Korkhov, V.; Degtyarev, A.; Bogdanov, A.

    2017-12-01

    Dealing with large volumes of data is resource-consuming work which is more and more often delegated not only to a single computer but also to a whole distributed computing system at once. As the number of computers in a distributed system increases, the amount of effort put into effective management of the system grows. When the system reaches some critical size, much effort should be put into improving its fault tolerance. It is difficult to estimate when some particular distributed system needs such facilities for a given workload, so instead they should be implemented in a middleware which works efficiently with a distributed system of any size. It is also difficult to estimate whether a volume of data is large or not, so the middleware should also work with data of any volume. In other words, the purpose of the middleware is to provide facilities that adapt distributed computing system for a given workload. In this paper we introduce such middleware appliance. Tests show that this middleware is well-suited for typical HPC and big data workloads and its performance is comparable with well-known alternatives.

  13. Modeling of the Nuclear Power Plant Life cycle for 'Big Data' Management System: A Systems Engineering Viewpoint

    Energy Technology Data Exchange (ETDEWEB)

    Ha, Bui Hoang; Khanh, Tran Quang Diep; Shakirah, Wan; Kahar, Wan Abdul; Jung, Jae Cheon [KEPCO International Nuclear Graduate School, Ulsan (Korea, Republic of)

    2012-10-15

    Together with the significant development of Internet and Web technologies, the rapid evolution of 'Big Data' idea has been observed since it is first introduced in 1941 as an 'information explosion'(OED). Using the '3Vs' model, as proposed by Gartner, 'Big Data' can be defined as 'high volume, high velocity, and/or high variety information assets that require new forms of processing to enable enhanced decision making, insight discovery and process optimization.' Big Data technologies and tools have been developed to address the way large quantities of data are stored, accessed and presented for manipulation or analysis. The idea also focuses on how the users can easily access and extract the 'useful and right' data, information, or even knowledge from the 'Big Data'.

  14. Big data need big theory too.

    Science.gov (United States)

    Coveney, Peter V; Dougherty, Edward R; Highfield, Roger R

    2016-11-13

    The current interest in big data, machine learning and data analytics has generated the widespread impression that such methods are capable of solving most problems without the need for conventional scientific methods of inquiry. Interest in these methods is intensifying, accelerated by the ease with which digitized data can be acquired in virtually all fields of endeavour, from science, healthcare and cybersecurity to economics, social sciences and the humanities. In multiscale modelling, machine learning appears to provide a shortcut to reveal correlations of arbitrary complexity between processes at the atomic, molecular, meso- and macroscales. Here, we point out the weaknesses of pure big data approaches with particular focus on biology and medicine, which fail to provide conceptual accounts for the processes to which they are applied. No matter their 'depth' and the sophistication of data-driven methods, such as artificial neural nets, in the end they merely fit curves to existing data. Not only do these methods invariably require far larger quantities of data than anticipated by big data aficionados in order to produce statistically reliable results, but they can also fail in circumstances beyond the range of the data used to train them because they are not designed to model the structural characteristics of the underlying system. We argue that it is vital to use theory as a guide to experimental design for maximal efficiency of data collection and to produce reliable predictive models and conceptual knowledge. Rather than continuing to fund, pursue and promote 'blind' big data projects with massive budgets, we call for more funding to be allocated to the elucidation of the multiscale and stochastic processes controlling the behaviour of complex systems, including those of life, medicine and healthcare.This article is part of the themed issue 'Multiscale modelling at the physics-chemistry-biology interface'. © 2015 The Authors.

  15. Simulating Smoke Filling in Big Halls by Computational Fluid Dynamics

    Directory of Open Access Journals (Sweden)

    W. K. Chow

    2011-01-01

    Full Text Available Many tall halls of big space volume were built and, to be built in many construction projects in the Far East, particularly Mainland China, Hong Kong, and Taiwan. Smoke is identified to be the key hazard to handle. Consequently, smoke exhaust systems are specified in the fire code in those areas. An update on applying Computational Fluid Dynamics (CFD in smoke exhaust design will be presented in this paper. Key points to note in CFD simulations on smoke filling due to a fire in a big hall will be discussed. Mathematical aspects concerning of discretization of partial differential equations and algorithms for solving the velocity-pressure linked equations are briefly outlined. Results predicted by CFD with different free boundary conditions are compared with those on room fire tests. Standards on grid size, relaxation factors, convergence criteria, and false diffusion should be set up for numerical experiments with CFD.

  16. Do container volume, site preparation, and field fertilization affect restoration potential of Wyoming big sagebrush?

    Science.gov (United States)

    Kayla R. Herriman; Anthony S. Davis; Kent G. Apostol; Olga. A. Kildisheva; Amy L. Ross-Davis; Kas Dumroese

    2016-01-01

    Land management practices, invasive species expansion, and changes in the fire regime greatly impact the distribution of native plants in natural areas. Wyoming big sagebrush (Artemisia tridentata ssp. wyomingensis), a keystone species in the Great Basin, has seen a 50% reduction in its distribution. For many dryland species, reestablishment efforts have...

  17. The Ethics of Biomedical Big Data : Brent Daniel Mittelstadt and Luciano Floridi, eds. 2016, Springer International Publishing (Cham, Switzerland, 978-3-319-33523-0, 480 pp.).

    Science.gov (United States)

    Mason, Paul H

    2017-12-01

    The availability of diverse sources of data related to health and illness from various types of modern communication technology presents the possibility of augmenting medical knowledge, clinical care, and the patient experience. New forms of data collection and analysis will undoubtedly transform epidemiology, public health, and clinical practice, but what ethical considerations come in to play? With a view to analysing the ethical and regulatory dimensions of burgeoning forms of biomedical big data, Brent Daniel Mittelstadt and Luciano Floridi have brought together thirty scholars in an edited volume that forms part of Springer's Law, Governance and Technology book series in a collection titled The Ethics of Biomedical Big Data. With eighteen chapters partitioned into six carefully devised sections, this volume engages with core theoretical, ethical, and regulatory challenges posed by biomedical big data.

  18. From Big Data to Smart Data for Pharmacovigilance: The Role of Healthcare Databases and Other Emerging Sources.

    Science.gov (United States)

    Trifirò, Gianluca; Sultana, Janet; Bate, Andrew

    2018-02-01

    In the last decade 'big data' has become a buzzword used in several industrial sectors, including but not limited to telephony, finance and healthcare. Despite its popularity, it is not always clear what big data refers to exactly. Big data has become a very popular topic in healthcare, where the term primarily refers to the vast and growing volumes of computerized medical information available in the form of electronic health records, administrative or health claims data, disease and drug monitoring registries and so on. This kind of data is generally collected routinely during administrative processes and clinical practice by different healthcare professionals: from doctors recording their patients' medical history, drug prescriptions or medical claims to pharmacists registering dispensed prescriptions. For a long time, this data accumulated without its value being fully recognized and leveraged. Today big data has an important place in healthcare, including in pharmacovigilance. The expanding role of big data in pharmacovigilance includes signal detection, substantiation and validation of drug or vaccine safety signals, and increasingly new sources of information such as social media are also being considered. The aim of the present paper is to discuss the uses of big data for drug safety post-marketing assessment.

  19. Big Data and medicine: a big deal?

    Science.gov (United States)

    Mayer-Schönberger, V; Ingelsson, E

    2018-05-01

    Big Data promises huge benefits for medical research. Looking beyond superficial increases in the amount of data collected, we identify three key areas where Big Data differs from conventional analyses of data samples: (i) data are captured more comprehensively relative to the phenomenon under study; this reduces some bias but surfaces important trade-offs, such as between data quantity and data quality; (ii) data are often analysed using machine learning tools, such as neural networks rather than conventional statistical methods resulting in systems that over time capture insights implicit in data, but remain black boxes, rarely revealing causal connections; and (iii) the purpose of the analyses of data is no longer simply answering existing questions, but hinting at novel ones and generating promising new hypotheses. As a consequence, when performed right, Big Data analyses can accelerate research. Because Big Data approaches differ so fundamentally from small data ones, research structures, processes and mindsets need to adjust. The latent value of data is being reaped through repeated reuse of data, which runs counter to existing practices not only regarding data privacy, but data management more generally. Consequently, we suggest a number of adjustments such as boards reviewing responsible data use, and incentives to facilitate comprehensive data sharing. As data's role changes to a resource of insight, we also need to acknowledge the importance of collecting and making data available as a crucial part of our research endeavours, and reassess our formal processes from career advancement to treatment approval. © 2017 The Association for the Publication of the Journal of Internal Medicine.

  20. DB2 11 the database for big data & analytics

    CERN Document Server

    Molaro, Cristian; Purcell, Terry

    2013-01-01

    The landscape of today's business is shaped by the mountains of data being produced, with rapid growth in the volume, variety, and velocity of data due to the explosion of smart devices, mobile applications, cloud computing, and social media. Much of this growth has been in unstructured data; however, by 2020, internet business transactions-business-to-business and business-to-consumer-are predicted to reach 450 billion per day. Smart organizations are seeking innovative ways to turn this explosion of data, called big data, i

  1. Assessing Big Data

    DEFF Research Database (Denmark)

    Leimbach, Timo; Bachlechner, Daniel

    2015-01-01

    In recent years, big data has been one of the most controversially discussed technologies in terms of its possible positive and negative impact. Therefore, the need for technology assessments is obvious. This paper first provides, based on the results of a technology assessment study, an overview...... of the potential and challenges associated with big data and then describes the problems experienced during the study as well as methods found helpful to address them. The paper concludes with reflections on how the insights from the technology assessment study may have an impact on the future governance of big...... data....

  2. The big data potential of epidemiological studies for criminology and forensics.

    Science.gov (United States)

    DeLisi, Matt

    2018-07-01

    Big data, the analysis of original datasets with large samples ranging from ∼30,000 to one million participants to mine unexplored data, has been under-utilized in criminology. However, there have been recent calls for greater synthesis between epidemiology and criminology and a small number of scholars have utilized epidemiological studies that were designed to measure alcohol and substance use to harvest behavioral and psychiatric measures that relate to the study of crime. These studies have been helpful in producing knowledge about the most serious, violent, and chronic offenders, but applications to more pathological forensic populations is lagging. Unfortunately, big data relating to crime and justice are restricted and limited to criminal justice purposes and not easily available to the research community. Thus, the study of criminal and forensic populations is limited in terms of data volume, velocity, and variety. Additional forays into epidemiology, increased use of available online judicial and correctional data, and unknown new frontiers are needed to bring criminology up to speed in the big data arena. Copyright © 2016 Elsevier Ltd and Faculty of Forensic and Legal Medicine. All rights reserved.

  3. Big data, big responsibilities

    Directory of Open Access Journals (Sweden)

    Primavera De Filippi

    2014-01-01

    Full Text Available Big data refers to the collection and aggregation of large quantities of data produced by and about people, things or the interactions between them. With the advent of cloud computing, specialised data centres with powerful computational hardware and software resources can be used for processing and analysing a humongous amount of aggregated data coming from a variety of different sources. The analysis of such data is all the more valuable to the extent that it allows for specific patterns to be found and new correlations to be made between different datasets, so as to eventually deduce or infer new information, as well as to potentially predict behaviours or assess the likelihood for a certain event to occur. This article will focus specifically on the legal and moral obligations of online operators collecting and processing large amounts of data, to investigate the potential implications of big data analysis on the privacy of individual users and on society as a whole.

  4. Comparative validity of brief to medium-length Big Five and Big Six Personality Questionnaires.

    Science.gov (United States)

    Thalmayer, Amber Gayle; Saucier, Gerard; Eigenhuis, Annemarie

    2011-12-01

    A general consensus on the Big Five model of personality attributes has been highly generative for the field of personality psychology. Many important psychological and life outcome correlates with Big Five trait dimensions have been established. But researchers must choose between multiple Big Five inventories when conducting a study and are faced with a variety of options as to inventory length. Furthermore, a 6-factor model has been proposed to extend and update the Big Five model, in part by adding a dimension of Honesty/Humility or Honesty/Propriety. In this study, 3 popular brief to medium-length Big Five measures (NEO Five Factor Inventory, Big Five Inventory [BFI], and International Personality Item Pool), and 3 six-factor measures (HEXACO Personality Inventory, Questionnaire Big Six Scales, and a 6-factor version of the BFI) were placed in competition to best predict important student life outcomes. The effect of test length was investigated by comparing brief versions of most measures (subsets of items) with original versions. Personality questionnaires were administered to undergraduate students (N = 227). Participants' college transcripts and student conduct records were obtained 6-9 months after data was collected. Six-factor inventories demonstrated better predictive ability for life outcomes than did some Big Five inventories. Additional behavioral observations made on participants, including their Facebook profiles and cell-phone text usage, were predicted similarly by Big Five and 6-factor measures. A brief version of the BFI performed surprisingly well; across inventory platforms, increasing test length had little effect on predictive validity. Comparative validity of the models and measures in terms of outcome prediction and parsimony is discussed.

  5. Big Machines and Big Science: 80 Years of Accelerators at Stanford

    Energy Technology Data Exchange (ETDEWEB)

    Loew, Gregory

    2008-12-16

    Longtime SLAC physicist Greg Loew will present a trip through SLAC's origins, highlighting its scientific achievements, and provide a glimpse of the lab's future in 'Big Machines and Big Science: 80 Years of Accelerators at Stanford.'

  6. Infectious Disease Surveillance in the Big Data Era: Towards Faster and Locally Relevant Systems.

    Science.gov (United States)

    Simonsen, Lone; Gog, Julia R; Olson, Don; Viboud, Cécile

    2016-12-01

    While big data have proven immensely useful in fields such as marketing and earth sciences, public health is still relying on more traditional surveillance systems and awaiting the fruits of a big data revolution. A new generation of big data surveillance systems is needed to achieve rapid, flexible, and local tracking of infectious diseases, especially for emerging pathogens. In this opinion piece, we reflect on the long and distinguished history of disease surveillance and discuss recent developments related to use of big data. We start with a brief review of traditional systems relying on clinical and laboratory reports. We then examine how large-volume medical claims data can, with great spatiotemporal resolution, help elucidate local disease patterns. Finally, we review efforts to develop surveillance systems based on digital and social data streams, including the recent rise and fall of Google Flu Trends. We conclude by advocating for increased use of hybrid systems combining information from traditional surveillance and big data sources, which seems the most promising option moving forward. Throughout the article, we use influenza as an exemplar of an emerging and reemerging infection which has traditionally been considered a model system for surveillance and modeling. Published by Oxford University Press for the Infectious Diseases Society of America 2016. This work is written by (a) US Government employee(s) and is in the public domain in the US.

  7. Dual of big bang and big crunch

    International Nuclear Information System (INIS)

    Bak, Dongsu

    2007-01-01

    Starting from the Janus solution and its gauge theory dual, we obtain the dual gauge theory description of the cosmological solution by the procedure of double analytic continuation. The coupling is driven either to zero or to infinity at the big-bang and big-crunch singularities, which are shown to be related by the S-duality symmetry. In the dual Yang-Mills theory description, these are nonsingular as the coupling goes to zero in the N=4 super Yang-Mills theory. The cosmological singularities simply signal the failure of the supergravity description of the full type IIB superstring theory

  8. Comparative Validity of Brief to Medium-Length Big Five and Big Six Personality Questionnaires

    Science.gov (United States)

    Thalmayer, Amber Gayle; Saucier, Gerard; Eigenhuis, Annemarie

    2011-01-01

    A general consensus on the Big Five model of personality attributes has been highly generative for the field of personality psychology. Many important psychological and life outcome correlates with Big Five trait dimensions have been established. But researchers must choose between multiple Big Five inventories when conducting a study and are…

  9. Big data for health.

    Science.gov (United States)

    Andreu-Perez, Javier; Poon, Carmen C Y; Merrifield, Robert D; Wong, Stephen T C; Yang, Guang-Zhong

    2015-07-01

    This paper provides an overview of recent developments in big data in the context of biomedical and health informatics. It outlines the key characteristics of big data and how medical and health informatics, translational bioinformatics, sensor informatics, and imaging informatics will benefit from an integrated approach of piecing together different aspects of personalized information from a diverse range of data sources, both structured and unstructured, covering genomics, proteomics, metabolomics, as well as imaging, clinical diagnosis, and long-term continuous physiological sensing of an individual. It is expected that recent advances in big data will expand our knowledge for testing new hypotheses about disease management from diagnosis to prevention to personalized treatment. The rise of big data, however, also raises challenges in terms of privacy, security, data ownership, data stewardship, and governance. This paper discusses some of the existing activities and future opportunities related to big data for health, outlining some of the key underlying issues that need to be tackled.

  10. Big Data in radiation therapy: challenges and opportunities.

    Science.gov (United States)

    Lustberg, Tim; van Soest, Johan; Jochems, Arthur; Deist, Timo; van Wijk, Yvonka; Walsh, Sean; Lambin, Philippe; Dekker, Andre

    2017-01-01

    Data collected and generated by radiation oncology can be classified by the Volume, Variety, Velocity and Veracity (4Vs) of Big Data because they are spread across different care providers and not easily shared owing to patient privacy protection. The magnitude of the 4Vs is substantial in oncology, especially owing to imaging modalities and unclear data definitions. To create useful models ideally all data of all care providers are understood and learned from; however, this presents challenges in the guise of poor data quality, patient privacy concerns, geographical spread, interoperability and large volume. In radiation oncology, there are many efforts to collect data for research and innovation purposes. Clinical trials are the gold standard when proving any hypothesis that directly affects the patient. Collecting data in registries with strict predefined rules is also a common approach to find answers. A third approach is to develop data stores that can be used by modern machine learning techniques to provide new insights or answer hypotheses. We believe all three approaches have their strengths and weaknesses, but they should all strive to create Findable, Accessible, Interoperable, Reusable (FAIR) data. To learn from these data, we need distributed learning techniques, sending machine learning algorithms to FAIR data stores around the world, learning from trial data, registries and routine clinical data rather than trying to centralize all data. To improve and personalize medicine, rapid learning platforms must be able to process FAIR "Big Data" to evaluate current clinical practice and to guide further innovation.

  11. Big Data as a Driver for Clinical Decision Support Systems: A Learning Health Systems Perspective

    Directory of Open Access Journals (Sweden)

    Arianna Dagliati

    2018-05-01

    Full Text Available Big data technologies are nowadays providing health care with powerful instruments to gather and analyze large volumes of heterogeneous data collected for different purposes, including clinical care, administration, and research. This makes possible to design IT infrastructures that favor the implementation of the so-called “Learning Healthcare System Cycle,” where healthcare practice and research are part of a unique and synergic process. In this paper we highlight how “Big Data enabled” integrated data collections may support clinical decision-making together with biomedical research. Two effective implementations are reported, concerning decision support in Diabetes and in Inherited Arrhythmogenic Diseases.

  12. Big data in pharmacy practice: current use, challenges, and the future

    OpenAIRE

    Ma, Carolyn; Smith, Helen Wong; Chu, Cherie; Juarez, Deborah T

    2015-01-01

    Carolyn Ma, Helen Wong Smith, Cherie Chu, Deborah T JuarezDepartment of Pharmacy Practice, The Daniel K Inouye College of Pharmacy, University of Hawai'i at Hilo, Hilo, HI, USAAbstract: Pharmacy informatics is defined as the use and integration of data, information, knowledge, technology, and automation in the medication-use process for the purpose of improving health outcomes. The term “big data” has been coined and is often defined in three V's: volume, v...

  13. Big data analytics for the virtual network topology reconfiguration use case

    OpenAIRE

    Gifre Renom, Lluís; Morales Alcaide, Fernando; Velasco Esteban, Luis Domingo; Ruiz Ramírez, Marc

    2016-01-01

    ABNO's OAM Handler is extended with big data analytics capabilities to anticipate traffic changes in volume and direction. Predicted traffic is used to trigger virtual network topology re-optimization. When the virtual topology needs to be reconfigured, predicted and current traffic matrices are used to find the optimal topology. A heuristic algorithm to adapt current virtual topology to meet both actual demands and expected traffic matrix is proposed. Experimental assessment is carried ou...

  14. Big Data: Implications for Health System Pharmacy.

    Science.gov (United States)

    Stokes, Laura B; Rogers, Joseph W; Hertig, John B; Weber, Robert J

    2016-07-01

    Big Data refers to datasets that are so large and complex that traditional methods and hardware for collecting, sharing, and analyzing them are not possible. Big Data that is accurate leads to more confident decision making, improved operational efficiency, and reduced costs. The rapid growth of health care information results in Big Data around health services, treatments, and outcomes, and Big Data can be used to analyze the benefit of health system pharmacy services. The goal of this article is to provide a perspective on how Big Data can be applied to health system pharmacy. It will define Big Data, describe the impact of Big Data on population health, review specific implications of Big Data in health system pharmacy, and describe an approach for pharmacy leaders to effectively use Big Data. A few strategies involved in managing Big Data in health system pharmacy include identifying potential opportunities for Big Data, prioritizing those opportunities, protecting privacy concerns, promoting data transparency, and communicating outcomes. As health care information expands in its content and becomes more integrated, Big Data can enhance the development of patient-centered pharmacy services.

  15. Generalized formal model of Big Data

    OpenAIRE

    Shakhovska, N.; Veres, O.; Hirnyak, M.

    2016-01-01

    This article dwells on the basic characteristic features of the Big Data technologies. It is analyzed the existing definition of the “big data” term. The article proposes and describes the elements of the generalized formal model of big data. It is analyzed the peculiarities of the application of the proposed model components. It is described the fundamental differences between Big Data technology and business analytics. Big Data is supported by the distributed file system Google File System ...

  16. BigWig and BigBed: enabling browsing of large distributed datasets.

    Science.gov (United States)

    Kent, W J; Zweig, A S; Barber, G; Hinrichs, A S; Karolchik, D

    2010-09-01

    BigWig and BigBed files are compressed binary indexed files containing data at several resolutions that allow the high-performance display of next-generation sequencing experiment results in the UCSC Genome Browser. The visualization is implemented using a multi-layered software approach that takes advantage of specific capabilities of web-based protocols and Linux and UNIX operating systems files, R trees and various indexing and compression tricks. As a result, only the data needed to support the current browser view is transmitted rather than the entire file, enabling fast remote access to large distributed data sets. Binaries for the BigWig and BigBed creation and parsing utilities may be downloaded at http://hgdownload.cse.ucsc.edu/admin/exe/linux.x86_64/. Source code for the creation and visualization software is freely available for non-commercial use at http://hgdownload.cse.ucsc.edu/admin/jksrc.zip, implemented in C and supported on Linux. The UCSC Genome Browser is available at http://genome.ucsc.edu.

  17. Stratification of mixtures in evaporating liquid films occurs only for a range of volume fractions of the smaller component

    Science.gov (United States)

    Sear, Richard P.

    2018-04-01

    I model the drying of a liquid film containing small and big colloid particles. Fortini et al. [Phys. Rev. Lett. 116, 118301 (2016)] studied these films with both computer simulation and experiment. They found that at the end of drying, the mixture had stratified with a layer of the smaller particles on top of the big particles. I develop a simple model for this process. The model has two ingredients: arrest of the diffusion of the particles at high density and diffusiophoretic motion of the big particles due to gradients in the volume fraction of the small particles. The model predicts that stratification only occurs over a range of initial volume fractions of the smaller colloidal species. Above and below this range, the downward diffusiophoretic motion of the big particles is too slow to remove the big particles from the top of the film, and so there is no stratification. In agreement with earlier work, the model also predicts that large Péclet numbers for drying are needed to see stratification.

  18. Big data-driven business how to use big data to win customers, beat competitors, and boost profits

    CERN Document Server

    Glass, Russell

    2014-01-01

    Get the expert perspective and practical advice on big data The Big Data-Driven Business: How to Use Big Data to Win Customers, Beat Competitors, and Boost Profits makes the case that big data is for real, and more than just big hype. The book uses real-life examples-from Nate Silver to Copernicus, and Apple to Blackberry-to demonstrate how the winners of the future will use big data to seek the truth. Written by a marketing journalist and the CEO of a multi-million-dollar B2B marketing platform that reaches more than 90% of the U.S. business population, this book is a comprehens

  19. Automatic Scaling Hadoop in the Cloud for Efficient Process of Big Geospatial Data

    Directory of Open Access Journals (Sweden)

    Zhenlong Li

    2016-09-01

    Full Text Available Efficient processing of big geospatial data is crucial for tackling global and regional challenges such as climate change and natural disasters, but it is challenging not only due to the massive data volume but also due to the intrinsic complexity and high dimensions of the geospatial datasets. While traditional computing infrastructure does not scale well with the rapidly increasing data volume, Hadoop has attracted increasing attention in geoscience communities for handling big geospatial data. Recently, many studies were carried out to investigate adopting Hadoop for processing big geospatial data, but how to adjust the computing resources to efficiently handle the dynamic geoprocessing workload was barely explored. To bridge this gap, we propose a novel framework to automatically scale the Hadoop cluster in the cloud environment to allocate the right amount of computing resources based on the dynamic geoprocessing workload. The framework and auto-scaling algorithms are introduced, and a prototype system was developed to demonstrate the feasibility and efficiency of the proposed scaling mechanism using Digital Elevation Model (DEM interpolation as an example. Experimental results show that this auto-scaling framework could (1 significantly reduce the computing resource utilization (by 80% in our example while delivering similar performance as a full-powered cluster; and (2 effectively handle the spike processing workload by automatically increasing the computing resources to ensure the processing is finished within an acceptable time. Such an auto-scaling approach provides a valuable reference to optimize the performance of geospatial applications to address data- and computational-intensity challenges in GIScience in a more cost-efficient manner.

  20. Big Game Reporting Stations

    Data.gov (United States)

    Vermont Center for Geographic Information — Point locations of big game reporting stations. Big game reporting stations are places where hunters can legally report harvested deer, bear, or turkey. These are...

  1. Big Data and Health Economics: Strengths, Weaknesses, Opportunities and Threats.

    Science.gov (United States)

    Collins, Brendan

    2016-02-01

    'Big data' is the collective name for the increasing capacity of information systems to collect and store large volumes of data, which are often unstructured and time stamped, and to analyse these data by using regression and other statistical techniques. This is a review of the potential applications of big data and health economics, using a SWOT (strengths, weaknesses, opportunities, threats) approach. In health economics, large pseudonymized databases, such as the planned care.data programme in the UK, have the potential to increase understanding of how drugs work in the real world, taking into account adherence, co-morbidities, interactions and side effects. This 'real-world evidence' has applications in individualized medicine. More routine and larger-scale cost and outcomes data collection will make health economic analyses more disease specific and population specific but may require new skill sets. There is potential for biomonitoring and lifestyle data to inform health economic analyses and public health policy.

  2. Evaluating the Open Source Data Containers for Handling Big Geospatial Raster Data

    Directory of Open Access Journals (Sweden)

    Fei Hu

    2018-04-01

    Full Text Available Big geospatial raster data pose a grand challenge to data management technologies for effective big data query and processing. To address these challenges, various big data container solutions have been developed or enhanced to facilitate data storage, retrieval, and analysis. Data containers were also developed or enhanced to handle geospatial data. For example, Rasdaman was developed to handle raster data and GeoSpark/SpatialHadoop were enhanced from Spark/Hadoop to handle vector data. However, there are few studies to systematically compare and evaluate the features and performances of these popular data containers. This paper provides a comprehensive evaluation of six popular data containers (i.e., Rasdaman, SciDB, Spark, ClimateSpark, Hive, and MongoDB for handling multi-dimensional, array-based geospatial raster datasets. Their architectures, technologies, capabilities, and performance are compared and evaluated from two perspectives: (a system design and architecture (distributed architecture, logical data model, physical data model, and data operations; and (b practical use experience and performance (data preprocessing, data uploading, query speed, and resource consumption. Four major conclusions are offered: (1 no data containers, except ClimateSpark, have good support for the HDF data format used in this paper, requiring time- and resource-consuming data preprocessing to load data; (2 SciDB, Rasdaman, and MongoDB handle small/mediate volumes of data query well, whereas Spark and ClimateSpark can handle large volumes of data with stable resource consumption; (3 SciDB and Rasdaman provide mature array-based data operation and analytical functions, while the others lack these functions for users; and (4 SciDB, Spark, and Hive have better support of user defined functions (UDFs to extend the system capability.

  3. Big Data Analytics for Flow-based Anomaly Detection in High-Speed Networks

    OpenAIRE

    Garofalo, Mauro

    2017-01-01

    The Cisco VNI Complete Forecast Highlights clearly states that the Internet traffic is growing in three different directions, Volume, Velocity, and Variety, bringing computer network into the big data era. At the same time, sophisticated network attacks are growing exponentially. Such growth making the existing signature-based security tools, like firewall and traditional intrusion detection systems, ineffective against new kind of attacks or variations of known attacks. In this dissertati...

  4. Stalin's Big Fleet Program

    National Research Council Canada - National Science Library

    Mauner, Milan

    2002-01-01

    Although Dr. Milan Hauner's study 'Stalin's Big Fleet program' has focused primarily on the formation of Big Fleets during the Tsarist and Soviet periods of Russia's naval history, there are important lessons...

  5. Five Big, Big Five Issues : Rationale, Content, Structure, Status, and Crosscultural Assessment

    NARCIS (Netherlands)

    De Raad, Boele

    1998-01-01

    This article discusses the rationale, content, structure, status, and crosscultural assessment of the Big Five trait factors, focusing on topics of dispute and misunderstanding. Taxonomic restrictions of the original Big Five forerunner, the "Norman Five," are discussed, and criticisms regarding the

  6. Big Data and HPC collocation: Using HPC idle resources for Big Data Analytics

    OpenAIRE

    MERCIER , Michael; Glesser , David; Georgiou , Yiannis; Richard , Olivier

    2017-01-01

    International audience; Executing Big Data workloads upon High Performance Computing (HPC) infrastractures has become an attractive way to improve their performances. However, the collocation of HPC and Big Data workloads is not an easy task, mainly because of their core concepts' differences. This paper focuses on the challenges related to the scheduling of both Big Data and HPC workloads on the same computing platform. In classic HPC workloads, the rigidity of jobs tends to create holes in ...

  7. Big Data as Governmentality

    DEFF Research Database (Denmark)

    Flyverbom, Mikkel; Madsen, Anders Koed; Rasche, Andreas

    This paper conceptualizes how large-scale data and algorithms condition and reshape knowledge production when addressing international development challenges. The concept of governmentality and four dimensions of an analytics of government are proposed as a theoretical framework to examine how big...... data is constituted as an aspiration to improve the data and knowledge underpinning development efforts. Based on this framework, we argue that big data’s impact on how relevant problems are governed is enabled by (1) new techniques of visualizing development issues, (2) linking aspects...... shows that big data problematizes selected aspects of traditional ways to collect and analyze data for development (e.g. via household surveys). We also demonstrate that using big data analyses to address development challenges raises a number of questions that can deteriorate its impact....

  8. Boarding to Big data

    Directory of Open Access Journals (Sweden)

    Oana Claudia BRATOSIN

    2016-05-01

    Full Text Available Today Big data is an emerging topic, as the quantity of the information grows exponentially, laying the foundation for its main challenge, the value of the information. The information value is not only defined by the value extraction from huge data sets, as fast and optimal as possible, but also by the value extraction from uncertain and inaccurate data, in an innovative manner using Big data analytics. At this point, the main challenge of the businesses that use Big data tools is to clearly define the scope and the necessary output of the business so that the real value can be gained. This article aims to explain the Big data concept, its various classifications criteria, architecture, as well as the impact in the world wide processes.

  9. Big data - a 21st century science Maginot Line? No-boundary thinking: shifting from the big data paradigm.

    Science.gov (United States)

    Huang, Xiuzhen; Jennings, Steven F; Bruce, Barry; Buchan, Alison; Cai, Liming; Chen, Pengyin; Cramer, Carole L; Guan, Weihua; Hilgert, Uwe Kk; Jiang, Hongmei; Li, Zenglu; McClure, Gail; McMullen, Donald F; Nanduri, Bindu; Perkins, Andy; Rekepalli, Bhanu; Salem, Saeed; Specker, Jennifer; Walker, Karl; Wunsch, Donald; Xiong, Donghai; Zhang, Shuzhong; Zhang, Yu; Zhao, Zhongming; Moore, Jason H

    2015-01-01

    Whether your interests lie in scientific arenas, the corporate world, or in government, you have certainly heard the praises of big data: Big data will give you new insights, allow you to become more efficient, and/or will solve your problems. While big data has had some outstanding successes, many are now beginning to see that it is not the Silver Bullet that it has been touted to be. Here our main concern is the overall impact of big data; the current manifestation of big data is constructing a Maginot Line in science in the 21st century. Big data is not "lots of data" as a phenomena anymore; The big data paradigm is putting the spirit of the Maginot Line into lots of data. Big data overall is disconnecting researchers and science challenges. We propose No-Boundary Thinking (NBT), applying no-boundary thinking in problem defining to address science challenges.

  10. Big Egos in Big Science

    DEFF Research Database (Denmark)

    Andersen, Kristina Vaarst; Jeppesen, Jacob

    In this paper we investigate the micro-mechanisms governing structural evolution and performance of scientific collaboration. Scientific discovery tends not to be lead by so called lone ?stars?, or big egos, but instead by collaboration among groups of researchers, from a multitude of institutions...

  11. Big Data and Big Science

    OpenAIRE

    Di Meglio, Alberto

    2014-01-01

    Brief introduction to the challenges of big data in scientific research based on the work done by the HEP community at CERN and how the CERN openlab promotes collaboration among research institutes and industrial IT companies. Presented at the FutureGov 2014 conference in Singapore.

  12. Challenges of Big Data Analysis.

    Science.gov (United States)

    Fan, Jianqing; Han, Fang; Liu, Han

    2014-06-01

    Big Data bring new opportunities to modern society and challenges to data scientists. On one hand, Big Data hold great promises for discovering subtle population patterns and heterogeneities that are not possible with small-scale data. On the other hand, the massive sample size and high dimensionality of Big Data introduce unique computational and statistical challenges, including scalability and storage bottleneck, noise accumulation, spurious correlation, incidental endogeneity, and measurement errors. These challenges are distinguished and require new computational and statistical paradigm. This article gives overviews on the salient features of Big Data and how these features impact on paradigm change on statistical and computational methods as well as computing architectures. We also provide various new perspectives on the Big Data analysis and computation. In particular, we emphasize on the viability of the sparsest solution in high-confidence set and point out that exogeneous assumptions in most statistical methods for Big Data can not be validated due to incidental endogeneity. They can lead to wrong statistical inferences and consequently wrong scientific conclusions.

  13. Knowledge Discovery for Smart Grid Operation, Control, and Situation Awareness -- A Big Data Visualization Platform

    Energy Technology Data Exchange (ETDEWEB)

    Gu, Yi; Jiang, Huaiguang; Zhang, Yingchen; Zhang, Jun Jason; Gao, Tianlu; Muljadi, Eduard

    2016-11-21

    In this paper, a big data visualization platform is designed to discover the hidden useful knowledge for smart grid (SG) operation, control and situation awareness. The spawn of smart sensors at both grid side and customer side can provide large volume of heterogeneous data that collect information in all time spectrums. Extracting useful knowledge from this big-data poll is still challenging. In this paper, the Apache Spark, an open source cluster computing framework, is used to process the big-data to effectively discover the hidden knowledge. A high-speed communication architecture utilizing the Open System Interconnection (OSI) model is designed to transmit the data to a visualization platform. This visualization platform uses Google Earth, a global geographic information system (GIS) to link the geological information with the SG knowledge and visualize the information in user defined fashion. The University of Denver's campus grid is used as a SG test bench and several demonstrations are presented for the proposed platform.

  14. Big data is not a monolith

    CERN Document Server

    Ekbia, Hamid R; Mattioli, Michael

    2016-01-01

    Big data is ubiquitous but heterogeneous. Big data can be used to tally clicks and traffic on web pages, find patterns in stock trades, track consumer preferences, identify linguistic correlations in large corpuses of texts. This book examines big data not as an undifferentiated whole but contextually, investigating the varied challenges posed by big data for health, science, law, commerce, and politics. Taken together, the chapters reveal a complex set of problems, practices, and policies. The advent of big data methodologies has challenged the theory-driven approach to scientific knowledge in favor of a data-driven one. Social media platforms and self-tracking tools change the way we see ourselves and others. The collection of data by corporations and government threatens privacy while promoting transparency. Meanwhile, politicians, policy makers, and ethicists are ill-prepared to deal with big data's ramifications. The contributors look at big data's effect on individuals as it exerts social control throu...

  15. Big universe, big data

    DEFF Research Database (Denmark)

    Kremer, Jan; Stensbo-Smidt, Kristoffer; Gieseke, Fabian Cristian

    2017-01-01

    , modern astronomy requires big data know-how, in particular it demands highly efficient machine learning and image analysis algorithms. But scalability is not the only challenge: Astronomy applications touch several current machine learning research questions, such as learning from biased data and dealing......, and highlight some recent methodological advancements in machine learning and image analysis triggered by astronomical applications....

  16. Real-time analysis of healthcare using big data analytics

    Science.gov (United States)

    Basco, J. Antony; Senthilkumar, N. C.

    2017-11-01

    Big Data Analytics (BDA) provides a tremendous advantage where there is a need of revolutionary performance in handling large amount of data that covers 4 characteristics such as Volume Velocity Variety Veracity. BDA has the ability to handle such dynamic data providing functioning effectiveness and exceptionally beneficial output in several day to day applications for various organizations. Healthcare is one of the sectors which generate data constantly covering all four characteristics with outstanding growth. There are several challenges in processing patient records which deals with variety of structured and unstructured format. Inducing BDA in to Healthcare (HBDA) will deal with sensitive patient driven information mostly in unstructured format comprising of prescriptions, reports, data from imaging system, etc., the challenges will be overcome by big data with enhanced efficiency in fetching and storing of data. In this project, dataset alike Electronic Medical Records (EMR) produced from numerous medical devices and mobile applications will be induced into MongoDB using Hadoop framework with Improvised processing technique to improve outcome of processing patient records.

  17. Poker Player Behavior After Big Wins and Big Losses

    OpenAIRE

    Gary Smith; Michael Levere; Robert Kurtzman

    2009-01-01

    We find that experienced poker players typically change their style of play after winning or losing a big pot--most notably, playing less cautiously after a big loss, evidently hoping for lucky cards that will erase their loss. This finding is consistent with Kahneman and Tversky's (Kahneman, D., A. Tversky. 1979. Prospect theory: An analysis of decision under risk. Econometrica 47(2) 263-292) break-even hypothesis and suggests that when investors incur a large loss, it might be time to take ...

  18. Earth Science Data Analysis in the Era of Big Data

    Science.gov (United States)

    Kuo, K.-S.; Clune, T. L.; Ramachandran, R.

    2014-01-01

    Anyone with even a cursory interest in information technology cannot help but recognize that "Big Data" is one of the most fashionable catchphrases of late. From accurate voice and facial recognition, language translation, and airfare prediction and comparison, to monitoring the real-time spread of flu, Big Data techniques have been applied to many seemingly intractable problems with spectacular successes. They appear to be a rewarding way to approach many currently unsolved problems. Few fields of research can claim a longer history with problems involving voluminous data than Earth science. The problems we are facing today with our Earth's future are more complex and carry potentially graver consequences than the examples given above. How has our climate changed? Beside natural variations, what is causing these changes? What are the processes involved and through what mechanisms are these connected? How will they impact life as we know it? In attempts to answer these questions, we have resorted to observations and numerical simulations with ever-finer resolutions, which continue to feed the "data deluge." Plausibly, many Earth scientists are wondering: How will Big Data technologies benefit Earth science research? As an example from the global water cycle, one subdomain among many in Earth science, how would these technologies accelerate the analysis of decades of global precipitation to ascertain the changes in its characteristics, to validate these changes in predictive climate models, and to infer the implications of these changes to ecosystems, economies, and public health? Earth science researchers need a viable way to harness the power of Big Data technologies to analyze large volumes and varieties of data with velocity and veracity. Beyond providing speedy data analysis capabilities, Big Data technologies can also play a crucial, albeit indirect, role in boosting scientific productivity by facilitating effective collaboration within an analysis environment

  19. Implementing Operational Analytics using Big Data Technologies to Detect and Predict Sensor Anomalies

    Science.gov (United States)

    Coughlin, J.; Mital, R.; Nittur, S.; SanNicolas, B.; Wolf, C.; Jusufi, R.

    2016-09-01

    Operational analytics when combined with Big Data technologies and predictive techniques have been shown to be valuable in detecting mission critical sensor anomalies that might be missed by conventional analytical techniques. Our approach helps analysts and leaders make informed and rapid decisions by analyzing large volumes of complex data in near real-time and presenting it in a manner that facilitates decision making. It provides cost savings by being able to alert and predict when sensor degradations pass a critical threshold and impact mission operations. Operational analytics, which uses Big Data tools and technologies, can process very large data sets containing a variety of data types to uncover hidden patterns, unknown correlations, and other relevant information. When combined with predictive techniques, it provides a mechanism to monitor and visualize these data sets and provide insight into degradations encountered in large sensor systems such as the space surveillance network. In this study, data from a notional sensor is simulated and we use big data technologies, predictive algorithms and operational analytics to process the data and predict sensor degradations. This study uses data products that would commonly be analyzed at a site. This study builds on a big data architecture that has previously been proven valuable in detecting anomalies. This paper outlines our methodology of implementing an operational analytic solution through data discovery, learning and training of data modeling and predictive techniques, and deployment. Through this methodology, we implement a functional architecture focused on exploring available big data sets and determine practical analytic, visualization, and predictive technologies.

  20. Anticipated Changes in Conducting Scientific Data-Analysis Research in the Big-Data Era

    Science.gov (United States)

    Kuo, Kwo-Sen; Seablom, Michael; Clune, Thomas; Ramachandran, Rahul

    2014-05-01

    A Big-Data environment is one that is capable of orchestrating quick-turnaround analyses involving large volumes of data for numerous simultaneous users. Based on our experiences with a prototype Big-Data analysis environment, we anticipate some important changes in research behaviors and processes while conducting scientific data-analysis research in the near future as such Big-Data environments become the mainstream. The first anticipated change will be the reduced effort and difficulty in most parts of the data management process. A Big-Data analysis environment is likely to house most of the data required for a particular research discipline along with appropriate analysis capabilities. This will reduce the need for researchers to download local copies of data. In turn, this also reduces the need for compute and storage procurement by individual researchers or groups, as well as associated maintenance and management afterwards. It is almost certain that Big-Data environments will require a different "programming language" to fully exploit the latent potential. In addition, the process of extending the environment to provide new analysis capabilities will likely be more involved than, say, compiling a piece of new or revised code. We thus anticipate that researchers will require support from dedicated organizations associated with the environment that are composed of professional software engineers and data scientists. A major benefit will likely be that such extensions are of higher-quality and broader applicability than ad hoc changes by physical scientists. Another anticipated significant change is improved collaboration among the researchers using the same environment. Since the environment is homogeneous within itself, many barriers to collaboration are minimized or eliminated. For example, data and analysis algorithms can be seamlessly shared, reused and re-purposed. In conclusion, we will be able to achieve a new level of scientific productivity in the

  1. Anticipated Changes in Conducting Scientific Data-Analysis Research in the Big-Data Era

    Science.gov (United States)

    Kuo, Kwo-Sen; Seablom, Michael; Clune, Thomas; Ramachandran, Rahul

    2014-01-01

    A Big-Data environment is one that is capable of orchestrating quick-turnaround analyses involving large volumes of data for numerous simultaneous users. Based on our experiences with a prototype Big-Data analysis environment, we anticipate some important changes in research behaviors and processes while conducting scientific data-analysis research in the near future as such Big-Data environments become the mainstream. The first anticipated change will be the reduced effort and difficulty in most parts of the data management process. A Big-Data analysis environment is likely to house most of the data required for a particular research discipline along with appropriate analysis capabilities. This will reduce the need for researchers to download local copies of data. In turn, this also reduces the need for compute and storage procurement by individual researchers or groups, as well as associated maintenance and management afterwards. It is almost certain that Big-Data environments will require a different "programming language" to fully exploit the latent potential. In addition, the process of extending the environment to provide new analysis capabilities will likely be more involved than, say, compiling a piece of new or revised code.We thus anticipate that researchers will require support from dedicated organizations associated with the environment that are composed of professional software engineers and data scientists. A major benefit will likely be that such extensions are of higherquality and broader applicability than ad hoc changes by physical scientists. Another anticipated significant change is improved collaboration among the researchers using the same environment. Since the environment is homogeneous within itself, many barriers to collaboration are minimized or eliminated. For example, data and analysis algorithms can be seamlessly shared, reused and re-purposed. In conclusion, we will be able to achieve a new level of scientific productivity in the Big

  2. Big data in Finnish financial services

    OpenAIRE

    Laurila, M. (Mikko)

    2017-01-01

    Abstract This thesis aims to explore the concept of big data, and create understanding of big data maturity in the Finnish financial services industry. The research questions of this thesis are “What kind of big data solutions are being implemented in the Finnish financial services sector?” and “Which factors impede faster implementation of big data solutions in the Finnish financial services sector?”. ...

  3. Big data in fashion industry

    Science.gov (United States)

    Jain, S.; Bruniaux, J.; Zeng, X.; Bruniaux, P.

    2017-10-01

    Significant work has been done in the field of big data in last decade. The concept of big data includes analysing voluminous data to extract valuable information. In the fashion world, big data is increasingly playing a part in trend forecasting, analysing consumer behaviour, preference and emotions. The purpose of this paper is to introduce the term fashion data and why it can be considered as big data. It also gives a broad classification of the types of fashion data and briefly defines them. Also, the methodology and working of a system that will use this data is briefly described.

  4. Big Cat Coalitions: A Comparative Analysis of Regional Brain Volumes in Felidae.

    Science.gov (United States)

    Sakai, Sharleen T; Arsznov, Bradley M; Hristova, Ani E; Yoon, Elise J; Lundrigan, Barbara L

    2016-01-01

    Broad-based species comparisons across mammalian orders suggest a number of factors that might influence the evolution of large brains. However, the relationship between these factors and total and regional brain size remains unclear. This study investigated the relationship between relative brain size and regional brain volumes and sociality in 13 felid species in hopes of revealing relationships that are not detected in more inclusive comparative studies. In addition, a more detailed analysis was conducted of four focal species: lions ( Panthera leo ), leopards ( Panthera pardus ), cougars ( Puma concolor ), and cheetahs ( Acinonyx jubatus ). These species differ markedly in sociality and behavioral flexibility, factors hypothesized to contribute to increased relative brain size and/or frontal cortex size. Lions are the only truly social species, living in prides. Although cheetahs are largely solitary, males often form small groups. Both leopards and cougars are solitary. Of the four species, leopards exhibit the most behavioral flexibility, readily adapting to changing circumstances. Regional brain volumes were analyzed using computed tomography. Skulls ( n = 75) were scanned to create three-dimensional virtual endocasts, and regional brain volumes were measured using either sulcal or bony landmarks obtained from the endocasts or skulls. Phylogenetic least squares regression analyses found that sociality does not correspond with larger relative brain size in these species. However, the sociality/solitary variable significantly predicted anterior cerebrum (AC) volume, a region that includes frontal cortex. This latter finding is despite the fact that the two social species in our sample, lions and cheetahs, possess the largest and smallest relative AC volumes, respectively. Additionally, an ANOVA comparing regional brain volumes in four focal species revealed that lions and leopards, while not significantly different from one another, have relatively larger AC

  5. Big Cat Coalitions: A comparative analysis of regional brain volumes in Felidae

    Directory of Open Access Journals (Sweden)

    Sharleen T Sakai

    2016-10-01

    Full Text Available Broad-based species comparisons across mammalian orders suggest a number of factors that might influence the evolution of large brains. However, the relationship between these factors and total and regional brain size remains unclear. This study investigated the relationship between relative brain size and regional brain volumes and sociality in 13 felid species in hopes of revealing relationships that are not detected in more inclusive comparative studies. In addition, a more detailed analysis was conducted of 4 focal species: lions (Panthera leo, leopards (Panthera pardus, cougars (Puma concolor, and cheetahs (Acinonyx jubatus. These species differ markedly in sociality and behavioral flexibility, factors hypothesized to contribute to increased relative brain size and/or frontal cortex size. Lions are the only truly social species, living in prides. Although cheetahs are largely solitary, males often form small groups. Both leopards and cougars are solitary. Of the four species, leopards exhibit the most behavioral flexibility, readily adapting to changing circumstances. Regional brain volumes were analyzed using computed tomography (CT. Skulls (n=75 were scanned to create three-dimensional virtual endocasts, and regional brain volumes were measured using either sulcal or bony landmarks obtained from the endocasts or skulls. Phylogenetic least squares (PGLS regression analyses found that sociality does not correspond with larger relative brain size in these species. However, the sociality/solitary variable significantly predicted anterior cerebrum (AC volume, a region that includes frontal cortex. This latter finding is despite the fact that the two social species in our sample, lions and cheetahs, possess the largest and smallest relative AC volumes, respectively. Additionally, an ANOVA comparing regional brain volumes in 4 focal species revealed that lions and leopards, while not significantly different from one another, have relatively

  6. Big data bioinformatics.

    Science.gov (United States)

    Greene, Casey S; Tan, Jie; Ung, Matthew; Moore, Jason H; Cheng, Chao

    2014-12-01

    Recent technological advances allow for high throughput profiling of biological systems in a cost-efficient manner. The low cost of data generation is leading us to the "big data" era. The availability of big data provides unprecedented opportunities but also raises new challenges for data mining and analysis. In this review, we introduce key concepts in the analysis of big data, including both "machine learning" algorithms as well as "unsupervised" and "supervised" examples of each. We note packages for the R programming language that are available to perform machine learning analyses. In addition to programming based solutions, we review webservers that allow users with limited or no programming background to perform these analyses on large data compendia. © 2014 Wiley Periodicals, Inc.

  7. Changing the personality of a face: Perceived Big Two and Big Five personality factors modeled in real photographs.

    Science.gov (United States)

    Walker, Mirella; Vetter, Thomas

    2016-04-01

    General, spontaneous evaluations of strangers based on their faces have been shown to reflect judgments of these persons' intention and ability to harm. These evaluations can be mapped onto a 2D space defined by the dimensions trustworthiness (intention) and dominance (ability). Here we go beyond general evaluations and focus on more specific personality judgments derived from the Big Two and Big Five personality concepts. In particular, we investigate whether Big Two/Big Five personality judgments can be mapped onto the 2D space defined by the dimensions trustworthiness and dominance. Results indicate that judgments of the Big Two personality dimensions almost perfectly map onto the 2D space. In contrast, at least 3 of the Big Five dimensions (i.e., neuroticism, extraversion, and conscientiousness) go beyond the 2D space, indicating that additional dimensions are necessary to describe more specific face-based personality judgments accurately. Building on this evidence, we model the Big Two/Big Five personality dimensions in real facial photographs. Results from 2 validation studies show that the Big Two/Big Five are perceived reliably across different samples of faces and participants. Moreover, results reveal that participants differentiate reliably between the different Big Two/Big Five dimensions. Importantly, this high level of agreement and differentiation in personality judgments from faces likely creates a subjective reality which may have serious consequences for those being perceived-notably, these consequences ensue because the subjective reality is socially shared, irrespective of the judgments' validity. The methodological approach introduced here might prove useful in various psychological disciplines. (PsycINFO Database Record (c) 2016 APA, all rights reserved).

  8. The BigBOSS Experiment

    Energy Technology Data Exchange (ETDEWEB)

    Schelgel, D.; Abdalla, F.; Abraham, T.; Ahn, C.; Allende Prieto, C.; Annis, J.; Aubourg, E.; Azzaro, M.; Bailey, S.; Baltay, C.; Baugh, C.; /APC, Paris /Brookhaven /IRFU, Saclay /Marseille, CPPM /Marseille, CPT /Durham U. / /IEU, Seoul /Fermilab /IAA, Granada /IAC, La Laguna

    2011-01-01

    BigBOSS will obtain observational constraints that will bear on three of the four 'science frontier' questions identified by the Astro2010 Cosmology and Fundamental Phyics Panel of the Decadal Survey: Why is the universe accelerating; what is dark matter and what are the properties of neutrinos? Indeed, the BigBOSS project was recommended for substantial immediate R and D support the PASAG report. The second highest ground-based priority from the Astro2010 Decadal Survey was the creation of a funding line within the NSF to support a 'Mid-Scale Innovations' program, and it used BigBOSS as a 'compelling' example for support. This choice was the result of the Decadal Survey's Program Priorization panels reviewing 29 mid-scale projects and recommending BigBOSS 'very highly'.

  9. Big game hunting practices, meanings, motivations and constraints: a survey of Oregon big game hunters

    Science.gov (United States)

    Suresh K. Shrestha; Robert C. Burns

    2012-01-01

    We conducted a self-administered mail survey in September 2009 with randomly selected Oregon hunters who had purchased big game hunting licenses/tags for the 2008 hunting season. Survey questions explored hunting practices, the meanings of and motivations for big game hunting, the constraints to big game hunting participation, and the effects of age, years of hunting...

  10. The challenge of big data in public health: an opportunity for visual analytics.

    Science.gov (United States)

    Ola, Oluwakemi; Sedig, Kamran

    2014-01-01

    Public health (PH) data can generally be characterized as big data. The efficient and effective use of this data determines the extent to which PH stakeholders can sufficiently address societal health concerns as they engage in a variety of work activities. As stakeholders interact with data, they engage in various cognitive activities such as analytical reasoning, decision-making, interpreting, and problem solving. Performing these activities with big data is a challenge for the unaided mind as stakeholders encounter obstacles relating to the data's volume, variety, velocity, and veracity. Such being the case, computer-based information tools are needed to support PH stakeholders. Unfortunately, while existing computational tools are beneficial in addressing certain work activities, they fall short in supporting cognitive activities that involve working with large, heterogeneous, and complex bodies of data. This paper presents visual analytics (VA) tools, a nascent category of computational tools that integrate data analytics with interactive visualizations, to facilitate the performance of cognitive activities involving big data. Historically, PH has lagged behind other sectors in embracing new computational technology. In this paper, we discuss the role that VA tools can play in addressing the challenges presented by big data. In doing so, we demonstrate the potential benefit of incorporating VA tools into PH practice, in addition to highlighting the need for further systematic and focused research.

  11. Google BigQuery analytics

    CERN Document Server

    Tigani, Jordan

    2014-01-01

    How to effectively use BigQuery, avoid common mistakes, and execute sophisticated queries against large datasets Google BigQuery Analytics is the perfect guide for business and data analysts who want the latest tips on running complex queries and writing code to communicate with the BigQuery API. The book uses real-world examples to demonstrate current best practices and techniques, and also explains and demonstrates streaming ingestion, transformation via Hadoop in Google Compute engine, AppEngine datastore integration, and using GViz with Tableau to generate charts of query results. In addit

  12. Big data for dummies

    CERN Document Server

    Hurwitz, Judith; Halper, Fern; Kaufman, Marcia

    2013-01-01

    Find the right big data solution for your business or organization Big data management is one of the major challenges facing business, industry, and not-for-profit organizations. Data sets such as customer transactions for a mega-retailer, weather patterns monitored by meteorologists, or social network activity can quickly outpace the capacity of traditional data management tools. If you need to develop or manage big data solutions, you'll appreciate how these four experts define, explain, and guide you through this new and often confusing concept. You'll learn what it is, why it m

  13. BIGCHEM: Challenges and Opportunities for Big Data Analysis in Chemistry.

    Science.gov (United States)

    Tetko, Igor V; Engkvist, Ola; Koch, Uwe; Reymond, Jean-Louis; Chen, Hongming

    2016-12-01

    The increasing volume of biomedical data in chemistry and life sciences requires the development of new methods and approaches for their handling. Here, we briefly discuss some challenges and opportunities of this fast growing area of research with a focus on those to be addressed within the BIGCHEM project. The article starts with a brief description of some available resources for "Big Data" in chemistry and a discussion of the importance of data quality. We then discuss challenges with visualization of millions of compounds by combining chemical and biological data, the expectations from mining the "Big Data" using advanced machine-learning methods, and their applications in polypharmacology prediction and target de-convolution in phenotypic screening. We show that the efficient exploration of billions of molecules requires the development of smart strategies. We also address the issue of secure information sharing without disclosing chemical structures, which is critical to enable bi-party or multi-party data sharing. Data sharing is important in the context of the recent trend of "open innovation" in pharmaceutical industry, which has led to not only more information sharing among academics and pharma industries but also the so-called "precompetitive" collaboration between pharma companies. At the end we highlight the importance of education in "Big Data" for further progress of this area. © 2016 The Authors. Published by Wiley-VCH Verlag GmbH & Co. KGaA.

  14. Was there a big bang

    International Nuclear Information System (INIS)

    Narlikar, J.

    1981-01-01

    In discussing the viability of the big-bang model of the Universe relative evidence is examined including the discrepancies in the age of the big-bang Universe, the red shifts of quasars, the microwave background radiation, general theory of relativity aspects such as the change of the gravitational constant with time, and quantum theory considerations. It is felt that the arguments considered show that the big-bang picture is not as soundly established, either theoretically or observationally, as it is usually claimed to be, that the cosmological problem is still wide open and alternatives to the standard big-bang picture should be seriously investigated. (U.K.)

  15. BIG DATA-DRIVEN MARKETING: AN ABSTRACT

    OpenAIRE

    Suoniemi, Samppa; Meyer-Waarden, Lars; Munzel, Andreas

    2017-01-01

    Customer information plays a key role in managing successful relationships with valuable customers. Big data customer analytics use (BD use), i.e., the extent to which customer information derived from big data analytics guides marketing decisions, helps firms better meet customer needs for competitive advantage. This study addresses three research questions: What are the key antecedents of big data customer analytics use? How, and to what extent, does big data customer an...

  16. The trashing of Big Green

    International Nuclear Information System (INIS)

    Felten, E.

    1990-01-01

    The Big Green initiative on California's ballot lost by a margin of 2-to-1. Green measures lost in five other states, shocking ecology-minded groups. According to the postmortem by environmentalists, Big Green was a victim of poor timing and big spending by the opposition. Now its supporters plan to break up the bill and try to pass some provisions in the Legislature

  17. The Big Bang Singularity

    Science.gov (United States)

    Ling, Eric

    The big bang theory is a model of the universe which makes the striking prediction that the universe began a finite amount of time in the past at the so called "Big Bang singularity." We explore the physical and mathematical justification of this surprising result. After laying down the framework of the universe as a spacetime manifold, we combine physical observations with global symmetrical assumptions to deduce the FRW cosmological models which predict a big bang singularity. Next we prove a couple theorems due to Stephen Hawking which show that the big bang singularity exists even if one removes the global symmetrical assumptions. Lastly, we investigate the conditions one needs to impose on a spacetime if one wishes to avoid a singularity. The ideas and concepts used here to study spacetimes are similar to those used to study Riemannian manifolds, therefore we compare and contrast the two geometries throughout.

  18. Reframing Open Big Data

    DEFF Research Database (Denmark)

    Marton, Attila; Avital, Michel; Jensen, Tina Blegind

    2013-01-01

    Recent developments in the techniques and technologies of collecting, sharing and analysing data are challenging the field of information systems (IS) research let alone the boundaries of organizations and the established practices of decision-making. Coined ‘open data’ and ‘big data......’, these developments introduce an unprecedented level of societal and organizational engagement with the potential of computational data to generate new insights and information. Based on the commonalities shared by open data and big data, we develop a research framework that we refer to as open big data (OBD......) by employing the dimensions of ‘order’ and ‘relationality’. We argue that these dimensions offer a viable approach for IS research on open and big data because they address one of the core value propositions of IS; i.e. how to support organizing with computational data. We contrast these dimensions with two...

  19. Práticas linguísticas em Big Data

    OpenAIRE

    Vinícius Vargas Vieira dos Santos

    2017-01-01

    RESUMO: O presente artigo objetiva tratar possíveis relações entre novas mídias digitais e certos aspectos conceituais da linguagem, como significado e performatividade. Big data é o termo que se refere ao acúmulo de dados digitais que caracterizou as mídias de comunicação em massa nas duas últimas décadas e está diretamente relacionado à atual configuração da plataforma de serviços de tecnologia Web 2.0. As escalas de desmedido volume e variedade de dados digitais e altos índices de velocida...

  20. Big data in pharmacy practice: current use, challenges, and the future

    Directory of Open Access Journals (Sweden)

    Ma C

    2015-08-01

    Full Text Available Carolyn Ma, Helen Wong Smith, Cherie Chu, Deborah T JuarezDepartment of Pharmacy Practice, The Daniel K Inouye College of Pharmacy, University of Hawai'i at Hilo, Hilo, HI, USAAbstract: Pharmacy informatics is defined as the use and integration of data, information, knowledge, technology, and automation in the medication-use process for the purpose of improving health outcomes. The term “big data” has been coined and is often defined in three V's: volume, velocity, and variety. This paper describes three major areas in which pharmacy utilizes big data, including: 1 informed decision making (clinical pathways and clinical practice guidelines; 2 improved care delivery in health care settings such as hospitals and community pharmacy practice settings; and 3 quality performance measurement for the Centers for Medicare and Medicaid and medication management activities such as tracking medication adherence and medication reconciliation.Keywords: clinical pharmacy data base, pharmacy informatics, patient outcomes

  1. What is beyond the big five?

    Science.gov (United States)

    Saucier, G; Goldberg, L R

    1998-08-01

    Previous investigators have proposed that various kinds of person-descriptive content--such as differences in attitudes or values, in sheer evaluation, in attractiveness, or in height and girth--are not adequately captured by the Big Five Model. We report on a rather exhaustive search for reliable sources of Big Five-independent variation in data from person-descriptive adjectives. Fifty-three candidate clusters were developed in a college sample using diverse approaches and sources. In a nonstudent adult sample, clusters were evaluated with respect to a minimax criterion: minimum multiple correlation with factors from Big Five markers and maximum reliability. The most clearly Big Five-independent clusters referred to Height, Girth, Religiousness, Employment Status, Youthfulness and Negative Valence (or low-base-rate attributes). Clusters referring to Fashionableness, Sensuality/Seductiveness, Beauty, Masculinity, Frugality, Humor, Wealth, Prejudice, Folksiness, Cunning, and Luck appeared to be potentially beyond the Big Five, although each of these clusters demonstrated Big Five multiple correlations of .30 to .45, and at least one correlation of .20 and over with a Big Five factor. Of all these content areas, Religiousness, Negative Valence, and the various aspects of Attractiveness were found to be represented by a substantial number of distinct, common adjectives. Results suggest directions for supplementing the Big Five when one wishes to extend variable selection outside the domain of personality traits as conventionally defined.

  2. Big Data Analytics and Its Applications

    Directory of Open Access Journals (Sweden)

    Mashooque A. Memon

    2017-10-01

    Full Text Available The term, Big Data, has been authored to refer to the extensive heave of data that can't be managed by traditional data handling methods or techniques. The field of Big Data plays an indispensable role in various fields, such as agriculture, banking, data mining, education, chemistry, finance, cloud computing, marketing, health care stocks. Big data analytics is the method for looking at big data to reveal hidden patterns, incomprehensible relationship and other important data that can be utilize to resolve on enhanced decisions. There has been a perpetually expanding interest for big data because of its fast development and since it covers different areas of applications. Apache Hadoop open source technology created in Java and keeps running on Linux working framework was used. The primary commitment of this exploration is to display an effective and free solution for big data application in a distributed environment, with its advantages and indicating its easy use. Later on, there emerge to be a required for an analytical review of new developments in the big data technology. Healthcare is one of the best concerns of the world. Big data in healthcare imply to electronic health data sets that are identified with patient healthcare and prosperity. Data in the healthcare area is developing past managing limit of the healthcare associations and is relied upon to increment fundamentally in the coming years.

  3. Measuring the Promise of Big Data Syllabi

    Science.gov (United States)

    Friedman, Alon

    2018-01-01

    Growing interest in Big Data is leading industries, academics and governments to accelerate Big Data research. However, how teachers should teach Big Data has not been fully examined. This article suggests criteria for redesigning Big Data syllabi in public and private degree-awarding higher education establishments. The author conducted a survey…

  4. 77 FR 27245 - Big Stone National Wildlife Refuge, Big Stone and Lac Qui Parle Counties, MN

    Science.gov (United States)

    2012-05-09

    ... DEPARTMENT OF THE INTERIOR Fish and Wildlife Service [FWS-R3-R-2012-N069; FXRS1265030000S3-123-FF03R06000] Big Stone National Wildlife Refuge, Big Stone and Lac Qui Parle Counties, MN AGENCY: Fish and... plan (CCP) and environmental assessment (EA) for Big Stone National Wildlife Refuge (Refuge, NWR) for...

  5. The BigBoss Experiment

    Energy Technology Data Exchange (ETDEWEB)

    Schelgel, D.; Abdalla, F.; Abraham, T.; Ahn, C.; Allende Prieto, C.; Annis, J.; Aubourg, E.; Azzaro, M.; Bailey, S.; Baltay, C.; Baugh, C.; Bebek, C.; Becerril, S.; Blanton, M.; Bolton, A.; Bromley, B.; Cahn, R.; Carton, P.-H.; Cervanted-Cota, J.L.; Chu, Y.; Cortes, M.; /APC, Paris /Brookhaven /IRFU, Saclay /Marseille, CPPM /Marseille, CPT /Durham U. / /IEU, Seoul /Fermilab /IAA, Granada /IAC, La Laguna / /IAC, Mexico / / /Madrid, IFT /Marseille, Lab. Astrophys. / / /New York U. /Valencia U.

    2012-06-07

    BigBOSS is a Stage IV ground-based dark energy experiment to study baryon acoustic oscillations (BAO) and the growth of structure with a wide-area galaxy and quasar redshift survey over 14,000 square degrees. It has been conditionally accepted by NOAO in response to a call for major new instrumentation and a high-impact science program for the 4-m Mayall telescope at Kitt Peak. The BigBOSS instrument is a robotically-actuated, fiber-fed spectrograph capable of taking 5000 simultaneous spectra over a wavelength range from 340 nm to 1060 nm, with a resolution R = {lambda}/{Delta}{lambda} = 3000-4800. Using data from imaging surveys that are already underway, spectroscopic targets are selected that trace the underlying dark matter distribution. In particular, targets include luminous red galaxies (LRGs) up to z = 1.0, extending the BOSS LRG survey in both redshift and survey area. To probe the universe out to even higher redshift, BigBOSS will target bright [OII] emission line galaxies (ELGs) up to z = 1.7. In total, 20 million galaxy redshifts are obtained to measure the BAO feature, trace the matter power spectrum at smaller scales, and detect redshift space distortions. BigBOSS will provide additional constraints on early dark energy and on the curvature of the universe by measuring the Ly-alpha forest in the spectra of over 600,000 2.2 < z < 3.5 quasars. BigBOSS galaxy BAO measurements combined with an analysis of the broadband power, including the Ly-alpha forest in BigBOSS quasar spectra, achieves a FOM of 395 with Planck plus Stage III priors. This FOM is based on conservative assumptions for the analysis of broad band power (k{sub max} = 0.15), and could grow to over 600 if current work allows us to push the analysis to higher wave numbers (k{sub max} = 0.3). BigBOSS will also place constraints on theories of modified gravity and inflation, and will measure the sum of neutrino masses to 0.024 eV accuracy.

  6. Achieving privacy-preserving big data aggregation with fault tolerance in smart grid

    OpenAIRE

    Zhitao Guan; Guanlin Si

    2017-01-01

    In a smart grid, a huge amount of data is collected for various applications, such as load monitoring and demand response. These data are used for analyzing the power state and formulating the optimal dispatching strategy. However, these big energy data in terms of volume, velocity and variety raise concern over consumers’ privacy. For instance, in order to optimize energy utilization and support demand response, numerous smart meters are installed at a consumer's home to collect energy consu...

  7. Big data and educational research

    OpenAIRE

    Beneito-Montagut, Roser

    2017-01-01

    Big data and data analytics offer the promise to enhance teaching and learning, improve educational research and progress education governance. This chapter aims to contribute to the conceptual and methodological understanding of big data and analytics within educational research. It describes the opportunities and challenges that big data and analytics bring to education as well as critically explore the perils of applying a data driven approach to education. Despite the claimed value of the...

  8. Thick-Big Descriptions

    DEFF Research Database (Denmark)

    Lai, Signe Sophus

    The paper discusses the rewards and challenges of employing commercial audience measurements data – gathered by media industries for profitmaking purposes – in ethnographic research on the Internet in everyday life. It questions claims to the objectivity of big data (Anderson 2008), the assumption...... communication systems, language and behavior appear as texts, outputs, and discourses (data to be ‘found’) – big data then documents things that in earlier research required interviews and observations (data to be ‘made’) (Jensen 2014). However, web-measurement enterprises build audiences according...... to a commercial logic (boyd & Crawford 2011) and is as such directed by motives that call for specific types of sellable user data and specific segmentation strategies. In combining big data and ‘thick descriptions’ (Geertz 1973) scholars need to question how ethnographic fieldwork might map the ‘data not seen...

  9. A conceptual review of XBRL in relation to Big Data

    DEFF Research Database (Denmark)

    Krisko, Adam

    The technological developments of the last couple of decades have led to remarkable changes in the role of data within society. Data continues to grow, and the advances in the field of information technology (IT) further accelerate the process. This unfolding phenomenon has resulted in the emerge......The technological developments of the last couple of decades have led to remarkable changes in the role of data within society. Data continues to grow, and the advances in the field of information technology (IT) further accelerate the process. This unfolding phenomenon has resulted...... the increasingly important role of Big Data within the accounting domain, by discussing the potential benefits and challenges the expansion in the volume, velocity, and variety of accounting data carries. One of the possible responses to the potential changes Big Data might foster within accounting...... is the utilisation of eXtensible Business Reporting Language (XBRL). XBRL is an open standard, free of license fees electronic language for communicating financial and business information. Storing data in XBRL format enables it to be machine-readable, and standardises the financial terms through XBRL taxonomy...

  10. Big Data, Data Analyst, and Improving the Competence of Librarian

    Directory of Open Access Journals (Sweden)

    Albertus Pramukti Narendra

    2016-01-01

    Full Text Available Issue of Big Data was already raised by Fremont Rider, an American Librarian from Westleyan University, in 1944. He predicted that the volume of American universities collection would reach 200 million copies in 2040. As a result, it brings to fore multiple issues such as big data users, storage capacity, and the need to have data analysts. In Indonesia, data analysts is still a rare profession, and therefore urgently needed. One of its distinctive tasks is to conduct visual analyses from various data resources and also to present the result visually as interesting knowledge. It becomes science enliven by interactive visualization. In response to the issue, librarians have already been equipped with basic information management. Yet, they can see the opportunity and improve themselves as data analysts. In developed countries, it is common that librarian are also regarded as data analysts. They enhance themselves with various skills required, such as cloud computing and smart computing. In the end librarian with data analysts competency are eloquent to extract and present complex data resources as interesting and discernible knowledge.

  11. Big Data, indispensable today

    Directory of Open Access Journals (Sweden)

    Radu-Ioan ENACHE

    2015-10-01

    Full Text Available Big data is and will be used more in the future as a tool for everything that happens both online and offline. Of course , online is a real hobbit, Big Data is found in this medium , offering many advantages , being a real help for all consumers. In this paper we talked about Big Data as being a plus in developing new applications, by gathering useful information about the users and their behaviour.We've also presented the key aspects of real-time monitoring and the architecture principles of this technology. The most important benefit brought to this paper is presented in the cloud section.

  12. Antigravity and the big crunch/big bang transition

    Science.gov (United States)

    Bars, Itzhak; Chen, Shih-Hung; Steinhardt, Paul J.; Turok, Neil

    2012-08-01

    We point out a new phenomenon which seems to be generic in 4d effective theories of scalar fields coupled to Einstein gravity, when applied to cosmology. A lift of such theories to a Weyl-invariant extension allows one to define classical evolution through cosmological singularities unambiguously, and hence construct geodesically complete background spacetimes. An attractor mechanism ensures that, at the level of the effective theory, generic solutions undergo a big crunch/big bang transition by contracting to zero size, passing through a brief antigravity phase, shrinking to zero size again, and re-emerging into an expanding normal gravity phase. The result may be useful for the construction of complete bouncing cosmologies like the cyclic model.

  13. Antigravity and the big crunch/big bang transition

    Energy Technology Data Exchange (ETDEWEB)

    Bars, Itzhak [Department of Physics and Astronomy, University of Southern California, Los Angeles, CA 90089-2535 (United States); Chen, Shih-Hung [Perimeter Institute for Theoretical Physics, Waterloo, ON N2L 2Y5 (Canada); Department of Physics and School of Earth and Space Exploration, Arizona State University, Tempe, AZ 85287-1404 (United States); Steinhardt, Paul J., E-mail: steinh@princeton.edu [Department of Physics and Princeton Center for Theoretical Physics, Princeton University, Princeton, NJ 08544 (United States); Turok, Neil [Perimeter Institute for Theoretical Physics, Waterloo, ON N2L 2Y5 (Canada)

    2012-08-29

    We point out a new phenomenon which seems to be generic in 4d effective theories of scalar fields coupled to Einstein gravity, when applied to cosmology. A lift of such theories to a Weyl-invariant extension allows one to define classical evolution through cosmological singularities unambiguously, and hence construct geodesically complete background spacetimes. An attractor mechanism ensures that, at the level of the effective theory, generic solutions undergo a big crunch/big bang transition by contracting to zero size, passing through a brief antigravity phase, shrinking to zero size again, and re-emerging into an expanding normal gravity phase. The result may be useful for the construction of complete bouncing cosmologies like the cyclic model.

  14. Antigravity and the big crunch/big bang transition

    International Nuclear Information System (INIS)

    Bars, Itzhak; Chen, Shih-Hung; Steinhardt, Paul J.; Turok, Neil

    2012-01-01

    We point out a new phenomenon which seems to be generic in 4d effective theories of scalar fields coupled to Einstein gravity, when applied to cosmology. A lift of such theories to a Weyl-invariant extension allows one to define classical evolution through cosmological singularities unambiguously, and hence construct geodesically complete background spacetimes. An attractor mechanism ensures that, at the level of the effective theory, generic solutions undergo a big crunch/big bang transition by contracting to zero size, passing through a brief antigravity phase, shrinking to zero size again, and re-emerging into an expanding normal gravity phase. The result may be useful for the construction of complete bouncing cosmologies like the cyclic model.

  15. Big data: een zoektocht naar instituties

    NARCIS (Netherlands)

    van der Voort, H.G.; Crompvoets, J

    2016-01-01

    Big data is a well-known phenomenon, even a buzzword nowadays. It refers to an abundance of data and new possibilities to process and use them. Big data is subject of many publications. Some pay attention to the many possibilities of big data, others warn us for their consequences. This special

  16. Data, Data, Data : Big, Linked & Open

    NARCIS (Netherlands)

    Folmer, E.J.A.; Krukkert, D.; Eckartz, S.M.

    2013-01-01

    De gehele business en IT-wereld praat op dit moment over Big Data, een trend die medio 2013 Cloud Computing is gepasseerd (op basis van Google Trends). Ook beleidsmakers houden zich actief bezig met Big Data. Neelie Kroes, vice-president van de Europese Commissie, spreekt over de ‘Big Data

  17. Megastore: structured storage for Big Data

    Directory of Open Access Journals (Sweden)

    Oswaldo Moscoso Zea

    2012-12-01

    Full Text Available Megastore es uno de los componentes principales de la infraestructura de datos de Google, elcual ha permitido el procesamiento y almacenamiento de grandes volúmenes de datos (BigData con alta escalabilidad, confiabilidad y seguridad. Las compañías e individuos que usanestá tecnología se están beneficiando al mismo tiempo de un servicio estable y de altadisponibilidad. En este artículo se realiza un análisis de la infraestructura de datos de Google,comenzando por una revisión de los componentes principales que se han implementado en losúltimos años hasta la creación de Megastore. Se presenta también un análisis de los aspectostécnicos más importantes que se han implementado en este sistema de almacenamiento y que le han permitido cumplir con los objetivos para los que fue creado.Abstract:Megastore is one of the building blocks of Google’s data infrastructure. It has allowed storingand processing operations of huge volumes of data (Big Data with high scalability, reliabilityand security. Companies and individuals using this technology benefit from a highly availableand stable service. In this paper an analysis of Google’s data infrastructure is made, startingwith a review of the core components that have been developed in recent years until theimplementation of Megastore. An analysis is also made of the most important

  18. Rasdaman for Big Spatial Raster Data

    Science.gov (United States)

    Hu, F.; Huang, Q.; Scheele, C. J.; Yang, C. P.; Yu, M.; Liu, K.

    2015-12-01

    Spatial raster data have grown exponentially over the past decade. Recent advancements on data acquisition technology, such as remote sensing, have allowed us to collect massive observation data of various spatial resolution and domain coverage. The volume, velocity, and variety of such spatial data, along with the computational intensive nature of spatial queries, pose grand challenge to the storage technologies for effective big data management. While high performance computing platforms (e.g., cloud computing) can be used to solve the computing-intensive issues in big data analysis, data has to be managed in a way that is suitable for distributed parallel processing. Recently, rasdaman (raster data manager) has emerged as a scalable and cost-effective database solution to store and retrieve massive multi-dimensional arrays, such as sensor, image, and statistics data. Within this paper, the pros and cons of using rasdaman to manage and query spatial raster data will be examined and compared with other common approaches, including file-based systems, relational databases (e.g., PostgreSQL/PostGIS), and NoSQL databases (e.g., MongoDB and Hive). Earth Observing System (EOS) data collected from NASA's Atmospheric Scientific Data Center (ASDC) will be used and stored in these selected database systems, and a set of spatial and non-spatial queries will be designed to benchmark their performance on retrieving large-scale, multi-dimensional arrays of EOS data. Lessons learnt from using rasdaman will be discussed as well.

  19. A Big Data Revolution in Health Care Sector: Opportunities, Challenges and Technological Advancements

    OpenAIRE

    Sanskruti Patel; Atul Patel

    2016-01-01

    Health care sector grows tremendously in last few decades. The health care sector has generated huge amounts of data that has huge volume, enormous velocity and vast variety. Also it comes from a variety of new sources as hospitals are now tend to implemented electronic health record (EHR) systems. These sources have strained the existing capabilities of existing conventional relational database management systems. In such scenario, Big data solutions offer to harness these massive, heterogen...

  20. Methods and tools for big data visualization

    OpenAIRE

    Zubova, Jelena; Kurasova, Olga

    2015-01-01

    In this paper, methods and tools for big data visualization have been investigated. Challenges faced by the big data analysis and visualization have been identified. Technologies for big data analysis have been discussed. A review of methods and tools for big data visualization has been done. Functionalities of the tools have been demonstrated by examples in order to highlight their advantages and disadvantages.

  1. Ideal MHD B limits in the BIG DEE tokamak

    International Nuclear Information System (INIS)

    Helton, F.J.; Bernard, L.C.; Greene, J.M.

    1983-01-01

    Using D-D reactions, tokamak reactors become economically attractive when B (the ratio of volume averaged pressure to magnetic pressure) exceeds 5 percent. Ideal MID instabilities are of great concern because they have the potential to limit B below this range and so extensive studies have been done to determine ideal MHD B limits. As B increases with inverse aspect ratio, elongation and triangularity, the Doublet III upgrade machine -- BIG DEE -- is particularly suited to study the possibility of very high B. The authors have done computations to determine ideal MHD B limits for various plasma shapes and elongations in BIG DEE. They have determined that for q at the plasma surface greater than 2, B is limited by the ballooning mode if the wall is reasonably close to the plasma surface (d/a < 1.5 where d and a are the wall and plasma radii respectively). On the other hand, for q at the plasma surface less than 2, the n=1 external kink is unstable even with a wall close by. Thus, relevant values of limiting B can be obtained by assuming that the external kink limits the value of q at the limiter to a value greater than 2 and that the ballooning modes limit B. Under this assumption, a relevant B limit for the BIG DEE would be over 18%. For such an equilibrium, the wall position necessary to stabilize the n=1 and n=2 modes is 2a and the equilibrium is stable for n=3

  2. Visualization of big data security: a case study on the KDD99 cup data set

    Directory of Open Access Journals (Sweden)

    Zichan Ruan

    2017-11-01

    Full Text Available Cyber security has been thrust into the limelight in the modern technological era because of an array of attacks often bypassing untrained intrusion detection systems (IDSs. Therefore, greater attention has been directed on being able deciphering better methods for identifying attack types to train IDSs more effectively. Keycyber-attack insights exist in big data; however, an efficient approach is required to determine strong attack types to train IDSs to become more effective in key areas. Despite the rising growth in IDS research, there is a lack of studies involving big data visualization, which is key. The KDD99 data set has served as a strong benchmark since 1999; therefore, we utilized this data set in our experiment. In this study, we utilized hash algorithm, a weight table, and sampling method to deal with the inherent problems caused by analyzing big data; volume, variety, and velocity. By utilizing a visualization algorithm, we were able to gain insights into the KDD99 data set with a clear identification of “normal” clusters and described distinct clusters of effective attacks.

  3. Big Data Analytics for Discovering Electricity Consumption Patterns in Smart Cities

    Directory of Open Access Journals (Sweden)

    Rubén Pérez-Chacón

    2018-03-01

    Full Text Available New technologies such as sensor networks have been incorporated into the management of buildings for organizations and cities. Sensor networks have led to an exponential increase in the volume of data available in recent years, which can be used to extract consumption patterns for the purposes of energy and monetary savings. For this reason, new approaches and strategies are needed to analyze information in big data environments. This paper proposes a methodology to extract electric energy consumption patterns in big data time series, so that very valuable conclusions can be made for managers and governments. The methodology is based on the study of four clustering validity indices in their parallelized versions along with the application of a clustering technique. In particular, this work uses a voting system to choose an optimal number of clusters from the results of the indices, as well as the application of the distributed version of the k-means algorithm included in Apache Spark’s Machine Learning Library. The results, using electricity consumption for the years 2011–2017 for eight buildings of a public university, are presented and discussed. In addition, the performance of the proposed methodology is evaluated using synthetic big data, which cab represent thousands of buildings in a smart city. Finally, policies derived from the patterns discovered are proposed to optimize energy usage across the university campus.

  4. The Big bang and the Quantum

    Science.gov (United States)

    Ashtekar, Abhay

    2010-06-01

    General relativity predicts that space-time comes to an end and physics comes to a halt at the big-bang. Recent developments in loop quantum cosmology have shown that these predictions cannot be trusted. Quantum geometry effects can resolve singularities, thereby opening new vistas. Examples are: The big bang is replaced by a quantum bounce; the `horizon problem' disappears; immediately after the big bounce, there is a super-inflationary phase with its own phenomenological ramifications; and, in presence of a standard inflation potential, initial conditions are naturally set for a long, slow roll inflation independently of what happens in the pre-big bang branch. As in my talk at the conference, I will first discuss the foundational issues and then the implications of the new Planck scale physics near the Big Bang.

  5. Big Bang baryosynthesis

    International Nuclear Information System (INIS)

    Turner, M.S.; Chicago Univ., IL

    1983-01-01

    In these lectures I briefly review Big Bang baryosynthesis. In the first lecture I discuss the evidence which exists for the BAU, the failure of non-GUT symmetrical cosmologies, the qualitative picture of baryosynthesis, and numerical results of detailed baryosynthesis calculations. In the second lecture I discuss the requisite CP violation in some detail, further the statistical mechanics of baryosynthesis, possible complications to the simplest scenario, and one cosmological implication of Big Bang baryosynthesis. (orig./HSI)

  6. Exploiting big data for critical care research.

    Science.gov (United States)

    Docherty, Annemarie B; Lone, Nazir I

    2015-10-01

    Over recent years the digitalization, collection and storage of vast quantities of data, in combination with advances in data science, has opened up a new era of big data. In this review, we define big data, identify examples of critical care research using big data, discuss the limitations and ethical concerns of using these large datasets and finally consider scope for future research. Big data refers to datasets whose size, complexity and dynamic nature are beyond the scope of traditional data collection and analysis methods. The potential benefits to critical care are significant, with faster progress in improving health and better value for money. Although not replacing clinical trials, big data can improve their design and advance the field of precision medicine. However, there are limitations to analysing big data using observational methods. In addition, there are ethical concerns regarding maintaining confidentiality of patients who contribute to these datasets. Big data have the potential to improve medical care and reduce costs, both by individualizing medicine, and bringing together multiple sources of data about individual patients. As big data become increasingly mainstream, it will be important to maintain public confidence by safeguarding data security, governance and confidentiality.

  7. Performance Analysis of Two Big Data Technologies on a Cloud Distributed Architecture. Results for Non-Aggregate Queries on Medium-Sized Data

    Directory of Open Access Journals (Sweden)

    Fotache Marin

    2016-12-01

    Full Text Available Big Data systems manage and process huge volumes of data constantly generated by various technologies in a myriad of formats. Big Data advocates (and preachers have claimed that, relative to classical, relational/SQL Data Base Management Systems, Big Data technologies such as NoSQL, Hadoop and in-memory data stores perform better. This paper compares data processing performance of two systems belonging to SQL (PostgreSQL/Postgres XL and Big Data (Hadoop/Hive camps on a distributed five-node cluster deployed in cloud. Unlike benchmarks in use (YCSB, TPC, a series of R modules were devised for generating random non-aggregate queries on different subschema (with increasing data size of TPC-H database. Overall performance of the two systems was compared. Subsequently a number of models were developed for relating performance on the system and also on various query parameters such as the number of attributes in SELECT and WHERE clause, number of joins, number of processing rows etc.

  8. Empathy and the Big Five

    OpenAIRE

    Paulus, Christoph

    2016-01-01

    Del Barrio et al. (2004) haben vor mehr als 10 Jahren versucht, eine direkte Beziehung zwischen Empathie und den Big Five herzustellen. Im Mittel hatten in ihrer Stichprobe Frauen höhere Werte in der Empathie und auf den Big Five-Faktoren mit Ausnahme des Faktors Neurotizismus. Zusammenhänge zu Empathie fanden sie in den Bereichen Offenheit, Verträglichkeit, Gewissenhaftigkeit und Extraversion. In unseren Daten besitzen Frauen sowohl in der Empathie als auch den Big Five signifikant höhere We...

  9. Big domains are novel Ca²+-binding modules: evidences from big domains of Leptospira immunoglobulin-like (Lig) proteins.

    Science.gov (United States)

    Raman, Rajeev; Rajanikanth, V; Palaniappan, Raghavan U M; Lin, Yi-Pin; He, Hongxuan; McDonough, Sean P; Sharma, Yogendra; Chang, Yung-Fu

    2010-12-29

    Many bacterial surface exposed proteins mediate the host-pathogen interaction more effectively in the presence of Ca²+. Leptospiral immunoglobulin-like (Lig) proteins, LigA and LigB, are surface exposed proteins containing Bacterial immunoglobulin like (Big) domains. The function of proteins which contain Big fold is not known. Based on the possible similarities of immunoglobulin and βγ-crystallin folds, we here explore the important question whether Ca²+ binds to a Big domains, which would provide a novel functional role of the proteins containing Big fold. We selected six individual Big domains for this study (three from the conserved part of LigA and LigB, denoted as Lig A3, Lig A4, and LigBCon5; two from the variable region of LigA, i.e., 9(th) (Lig A9) and 10(th) repeats (Lig A10); and one from the variable region of LigB, i.e., LigBCen2. We have also studied the conserved region covering the three and six repeats (LigBCon1-3 and LigCon). All these proteins bind the calcium-mimic dye Stains-all. All the selected four domains bind Ca²+ with dissociation constants of 2-4 µM. Lig A9 and Lig A10 domains fold well with moderate thermal stability, have β-sheet conformation and form homodimers. Fluorescence spectra of Big domains show a specific doublet (at 317 and 330 nm), probably due to Trp interaction with a Phe residue. Equilibrium unfolding of selected Big domains is similar and follows a two-state model, suggesting the similarity in their fold. We demonstrate that the Lig are Ca²+-binding proteins, with Big domains harbouring the binding motif. We conclude that despite differences in sequence, a Big motif binds Ca²+. This work thus sets up a strong possibility for classifying the proteins containing Big domains as a novel family of Ca²+-binding proteins. Since Big domain is a part of many proteins in bacterial kingdom, we suggest a possible function these proteins via Ca²+ binding.

  10. Quest for Value in Big Earth Data

    Science.gov (United States)

    Kuo, Kwo-Sen; Oloso, Amidu O.; Rilee, Mike L.; Doan, Khoa; Clune, Thomas L.; Yu, Hongfeng

    2017-04-01

    Among all the V's of Big Data challenges, such as Volume, Variety, Velocity, Veracity, etc., we believe Value is the ultimate determinant, because a system delivering better value has a competitive edge over others. Although it is not straightforward to assess the value of scientific endeavors, we believe the ratio of scientific productivity increase to investment is a reasonable measure. Our research in Big Data approaches to data-intensive analysis for Earth Science has yielded some insights, as well as evidences, as to how optimal value might be attained. The first insight is that we should avoid, as much as possible, moving data through connections with relatively low bandwidth. That is, we recognize that moving data is expensive, albeit inevitable. They must at least be moved from the storage device into computer main memory and then to CPU registers for computation. When data must be moved it is better to move them via relatively high-bandwidth connections and avoid low-bandwidth ones. For this reason, a technology that can best exploit data locality will have an advantage over others. Data locality is easy to achieve and exploit with only one dataset. With multiple datasets, data colocation becomes important in addition to data locality. However, the organization of datasets can only be co-located for certain types of analyses. It is impossible for them to be co-located for all analyses. Therefore, our second insight is that we need to co-locate the datasets for the most commonly used analyses. In Earth Science, we believe the most common analysis requirement is "spatiotemporal coincidence". For example, when we analyze precipitation systems, we often would like to know the environment conditions "where and when" (i.e. at the same location and time) there is precipitation. This "where and when" indicates the "spatiotemporal coincidence" requirement. Thus, an associated insight is that datasets need to be partitioned per the physical dimensions, i.e. space

  11. Semantic Web Technologies and Big Data Infrastructures: SPARQL Federated Querying of Heterogeneous Big Data Stores

    OpenAIRE

    Konstantopoulos, Stasinos; Charalambidis, Angelos; Mouchakis, Giannis; Troumpoukis, Antonis; Jakobitsch, Jürgen; Karkaletsis, Vangelis

    2016-01-01

    The ability to cross-link large scale data with each other and with structured Semantic Web data, and the ability to uniformly process Semantic Web and other data adds value to both the Semantic Web and to the Big Data community. This paper presents work in progress towards integrating Big Data infrastructures with Semantic Web technologies, allowing for the cross-linking and uniform retrieval of data stored in both Big Data infrastructures and Semantic Web data. The technical challenges invo...

  12. Quantum fields in a big-crunch-big-bang spacetime

    International Nuclear Information System (INIS)

    Tolley, Andrew J.; Turok, Neil

    2002-01-01

    We consider quantum field theory on a spacetime representing the big-crunch-big-bang transition postulated in ekpyrotic or cyclic cosmologies. We show via several independent methods that an essentially unique matching rule holds connecting the incoming state, in which a single extra dimension shrinks to zero, to the outgoing state in which it reexpands at the same rate. For free fields in our construction there is no particle production from the incoming adiabatic vacuum. When interactions are included the particle production for fixed external momentum is finite at the tree level. We discuss a formal correspondence between our construction and quantum field theory on de Sitter spacetime

  13. Scaling Big Data Cleansing

    KAUST Repository

    Khayyat, Zuhair

    2017-07-31

    Data cleansing approaches have usually focused on detecting and fixing errors with little attention to big data scaling. This presents a serious impediment since identify- ing and repairing dirty data often involves processing huge input datasets, handling sophisticated error discovery approaches and managing huge arbitrary errors. With large datasets, error detection becomes overly expensive and complicated especially when considering user-defined functions. Furthermore, a distinctive algorithm is de- sired to optimize inequality joins in sophisticated error discovery rather than na ̈ıvely parallelizing them. Also, when repairing large errors, their skewed distribution may obstruct effective error repairs. In this dissertation, I present solutions to overcome the above three problems in scaling data cleansing. First, I present BigDansing as a general system to tackle efficiency, scalability, and ease-of-use issues in data cleansing for Big Data. It automatically parallelizes the user’s code on top of general-purpose distributed platforms. Its programming inter- face allows users to express data quality rules independently from the requirements of parallel and distributed environments. Without sacrificing their quality, BigDans- ing also enables parallel execution of serial repair algorithms by exploiting the graph representation of discovered errors. The experimental results show that BigDansing outperforms existing baselines up to more than two orders of magnitude. Although BigDansing scales cleansing jobs, it still lacks the ability to handle sophisticated error discovery requiring inequality joins. Therefore, I developed IEJoin as an algorithm for fast inequality joins. It is based on sorted arrays and space efficient bit-arrays to reduce the problem’s search space. By comparing IEJoin against well- known optimizations, I show that it is more scalable, and several orders of magnitude faster. BigDansing depends on vertex-centric graph systems, i.e., Pregel

  14. The ethics of big data in big agriculture

    Directory of Open Access Journals (Sweden)

    Isabelle M. Carbonell

    2016-03-01

    Full Text Available This paper examines the ethics of big data in agriculture, focusing on the power asymmetry between farmers and large agribusinesses like Monsanto. Following the recent purchase of Climate Corp., Monsanto is currently the most prominent biotech agribusiness to buy into big data. With wireless sensors on tractors monitoring or dictating every decision a farmer makes, Monsanto can now aggregate large quantities of previously proprietary farming data, enabling a privileged position with unique insights on a field-by-field basis into a third or more of the US farmland. This power asymmetry may be rebalanced through open-sourced data, and publicly-funded data analytic tools which rival Climate Corp. in complexity and innovation for use in the public domain.

  15. Homogeneous and isotropic big rips?

    CERN Document Server

    Giovannini, Massimo

    2005-01-01

    We investigate the way big rips are approached in a fully inhomogeneous description of the space-time geometry. If the pressure and energy densities are connected by a (supernegative) barotropic index, the spatial gradients and the anisotropic expansion decay as the big rip is approached. This behaviour is contrasted with the usual big-bang singularities. A similar analysis is performed in the case of sudden (quiescent) singularities and it is argued that the spatial gradients may well be non-negligible in the vicinity of pressure singularities.

  16. Rate Change Big Bang Theory

    Science.gov (United States)

    Strickland, Ken

    2013-04-01

    The Rate Change Big Bang Theory redefines the birth of the universe with a dramatic shift in energy direction and a new vision of the first moments. With rate change graph technology (RCGT) we can look back 13.7 billion years and experience every step of the big bang through geometrical intersection technology. The analysis of the Big Bang includes a visualization of the first objects, their properties, the astounding event that created space and time as well as a solution to the mystery of anti-matter.

  17. Intelligent Test Mechanism Design of Worn Big Gear

    Directory of Open Access Journals (Sweden)

    Hong-Yu LIU

    2014-10-01

    Full Text Available With the continuous development of national economy, big gear was widely applied in metallurgy and mine domains. So, big gear plays an important role in above domains. In practical production, big gear abrasion and breach take place often. It affects normal production and causes unnecessary economic loss. A kind of intelligent test method was put forward on worn big gear mainly aimed at the big gear restriction conditions of high production cost, long production cycle and high- intensity artificial repair welding work. The measure equations transformations were made on involute straight gear. Original polar coordinate equations were transformed into rectangular coordinate equations. Big gear abrasion measure principle was introduced. Detection principle diagram was given. Detection route realization method was introduced. OADM12 laser sensor was selected. Detection on big gear abrasion area was realized by detection mechanism. Tested data of unworn gear and worn gear were led in designed calculation program written by Visual Basic language. Big gear abrasion quantity can be obtained. It provides a feasible method for intelligent test and intelligent repair welding on worn big gear.

  18. [Big data in medicine and healthcare].

    Science.gov (United States)

    Rüping, Stefan

    2015-08-01

    Healthcare is one of the business fields with the highest Big Data potential. According to the prevailing definition, Big Data refers to the fact that data today is often too large and heterogeneous and changes too quickly to be stored, processed, and transformed into value by previous technologies. The technological trends drive Big Data: business processes are more and more executed electronically, consumers produce more and more data themselves - e.g. in social networks - and finally ever increasing digitalization. Currently, several new trends towards new data sources and innovative data analysis appear in medicine and healthcare. From the research perspective, omics-research is one clear Big Data topic. In practice, the electronic health records, free open data and the "quantified self" offer new perspectives for data analytics. Regarding analytics, significant advances have been made in the information extraction from text data, which unlocks a lot of data from clinical documentation for analytics purposes. At the same time, medicine and healthcare is lagging behind in the adoption of Big Data approaches. This can be traced to particular problems regarding data complexity and organizational, legal, and ethical challenges. The growing uptake of Big Data in general and first best-practice examples in medicine and healthcare in particular, indicate that innovative solutions will be coming. This paper gives an overview of the potentials of Big Data in medicine and healthcare.

  19. Data mining and knowledge discovery for big data methodologies, challenge and opportunities

    CERN Document Server

    2014-01-01

    The field of data mining has made significant and far-reaching advances over the past three decades.  Because of its potential power for solving complex problems, data mining has been successfully applied to diverse areas such as business, engineering, social media, and biological science. Many of these applications search for patterns in complex structural information. In biomedicine for example, modeling complex biological systems requires linking knowledge across many levels of science, from genes to disease.  Further, the data characteristics of the problems have also grown from static to dynamic and spatiotemporal, complete to incomplete, and centralized to distributed, and grow in their scope and size (this is known as big data). The effective integration of big data for decision-making also requires privacy preservation. The contributions to this monograph summarize the advances of data mining in the respective fields. This volume consists of nine chapters that address subjects ranging from mining da...

  20. From Big Data to Big Business

    DEFF Research Database (Denmark)

    Lund Pedersen, Carsten

    2017-01-01

    Idea in Brief: Problem: There is an enormous profit potential for manufacturing firms in big data, but one of the key barriers to obtaining data-driven growth is the lack of knowledge about which capabilities are needed to extract value and profit from data. Solution: We (BDBB research group at C...

  1. Advancements in Big Data Processing

    CERN Document Server

    Vaniachine, A; The ATLAS collaboration

    2012-01-01

    The ever-increasing volumes of scientific data present new challenges for Distributed Computing and Grid-technologies. The emerging Big Data revolution drives new discoveries in scientific fields including nanotechnology, astrophysics, high-energy physics, biology and medicine. New initiatives are transforming data-driven scientific fields by pushing Bid Data limits enabling massive data analysis in new ways. In petascale data processing scientists deal with datasets, not individual files. As a result, a task (comprised of many jobs) became a unit of petascale data processing on the Grid. Splitting of a large data processing task into jobs enabled fine-granularity checkpointing analogous to the splitting of a large file into smaller TCP/IP packets during data transfers. Transferring large data in small packets achieves reliability through automatic re-sending of the dropped TCP/IP packets. Similarly, transient job failures on the Grid can be recovered by automatic re-tries to achieve reliable Six Sigma produc...

  2. Making big sense from big data in toxicology by read-across.

    Science.gov (United States)

    Hartung, Thomas

    2016-01-01

    Modern information technologies have made big data available in safety sciences, i.e., extremely large data sets that may be analyzed only computationally to reveal patterns, trends and associations. This happens by (1) compilation of large sets of existing data, e.g., as a result of the European REACH regulation, (2) the use of omics technologies and (3) systematic robotized testing in a high-throughput manner. All three approaches and some other high-content technologies leave us with big data--the challenge is now to make big sense of these data. Read-across, i.e., the local similarity-based intrapolation of properties, is gaining momentum with increasing data availability and consensus on how to process and report it. It is predominantly applied to in vivo test data as a gap-filling approach, but can similarly complement other incomplete datasets. Big data are first of all repositories for finding similar substances and ensure that the available data is fully exploited. High-content and high-throughput approaches similarly require focusing on clusters, in this case formed by underlying mechanisms such as pathways of toxicity. The closely connected properties, i.e., structural and biological similarity, create the confidence needed for predictions of toxic properties. Here, a new web-based tool under development called REACH-across, which aims to support and automate structure-based read-across, is presented among others.

  3. [Big data in official statistics].

    Science.gov (United States)

    Zwick, Markus

    2015-08-01

    The concept of "big data" stands to change the face of official statistics over the coming years, having an impact on almost all aspects of data production. The tasks of future statisticians will not necessarily be to produce new data, but rather to identify and make use of existing data to adequately describe social and economic phenomena. Until big data can be used correctly in official statistics, a lot of questions need to be answered and problems solved: the quality of data, data protection, privacy, and the sustainable availability are some of the more pressing issues to be addressed. The essential skills of official statisticians will undoubtedly change, and this implies a number of challenges to be faced by statistical education systems, in universities, and inside the statistical offices. The national statistical offices of the European Union have concluded a concrete strategy for exploring the possibilities of big data for official statistics, by means of the Big Data Roadmap and Action Plan 1.0. This is an important first step and will have a significant influence on implementing the concept of big data inside the statistical offices of Germany.

  4. Big-Leaf Mahogany on CITES Appendix II: Big Challenge, Big Opportunity

    Science.gov (United States)

    JAMES GROGAN; PAULO BARRETO

    2005-01-01

    On 15 November 2003, big-leaf mahogany (Swietenia macrophylla King, Meliaceae), the most valuable widely traded Neotropical timber tree, gained strengthened regulatory protection from its listing on Appendix II of the Convention on International Trade in Endangered Species ofWild Fauna and Flora (CITES). CITES is a United Nations-chartered agreement signed by 164...

  5. Big Data as Information Barrier

    Directory of Open Access Journals (Sweden)

    Victor Ya. Tsvetkov

    2014-07-01

    Full Text Available The article covers analysis of ‘Big Data’ which has been discussed over last 10 years. The reasons and factors for the issue are revealed. It has proved that the factors creating ‘Big Data’ issue has existed for quite a long time, and from time to time, would cause the informational barriers. Such barriers were successfully overcome through the science and technologies. The conducted analysis refers the “Big Data” issue to a form of informative barrier. This issue may be solved correctly and encourages development of scientific and calculating methods.

  6. Big Data in Space Science

    OpenAIRE

    Barmby, Pauline

    2018-01-01

    It seems like “big data” is everywhere these days. In planetary science and astronomy, we’ve been dealing with large datasets for a long time. So how “big” is our data? How does it compare to the big data that a bank or an airline might have? What new tools do we need to analyze big datasets, and how can we make better use of existing tools? What kinds of science problems can we address with these? I’ll address these questions with examples including ESA’s Gaia mission, ...

  7. Big Data in Medicine is Driving Big Changes

    Science.gov (United States)

    Verspoor, K.

    2014-01-01

    Summary Objectives To summarise current research that takes advantage of “Big Data” in health and biomedical informatics applications. Methods Survey of trends in this work, and exploration of literature describing how large-scale structured and unstructured data sources are being used to support applications from clinical decision making and health policy, to drug design and pharmacovigilance, and further to systems biology and genetics. Results The survey highlights ongoing development of powerful new methods for turning that large-scale, and often complex, data into information that provides new insights into human health, in a range of different areas. Consideration of this body of work identifies several important paradigm shifts that are facilitated by Big Data resources and methods: in clinical and translational research, from hypothesis-driven research to data-driven research, and in medicine, from evidence-based practice to practice-based evidence. Conclusions The increasing scale and availability of large quantities of health data require strategies for data management, data linkage, and data integration beyond the limits of many existing information systems, and substantial effort is underway to meet those needs. As our ability to make sense of that data improves, the value of the data will continue to increase. Health systems, genetics and genomics, population and public health; all areas of biomedicine stand to benefit from Big Data and the associated technologies. PMID:25123716

  8. Harnessing the Power of Big Data to Improve Graduate Medical Education: Big Idea or Bust?

    Science.gov (United States)

    Arora, Vineet M

    2018-06-01

    With the advent of electronic medical records (EMRs) fueling the rise of big data, the use of predictive analytics, machine learning, and artificial intelligence are touted as transformational tools to improve clinical care. While major investments are being made in using big data to transform health care delivery, little effort has been directed toward exploiting big data to improve graduate medical education (GME). Because our current system relies on faculty observations of competence, it is not unreasonable to ask whether big data in the form of clinical EMRs and other novel data sources can answer questions of importance in GME such as when is a resident ready for independent practice.The timing is ripe for such a transformation. A recent National Academy of Medicine report called for reforms to how GME is delivered and financed. While many agree on the need to ensure that GME meets our nation's health needs, there is little consensus on how to measure the performance of GME in meeting this goal. During a recent workshop at the National Academy of Medicine on GME outcomes and metrics in October 2017, a key theme emerged: Big data holds great promise to inform GME performance at individual, institutional, and national levels. In this Invited Commentary, several examples are presented, such as using big data to inform clinical experience and provide clinically meaningful data to trainees, and using novel data sources, including ambient data, to better measure the quality of GME training.

  9. A SWOT Analysis of Big Data

    Science.gov (United States)

    Ahmadi, Mohammad; Dileepan, Parthasarati; Wheatley, Kathleen K.

    2016-01-01

    This is the decade of data analytics and big data, but not everyone agrees with the definition of big data. Some researchers see it as the future of data analysis, while others consider it as hype and foresee its demise in the near future. No matter how it is defined, big data for the time being is having its glory moment. The most important…

  10. A survey of big data research

    Science.gov (United States)

    Fang, Hua; Zhang, Zhaoyang; Wang, Chanpaul Jin; Daneshmand, Mahmoud; Wang, Chonggang; Wang, Honggang

    2015-01-01

    Big data create values for business and research, but pose significant challenges in terms of networking, storage, management, analytics and ethics. Multidisciplinary collaborations from engineers, computer scientists, statisticians and social scientists are needed to tackle, discover and understand big data. This survey presents an overview of big data initiatives, technologies and research in industries and academia, and discusses challenges and potential solutions. PMID:26504265

  11. Big Data in Action for Government : Big Data Innovation in Public Services, Policy, and Engagement

    OpenAIRE

    World Bank

    2017-01-01

    Governments have an opportunity to harness big data solutions to improve productivity, performance and innovation in service delivery and policymaking processes. In developing countries, governments have an opportunity to adopt big data solutions and leapfrog traditional administrative approaches

  12. Evidence for Evolution as Support for Big Bang

    Science.gov (United States)

    Gopal-Krishna

    1997-12-01

    With the exception of ZERO, the concept of BIG BANG is by far the most bizarre creation of the human mind. Three classical pillars of the Big Bang model of the origin of the universe are generally thought to be: (i) The abundances of the light elements; (ii) the microwave back-ground radiation; and (iii) the change with cosmic epoch in the average properties of galaxies (both active and non-active types). Evidence is also mounting for redshift dependence of the intergalactic medium, as discussed elsewhere in this volume in detail. In this contribution, I endeavour to highlight a selection of recent advances pertaining to the third category. The widely different levels of confidence in the claimed observational constraints in the field of cosmology can be guaged from the following excerpts from two leading astrophysicists: "I would bet odds of 10 to 1 on the validity of the general 'hot Big Bang' concept as a description of how our universe has evolved since it was around 1 sec. old" -M. Rees (1995), in 'Perspectives in Astrophysical Cosmology' CUP. "With the much more sensitive observations available today, no astrophysical property shows evidence of evolution, such as was claimed in the 1950s to disprove the Steady State theory" -F. Hoyle (1987), in 'Fifty years in cosmology', B. M. Birla Memorial Lecture, Hyderabad, India. The burgeoning multi-wavelength culture in astronomy has provided a tremendous boost to observational cosmology in recent years. We now proceed to illustrate this with a sequence of examples which reinforce the picture of an evolving universe. Also provided are some relevant details of the data used in these studies so that their scope can be independently judged by the readers.

  13. Kazakhstan's Environment-Health system, a Big Data challenge

    Science.gov (United States)

    Vitolo, Claudia; Bella Gazdiyeva, Bella; Tucker, Allan; Russell, Andrew; Ali, Maged; Althonayan, Abraham

    2016-04-01

    Kazakhstan has witnessed a remarkable economic development in the past 15 years, becoming an upper-middle-income country. However it is still widely regarded as a developing nation, partially because of its population's low life expectancy which is 5 years below the average in similar economies. The environment is in a rather fragile state, affected by soil, water, air pollution, radioactive contamination and climate change. However, Kazakhstan's government is moving towards clean energy and environmental protection and calling on scientists to help prioritise investments. The British Council-funded "Kazakhstan's Environment-Health Risk Analysis (KEHRA)" project is one of the recently launched initiatives to support Kazakhstan healthier future. The underlying hypothesis of this research is that the above mentioned factors (air/water/soil pollution, etc.) affecting public health almost certainly do not act independently but rather trigger and exacerbate each other. Exploring the environment-health links in a multi-dimensional framework is a typical Big Data problem, in which the volume and variety of the data needed poses technical as well as scientific challenges. In Kazakhstan, the complexities related to managing and analysing Big Data are worsened by a number of obstacles at the data acquisition step: most of the data is not in digital form, spatial and temporal attributes are often ambiguous and the re-use and re-purpose of the information is subject to restrictive licenses and other mechanisms of control. In this work, we document the first steps taken towards building an understanding of the complex environment-health system in Kazakhstan, using interactive visualisation tools to identify and compare hot-spots of pollution and poor health outcomes, Big Data and web technologies to collect, manage and explore available information. In the future, the knowledge acquired will be modelled to develop evidence-based recommendation systems for decision makers in

  14. 78 FR 3911 - Big Stone National Wildlife Refuge, Big Stone and Lac Qui Parle Counties, MN; Final Comprehensive...

    Science.gov (United States)

    2013-01-17

    ... DEPARTMENT OF THE INTERIOR Fish and Wildlife Service [FWS-R3-R-2012-N259; FXRS1265030000-134-FF03R06000] Big Stone National Wildlife Refuge, Big Stone and Lac Qui Parle Counties, MN; Final Comprehensive... significant impact (FONSI) for the environmental assessment (EA) for Big Stone National Wildlife Refuge...

  15. Big domains are novel Ca²+-binding modules: evidences from big domains of Leptospira immunoglobulin-like (Lig proteins.

    Directory of Open Access Journals (Sweden)

    Rajeev Raman

    Full Text Available BACKGROUND: Many bacterial surface exposed proteins mediate the host-pathogen interaction more effectively in the presence of Ca²+. Leptospiral immunoglobulin-like (Lig proteins, LigA and LigB, are surface exposed proteins containing Bacterial immunoglobulin like (Big domains. The function of proteins which contain Big fold is not known. Based on the possible similarities of immunoglobulin and βγ-crystallin folds, we here explore the important question whether Ca²+ binds to a Big domains, which would provide a novel functional role of the proteins containing Big fold. PRINCIPAL FINDINGS: We selected six individual Big domains for this study (three from the conserved part of LigA and LigB, denoted as Lig A3, Lig A4, and LigBCon5; two from the variable region of LigA, i.e., 9(th (Lig A9 and 10(th repeats (Lig A10; and one from the variable region of LigB, i.e., LigBCen2. We have also studied the conserved region covering the three and six repeats (LigBCon1-3 and LigCon. All these proteins bind the calcium-mimic dye Stains-all. All the selected four domains bind Ca²+ with dissociation constants of 2-4 µM. Lig A9 and Lig A10 domains fold well with moderate thermal stability, have β-sheet conformation and form homodimers. Fluorescence spectra of Big domains show a specific doublet (at 317 and 330 nm, probably due to Trp interaction with a Phe residue. Equilibrium unfolding of selected Big domains is similar and follows a two-state model, suggesting the similarity in their fold. CONCLUSIONS: We demonstrate that the Lig are Ca²+-binding proteins, with Big domains harbouring the binding motif. We conclude that despite differences in sequence, a Big motif binds Ca²+. This work thus sets up a strong possibility for classifying the proteins containing Big domains as a novel family of Ca²+-binding proteins. Since Big domain is a part of many proteins in bacterial kingdom, we suggest a possible function these proteins via Ca²+ binding.

  16. Big Data Management with Incremental K-Means Trees–GPU-Accelerated Construction and Visualization

    Directory of Open Access Journals (Sweden)

    Jun Wang

    2017-07-01

    Full Text Available While big data is revolutionizing scientific research, the tasks of data management and analytics are becoming more challenging than ever. One way to remit the difficulty is to obtain the multilevel hierarchy embedded in the data. Knowing the hierarchy enables not only the revelation of the nature of the data, it is also often the first step in big data analytics. However, current algorithms for learning the hierarchy are typically not scalable to large volumes of data with high dimensionality. To tackle this challenge, in this paper, we propose a new scalable approach for constructing the tree structure from data. Our method builds the tree in a bottom-up manner, with adapted incremental k-means. By referencing the distribution of point distances, one can flexibly control the height of the tree and the branching of each node. Dimension reduction is also conducted as a pre-process, to further boost the computing efficiency. The algorithm takes a parallel design and is implemented with CUDA (Compute Unified Device Architecture, so that it can be efficiently applied to big data. We test the algorithm with two real-world datasets, and the results are visualized with extended circular dendrograms and other visualization techniques.

  17. Studies of Big Data metadata segmentation between relational and non-relational databases

    Science.gov (United States)

    Golosova, M. V.; Grigorieva, M. A.; Klimentov, A. A.; Ryabinkin, E. A.; Dimitrov, G.; Potekhin, M.

    2015-12-01

    In recent years the concepts of Big Data became well established in IT. Systems managing large data volumes produce metadata that describe data and workflows. These metadata are used to obtain information about current system state and for statistical and trend analysis of the processes these systems drive. Over the time the amount of the stored metadata can grow dramatically. In this article we present our studies to demonstrate how metadata storage scalability and performance can be improved by using hybrid RDBMS/NoSQL architecture.

  18. Studies of Big Data metadata segmentation between relational and non-relational databases

    CERN Document Server

    Golosova, M V; Klimentov, A A; Ryabinkin, E A; Dimitrov, G; Potekhin, M

    2015-01-01

    In recent years the concepts of Big Data became well established in IT. Systems managing large data volumes produce metadata that describe data and workflows. These metadata are used to obtain information about current system state and for statistical and trend analysis of the processes these systems drive. Over the time the amount of the stored metadata can grow dramatically. In this article we present our studies to demonstrate how metadata storage scalability and performance can be improved by using hybrid RDBMS/NoSQL architecture.

  19. New 'bigs' in cosmology

    International Nuclear Information System (INIS)

    Yurov, Artyom V.; Martin-Moruno, Prado; Gonzalez-Diaz, Pedro F.

    2006-01-01

    This paper contains a detailed discussion on new cosmic solutions describing the early and late evolution of a universe that is filled with a kind of dark energy that may or may not satisfy the energy conditions. The main distinctive property of the resulting space-times is that they make to appear twice the single singular events predicted by the corresponding quintessential (phantom) models in a manner which can be made symmetric with respect to the origin of cosmic time. Thus, big bang and big rip singularity are shown to take place twice, one on the positive branch of time and the other on the negative one. We have also considered dark energy and phantom energy accretion onto black holes and wormholes in the context of these new cosmic solutions. It is seen that the space-times of these holes would then undergo swelling processes leading to big trip and big hole events taking place on distinct epochs along the evolution of the universe. In this way, the possibility is considered that the past and future be connected in a non-paradoxical manner in the universes described by means of the new symmetric solutions

  20. 2nd INNS Conference on Big Data

    CERN Document Server

    Manolopoulos, Yannis; Iliadis, Lazaros; Roy, Asim; Vellasco, Marley

    2017-01-01

    The book offers a timely snapshot of neural network technologies as a significant component of big data analytics platforms. It promotes new advances and research directions in efficient and innovative algorithmic approaches to analyzing big data (e.g. deep networks, nature-inspired and brain-inspired algorithms); implementations on different computing platforms (e.g. neuromorphic, graphics processing units (GPUs), clouds, clusters); and big data analytics applications to solve real-world problems (e.g. weather prediction, transportation, energy management). The book, which reports on the second edition of the INNS Conference on Big Data, held on October 23–25, 2016, in Thessaloniki, Greece, depicts an interesting collaborative adventure of neural networks with big data and other learning technologies.

  1. Big Data Solution for CTBT Monitoring Using Global Cross Correlation

    Science.gov (United States)

    Gaillard, P.; Bobrov, D.; Dupont, A.; Grenouille, A.; Kitov, I. O.; Rozhkov, M.

    2014-12-01

    Due to the mismatch between data volume and the performance of the Information Technology infrastructure used in seismic data centers, it becomes more and more difficult to process all the data with traditional applications in a reasonable elapsed time. To fulfill their missions, the International Data Centre of the Comprehensive Nuclear-Test-Ban Treaty Organization (CTBTO/IDC) and the Département Analyse Surveillance Environnement of Commissariat à l'Energie atomique et aux énergies alternatives (CEA/DASE) collect, process and produce complex data sets whose volume is growing exponentially. In the medium term, computer architectures, data management systems and application algorithms will require fundamental changes to meet the needs. This problem is well known and identified as a "Big Data" challenge. To tackle this major task, the CEA/DASE takes part during two years to the "DataScale" project. Started in September 2013, DataScale gathers a large set of partners (research laboratories, SMEs and big companies). The common objective is to design efficient solutions using the synergy between Big Data solutions and the High Performance Computing (HPC). The project will evaluate the relevance of these technological solutions by implementing a demonstrator for seismic event detections thanks to massive waveform correlations. The IDC has developed an expertise on such techniques leading to an algorithm called "Master Event" and provides a high-quality dataset for an extensive cross correlation study. The objective of the project is to enhance the Master Event algorithm and to reanalyze 10 years of waveform data from the International Monitoring System (IMS) network thanks to a dedicated HPC infrastructure operated by the "Centre de Calcul Recherche et Technologie" at the CEA of Bruyères-le-Châtel. The dataset used for the demonstrator includes more than 300,000 seismic events, tens of millions of raw detections and more than 30 terabytes of continuous seismic data

  2. The ethics of biomedical big data

    CERN Document Server

    Mittelstadt, Brent Daniel

    2016-01-01

    This book presents cutting edge research on the new ethical challenges posed by biomedical Big Data technologies and practices. ‘Biomedical Big Data’ refers to the analysis of aggregated, very large datasets to improve medical knowledge and clinical care. The book describes the ethical problems posed by aggregation of biomedical datasets and re-use/re-purposing of data, in areas such as privacy, consent, professionalism, power relationships, and ethical governance of Big Data platforms. Approaches and methods are discussed that can be used to address these problems to achieve the appropriate balance between the social goods of biomedical Big Data research and the safety and privacy of individuals. Seventeen original contributions analyse the ethical, social and related policy implications of the analysis and curation of biomedical Big Data, written by leading experts in the areas of biomedical research, medical and technology ethics, privacy, governance and data protection. The book advances our understan...

  3. Scalable privacy-preserving big data aggregation mechanism

    Directory of Open Access Journals (Sweden)

    Dapeng Wu

    2016-08-01

    Full Text Available As the massive sensor data generated by large-scale Wireless Sensor Networks (WSNs recently become an indispensable part of ‘Big Data’, the collection, storage, transmission and analysis of the big sensor data attract considerable attention from researchers. Targeting the privacy requirements of large-scale WSNs and focusing on the energy-efficient collection of big sensor data, a Scalable Privacy-preserving Big Data Aggregation (Sca-PBDA method is proposed in this paper. Firstly, according to the pre-established gradient topology structure, sensor nodes in the network are divided into clusters. Secondly, sensor data is modified by each node according to the privacy-preserving configuration message received from the sink. Subsequently, intra- and inter-cluster data aggregation is employed during the big sensor data reporting phase to reduce energy consumption. Lastly, aggregated results are recovered by the sink to complete the privacy-preserving big data aggregation. Simulation results validate the efficacy and scalability of Sca-PBDA and show that the big sensor data generated by large-scale WSNs is efficiently aggregated to reduce network resource consumption and the sensor data privacy is effectively protected to meet the ever-growing application requirements.

  4. Ethische aspecten van big data

    NARCIS (Netherlands)

    N. (Niek) van Antwerpen; Klaas Jan Mollema

    2017-01-01

    Big data heeft niet alleen geleid tot uitdagende technische vraagstukken, ook gaat het gepaard met allerlei nieuwe ethische en morele kwesties. Om verantwoord met big data om te gaan, moet ook over deze kwesties worden nagedacht. Want slecht datagebruik kan nadelige gevolgen hebben voor

  5. Epidemiology in wonderland: Big Data and precision medicine.

    Science.gov (United States)

    Saracci, Rodolfo

    2018-03-01

    Big Data and precision medicine, two major contemporary challenges for epidemiology, are critically examined from two different angles. In Part 1 Big Data collected for research purposes (Big research Data) and Big Data used for research although collected for other primary purposes (Big secondary Data) are discussed in the light of the fundamental common requirement of data validity, prevailing over "bigness". Precision medicine is treated developing the key point that high relative risks are as a rule required to make a variable or combination of variables suitable for prediction of disease occurrence, outcome or response to treatment; the commercial proliferation of allegedly predictive tests of unknown or poor validity is commented. Part 2 proposes a "wise epidemiology" approach to: (a) choosing in a context imprinted by Big Data and precision medicine-epidemiological research projects actually relevant to population health, (b) training epidemiologists, (c) investigating the impact on clinical practices and doctor-patient relation of the influx of Big Data and computerized medicine and (d) clarifying whether today "health" may be redefined-as some maintain in purely technological terms.

  6. Big Data and Analytics in Healthcare.

    Science.gov (United States)

    Tan, S S-L; Gao, G; Koch, S

    2015-01-01

    This editorial is part of the Focus Theme of Methods of Information in Medicine on "Big Data and Analytics in Healthcare". The amount of data being generated in the healthcare industry is growing at a rapid rate. This has generated immense interest in leveraging the availability of healthcare data (and "big data") to improve health outcomes and reduce costs. However, the nature of healthcare data, and especially big data, presents unique challenges in processing and analyzing big data in healthcare. This Focus Theme aims to disseminate some novel approaches to address these challenges. More specifically, approaches ranging from efficient methods of processing large clinical data to predictive models that could generate better predictions from healthcare data are presented.

  7. Big Data for Business Ecosystem Players

    Directory of Open Access Journals (Sweden)

    Perko Igor

    2016-06-01

    Full Text Available In the provided research, some of the Big Data most prospective usage domains connect with distinguished player groups found in the business ecosystem. Literature analysis is used to identify the state of the art of Big Data related research in the major domains of its use-namely, individual marketing, health treatment, work opportunities, financial services, and security enforcement. System theory was used to identify business ecosystem major player types disrupted by Big Data: individuals, small and mid-sized enterprises, large organizations, information providers, and regulators. Relationships between the domains and players were explained through new Big Data opportunities and threats and by players’ responsive strategies. System dynamics was used to visualize relationships in the provided model.

  8. "Big data" in economic history.

    Science.gov (United States)

    Gutmann, Myron P; Merchant, Emily Klancher; Roberts, Evan

    2018-03-01

    Big data is an exciting prospect for the field of economic history, which has long depended on the acquisition, keying, and cleaning of scarce numerical information about the past. This article examines two areas in which economic historians are already using big data - population and environment - discussing ways in which increased frequency of observation, denser samples, and smaller geographic units allow us to analyze the past with greater precision and often to track individuals, places, and phenomena across time. We also explore promising new sources of big data: organically created economic data, high resolution images, and textual corpora.

  9. Big Data Knowledge in Global Health Education.

    Science.gov (United States)

    Olayinka, Olaniyi; Kekeh, Michele; Sheth-Chandra, Manasi; Akpinar-Elci, Muge

    The ability to synthesize and analyze massive amounts of data is critical to the success of organizations, including those that involve global health. As countries become highly interconnected, increasing the risk for pandemics and outbreaks, the demand for big data is likely to increase. This requires a global health workforce that is trained in the effective use of big data. To assess implementation of big data training in global health, we conducted a pilot survey of members of the Consortium of Universities of Global Health. More than half the respondents did not have a big data training program at their institution. Additionally, the majority agreed that big data training programs will improve global health deliverables, among other favorable outcomes. Given the observed gap and benefits, global health educators may consider investing in big data training for students seeking a career in global health. Copyright © 2017 Icahn School of Medicine at Mount Sinai. Published by Elsevier Inc. All rights reserved.

  10. Big data for bipolar disorder.

    Science.gov (United States)

    Monteith, Scott; Glenn, Tasha; Geddes, John; Whybrow, Peter C; Bauer, Michael

    2016-12-01

    The delivery of psychiatric care is changing with a new emphasis on integrated care, preventative measures, population health, and the biological basis of disease. Fundamental to this transformation are big data and advances in the ability to analyze these data. The impact of big data on the routine treatment of bipolar disorder today and in the near future is discussed, with examples that relate to health policy, the discovery of new associations, and the study of rare events. The primary sources of big data today are electronic medical records (EMR), claims, and registry data from providers and payers. In the near future, data created by patients from active monitoring, passive monitoring of Internet and smartphone activities, and from sensors may be integrated with the EMR. Diverse data sources from outside of medicine, such as government financial data, will be linked for research. Over the long term, genetic and imaging data will be integrated with the EMR, and there will be more emphasis on predictive models. Many technical challenges remain when analyzing big data that relates to size, heterogeneity, complexity, and unstructured text data in the EMR. Human judgement and subject matter expertise are critical parts of big data analysis, and the active participation of psychiatrists is needed throughout the analytical process.

  11. BIG DATA IN TAMIL: OPPORTUNITIES, BENEFITS AND CHALLENGES

    OpenAIRE

    R.S. Vignesh Raj; Babak Khazaei; Ashik Ali

    2015-01-01

    This paper gives an overall introduction on big data and has tried to introduce Big Data in Tamil. It discusses the potential opportunities, benefits and likely challenges from a very Tamil and Tamil Nadu perspective. The paper has also made original contribution by proposing the ‘big data’s’ terminology in Tamil. The paper further suggests a few areas to explore using big data Tamil on the lines of the Tamil Nadu Government ‘vision 2023’. Whilst, big data has something to offer everyone, it ...

  12. Big data in biomedicine.

    Science.gov (United States)

    Costa, Fabricio F

    2014-04-01

    The increasing availability and growth rate of biomedical information, also known as 'big data', provides an opportunity for future personalized medicine programs that will significantly improve patient care. Recent advances in information technology (IT) applied to biomedicine are changing the landscape of privacy and personal information, with patients getting more control of their health information. Conceivably, big data analytics is already impacting health decisions and patient care; however, specific challenges need to be addressed to integrate current discoveries into medical practice. In this article, I will discuss the major breakthroughs achieved in combining omics and clinical health data in terms of their application to personalized medicine. I will also review the challenges associated with using big data in biomedicine and translational science. Copyright © 2013 Elsevier Ltd. All rights reserved.

  13. Big inquiry

    Energy Technology Data Exchange (ETDEWEB)

    Wynne, B [Lancaster Univ. (UK)

    1979-06-28

    The recently published report entitled 'The Big Public Inquiry' from the Council for Science and Society and the Outer Circle Policy Unit is considered, with especial reference to any future enquiry which may take place into the first commercial fast breeder reactor. Proposals embodied in the report include stronger rights for objectors and an attempt is made to tackle the problem that participation in a public inquiry is far too late to be objective. It is felt by the author that the CSS/OCPU report is a constructive contribution to the debate about big technology inquiries but that it fails to understand the deeper currents in the economic and political structure of technology which so influence the consequences of whatever formal procedures are evolved.

  14. The Promise and Potential Perils of Big Data for Advancing Symptom Management Research in Populations at Risk for Health Disparities.

    Science.gov (United States)

    Bakken, Suzanne; Reame, Nancy

    2016-01-01

    Symptom management research is a core area of nursing science and one of the priorities for the National Institute of Nursing Research, which specifically focuses on understanding the biological and behavioral aspects of symptoms such as pain and fatigue, with the goal of developing new knowledge and new strategies for improving patient health and quality of life. The types and volume of data related to the symptom experience, symptom management strategies, and outcomes are increasingly accessible for research. Traditional data streams are now complemented by consumer-generated (i.e., quantified self) and "omic" data streams. Thus, the data available for symptom science can be considered big data. The purposes of this chapter are to (a) briefly summarize the current drivers for the use of big data in research; (b) describe the promise of big data and associated data science methods for advancing symptom management research; (c) explicate the potential perils of big data and data science from the perspective of the ethical principles of autonomy, beneficence, and justice; and (d) illustrate strategies for balancing the promise and the perils of big data through a case study of a community at high risk for health disparities. Big data and associated data science methods offer the promise of multidimensional data sources and new methods to address significant research gaps in symptom management. If nurse scientists wish to apply big data and data science methods to advance symptom management research and promote health equity, they must carefully consider both the promise and perils.

  15. Big data analytics with R and Hadoop

    CERN Document Server

    Prajapati, Vignesh

    2013-01-01

    Big Data Analytics with R and Hadoop is a tutorial style book that focuses on all the powerful big data tasks that can be achieved by integrating R and Hadoop.This book is ideal for R developers who are looking for a way to perform big data analytics with Hadoop. This book is also aimed at those who know Hadoop and want to build some intelligent applications over Big data with R packages. It would be helpful if readers have basic knowledge of R.

  16. NASA's Big Data Task Force

    Science.gov (United States)

    Holmes, C. P.; Kinter, J. L.; Beebe, R. F.; Feigelson, E.; Hurlburt, N. E.; Mentzel, C.; Smith, G.; Tino, C.; Walker, R. J.

    2017-12-01

    Two years ago NASA established the Ad Hoc Big Data Task Force (BDTF - https://science.nasa.gov/science-committee/subcommittees/big-data-task-force), an advisory working group with the NASA Advisory Council system. The scope of the Task Force included all NASA Big Data programs, projects, missions, and activities. The Task Force focused on such topics as exploring the existing and planned evolution of NASA's science data cyber-infrastructure that supports broad access to data repositories for NASA Science Mission Directorate missions; best practices within NASA, other Federal agencies, private industry and research institutions; and Federal initiatives related to big data and data access. The BDTF has completed its two-year term and produced several recommendations plus four white papers for NASA's Science Mission Directorate. This presentation will discuss the activities and results of the TF including summaries of key points from its focused study topics. The paper serves as an introduction to the papers following in this ESSI session.

  17. Big Data Technologies

    Science.gov (United States)

    Bellazzi, Riccardo; Dagliati, Arianna; Sacchi, Lucia; Segagni, Daniele

    2015-01-01

    The so-called big data revolution provides substantial opportunities to diabetes management. At least 3 important directions are currently of great interest. First, the integration of different sources of information, from primary and secondary care to administrative information, may allow depicting a novel view of patient’s care processes and of single patient’s behaviors, taking into account the multifaceted nature of chronic care. Second, the availability of novel diabetes technologies, able to gather large amounts of real-time data, requires the implementation of distributed platforms for data analysis and decision support. Finally, the inclusion of geographical and environmental information into such complex IT systems may further increase the capability of interpreting the data gathered and extract new knowledge from them. This article reviews the main concepts and definitions related to big data, it presents some efforts in health care, and discusses the potential role of big data in diabetes care. Finally, as an example, it describes the research efforts carried on in the MOSAIC project, funded by the European Commission. PMID:25910540

  18. Eksplorasi Teknologi Big Data Hadoop Untuk Sistem Aplikasi Berbasis Komunitas

    Directory of Open Access Journals (Sweden)

    Gede Karya

    2017-11-01

    Full Text Available Pada tahun 2014, telah dikembangkan aplikasi pembukuan untuk usaha mikro dan kecil (UMK berbasis mobile cloud. Aplikasi tersebut dikembangkan dengan teknologi mobile berbasis Android, teknologi web dan web service serta menggunakan basis data MySQL sebagai back-end. Dengan populasi usaha mikro sebanyak 55,1 juta unit usaha di Indonesia dan terus berkembang, maka aplikasi pembukuan UMK berpotensi digunakan oleh banyak user. Hal ini menimbulkan kebutuhan akan layanan pengelolaan data yang sangat besar baik volume maupun pertumbuhannya. Oleh karena itu, perlu dipersiapkan sisi back-end dengan teknologi big-data processing untuk menjamin ketersediaan dan kehandalan layanan kepada pengguna UMK. Makalah ini fokus pada eksplorasi teknologi big-data Hadoop yang saat ini banyak diterapkan untuk aplikasi komunitas seperti: Google, Facebook, Twitter, dan Amazon. Pembahasan diawali dengan studi tentang Hadoop dan ekosistemnya, kemudian merumuskan pola adopsi untuk aplikasi berbasis komunitas. Setelah itu pola dan teknologi tersebut diterapkan untuk mengembangkan back-end aplikasi pembukuan UMK berbasis mobile cloud. Hasil studi dan penerapannya menunjukkan bahwa Hadoop dapat diadopsi pada aplikasi pembukuan UMK khususnya HBase. Untuk memudahkan akses dan meminimalkan usaha modifikasi, maka akses HBase dari aplikasi dapat menggunakan Apache Phoenix Java Data Base Connectivity (JDBC dari beberapa opsi yang tersedia.

  19. Big Data, Data Analyst, and Improving the Competence of Librarian

    Directory of Open Access Journals (Sweden)

    Albertus Pramukti Narendra

    2018-01-01

    Full Text Available Issue of Big Data was already raised by Fremont Rider, an American Librarian from Westleyan University, in 1944. He predicted that the volume of American universities collection would reach 200 million copies in 2040. As a result, it brings to fore multiple issues such as big data users, storage capacity, and the need to have data analysts.  In Indonesia, data analysts is still a rare profession, and therefore urgently needed. One of its distinctive tasks  is to conduct visual analyses from various data resources and also to present the result visually as interesting knowledge. It becomes science enliven by interactive visualization. (Thomas and Cook, 2005. In response to the issue, librarians have already been equipped with basic information management. Yet, they can see the opportunity and improve themselves as data analysts. In developed countries, it is common that librarian are also regarded as data analysts. They enhance  themselves with various skills required, such as cloud computing and smart computing. In the end librarian with data analysts competency are eloquent to extract and present complex data resources as “interesting and discernible” knowledge.

  20. The Berlin Inventory of Gambling behavior - Screening (BIG-S): Validation using a clinical sample.

    Science.gov (United States)

    Wejbera, Martin; Müller, Kai W; Becker, Jan; Beutel, Manfred E

    2017-05-18

    Published diagnostic questionnaires for gambling disorder in German are either based on DSM-III criteria or focus on aspects other than life time prevalence. This study was designed to assess the usability of the DSM-IV criteria based Berlin Inventory of Gambling Behavior Screening tool in a clinical sample and adapt it to DSM-5 criteria. In a sample of 432 patients presenting for behavioral addiction assessment at the University Medical Center Mainz, we checked the screening tool's results against clinical diagnosis and compared a subsample of n=300 clinically diagnosed gambling disorder patients with a comparison group of n=132. The BIG-S produced a sensitivity of 99.7% and a specificity of 96.2%. The instrument's unidimensionality and the diagnostic improvements of DSM-5 criteria were verified by exploratory and confirmatory factor analysis as well as receiver operating characteristic analysis. The BIG-S is a reliable and valid screening tool for gambling disorder and demonstrated its concise and comprehensible operationalization of current DSM-5 criteria in a clinical setting.

  1. Medical Big Data Warehouse: Architecture and System Design, a Case Study: Improving Healthcare Resources Distribution.

    Science.gov (United States)

    Sebaa, Abderrazak; Chikh, Fatima; Nouicer, Amina; Tari, AbdelKamel

    2018-02-19

    The huge increases in medical devices and clinical applications which generate enormous data have raised a big issue in managing, processing, and mining this massive amount of data. Indeed, traditional data warehousing frameworks can not be effective when managing the volume, variety, and velocity of current medical applications. As a result, several data warehouses face many issues over medical data and many challenges need to be addressed. New solutions have emerged and Hadoop is one of the best examples, it can be used to process these streams of medical data. However, without an efficient system design and architecture, these performances will not be significant and valuable for medical managers. In this paper, we provide a short review of the literature about research issues of traditional data warehouses and we present some important Hadoop-based data warehouses. In addition, a Hadoop-based architecture and a conceptual data model for designing medical Big Data warehouse are given. In our case study, we provide implementation detail of big data warehouse based on the proposed architecture and data model in the Apache Hadoop platform to ensure an optimal allocation of health resources.

  2. Big Data Analysis of Human Genome Variations

    KAUST Repository

    Gojobori, Takashi

    2016-01-25

    Since the human genome draft sequence was in public for the first time in 2000, genomic analyses have been intensively extended to the population level. The following three international projects are good examples for large-scale studies of human genome variations: 1) HapMap Data (1,417 individuals) (http://hapmap.ncbi.nlm.nih.gov/downloads/genotypes/2010-08_phaseII+III/forward/), 2) HGDP (Human Genome Diversity Project) Data (940 individuals) (http://www.hagsc.org/hgdp/files.html), 3) 1000 genomes Data (2,504 individuals) http://ftp.1000genomes.ebi.ac.uk/vol1/ftp/release/20130502/ If we can integrate all three data into a single volume of data, we should be able to conduct a more detailed analysis of human genome variations for a total number of 4,861 individuals (= 1,417+940+2,504 individuals). In fact, we successfully integrated these three data sets by use of information on the reference human genome sequence, and we conducted the big data analysis. In particular, we constructed a phylogenetic tree of about 5,000 human individuals at the genome level. As a result, we were able to identify clusters of ethnic groups, with detectable admixture, that were not possible by an analysis of each of the three data sets. Here, we report the outcome of this kind of big data analyses and discuss evolutionary significance of human genomic variations. Note that the present study was conducted in collaboration with Katsuhiko Mineta and Kosuke Goto at KAUST.

  3. Traffic information computing platform for big data

    Energy Technology Data Exchange (ETDEWEB)

    Duan, Zongtao, E-mail: ztduan@chd.edu.cn; Li, Ying, E-mail: ztduan@chd.edu.cn; Zheng, Xibin, E-mail: ztduan@chd.edu.cn; Liu, Yan, E-mail: ztduan@chd.edu.cn; Dai, Jiting, E-mail: ztduan@chd.edu.cn; Kang, Jun, E-mail: ztduan@chd.edu.cn [Chang' an University School of Information Engineering, Xi' an, China and Shaanxi Engineering and Technical Research Center for Road and Traffic Detection, Xi' an (China)

    2014-10-06

    Big data environment create data conditions for improving the quality of traffic information service. The target of this article is to construct a traffic information computing platform for big data environment. Through in-depth analysis the connotation and technology characteristics of big data and traffic information service, a distributed traffic atomic information computing platform architecture is proposed. Under the big data environment, this type of traffic atomic information computing architecture helps to guarantee the traffic safety and efficient operation, more intelligent and personalized traffic information service can be used for the traffic information users.

  4. Traffic information computing platform for big data

    International Nuclear Information System (INIS)

    Duan, Zongtao; Li, Ying; Zheng, Xibin; Liu, Yan; Dai, Jiting; Kang, Jun

    2014-01-01

    Big data environment create data conditions for improving the quality of traffic information service. The target of this article is to construct a traffic information computing platform for big data environment. Through in-depth analysis the connotation and technology characteristics of big data and traffic information service, a distributed traffic atomic information computing platform architecture is proposed. Under the big data environment, this type of traffic atomic information computing architecture helps to guarantee the traffic safety and efficient operation, more intelligent and personalized traffic information service can be used for the traffic information users

  5. Fremtidens landbrug bliver big business

    DEFF Research Database (Denmark)

    Hansen, Henning Otte

    2016-01-01

    Landbrugets omverdensforhold og konkurrencevilkår ændres, og det vil nødvendiggøre en udvikling i retning af “big business“, hvor landbrugene bliver endnu større, mere industrialiserede og koncentrerede. Big business bliver en dominerende udvikling i dansk landbrug - men ikke den eneste...

  6. Quantum nature of the big bang.

    Science.gov (United States)

    Ashtekar, Abhay; Pawlowski, Tomasz; Singh, Parampreet

    2006-04-14

    Some long-standing issues concerning the quantum nature of the big bang are resolved in the context of homogeneous isotropic models with a scalar field. Specifically, the known results on the resolution of the big-bang singularity in loop quantum cosmology are significantly extended as follows: (i) the scalar field is shown to serve as an internal clock, thereby providing a detailed realization of the "emergent time" idea; (ii) the physical Hilbert space, Dirac observables, and semiclassical states are constructed rigorously; (iii) the Hamiltonian constraint is solved numerically to show that the big bang is replaced by a big bounce. Thanks to the nonperturbative, background independent methods, unlike in other approaches the quantum evolution is deterministic across the deep Planck regime.

  7. Mentoring in Schools: An Impact Study of Big Brothers Big Sisters School-Based Mentoring

    Science.gov (United States)

    Herrera, Carla; Grossman, Jean Baldwin; Kauh, Tina J.; McMaken, Jennifer

    2011-01-01

    This random assignment impact study of Big Brothers Big Sisters School-Based Mentoring involved 1,139 9- to 16-year-old students in 10 cities nationwide. Youth were randomly assigned to either a treatment group (receiving mentoring) or a control group (receiving no mentoring) and were followed for 1.5 school years. At the end of the first school…

  8. Big Data in Designing Clinical Trials: Opportunities and Challenges.

    Science.gov (United States)

    Mayo, Charles S; Matuszak, Martha M; Schipper, Matthew J; Jolly, Shruti; Hayman, James A; Ten Haken, Randall K

    2017-01-01

    Emergence of big data analytics resource systems (BDARSs) as a part of routine practice in Radiation Oncology is on the horizon. Gradually, individual researchers, vendors, and professional societies are leading initiatives to create and demonstrate use of automated systems. What are the implications for design of clinical trials, as these systems emerge? Gold standard, randomized controlled trials (RCTs) have high internal validity for the patients and settings fitting constraints of the trial, but also have limitations including: reproducibility, generalizability to routine practice, infrequent external validation, selection bias, characterization of confounding factors, ethics, and use for rare events. BDARS present opportunities to augment and extend RCTs. Preliminary modeling using single- and muti-institutional BDARS may lead to better design and less cost. Standardizations in data elements, clinical processes, and nomenclatures used to decrease variability and increase veracity needed for automation and multi-institutional data pooling in BDARS also support ability to add clinical validation phases to clinical trial design and increase participation. However, volume and variety in BDARS present other technical, policy, and conceptual challenges including applicable statistical concepts, cloud-based technologies. In this summary, we will examine both the opportunities and the challenges for use of big data in design of clinical trials.

  9. Big Data in Designing Clinical Trials: Opportunities and Challenges

    Directory of Open Access Journals (Sweden)

    Charles S. Mayo

    2017-08-01

    Full Text Available Emergence of big data analytics resource systems (BDARSs as a part of routine practice in Radiation Oncology is on the horizon. Gradually, individual researchers, vendors, and professional societies are leading initiatives to create and demonstrate use of automated systems. What are the implications for design of clinical trials, as these systems emerge? Gold standard, randomized controlled trials (RCTs have high internal validity for the patients and settings fitting constraints of the trial, but also have limitations including: reproducibility, generalizability to routine practice, infrequent external validation, selection bias, characterization of confounding factors, ethics, and use for rare events. BDARS present opportunities to augment and extend RCTs. Preliminary modeling using single- and muti-institutional BDARS may lead to better design and less cost. Standardizations in data elements, clinical processes, and nomenclatures used to decrease variability and increase veracity needed for automation and multi-institutional data pooling in BDARS also support ability to add clinical validation phases to clinical trial design and increase participation. However, volume and variety in BDARS present other technical, policy, and conceptual challenges including applicable statistical concepts, cloud-based technologies. In this summary, we will examine both the opportunities and the challenges for use of big data in design of clinical trials.

  10. Big data processing in the cloud - Challenges and platforms

    Science.gov (United States)

    Zhelev, Svetoslav; Rozeva, Anna

    2017-12-01

    Choosing the appropriate architecture and technologies for a big data project is a difficult task, which requires extensive knowledge in both the problem domain and in the big data landscape. The paper analyzes the main big data architectures and the most widely implemented technologies used for processing and persisting big data. Clouds provide for dynamic resource scaling, which makes them a natural fit for big data applications. Basic cloud computing service models are presented. Two architectures for processing big data are discussed, Lambda and Kappa architectures. Technologies for big data persistence are presented and analyzed. Stream processing as the most important and difficult to manage is outlined. The paper highlights main advantages of cloud and potential problems.

  11. Ethics and Epistemology in Big Data Research.

    Science.gov (United States)

    Lipworth, Wendy; Mason, Paul H; Kerridge, Ian; Ioannidis, John P A

    2017-12-01

    Biomedical innovation and translation are increasingly emphasizing research using "big data." The hope is that big data methods will both speed up research and make its results more applicable to "real-world" patients and health services. While big data research has been embraced by scientists, politicians, industry, and the public, numerous ethical, organizational, and technical/methodological concerns have also been raised. With respect to technical and methodological concerns, there is a view that these will be resolved through sophisticated information technologies, predictive algorithms, and data analysis techniques. While such advances will likely go some way towards resolving technical and methodological issues, we believe that the epistemological issues raised by big data research have important ethical implications and raise questions about the very possibility of big data research achieving its goals.

  12. Precise Plan in the analysis of volume precision in SynergyTM conebeam CT image

    International Nuclear Information System (INIS)

    Bai Sen; Xu Qingfeng; Zhong Renming; Jiang Xiaoqin; Jiang Qingfeng; Xu Feng

    2007-01-01

    Objective: A method of checking the volume precision in Synergy TM conebeam CT image. Methods: To scan known phantoms (big, middle, small spheres, cubes and cuniform cavum) at different positions (CBCT centre and departure centre from 5, 8, 10 cm along the accelerator G-T way)with conebeam CT, the phantom volume of reconstructed images were measure. Then to compared measured volume of Synergy TM conebeam CT with fanbeam CT results and nominal values. Results: The middle spheres had 1.5% discrepancy in nominal values and metrical average values at CBCT centre and departure from centre 5, 8 cm along accelerator G-T way. The small spheres showed 8.1%, with 0.8 % of the big cube and 2.9% of small cube, in nominal values and metrical average values at CBCT centre and departure from centre 5, 8, 10 cm along the accelerator G-T way. Conclusion: In valid scan range of Synergy TM conebeam CT, reconstructed precision is independent of the distance deviation from the center. (authors)

  13. Digital curation: a proposal of a semi-automatic digital object selection-based model for digital curation in Big Data environments

    Directory of Open Access Journals (Sweden)

    Moisés Lima Dutra

    2016-08-01

    Full Text Available Introduction: This work presents a new approach for Digital Curations from a Big Data perspective. Objective: The objective is to propose techniques to digital curations for selecting and evaluating digital objects that take into account volume, velocity, variety, reality, and the value of the data collected from multiple knowledge domains. Methodology: This is an exploratory research of applied nature, which addresses the research problem in a qualitative way. Heuristics allow this semi-automatic process to be done either by human curators or by software agents. Results: As a result, it was proposed a model for searching, processing, evaluating and selecting digital objects to be processed by digital curations. Conclusions: It is possible to use Big Data environments as a source of information resources for Digital Curation; besides, Big Data techniques and tools can support the search and selection process of information resources by Digital Curations.

  14. Victoria Stodden: Scholarly Communication in the Era of Big Data and Big Computation

    OpenAIRE

    Stodden, Victoria

    2015-01-01

    Victoria Stodden gave the keynote address for Open Access Week 2015. "Scholarly communication in the era of big data and big computation" was sponsored by the University Libraries, Computational Modeling and Data Analytics, the Department of Computer Science, the Department of Statistics, the Laboratory for Interdisciplinary Statistical Analysis (LISA), and the Virginia Bioinformatics Institute. Victoria Stodden is an associate professor in the Graduate School of Library and Information Scien...

  15. Big data analytics a management perspective

    CERN Document Server

    Corea, Francesco

    2016-01-01

    This book is about innovation, big data, and data science seen from a business perspective. Big data is a buzzword nowadays, and there is a growing necessity within practitioners to understand better the phenomenon, starting from a clear stated definition. This book aims to be a starting reading for executives who want (and need) to keep the pace with the technological breakthrough introduced by new analytical techniques and piles of data. Common myths about big data will be explained, and a series of different strategic approaches will be provided. By browsing the book, it will be possible to learn how to implement a big data strategy and how to use a maturity framework to monitor the progress of the data science team, as well as how to move forward from one stage to the next. Crucial challenges related to big data will be discussed, where some of them are more general - such as ethics, privacy, and ownership – while others concern more specific business situations (e.g., initial public offering, growth st...

  16. Human factors in Big Data

    NARCIS (Netherlands)

    Boer, J. de

    2016-01-01

    Since 2014 I am involved in various (research) projects that try to make the hype around Big Data more concrete and tangible for the industry and government. Big Data is about multiple sources of (real-time) data that can be analysed, transformed to information and be used to make 'smart' decisions.

  17. Slaves to Big Data. Or Are We?

    Directory of Open Access Journals (Sweden)

    Mireille Hildebrandt

    2013-10-01

    Full Text Available

    In this contribution, the notion of Big Data is discussed in relation to the monetisation of personal data. The claim of some proponents, as well as adversaries, that Big Data implies that ‘n = all’, meaning that we no longer need to rely on samples because we have all the data, is scrutinised and found to be both overly optimistic and unnecessarily pessimistic. A set of epistemological and ethical issues is presented, focusing on the implications of Big Data for our perception, cognition, fairness, privacy and due process. The article then looks into the idea of user-centric personal data management to investigate to what extent it provides solutions for some of the problems triggered by the Big Data conundrum. Special attention is paid to the core principle of data protection legislation, namely purpose binding. Finally, this contribution seeks to inquire into the influence of Big Data politics on self, mind and society, and asks how we can prevent ourselves from becoming slaves to Big Data.

  18. Official statistics and Big Data

    Directory of Open Access Journals (Sweden)

    Peter Struijs

    2014-07-01

    Full Text Available The rise of Big Data changes the context in which organisations producing official statistics operate. Big Data provides opportunities, but in order to make optimal use of Big Data, a number of challenges have to be addressed. This stimulates increased collaboration between National Statistical Institutes, Big Data holders, businesses and universities. In time, this may lead to a shift in the role of statistical institutes in the provision of high-quality and impartial statistical information to society. In this paper, the changes in context, the opportunities, the challenges and the way to collaborate are addressed. The collaboration between the various stakeholders will involve each partner building on and contributing different strengths. For national statistical offices, traditional strengths include, on the one hand, the ability to collect data and combine data sources with statistical products and, on the other hand, their focus on quality, transparency and sound methodology. In the Big Data era of competing and multiplying data sources, they continue to have a unique knowledge of official statistical production methods. And their impartiality and respect for privacy as enshrined in law uniquely position them as a trusted third party. Based on this, they may advise on the quality and validity of information of various sources. By thus positioning themselves, they will be able to play their role as key information providers in a changing society.

  19. Big Data

    OpenAIRE

    Bútora, Matúš

    2017-01-01

    Cieľom bakalárskej práca je popísať problematiku Big Data a agregačné operácie OLAP pre podporu rozhodovania, ktoré sú na ne aplikované pomocou technológie Apache Hadoop. Prevažná časť práce je venovaná popisu práve tejto technológie. Posledná kapitola sa zaoberá spôsobom aplikovania agregačných operácií a problematikou ich realizácie. Nasleduje celkové zhodnotenie práce a možnosti využitia výsledného systému do budúcna. The aim of the bachelor thesis is to describe the Big Data issue and ...

  20. Artificial intelligence and big data management: the dynamic duo for moving forward data centric sciences

    OpenAIRE

    Vargas Solar, Genoveva

    2017-01-01

    After vivid discussions led by the emergence of the buzzword “Big Data”, it seems that industry and academia have reached an objective understanding about data properties (volume, velocity, variety, veracity and value), the resources and “know how” it requires, and the opportunities it opens. Indeed, new applications promising fundamental changes in society, industry and science, include face recognition, machine translation, digital assistants, self-driving cars, ad-serving, chat-bots, perso...

  1. Big data in the construction industry: A review of present status, opportunities, and future trends

    OpenAIRE

    Bilal, M.; Oyedele, L.; Qadir, J.; Munir, K.; Ajayi, S. ed; Akinade, O.; Owolabi, H. A.; Alaka, H. A.; Pasha, M.

    2016-01-01

    The ability to process large amounts of data and to extract useful insights from data has revolutionised society. This phenomenon—dubbed as Big Data—has applications for a wide assortment of industries, including the construction industry. The construction industry already deals with large volumes of heterogeneous data; which is expected to increase exponentially as technologies such as sensor networks and the Internet of Things are commoditised. In this paper, we present a detailed survey of...

  2. BigDansing

    KAUST Repository

    Khayyat, Zuhair; Ilyas, Ihab F.; Jindal, Alekh; Madden, Samuel; Ouzzani, Mourad; Papotti, Paolo; Quiané -Ruiz, Jorge-Arnulfo; Tang, Nan; Yin, Si

    2015-01-01

    of the underlying distributed platform. BigDansing takes these rules into a series of transformations that enable distributed computations and several optimizations, such as shared scans and specialized joins operators. Experimental results on both synthetic

  3. Leveraging Mobile Network Big Data for Developmental Policy ...

    International Development Research Centre (IDRC) Digital Library (Canada)

    Some argue that big data and big data users offer advantages to generate evidence. ... Supported by IDRC, this research focused on transportation planning in urban ... Using mobile network big data for land use classification CPRsouth 2015.

  4. Practice Variation in Big-4 Transparency Reports

    DEFF Research Database (Denmark)

    Girdhar, Sakshi; Klarskov Jeppesen, Kim

    2018-01-01

    Purpose: The purpose of this paper is to examine the transparency reports published by the Big-4 public accounting firms in the UK, Germany and Denmark to understand the determinants of their content within the networks of big accounting firms. Design/methodology/approach: The study draws...... on a qualitative research approach, in which the content of transparency reports is analyzed and semi-structured interviews are conducted with key people from the Big-4 firms who are responsible for developing the transparency reports. Findings: The findings show that the content of transparency reports...... is inconsistent and the transparency reporting practice is not uniform within the Big-4 networks. Differences were found in the way in which the transparency reporting practices are coordinated globally by the respective central governing bodies of the Big-4. The content of the transparency reports...

  5. Was the big bang hot

    International Nuclear Information System (INIS)

    Wright, E.L.

    1983-01-01

    The author considers experiments to confirm the substantial deviations from a Planck curve in the Woody and Richards spectrum of the microwave background, and search for conducting needles in our galaxy. Spectral deviations and needle-shaped grains are expected for a cold Big Bang, but are not required by a hot Big Bang. (Auth.)

  6. Passport to the Big Bang

    CERN Multimedia

    De Melis, Cinzia

    2013-01-01

    Le 2 juin 2013, le CERN inaugure le projet Passeport Big Bang lors d'un grand événement public. Affiche et programme. On 2 June 2013 CERN launches a scientific tourist trail through the Pays de Gex and the Canton of Geneva known as the Passport to the Big Bang. Poster and Programme.

  7. Keynote: Big Data, Big Opportunities

    OpenAIRE

    Borgman, Christine L.

    2014-01-01

    The enthusiasm for big data is obscuring the complexity and diversity of data in scholarship and the challenges for stewardship. Inside the black box of data are a plethora of research, technology, and policy issues. Data are not shiny objects that are easily exchanged. Rather, data are representations of observations, objects, or other entities used as evidence of phenomena for the purposes of research or scholarship. Data practices are local, varying from field to field, individual to indiv...

  8. Evaluation of Big Data and Innovation Interaction in Increase Supply Chain Competencies

    Directory of Open Access Journals (Sweden)

    Zumrut Ecevit Sati

    2017-12-01

    Full Text Available In business today, it means a great deal to uncover meaningful relationships, patterns and trends from the huge stacks of data that are often now available. The explosion in data diversity and volume coming from enterprise content and application data, data from social media, sensor data and also data including streams from third parties is significantly changing the ways and methods of interaction for both companies and their customers. This pressure is felt considerably more in the management of innovation through trying to develop the capability to integrate the supply chain to match the correct methods with the right information. This situation has directed companies into using “big data” in managing both their structured and unstructured data. Big data, which is information, held on a vast scale, can reveal significant potential in its transparency and convenience. To bring about a balanced approach to the use of internal and external information, supporting improved capabilities to better predict future competence, and provide that all important “big picture” through business analytics can improve the vision of businesses through the provision of more in-depth information about how to best access their customers. Improved communication and information links between partners of the supply chain may create major sources of information by bringing together both internal and external resources for customers, partners, stakeholders and suppliers in managing innovation. In this study, it is aimed to provide an extensive literature review on the interaction of innovation and big data in order to increase supply chain competencies and to study the problem, obstacles and driving forces for such interactions, and to consider projections for the future through the application of technology-based methods.

  9. Integrating R and Hadoop for Big Data Analysis

    OpenAIRE

    Bogdan Oancea; Raluca Mariana Dragoescu

    2014-01-01

    Analyzing and working with big data could be very diffi cult using classical means like relational database management systems or desktop software packages for statistics and visualization. Instead, big data requires large clusters with hundreds or even thousands of computing nodes. Offi cial statistics is increasingly considering big data for deriving new statistics because big data sources could produce more relevant and timely statistics than traditional sources. One of the software tools ...

  10. The challenges of big data.

    Science.gov (United States)

    Mardis, Elaine R

    2016-05-01

    The largely untapped potential of big data analytics is a feeding frenzy that has been fueled by the production of many next-generation-sequencing-based data sets that are seeking to answer long-held questions about the biology of human diseases. Although these approaches are likely to be a powerful means of revealing new biological insights, there are a number of substantial challenges that currently hamper efforts to harness the power of big data. This Editorial outlines several such challenges as a means of illustrating that the path to big data revelations is paved with perils that the scientific community must overcome to pursue this important quest. © 2016. Published by The Company of Biologists Ltd.

  11. Big Data Usage Patterns in the Health Care Domain: A Use Case Driven Approach Applied to the Assessment of Vaccination Benefits and Risks

    Science.gov (United States)

    Liyanage, H.; Liaw, S-T.; Kuziemsky, C.; Mold, F.; Krause, P.; Fleming, D.; Jones, S.

    2014-01-01

    Summary Background Generally benefits and risks of vaccines can be determined from studies carried out as part of regulatory compliance, followed by surveillance of routine data; however there are some rarer and more long term events that require new methods. Big data generated by increasingly affordable personalised computing, and from pervasive computing devices is rapidly growing and low cost, high volume, cloud computing makes the processing of these data inexpensive. Objective To describe how big data and related analytical methods might be applied to assess the benefits and risks of vaccines. Method: We reviewed the literature on the use of big data to improve health, applied to generic vaccine use cases, that illustrate benefits and risks of vaccination. We defined a use case as the interaction between a user and an information system to achieve a goal. We used flu vaccination and pre-school childhood immunisation as exemplars. Results We reviewed three big data use cases relevant to assessing vaccine benefits and risks: (i) Big data processing using crowd-sourcing, distributed big data processing, and predictive analytics, (ii) Data integration from heterogeneous big data sources, e.g. the increasing range of devices in the “internet of things”, and (iii) Real-time monitoring for the direct monitoring of epidemics as well as vaccine effects via social media and other data sources. Conclusions Big data raises new ethical dilemmas, though its analysis methods can bring complementary real-time capabilities for monitoring epidemics and assessing vaccine benefit-risk balance. PMID:25123718

  12. Big³. Editorial.

    Science.gov (United States)

    Lehmann, C U; Séroussi, B; Jaulent, M-C

    2014-05-22

    To provide an editorial introduction into the 2014 IMIA Yearbook of Medical Informatics with an overview of the content, the new publishing scheme, and upcoming 25th anniversary. A brief overview of the 2014 special topic, Big Data - Smart Health Strategies, and an outline of the novel publishing model is provided in conjunction with a call for proposals to celebrate the 25th anniversary of the Yearbook. 'Big Data' has become the latest buzzword in informatics and promise new approaches and interventions that can improve health, well-being, and quality of life. This edition of the Yearbook acknowledges the fact that we just started to explore the opportunities that 'Big Data' will bring. However, it will become apparent to the reader that its pervasive nature has invaded all aspects of biomedical informatics - some to a higher degree than others. It was our goal to provide a comprehensive view at the state of 'Big Data' today, explore its strengths and weaknesses, as well as its risks, discuss emerging trends, tools, and applications, and stimulate the development of the field through the aggregation of excellent survey papers and working group contributions to the topic. For the first time in history will the IMIA Yearbook be published in an open access online format allowing a broader readership especially in resource poor countries. For the first time, thanks to the online format, will the IMIA Yearbook be published twice in the year, with two different tracks of papers. We anticipate that the important role of the IMIA yearbook will further increase with these changes just in time for its 25th anniversary in 2016.

  13. Cloud Based Big Data Infrastructure: Architectural Components and Automated Provisioning

    OpenAIRE

    Demchenko, Yuri; Turkmen, Fatih; Blanchet, Christophe; Loomis, Charles; Laat, Caees de

    2016-01-01

    This paper describes the general architecture and functional components of the cloud based Big Data Infrastructure (BDI). The proposed BDI architecture is based on the analysis of the emerging Big Data and data intensive technologies and supported by the definition of the Big Data Architecture Framework (BDAF) that defines the following components of the Big Data technologies: Big Data definition, Data Management including data lifecycle and data structures, Big Data Infrastructure (generical...

  14. Physics with Big Karl Brainstorming. Abstracts

    International Nuclear Information System (INIS)

    Machner, H.; Lieb, J.

    2000-08-01

    Before summarizing details of the meeting, a short description of the spectrometer facility Big Karl is given. The facility is essentially a new instrument using refurbished dipole magnets from its predecessor. The large acceptance quadrupole magnets and the beam optics are new. Big Karl has a design very similar as the focussing spectrometers at MAMI (Mainz), AGOR (Groningen) and the high resolution spectrometer (HRS) in Hall A at Jefferson Laboratory with ΔE/E = 10 -4 but at some lower maximum momentum. The focal plane detectors consisting of multiwire drift chambers and scintillating hodoscopes are similar. Unlike HRS, Big Karl still needs Cerenkov counters and polarimeters in its focal plane; detectors which are necessary to perform some of the experiments proposed during the brainstorming. In addition, BIG KARL allows emission angle reconstruction via track measurements in its focal plane with high resolution. In the following the physics highlights, the proposed and potential experiments are summarized. During the meeting it became obvious that the physics to be explored at Big Karl can be grouped into five distinct categories, and this summary is organized accordingly. (orig.)

  15. Seed bank and big sagebrush plant community composition in a range margin for big sagebrush

    Science.gov (United States)

    Martyn, Trace E.; Bradford, John B.; Schlaepfer, Daniel R.; Burke, Ingrid C.; Laurenroth, William K.

    2016-01-01

    The potential influence of seed bank composition on range shifts of species due to climate change is unclear. Seed banks can provide a means of both species persistence in an area and local range expansion in the case of increasing habitat suitability, as may occur under future climate change. However, a mismatch between the seed bank and the established plant community may represent an obstacle to persistence and expansion. In big sagebrush (Artemisia tridentata) plant communities in Montana, USA, we compared the seed bank to the established plant community. There was less than a 20% similarity in the relative abundance of species between the established plant community and the seed bank. This difference was primarily driven by an overrepresentation of native annual forbs and an underrepresentation of big sagebrush in the seed bank compared to the established plant community. Even though we expect an increase in habitat suitability for big sagebrush under future climate conditions at our sites, the current mismatch between the plant community and the seed bank could impede big sagebrush range expansion into increasingly suitable habitat in the future.

  16. Application and Prospect of Big Data in Water Resources

    Science.gov (United States)

    Xi, Danchi; Xu, Xinyi

    2017-04-01

    Because of developed information technology and affordable data storage, we h ave entered the era of data explosion. The term "Big Data" and technology relate s to it has been created and commonly applied in many fields. However, academic studies just got attention on Big Data application in water resources recently. As a result, water resource Big Data technology has not been fully developed. This paper introduces the concept of Big Data and its key technologies, including the Hadoop system and MapReduce. In addition, this paper focuses on the significance of applying the big data in water resources and summarizing prior researches by others. Most studies in this field only set up theoretical frame, but we define the "Water Big Data" and explain its tridimensional properties which are time dimension, spatial dimension and intelligent dimension. Based on HBase, the classification system of Water Big Data is introduced: hydrology data, ecology data and socio-economic data. Then after analyzing the challenges in water resources management, a series of solutions using Big Data technologies such as data mining and web crawler, are proposed. Finally, the prospect of applying big data in water resources is discussed, it can be predicted that as Big Data technology keeps developing, "3D" (Data Driven Decision) will be utilized more in water resources management in the future.

  17. Big Data in food and agriculture

    Directory of Open Access Journals (Sweden)

    Kelly Bronson

    2016-06-01

    Full Text Available Farming is undergoing a digital revolution. Our existing review of current Big Data applications in the agri-food sector has revealed several collection and analytics tools that may have implications for relationships of power between players in the food system (e.g. between farmers and large corporations. For example, Who retains ownership of the data generated by applications like Monsanto Corproation's Weed I.D . “app”? Are there privacy implications with the data gathered by John Deere's precision agricultural equipment? Systematically tracing the digital revolution in agriculture, and charting the affordances as well as the limitations of Big Data applied to food and agriculture, should be a broad research goal for Big Data scholarship. Such a goal brings data scholarship into conversation with food studies and it allows for a focus on the material consequences of big data in society.

  18. Big data optimization recent developments and challenges

    CERN Document Server

    2016-01-01

    The main objective of this book is to provide the necessary background to work with big data by introducing some novel optimization algorithms and codes capable of working in the big data setting as well as introducing some applications in big data optimization for both academics and practitioners interested, and to benefit society, industry, academia, and government. Presenting applications in a variety of industries, this book will be useful for the researchers aiming to analyses large scale data. Several optimization algorithms for big data including convergent parallel algorithms, limited memory bundle algorithm, diagonal bundle method, convergent parallel algorithms, network analytics, and many more have been explored in this book.

  19. Una aproximación a Big Data = An approach to Big Data

    OpenAIRE

    Puyol Moreno, Javier

    2014-01-01

    Big Data puede ser considerada como una tendencia en el avance de la tecnología que ha abierto la puerta a un nuevo enfoque para la comprensión y la toma de decisiones, que se utiliza para describir las enormes cantidades de datos (estructurados, no estructurados y semi- estructurados) que sería demasiado largo y costoso para cargar una base de datos relacional para su análisis. Así, el concepto de Big Data se aplica a toda la información que no puede ser procesada o analizada utilizando herr...

  20. Big Data Analytic, Big Step for Patient Management and Care in Puerto Rico.

    Science.gov (United States)

    Borrero, Ernesto E

    2018-01-01

    This letter provides an overview of the application of big data in health care system to improve quality of care, including predictive modelling for risk and resource use, precision medicine and clinical decision support, quality of care and performance measurement, public health and research applications, among others. The author delineates the tremendous potential for big data analytics and discuss how it can be successfully implemented in clinical practice, as an important component of a learning health-care system.

  1. The impact of information and communication technology on decision making process in the big data era

    Directory of Open Access Journals (Sweden)

    Lukić Jelena

    2014-01-01

    Full Text Available The information necessary to make important decisions is held by many different hierarchical levels in organizations and management needs to find the answer on the question should the decisions be centralized and made by the top management or decentralized and made by the managers and employees of the lower-level units. This question becomes more important in the big data era which is characterized by volume, velocity, and variety of data. The aim of this paper is to analyze whether information and communication technology leads to centralization or decentralization tendencies in organizations and to give answer on the question what are the new challenges of decision making process in the big data era. The conclusion is that information and communication technology provides all organizational level with information that traditionally was used by only few levels, reducing internal coordination costs and enabling organizations to allow decision making across a higher range of hierarchical levels. But final decision of allocation of decision rights depends on knowledge of employees, especially in the big data era, where professionals with new knowledge and skills (known as data scientist became of tremendous importance.

  2. The Big Challenge in Big Earth Science Data: Maturing to Transdisciplinary Data Platforms that are Relevant to Government, Research and Industry

    Science.gov (United States)

    Wyborn, Lesley; Evans, Ben

    2016-04-01

    Collecting data for the Earth Sciences has a particularly long history going back centuries. Initially scientific data came only from simple human observations recorded by pen on paper. Scientific instruments soon supplemented data capture, and as these instruments became more capable (e.g, automation, more information captured, generation of digitally-born outputs), Earth Scientists entered the 'Big Data' era where progressively data became too big to store and process locally in the old style vaults. To date, most funding initiatives for collection and storage of large volume data sets in the Earth Sciences have been specialised within a single discipline (e.g., climate, geophysics, and Earth Observation) or specific to an individual institution. To undertake interdisciplinary research, it is hard for users to integrate data from these individual repositories mainly due to limitations on physical access to/movement of the data, and/or data being organised without enough information to make sense of it without discipline specialised knowledge. Smaller repositories have also gradually been seen as inefficient in terms of the cost to manage and access (including scarce skills) and effective implementation of new technology and techniques. Within the last decade, the trend is towards fewer and larger data repositories that increasingly are collocated with HPC/cloud resources. There has also been a growing recognition that digital data can be a valuable resource that can be reused and repurposed - publicly funded data from either the academic of government sector is seen as a shared resource, and that efficiencies can be gained by co-location. These new, highly capable, 'transdisciplinary' data repositories are emerging as a fundamental 'infrastructure' both for research and other innovation. The sharing of academic and government data resources on the same infrastructures is enabling new research programmes that will enable integration beyond the traditional physical

  3. Big Data and historical social science

    Directory of Open Access Journals (Sweden)

    Peter Bearman

    2015-11-01

    Full Text Available “Big Data” can revolutionize historical social science if it arises from substantively important contexts and is oriented towards answering substantively important questions. Such data may be especially important for answering previously largely intractable questions about the timing and sequencing of events, and of event boundaries. That said, “Big Data” makes no difference for social scientists and historians whose accounts rest on narrative sentences. Since such accounts are the norm, the effects of Big Data on the practice of historical social science may be more limited than one might wish.

  4. Understanding the Performance of Low Power Raspberry Pi Cloud for Big Data

    Directory of Open Access Journals (Sweden)

    Wajdi Hajji

    2016-06-01

    Full Text Available Nowadays, Internet-of-Things (IoT devices generate data at high speed and large volume. Often the data require real-time processing to support high system responsiveness which can be supported by localised Cloud and/or Fog computing paradigms. However, there are considerably large deployments of IoT such as sensor networks in remote areas where Internet connectivity is sparse, challenging the localised Cloud and/or Fog computing paradigms. With the advent of the Raspberry Pi, a credit card-sized single board computer, there is a great opportunity to construct low-cost, low-power portable cloud to support real-time data processing next to IoT deployments. In this paper, we extend our previous work on constructing Raspberry Pi Cloud to study its feasibility for real-time big data analytics under realistic application-level workload in both native and virtualised environments. We have extensively tested the performance of a single node Raspberry Pi 2 Model B with httperf and a cluster of 12 nodes with Apache Spark and HDFS (Hadoop Distributed File System. Our results have demonstrated that our portable cloud is useful for supporting real-time big data analytics. On the other hand, our results have also unveiled that overhead for CPU-bound workload in virtualised environment is surprisingly high, at 67.2%. We have found that, for big data applications, the virtualisation overhead is fractional for small jobs but becomes more significant for large jobs, up to 28.6%.

  5. The Inverted Big-Bang

    OpenAIRE

    Vaas, Ruediger

    2004-01-01

    Our universe appears to have been created not out of nothing but from a strange space-time dust. Quantum geometry (loop quantum gravity) makes it possible to avoid the ominous beginning of our universe with its physically unrealistic (i.e. infinite) curvature, extreme temperature, and energy density. This could be the long sought after explanation of the big-bang and perhaps even opens a window into a time before the big-bang: Space itself may have come from an earlier collapsing universe tha...

  6. Minsky on "Big Government"

    Directory of Open Access Journals (Sweden)

    Daniel de Santana Vasconcelos

    2014-03-01

    Full Text Available This paper objective is to assess, in light of the main works of Minsky, his view and analysis of what he called the "Big Government" as that huge institution which, in parallels with the "Big Bank" was capable of ensuring stability in the capitalist system and regulate its inherently unstable financial system in mid-20th century. In this work, we analyze how Minsky proposes an active role for the government in a complex economic system flawed by financial instability.

  7. Classical propagation of strings across a big crunch/big bang singularity

    International Nuclear Information System (INIS)

    Niz, Gustavo; Turok, Neil

    2007-01-01

    One of the simplest time-dependent solutions of M theory consists of nine-dimensional Euclidean space times 1+1-dimensional compactified Milne space-time. With a further modding out by Z 2 , the space-time represents two orbifold planes which collide and re-emerge, a process proposed as an explanation of the hot big bang [J. Khoury, B. A. Ovrut, P. J. Steinhardt, and N. Turok, Phys. Rev. D 64, 123522 (2001).][P. J. Steinhardt and N. Turok, Science 296, 1436 (2002).][N. Turok, M. Perry, and P. J. Steinhardt, Phys. Rev. D 70, 106004 (2004).]. When the two planes are near, the light states of the theory consist of winding M2-branes, describing fundamental strings in a particular ten-dimensional background. They suffer no blue-shift as the M theory dimension collapses, and their equations of motion are regular across the transition from big crunch to big bang. In this paper, we study the classical evolution of fundamental strings across the singularity in some detail. We also develop a simple semiclassical approximation to the quantum evolution which allows one to compute the quantum production of excitations on the string and implement it in a simplified example

  8. Leveraging the big-data revolution: CMS is expanding capabilities to spur health system transformation.

    Science.gov (United States)

    Brennan, Niall; Oelschlaeger, Allison; Cox, Christine; Tavenner, Marilyn

    2014-07-01

    As the largest single payer for health care in the United States, the Centers for Medicare and Medicaid Services (CMS) generates enormous amounts of data. Historically, CMS has faced technological challenges in storing, analyzing, and disseminating this information because of its volume and privacy concerns. However, rapid progress in the fields of data architecture, storage, and analysis--the big-data revolution--over the past several years has given CMS the capabilities to use data in new and innovative ways. We describe the different types of CMS data being used both internally and externally, and we highlight a selection of innovative ways in which big-data techniques are being used to generate actionable information from CMS data more effectively. These include the use of real-time analytics for program monitoring and detecting fraud and abuse and the increased provision of data to providers, researchers, beneficiaries, and other stakeholders. Project HOPE—The People-to-People Health Foundation, Inc.

  9. PROCESSING BIG REMOTE SENSING DATA FOR FAST FLOOD DETECTION IN A DISTRIBUTED COMPUTING ENVIRONMENT

    Directory of Open Access Journals (Sweden)

    A. Olasz

    2017-07-01

    Full Text Available The Earth observation (EO missions of the space agencies and space industry (ESA, NASA, national and commercial companies are evolving as never before. These missions aim to develop and launch next-generation series of satellites and sensors and often provide huge amounts of data, even free of charge, to enable novel monitoring services. The wide geospatial sector is targeted to handle new challenges to store, process and visualize these geospatial data, reaching the level of Big Data by their volume, variety, velocity, along with the need of multi-source spatio-temporal geospatial data processing. Handling and analysis of remote sensing data has always been a cumbersome task due to the ever-increasing size and frequency of collected information. This paper presents the achievements of the IQmulus EU FP7 research and development project with respect to processing and analysis of geospatial big data in the context of flood and waterlogging detection.

  10. Big data from small data: data-sharing in the ‘long tail’ of neuroscience

    Science.gov (United States)

    Ferguson, Adam R; Nielson, Jessica L; Cragin, Melissa H; Bandrowski, Anita E; Martone, Maryann E

    2016-01-01

    The launch of the US BRAIN and European Human Brain Projects coincides with growing international efforts toward transparency and increased access to publicly funded research in the neurosciences. The need for data-sharing standards and neuroinformatics infrastructure is more pressing than ever. However, ‘big science’ efforts are not the only drivers of data-sharing needs, as neuroscientists across the full spectrum of research grapple with the overwhelming volume of data being generated daily and a scientific environment that is increasingly focused on collaboration. In this commentary, we consider the issue of sharing of the richly diverse and heterogeneous small data sets produced by individual neuroscientists, so-called long-tail data. We consider the utility of these data, the diversity of repositories and options available for sharing such data, and emerging best practices. We provide use cases in which aggregating and mining diverse long-tail data convert numerous small data sources into big data for improved knowledge about neuroscience-related disorders. PMID:25349910

  11. The Information Panopticon in the Big Data Era

    Directory of Open Access Journals (Sweden)

    Martin Berner

    2014-04-01

    Full Text Available Taking advantage of big data opportunities is challenging for traditional organizations. In this article, we take a panoptic view of big data – obtaining information from more sources and making it visible to all organizational levels. We suggest that big data requires the transformation from command and control hierarchies to post-bureaucratic organizational structures wherein employees at all levels can be empowered while simultaneously being controlled. We derive propositions that show how to best exploit big data technologies in organizations.

  12. Big Data Analytics for Demand Response: Clustering Over Space and Time

    Energy Technology Data Exchange (ETDEWEB)

    Chelmis, Charalampos [Univ. of Southern California, Los Angeles, CA (United States); Kolte, Jahanvi [Nirma Univ., Gujarat (India); Prasanna, Viktor K. [Univ. of Southern California, Los Angeles, CA (United States)

    2015-10-29

    The pervasive deployment of advanced sensing infrastructure in Cyber-Physical systems, such as the Smart Grid, has resulted in an unprecedented data explosion. Such data exhibit both large volumes and high velocity characteristics, two of the three pillars of Big Data, and have a time-series notion as datasets in this context typically consist of successive measurements made over a time interval. Time-series data can be valuable for data mining and analytics tasks such as identifying the “right” customers among a diverse population, to target for Demand Response programs. However, time series are challenging to mine due to their high dimensionality. In this paper, we motivate this problem using a real application from the smart grid domain. We explore novel representations of time-series data for BigData analytics, and propose a clustering technique for determining natural segmentation of customers and identification of temporal consumption patterns. Our method is generizable to large-scale, real-world scenarios, without making any assumptions about the data. We evaluate our technique using real datasets from smart meters, totaling ~ 18,200,000 data points, and show the efficacy of our technique in efficiency detecting the number of optimal number of clusters.

  13. WE-H-BRB-00: Big Data in Radiation Oncology

    Energy Technology Data Exchange (ETDEWEB)

    NONE

    2016-06-15

    Big Data in Radiation Oncology: (1) Overview of the NIH 2015 Big Data Workshop, (2) Where do we stand in the applications of big data in radiation oncology?, and (3) Learning Health Systems for Radiation Oncology: Needs and Challenges for Future Success The overriding goal of this trio panel of presentations is to improve awareness of the wide ranging opportunities for big data impact on patient quality care and enhancing potential for research and collaboration opportunities with NIH and a host of new big data initiatives. This presentation will also summarize the Big Data workshop that was held at the NIH Campus on August 13–14, 2015 and sponsored by AAPM, ASTRO, and NIH. The workshop included discussion of current Big Data cancer registry initiatives, safety and incident reporting systems, and other strategies that will have the greatest impact on radiation oncology research, quality assurance, safety, and outcomes analysis. Learning Objectives: To discuss current and future sources of big data for use in radiation oncology research To optimize our current data collection by adopting new strategies from outside radiation oncology To determine what new knowledge big data can provide for clinical decision support for personalized medicine L. Xing, NIH/NCI Google Inc.

  14. WE-H-BRB-00: Big Data in Radiation Oncology

    International Nuclear Information System (INIS)

    2016-01-01

    Big Data in Radiation Oncology: (1) Overview of the NIH 2015 Big Data Workshop, (2) Where do we stand in the applications of big data in radiation oncology?, and (3) Learning Health Systems for Radiation Oncology: Needs and Challenges for Future Success The overriding goal of this trio panel of presentations is to improve awareness of the wide ranging opportunities for big data impact on patient quality care and enhancing potential for research and collaboration opportunities with NIH and a host of new big data initiatives. This presentation will also summarize the Big Data workshop that was held at the NIH Campus on August 13–14, 2015 and sponsored by AAPM, ASTRO, and NIH. The workshop included discussion of current Big Data cancer registry initiatives, safety and incident reporting systems, and other strategies that will have the greatest impact on radiation oncology research, quality assurance, safety, and outcomes analysis. Learning Objectives: To discuss current and future sources of big data for use in radiation oncology research To optimize our current data collection by adopting new strategies from outside radiation oncology To determine what new knowledge big data can provide for clinical decision support for personalized medicine L. Xing, NIH/NCI Google Inc.

  15. De impact van Big Data op Internationale Betrekkingen

    NARCIS (Netherlands)

    Zwitter, Andrej

    Big Data changes our daily lives, but does it also change international politics? In this contribution, Andrej Zwitter (NGIZ chair at Groningen University) argues that Big Data impacts on international relations in ways that we only now start to understand. To comprehend how Big Data influences

  16. Big data and analytics strategic and organizational impacts

    CERN Document Server

    Morabito, Vincenzo

    2015-01-01

    This book presents and discusses the main strategic and organizational challenges posed by Big Data and analytics in a manner relevant to both practitioners and scholars. The first part of the book analyzes strategic issues relating to the growing relevance of Big Data and analytics for competitive advantage, which is also attributable to empowerment of activities such as consumer profiling, market segmentation, and development of new products or services. Detailed consideration is also given to the strategic impact of Big Data and analytics on innovation in domains such as government and education and to Big Data-driven business models. The second part of the book addresses the impact of Big Data and analytics on management and organizations, focusing on challenges for governance, evaluation, and change management, while the concluding part reviews real examples of Big Data and analytics innovation at the global level. The text is supported by informative illustrations and case studies, so that practitioners...

  17. Big Science and Long-tail Science

    CERN Document Server

    2008-01-01

    Jim Downing and I were privileged to be the guests of Salavtore Mele at CERN yesterday and to see the Atlas detector of the Large Hadron Collider . This is a wow experience - although I knew it was big, I hadnt realised how big.

  18. Big-Eyed Bugs Have Big Appetite for Pests

    Science.gov (United States)

    Many kinds of arthropod natural enemies (predators and parasitoids) inhabit crop fields in Arizona and can have a large negative impact on several pest insect species that also infest these crops. Geocoris spp., commonly known as big-eyed bugs, are among the most abundant insect predators in field c...

  19. Big Data - What is it and why it matters.

    Science.gov (United States)

    Tattersall, Andy; Grant, Maria J

    2016-06-01

    Big data, like MOOCs, altmetrics and open access, is a term that has been commonplace in the library community for some time yet, despite its prevalence, many in the library and information sector remain unsure of the relationship between big data and their roles. This editorial explores what big data could mean for the day-to-day practice of health library and information workers, presenting examples of big data in action, considering the ethics of accessing big data sets and the potential for new roles for library and information workers. © 2016 Health Libraries Group.

  20. Research on information security in big data era

    Science.gov (United States)

    Zhou, Linqi; Gu, Weihong; Huang, Cheng; Huang, Aijun; Bai, Yongbin

    2018-05-01

    Big data is becoming another hotspot in the field of information technology after the cloud computing and the Internet of Things. However, the existing information security methods can no longer meet the information security requirements in the era of big data. This paper analyzes the challenges and a cause of data security brought by big data, discusses the development trend of network attacks under the background of big data, and puts forward my own opinions on the development of security defense in technology, strategy and product.

  1. BIG DATA IN BUSINESS ENVIRONMENT

    Directory of Open Access Journals (Sweden)

    Logica BANICA

    2015-06-01

    Full Text Available In recent years, dealing with a lot of data originating from social media sites and mobile communications among data from business environments and institutions, lead to the definition of a new concept, known as Big Data. The economic impact of the sheer amount of data produced in a last two years has increased rapidly. It is necessary to aggregate all types of data (structured and unstructured in order to improve current transactions, to develop new business models, to provide a real image of the supply and demand and thereby, generate market advantages. So, the companies that turn to Big Data have a competitive advantage over other firms. Looking from the perspective of IT organizations, they must accommodate the storage and processing Big Data, and provide analysis tools that are easily integrated into business processes. This paper aims to discuss aspects regarding the Big Data concept, the principles to build, organize and analyse huge datasets in the business environment, offering a three-layer architecture, based on actual software solutions. Also, the article refers to the graphical tools for exploring and representing unstructured data, Gephi and NodeXL.

  2. Fuzzy 2-partition entropy threshold selection based on Big Bang–Big Crunch Optimization algorithm

    Directory of Open Access Journals (Sweden)

    Baljit Singh Khehra

    2015-03-01

    Full Text Available The fuzzy 2-partition entropy approach has been widely used to select threshold value for image segmenting. This approach used two parameterized fuzzy membership functions to form a fuzzy 2-partition of the image. The optimal threshold is selected by searching an optimal combination of parameters of the membership functions such that the entropy of fuzzy 2-partition is maximized. In this paper, a new fuzzy 2-partition entropy thresholding approach based on the technology of the Big Bang–Big Crunch Optimization (BBBCO is proposed. The new proposed thresholding approach is called the BBBCO-based fuzzy 2-partition entropy thresholding algorithm. BBBCO is used to search an optimal combination of parameters of the membership functions for maximizing the entropy of fuzzy 2-partition. BBBCO is inspired by the theory of the evolution of the universe; namely the Big Bang and Big Crunch Theory. The proposed algorithm is tested on a number of standard test images. For comparison, three different algorithms included Genetic Algorithm (GA-based, Biogeography-based Optimization (BBO-based and recursive approaches are also implemented. From experimental results, it is observed that the performance of the proposed algorithm is more effective than GA-based, BBO-based and recursion-based approaches.

  3. A little big history of Tiananmen

    NARCIS (Netherlands)

    Quaedackers, E.; Grinin, L.E.; Korotayev, A.V.; Rodrigue, B.H.

    2011-01-01

    This contribution aims at demonstrating the usefulness of studying small-scale subjects such as Tiananmen, or the Gate of Heavenly Peace, in Beijing - from a Big History perspective. By studying such a ‘little big history’ of Tiananmen, previously overlooked yet fundamental explanations for why

  4. Exploring the connectome: Petascale volume visualization of microscopy data streams

    KAUST Repository

    Beyer, Johanna; Hadwiger, Markus; Al-Awami, Ali K.; Jeong, Wonki; Kasthuri, Narayanan; Lichtman, Jeff W M D; Pfister, Hanspeter

    2013-01-01

    Recent advances in high-resolution microscopy let neuroscientists acquire neural-tissue volume data of extremely large sizes. However, the tremendous resolution and the high complexity of neural structures present big challenges to storage, processing, and visualization at interactive rates. A proposed system provides interactive exploration of petascale (petavoxel) volumes resulting from high-throughput electron microscopy data streams. The system can concurrently handle multiple volumes and can support the simultaneous visualization of high-resolution voxel segmentation data. Its visualization-driven design restricts most computations to a small subset of the data. It employs a multiresolution virtual-memory architecture for better scalability than previous approaches and for handling incomplete data. Researchers have employed it for a 1-teravoxel mouse cortex volume, of which several hundred axons and dendrites as well as synapses have been segmented and labeled. © 1981-2012 IEEE.

  5. Exploring the connectome: Petascale volume visualization of microscopy data streams

    KAUST Repository

    Beyer, Johanna

    2013-07-01

    Recent advances in high-resolution microscopy let neuroscientists acquire neural-tissue volume data of extremely large sizes. However, the tremendous resolution and the high complexity of neural structures present big challenges to storage, processing, and visualization at interactive rates. A proposed system provides interactive exploration of petascale (petavoxel) volumes resulting from high-throughput electron microscopy data streams. The system can concurrently handle multiple volumes and can support the simultaneous visualization of high-resolution voxel segmentation data. Its visualization-driven design restricts most computations to a small subset of the data. It employs a multiresolution virtual-memory architecture for better scalability than previous approaches and for handling incomplete data. Researchers have employed it for a 1-teravoxel mouse cortex volume, of which several hundred axons and dendrites as well as synapses have been segmented and labeled. © 1981-2012 IEEE.

  6. Patterns of digital volume pulse waveform and pulse transit time in ...

    African Journals Online (AJOL)

    In this study the digital volume pulse wave and the pulse transit time of the thumb and big toe were analyzed in young and older subjects some of whom were hypertensive. We aimed to study the components and patterns of the pulse waveform and the pulse transit time and how they might change. Material and Methods: ...

  7. Improving Healthcare Using Big Data Analytics

    Directory of Open Access Journals (Sweden)

    Revanth Sonnati

    2017-03-01

    Full Text Available In daily terms we call the current era as Modern Era which can also be named as the era of Big Data in the field of Information Technology. Our daily lives in todays world are rapidly advancing never quenching ones thirst. The fields of science engineering and technology are producing data at an exponential rate leading to Exabytes of data every day. Big data helps us to explore and re-invent many areas not limited to education health and law. The primary purpose of this paper is to provide an in-depth analysis in the area of Healthcare using the big data and analytics. The main purpose is to emphasize on the usage of the big data which is being stored all the time helping to look back in the history but this is the time to emphasize on the analyzation to improve the medication and services. Although many big data implementations happen to be in-house development this proposed implementation aims to propose a broader extent using Hadoop which just happen to be the tip of the iceberg. The focus of this paper is not limited to the improvement and analysis of the data it also focusses on the strengths and drawbacks compared to the conventional techniques available.

  8. About Big Data and its Challenges and Benefits in Manufacturing

    OpenAIRE

    Bogdan NEDELCU

    2013-01-01

    The aim of this article is to show the importance of Big Data and its growing influence on companies. It also shows what kind of big data is currently generated and how much big data is estimated to be generated. We can also see how much are the companies willing to invest in big data and how much are they currently gaining from their big data. There are also shown some major influences that big data has over one major segment in the industry (manufacturing) and the challenges that appear.

  9. Big Data Management in US Hospitals: Benefits and Barriers.

    Science.gov (United States)

    Schaeffer, Chad; Booton, Lawrence; Halleck, Jamey; Studeny, Jana; Coustasse, Alberto

    Big data has been considered as an effective tool for reducing health care costs by eliminating adverse events and reducing readmissions to hospitals. The purposes of this study were to examine the emergence of big data in the US health care industry, to evaluate a hospital's ability to effectively use complex information, and to predict the potential benefits that hospitals might realize if they are successful in using big data. The findings of the research suggest that there were a number of benefits expected by hospitals when using big data analytics, including cost savings and business intelligence. By using big data, many hospitals have recognized that there have been challenges, including lack of experience and cost of developing the analytics. Many hospitals will need to invest in the acquiring of adequate personnel with experience in big data analytics and data integration. The findings of this study suggest that the adoption, implementation, and utilization of big data technology will have a profound positive effect among health care providers.

  10. Big Data Strategy for Telco: Network Transformation

    OpenAIRE

    F. Amin; S. Feizi

    2014-01-01

    Big data has the potential to improve the quality of services; enable infrastructure that businesses depend on to adapt continually and efficiently; improve the performance of employees; help organizations better understand customers; and reduce liability risks. Analytics and marketing models of fixed and mobile operators are falling short in combating churn and declining revenue per user. Big Data presents new method to reverse the way and improve profitability. The benefits of Big Data and ...

  11. Big Data in Shipping - Challenges and Opportunities

    OpenAIRE

    Rødseth, Ørnulf Jan; Perera, Lokukaluge Prasad; Mo, Brage

    2016-01-01

    Big Data is getting popular in shipping where large amounts of information is collected to better understand and improve logistics, emissions, energy consumption and maintenance. Constraints to the use of big data include cost and quality of on-board sensors and data acquisition systems, satellite communication, data ownership and technical obstacles to effective collection and use of big data. New protocol standards may simplify the process of collecting and organizing the data, including in...

  12. Herpetofaunal Inventories of the National Parks of South Florida and the Caribbean: Volume III. Big Cypress National Preserve

    Science.gov (United States)

    Rice, Kenneth G.; Waddle, J. Hardin; Crockett, Marquette E.; Jeffrey, Brian M.; Rice, Amanda N.; Percival, H. Franklin

    2005-01-01

    Amphibian declines and extinctions have been documented around the world, often in protected natural areas. Concern for this trend has prompted the U.S. Geological Survey and the National Park Service to document all species of amphibians that occur within U.S. National Parks and to search for any signs that amphibians may be declining. This study, an inventory of amphibian species in Big Cypress National Preserve, was conducted from 2002 to 2003. The goals of the project were to create a georeferenced inventory of amphibian species, use new analytical techniques to estimate proportion of sites occupied by each species, look for any signs of amphibian decline (missing species, disease, die-offs, and so forth.), and to establish a protocol that could be used for future monitoring efforts. Several sampling methods were used to accomplish these goals. Visual encounter surveys and anuran vocalization surveys were conducted in all habitats throughout the park to estimate the proportion of sites or proportion of area occupied (PAO) by each amphibian species in each habitat. Opportunistic collections, as well as limited drift fence data, were used to augment the visual encounter methods for highly aquatic or cryptic species. A total of 545 visits to 104 sites were conducted for standard sampling alone, and 2,358 individual amphibians and 374 reptiles were encountered. Data analysis was conducted in program PRESENCE to provide PAO estimates for each of the anuran species. All of the amphibian species historically found in Big Cypress National Preserve were detected during this project. At least one individual of each of the four salamander species was captured during sampling. Each of the anuran species in the preserve was adequately sampled using standard herpetological sampling methods, and PAO estimates were produced for each species of anuran by habitat. This information serves as an indicator of habitat associations of the species and relative abundance of sites

  13. [Relevance of big data for molecular diagnostics].

    Science.gov (United States)

    Bonin-Andresen, M; Smiljanovic, B; Stuhlmüller, B; Sörensen, T; Grützkau, A; Häupl, T

    2018-04-01

    Big data analysis raises the expectation that computerized algorithms may extract new knowledge from otherwise unmanageable vast data sets. What are the algorithms behind the big data discussion? In principle, high throughput technologies in molecular research already introduced big data and the development and application of analysis tools into the field of rheumatology some 15 years ago. This includes especially omics technologies, such as genomics, transcriptomics and cytomics. Some basic methods of data analysis are provided along with the technology, however, functional analysis and interpretation requires adaptation of existing or development of new software tools. For these steps, structuring and evaluating according to the biological context is extremely important and not only a mathematical problem. This aspect has to be considered much more for molecular big data than for those analyzed in health economy or epidemiology. Molecular data are structured in a first order determined by the applied technology and present quantitative characteristics that follow the principles of their biological nature. These biological dependencies have to be integrated into software solutions, which may require networks of molecular big data of the same or even different technologies in order to achieve cross-technology confirmation. More and more extensive recording of molecular processes also in individual patients are generating personal big data and require new strategies for management in order to develop data-driven individualized interpretation concepts. With this perspective in mind, translation of information derived from molecular big data will also require new specifications for education and professional competence.

  14. Big data in psychology: A framework for research advancement.

    Science.gov (United States)

    Adjerid, Idris; Kelley, Ken

    2018-02-22

    The potential for big data to provide value for psychology is significant. However, the pursuit of big data remains an uncertain and risky undertaking for the average psychological researcher. In this article, we address some of this uncertainty by discussing the potential impact of big data on the type of data available for psychological research, addressing the benefits and most significant challenges that emerge from these data, and organizing a variety of research opportunities for psychology. Our article yields two central insights. First, we highlight that big data research efforts are more readily accessible than many researchers realize, particularly with the emergence of open-source research tools, digital platforms, and instrumentation. Second, we argue that opportunities for big data research are diverse and differ both in their fit for varying research goals, as well as in the challenges they bring about. Ultimately, our outlook for researchers in psychology using and benefiting from big data is cautiously optimistic. Although not all big data efforts are suited for all researchers or all areas within psychology, big data research prospects are diverse, expanding, and promising for psychology and related disciplines. (PsycINFO Database Record (c) 2018 APA, all rights reserved).

  15. 'Big data' in pharmaceutical science: challenges and opportunities.

    Science.gov (United States)

    Dossetter, Al G; Ecker, Gerhard; Laverty, Hugh; Overington, John

    2014-05-01

    Future Medicinal Chemistry invited a selection of experts to express their views on the current impact of big data in drug discovery and design, as well as speculate on future developments in the field. The topics discussed include the challenges of implementing big data technologies, maintaining the quality and privacy of data sets, and how the industry will need to adapt to welcome the big data era. Their enlightening responses provide a snapshot of the many and varied contributions being made by big data to the advancement of pharmaceutical science.

  16. Towards Large Volume Big Divisor D3-D7 "mu-Split Supersymmetry" and Ricci-Flat Swiss-Cheese Metrics, and Dimension-Six Neutrino Mass Operators

    CERN Document Server

    Dhuria, Mansi

    2012-01-01

    We show that it is possible to realize a "mu-split SUSY" scenario [1] in the context of large volume limit of type IIB compactifications on Swiss-Cheese Calabi-Yau's in the presence of a mobile space-time filling D3-brane and a (stack of) D7-brane(s) wrapping the "big" divisor Sigma_B. For this, we investigate the possibility of getting one Higgs to be light while other to be heavy in addition to a heavy Higgsino mass parameter. Further, we examine the existence of long lived gluino that manifests one of the major consequences of mu-split SUSY scenario, by computing its decay width as well as lifetime corresponding to the 3-body decays of the gluino into a quark, a squark and a neutralino or Goldstino, as well as 2-body decays of the gluino into either a neutralino or a Goldstino and a gluon. Guided by the geometric Kaehler potential for Sigma_B obtained in [2] based on GLSM techniques, and the Donaldson's algorithm [3] for obtaining numerically a Ricci-flat metric, we give details of our calculation in [4] p...

  17. Solution of a braneworld big crunch/big bang cosmology

    International Nuclear Information System (INIS)

    McFadden, Paul L.; Turok, Neil; Steinhardt, Paul J.

    2007-01-01

    We solve for the cosmological perturbations in a five-dimensional background consisting of two separating or colliding boundary branes, as an expansion in the collision speed V divided by the speed of light c. Our solution permits a detailed check of the validity of four-dimensional effective theory in the vicinity of the event corresponding to the big crunch/big bang singularity. We show that the four-dimensional description fails at the first nontrivial order in (V/c) 2 . At this order, there is nontrivial mixing of the two relevant four-dimensional perturbation modes (the growing and decaying modes) as the boundary branes move from the narrowly separated limit described by Kaluza-Klein theory to the well-separated limit where gravity is confined to the positive-tension brane. We comment on the cosmological significance of the result and compute other quantities of interest in five-dimensional cosmological scenarios

  18. Current applications of big data in obstetric anesthesiology.

    Science.gov (United States)

    Klumpner, Thomas T; Bauer, Melissa E; Kheterpal, Sachin

    2017-06-01

    The narrative review aims to highlight several recently published 'big data' studies pertinent to the field of obstetric anesthesiology. Big data has been used to study rare outcomes, to identify trends within the healthcare system, to identify variations in practice patterns, and to highlight potential inequalities in obstetric anesthesia care. Big data studies have helped define the risk of rare complications of obstetric anesthesia, such as the risk of neuraxial hematoma in thrombocytopenic parturients. Also, large national databases have been used to better understand trends in anesthesia-related adverse events during cesarean delivery as well as outline potential racial/ethnic disparities in obstetric anesthesia care. Finally, real-time analysis of patient data across a number of disparate health information systems through the use of sophisticated clinical decision support and surveillance systems is one promising application of big data technology on the labor and delivery unit. 'Big data' research has important implications for obstetric anesthesia care and warrants continued study. Real-time electronic surveillance is a potentially useful application of big data technology on the labor and delivery unit.

  19. Using Big Book to Teach Things in My House

    OpenAIRE

    Effrien, Intan; Lailatus, Sa’diyah; Nuruliftitah Maja, Neneng

    2017-01-01

    The purpose of this study to determine students' interest in learning using the big book media. Big book is a big book from the general book. The big book contains simple words and images that match the content of sentences and spelling. From here researchers can know the interest and development of students' knowledge. As well as train researchers to remain crative in developing learning media for students.

  20. Big Data Analytics Methodology in the Financial Industry

    Science.gov (United States)

    Lawler, James; Joseph, Anthony

    2017-01-01

    Firms in industry continue to be attracted by the benefits of Big Data Analytics. The benefits of Big Data Analytics projects may not be as evident as frequently indicated in the literature. The authors of the study evaluate factors in a customized methodology that may increase the benefits of Big Data Analytics projects. Evaluating firms in the…

  1. NURE aerial gamma-ray and magnetic reconnaissance survey: Big Bend area, Marfa MH 13-5, Fort Stockton MH 13-6, Presidio MH 13-8, Emory Peak MH 13-9 Quadrangles. Volume I. Narrative report

    International Nuclear Information System (INIS)

    1979-02-01

    A rotary-wing, reconnaissance, high sensitivity, radiometric and magnetic survey was performed in the Big Bend area of Texas. Four 1:250,000 scale NTMS quadrangles (Marfa, Ft. Stockton, Presidio, and Emory Peak) were surveyed. A total of 7,529 line miles (12,115 kilometers) of data were collected utilizing a Sikorsky S58T helicopter. Traverse lines were flown in an east-west direction at 3.0 mile (5 kilometer) spacing, with tie lines flown in a north-south direction at 12.5 mile (20 kilometer) spacing. The data were digitally recorded at 1.0 second intervals. The NaI terrestrial detectors used in this survey had a total volume of 2,154 cubic inches. The magnetometer employed was a modified ASQ-10 fluxgate system. The radiometric data was normalized to 400 feet terrain clearance and is presented in the form of computer listings on microfiche and as stacked profile plots. Profile plots are contained in Volume II of this report. A geologic interpretation of the radiometric and magnetic data is included as part of this report

  2. Opportunity and Challenges for Migrating Big Data Analytics in Cloud

    Science.gov (United States)

    Amitkumar Manekar, S.; Pradeepini, G., Dr.

    2017-08-01

    Big Data Analytics is a big word now days. As per demanding and more scalable process data generation capabilities, data acquisition and storage become a crucial issue. Cloud storage is a majorly usable platform; the technology will become crucial to executives handling data powered by analytics. Now a day’s trend towards “big data-as-a-service” is talked everywhere. On one hand, cloud-based big data analytics exactly tackle in progress issues of scale, speed, and cost. But researchers working to solve security and other real-time problem of big data migration on cloud based platform. This article specially focused on finding possible ways to migrate big data to cloud. Technology which support coherent data migration and possibility of doing big data analytics on cloud platform is demanding in natute for new era of growth. This article also gives information about available technology and techniques for migration of big data in cloud.

  3. Hot big bang or slow freeze?

    Science.gov (United States)

    Wetterich, C.

    2014-09-01

    We confront the big bang for the beginning of the universe with an equivalent picture of a slow freeze - a very cold and slowly evolving universe. In the freeze picture the masses of elementary particles increase and the gravitational constant decreases with cosmic time, while the Newtonian attraction remains unchanged. The freeze and big bang pictures both describe the same observations or physical reality. We present a simple ;crossover model; without a big bang singularity. In the infinite past space-time is flat. Our model is compatible with present observations, describing the generation of primordial density fluctuations during inflation as well as the present transition to a dark energy-dominated universe.

  4. Big Data

    DEFF Research Database (Denmark)

    Aaen, Jon; Nielsen, Jeppe Agger

    2016-01-01

    Big Data byder sig til som en af tidens mest hypede teknologiske innovationer, udråbt til at rumme kimen til nye, værdifulde operationelle indsigter for private virksomheder og offentlige organisationer. Mens de optimistiske udmeldinger er mange, er forskningen i Big Data i den offentlige sektor...... indtil videre begrænset. Denne artikel belyser, hvordan den offentlige sundhedssektor kan genanvende og udnytte en stadig større mængde data under hensyntagen til offentlige værdier. Artiklen bygger på et casestudie af anvendelsen af store mængder sundhedsdata i Dansk AlmenMedicinsk Database (DAMD......). Analysen viser, at (gen)brug af data i nye sammenhænge er en flerspektret afvejning mellem ikke alene økonomiske rationaler og kvalitetshensyn, men også kontrol over personfølsomme data og etiske implikationer for borgeren. I DAMD-casen benyttes data på den ene side ”i den gode sags tjeneste” til...

  5. Big data analytics in healthcare: promise and potential.

    Science.gov (United States)

    Raghupathi, Wullianallur; Raghupathi, Viju

    2014-01-01

    To describe the promise and potential of big data analytics in healthcare. The paper describes the nascent field of big data analytics in healthcare, discusses the benefits, outlines an architectural framework and methodology, describes examples reported in the literature, briefly discusses the challenges, and offers conclusions. The paper provides a broad overview of big data analytics for healthcare researchers and practitioners. Big data analytics in healthcare is evolving into a promising field for providing insight from very large data sets and improving outcomes while reducing costs. Its potential is great; however there remain challenges to overcome.

  6. Data warehousing in the age of big data

    CERN Document Server

    Krishnan, Krish

    2013-01-01

    Data Warehousing in the Age of the Big Data will help you and your organization make the most of unstructured data with your existing data warehouse. As Big Data continues to revolutionize how we use data, it doesn't have to create more confusion. Expert author Krish Krishnan helps you make sense of how Big Data fits into the world of data warehousing in clear and concise detail. The book is presented in three distinct parts. Part 1 discusses Big Data, its technologies and use cases from early adopters. Part 2 addresses data warehousing, its shortcomings, and new architecture

  7. The Death of the Big Men

    DEFF Research Database (Denmark)

    Martin, Keir

    2010-01-01

    Recently Tolai people og Papua New Guinea have adopted the term 'Big Shot' to decribe an emerging post-colonial political elite. The mergence of the term is a negative moral evaluation of new social possibilities that have arisen as a consequence of the Big Shots' privileged position within a glo...

  8. Big data and software defined networks

    CERN Document Server

    Taheri, Javid

    2018-01-01

    Big Data Analytics and Software Defined Networking (SDN) are helping to drive the management of data usage of the extraordinary increase of computer processing power provided by Cloud Data Centres (CDCs). This new book investigates areas where Big-Data and SDN can help each other in delivering more efficient services.

  9. Big Data-Survey

    Directory of Open Access Journals (Sweden)

    P.S.G. Aruna Sri

    2016-03-01

    Full Text Available Big data is the term for any gathering of information sets, so expensive and complex, that it gets to be hard to process for utilizing customary information handling applications. The difficulties incorporate investigation, catch, duration, inquiry, sharing, stockpiling, Exchange, perception, and protection infringement. To reduce spot business patterns, anticipate diseases, conflict etc., we require bigger data sets when compared with the smaller data sets. Enormous information is hard to work with utilizing most social database administration frameworks and desktop measurements and perception bundles, needing rather enormously parallel programming running on tens, hundreds, or even a large number of servers. In this paper there was an observation on Hadoop architecture, different tools used for big data and its security issues.

  10. A Comparative Quantitative Analysis of Contemporary Big Data Clustering Algorithms for Market Segmentation in Hospitality Industry

    OpenAIRE

    Bose, Avishek; Munir, Arslan; Shabani, Neda

    2017-01-01

    The hospitality industry is one of the data-rich industries that receives huge Volumes of data streaming at high Velocity with considerably Variety, Veracity, and Variability. These properties make the data analysis in the hospitality industry a big data problem. Meeting the customers' expectations is a key factor in the hospitality industry to grasp the customers' loyalty. To achieve this goal, marketing professionals in this industry actively look for ways to utilize their data in the best ...

  11. Seeing the "Big" Picture: Big Data Methods for Exploring Relationships Between Usage, Language, and Outcome in Internet Intervention Data.

    Science.gov (United States)

    Carpenter, Jordan; Crutchley, Patrick; Zilca, Ran D; Schwartz, H Andrew; Smith, Laura K; Cobb, Angela M; Parks, Acacia C

    2016-08-31

    Assessing the efficacy of Internet interventions that are already in the market introduces both challenges and opportunities. While vast, often unprecedented amounts of data may be available (hundreds of thousands, and sometimes millions of participants with high dimensions of assessed variables), the data are observational in nature, are partly unstructured (eg, free text, images, sensor data), do not include a natural control group to be used for comparison, and typically exhibit high attrition rates. New approaches are therefore needed to use these existing data and derive new insights that can augment traditional smaller-group randomized controlled trials. Our objective was to demonstrate how emerging big data approaches can help explore questions about the effectiveness and process of an Internet well-being intervention. We drew data from the user base of a well-being website and app called Happify. To explore effectiveness, multilevel models focusing on within-person variation explored whether greater usage predicted higher well-being in a sample of 152,747 users. In addition, to explore the underlying processes that accompany improvement, we analyzed language for 10,818 users who had a sufficient volume of free-text response and timespan of platform usage. A topic model constructed from this free text provided language-based correlates of individual user improvement in outcome measures, providing insights into the beneficial underlying processes experienced by users. On a measure of positive emotion, the average user improved 1.38 points per week (SE 0.01, t122,455=113.60, Peffect on change in well-being over time, illustrating which topics may be more beneficial than others when engaging with the interventions. In particular, topics that are related to addressing negative thoughts and feelings were correlated with improvement over time. Using observational analyses on naturalistic big data, we can explore the relationship between usage and well-being among

  12. Big Data Analytics, Infectious Diseases and Associated Ethical Impacts

    OpenAIRE

    Garattini, C.; Raffle, J.; Aisyah, D. N.; Sartain, F.; Kozlakidis, Z.

    2017-01-01

    The exponential accumulation, processing and accrual of big data in healthcare are only possible through an equally rapidly evolving field of big data analytics. The latter offers the capacity to rationalize, understand and use big data to serve many different purposes, from improved services modelling to prediction of treatment outcomes, to greater patient and disease stratification. In the area of infectious diseases, the application of big data analytics has introduced a number of changes ...

  13. Evaluation of Data Management Systems for Geospatial Big Data

    OpenAIRE

    Amirian, Pouria; Basiri, Anahid; Winstanley, Adam C.

    2014-01-01

    Big Data encompasses collection, management, processing and analysis of the huge amount of data that varies in types and changes with high frequency. Often data component of Big Data has a positional component as an important part of it in various forms, such as postal address, Internet Protocol (IP) address and geographical location. If the positional components in Big Data extensively used in storage, retrieval, analysis, processing, visualization and knowledge discovery (geospatial Big Dat...

  14. Trends in IT Innovation to Build a Next Generation Bioinformatics Solution to Manage and Analyse Biological Big Data Produced by NGS Technologies.

    Science.gov (United States)

    de Brevern, Alexandre G; Meyniel, Jean-Philippe; Fairhead, Cécile; Neuvéglise, Cécile; Malpertuy, Alain

    2015-01-01

    Sequencing the human genome began in 1994, and 10 years of work were necessary in order to provide a nearly complete sequence. Nowadays, NGS technologies allow sequencing of a whole human genome in a few days. This deluge of data challenges scientists in many ways, as they are faced with data management issues and analysis and visualization drawbacks due to the limitations of current bioinformatics tools. In this paper, we describe how the NGS Big Data revolution changes the way of managing and analysing data. We present how biologists are confronted with abundance of methods, tools, and data formats. To overcome these problems, focus on Big Data Information Technology innovations from web and business intelligence. We underline the interest of NoSQL databases, which are much more efficient than relational databases. Since Big Data leads to the loss of interactivity with data during analysis due to high processing time, we describe solutions from the Business Intelligence that allow one to regain interactivity whatever the volume of data is. We illustrate this point with a focus on the Amadea platform. Finally, we discuss visualization challenges posed by Big Data and present the latest innovations with JavaScript graphic libraries.

  15. A New Look at Big History

    Science.gov (United States)

    Hawkey, Kate

    2014-01-01

    The article sets out a "big history" which resonates with the priorities of our own time. A globalizing world calls for new spacial scales to underpin what the history curriculum addresses, "big history" calls for new temporal scales, while concern over climate change calls for a new look at subject boundaries. The article…

  16. West Virginia's big trees: setting the record straight

    Science.gov (United States)

    Melissa Thomas-Van Gundy; Robert. Whetsell

    2016-01-01

    People love big trees, people love to find big trees, and people love to find big trees in the place they call home. Having been suspicious for years, my coauthor and historian Rob Whetsell, approached me with a species identification challenge. There are several photographs of giant trees used by many people to illustrate the past forests of West Virginia,...

  17. Sosiaalinen asiakassuhdejohtaminen ja big data

    OpenAIRE

    Toivonen, Topi-Antti

    2015-01-01

    Tässä tutkielmassa käsitellään sosiaalista asiakassuhdejohtamista sekä hyötyjä, joita siihen voidaan saada big datan avulla. Sosiaalinen asiakassuhdejohtaminen on terminä uusi ja monille tuntematon. Tutkimusta motivoi aiheen vähäinen tutkimus, suomenkielisen tutkimuksen puuttuminen kokonaan sekä sosiaalisen asiakassuhdejohtamisen mahdollinen olennainen rooli yritysten toiminnassa tulevaisuudessa. Big dataa käsittelevissä tutkimuksissa keskitytään monesti sen tekniseen puoleen, eikä sovellutuk...

  18. A View on Fuzzy Systems for Big Data: Progress and Opportunities

    Directory of Open Access Journals (Sweden)

    Alberto Fernandez

    2016-04-01

    Full Text Available Currently, we are witnessing a growing trend in the study and application of problems in the framework of Big Data. This is mainly due to the great advantages which come from the knowledge extraction from a high volume of information. For this reason, we observe a migration of the standard Data Mining systems towards a new functional paradigm that allows at working with Big Data. By means of the MapReduce model and its different extensions, scalability can be successfully addressed, while maintaining a good fault tolerance during the execution of the algorithms. Among the different approaches used in Data Mining, those models based on fuzzy systems stand out for many applications. Among their advantages, we must stress the use of a representation close to the natural language. Additionally, they use an inference model that allows a good adaptation to different scenarios, especially those with a given degree of uncertainty. Despite the success of this type of systems, their migration to the Big Data environment in the different learning areas is at a preliminary stage yet. In this paper, we will carry out an overview of the main existing proposals on the topic, analyzing the design of these models. Additionally, we will discuss those problems related to the data distribution and parallelization of the current algorithms, and also its relationship with the fuzzy representation of the information. Finally, we will provide our view on the expectations for the future in this framework according to the design of those methods based on fuzzy sets, as well as the open challenges on the topic.

  19. D-branes in a big bang/big crunch universe: Misner space

    International Nuclear Information System (INIS)

    Hikida, Yasuaki; Nayak, Rashmi R.; Panigrahi, Kamal L.

    2005-01-01

    We study D-branes in a two-dimensional lorentzian orbifold R 1,1 /Γ with a discrete boost Γ. This space is known as Misner or Milne space, and includes big crunch/big bang singularity. In this space, there are D0-branes in spiral orbits and D1-branes with or without flux on them. In particular, we observe imaginary parts of partition functions, and interpret them as the rates of open string pair creation for D0-branes and emission of winding closed strings for D1-branes. These phenomena occur due to the time-dependence of the background. Open string 2→2 scattering amplitude on a D1-brane is also computed and found to be less singular than closed string case

  20. D-branes in a big bang/big crunch universe: Misner space

    Energy Technology Data Exchange (ETDEWEB)

    Hikida, Yasuaki [Theory Group, High Energy Accelerator Research Organization (KEK), Tukuba, Ibaraki 305-0801 (Japan); Nayak, Rashmi R. [Dipartimento di Fisica and INFN, Sezione di Roma 2, ' Tor Vergata' , Rome 00133 (Italy); Panigrahi, Kamal L. [Dipartimento di Fisica and INFN, Sezione di Roma 2, ' Tor Vergata' , Rome 00133 (Italy)

    2005-09-01

    We study D-branes in a two-dimensional lorentzian orbifold R{sup 1,1}/{gamma} with a discrete boost {gamma}. This space is known as Misner or Milne space, and includes big crunch/big bang singularity. In this space, there are D0-branes in spiral orbits and D1-branes with or without flux on them. In particular, we observe imaginary parts of partition functions, and interpret them as the rates of open string pair creation for D0-branes and emission of winding closed strings for D1-branes. These phenomena occur due to the time-dependence of the background. Open string 2{yields}2 scattering amplitude on a D1-brane is also computed and found to be less singular than closed string case.

  1. Astroinformatics: the big data of the universe

    OpenAIRE

    Barmby, Pauline

    2016-01-01

    In astrophysics we like to think that our field was the originator of big data, back when it had to be carried around in big sky charts and books full of tables. These days, it's easier to move astrophysics data around, but we still have a lot of it, and upcoming telescope  facilities will generate even more. I discuss how astrophysicists approach big data in general, and give examples from some Western Physics & Astronomy research projects.  I also give an overview of ho...

  2. Recent big flare

    International Nuclear Information System (INIS)

    Moriyama, Fumio; Miyazawa, Masahide; Yamaguchi, Yoshisuke

    1978-01-01

    The features of three big solar flares observed at Tokyo Observatory are described in this paper. The active region, McMath 14943, caused a big flare on September 16, 1977. The flare appeared on both sides of a long dark line which runs along the boundary of the magnetic field. Two-ribbon structure was seen. The electron density of the flare observed at Norikura Corona Observatory was 3 x 10 12 /cc. Several arc lines which connect both bright regions of different magnetic polarity were seen in H-α monochrome image. The active region, McMath 15056, caused a big flare on December 10, 1977. At the beginning, several bright spots were observed in the region between two main solar spots. Then, the area and the brightness increased, and the bright spots became two ribbon-shaped bands. A solar flare was observed on April 8, 1978. At first, several bright spots were seen around the solar spot in the active region, McMath 15221. Then, these bright spots developed to a large bright region. On both sides of a dark line along the magnetic neutral line, bright regions were generated. These developed to a two-ribbon flare. The time required for growth was more than one hour. A bright arc which connects two ribbons was seen, and this arc may be a loop prominence system. (Kato, T.)

  3. Big Bang Day : The Great Big Particle Adventure - 3. Origins

    CERN Multimedia

    2008-01-01

    In this series, comedian and physicist Ben Miller asks the CERN scientists what they hope to find. If the LHC is successful, it will explain the nature of the Universe around us in terms of a few simple ingredients and a few simple rules. But the Universe now was forged in a Big Bang where conditions were very different, and the rules were very different, and those early moments were crucial to determining how things turned out later. At the LHC they can recreate conditions as they were billionths of a second after the Big Bang, before atoms and nuclei existed. They can find out why matter and antimatter didn't mutually annihilate each other to leave behind a Universe of pure, brilliant light. And they can look into the very structure of space and time - the fabric of the Universe

  4. Big Data Reduction and Optimization in Sensor Monitoring Network

    Directory of Open Access Journals (Sweden)

    Bin He

    2014-01-01

    Full Text Available Wireless sensor networks (WSNs are increasingly being utilized to monitor the structural health of the underground subway tunnels, showing many promising advantages over traditional monitoring schemes. Meanwhile, with the increase of the network size, the system is incapable of dealing with big data to ensure efficient data communication, transmission, and storage. Being considered as a feasible solution to these issues, data compression can reduce the volume of data travelling between sensor nodes. In this paper, an optimization algorithm based on the spatial and temporal data compression is proposed to cope with these issues appearing in WSNs in the underground tunnel environment. The spatial and temporal correlation functions are introduced for the data compression and data recovery. It is verified that the proposed algorithm is applicable to WSNs in the underground tunnel.

  5. Inflated granularity: Spatial “Big Data” and geodemographics

    Directory of Open Access Journals (Sweden)

    Craig M Dalton

    2015-08-01

    Full Text Available Data analytics, particularly the current rhetoric around “Big Data”, tend to be presented as new and innovative, emerging ahistorically to revolutionize modern life. In this article, we situate one branch of Big Data analytics, spatial Big Data, through a historical predecessor, geodemographic analysis, to help develop a critical approach to current data analytics. Spatial Big Data promises an epistemic break in marketing, a leap from targeting geodemographic areas to targeting individuals. Yet it inherits characteristics and problems from geodemographics, including a justification through the market, and a process of commodification through the black-boxing of technology. As researchers develop sustained critiques of data analytics and its effects on everyday life, we must so with a grounding in the cultural and historical contexts from which data technologies emerged. This article and others (Barnes and Wilson, 2014 develop a historically situated, critical approach to spatial Big Data. This history illustrates connections to the critical issues of surveillance, redlining, and the production of consumer subjects and geographies. The shared histories and structural logics of spatial Big Data and geodemographics create the space for a continued critique of data analyses’ role in society.

  6. Big data analysis for smart farming

    NARCIS (Netherlands)

    Kempenaar, C.; Lokhorst, C.; Bleumer, E.J.B.; Veerkamp, R.F.; Been, Th.; Evert, van F.K.; Boogaardt, M.J.; Ge, L.; Wolfert, J.; Verdouw, C.N.; Bekkum, van Michael; Feldbrugge, L.; Verhoosel, Jack P.C.; Waaij, B.D.; Persie, van M.; Noorbergen, H.

    2016-01-01

    In this report we describe results of a one-year TO2 institutes project on the development of big data technologies within the milk production chain. The goal of this project is to ‘create’ an integration platform for big data analysis for smart farming and to develop a show case. This includes both

  7. A survey on Big Data Stream Mining

    African Journals Online (AJOL)

    pc

    2018-03-05

    Mar 5, 2018 ... Big Data can be static on one machine or distributed ... decision making, and process automation. Big data .... Concept Drifting: concept drifting mean the classifier .... transactions generated by a prefix tree structure. EstDec ...

  8. Emerging technology and architecture for big-data analytics

    CERN Document Server

    Chang, Chip; Yu, Hao

    2017-01-01

    This book describes the current state of the art in big-data analytics, from a technology and hardware architecture perspective. The presentation is designed to be accessible to a broad audience, with general knowledge of hardware design and some interest in big-data analytics. Coverage includes emerging technology and devices for data-analytics, circuit design for data-analytics, and architecture and algorithms to support data-analytics. Readers will benefit from the realistic context used by the authors, which demonstrates what works, what doesn’t work, and what are the fundamental problems, solutions, upcoming challenges and opportunities. Provides a single-source reference to hardware architectures for big-data analytics; Covers various levels of big-data analytics hardware design abstraction and flow, from device, to circuits and systems; Demonstrates how non-volatile memory (NVM) based hardware platforms can be a viable solution to existing challenges in hardware architecture for big-data analytics.

  9. Toward a manifesto for the 'public understanding of big data'.

    Science.gov (United States)

    Michael, Mike; Lupton, Deborah

    2016-01-01

    In this article, we sketch a 'manifesto' for the 'public understanding of big data'. On the one hand, this entails such public understanding of science and public engagement with science and technology-tinged questions as follows: How, when and where are people exposed to, or do they engage with, big data? Who are regarded as big data's trustworthy sources, or credible commentators and critics? What are the mechanisms by which big data systems are opened to public scrutiny? On the other hand, big data generate many challenges for public understanding of science and public engagement with science and technology: How do we address publics that are simultaneously the informant, the informed and the information of big data? What counts as understanding of, or engagement with, big data, when big data themselves are multiplying, fluid and recursive? As part of our manifesto, we propose a range of empirical, conceptual and methodological exhortations. We also provide Appendix 1 that outlines three novel methods for addressing some of the issues raised in the article. © The Author(s) 2015.

  10. What do Big Data do in Global Governance?

    DEFF Research Database (Denmark)

    Krause Hansen, Hans; Porter, Tony

    2017-01-01

    Two paradoxes associated with big data are relevant to global governance. First, while promising to increase the capacities of humans in governance, big data also involve an increasingly independent role for algorithms, technical artifacts, the Internet of things, and other objects, which can...... reduce the control of human actors. Second, big data involve new boundary transgressions as data are brought together from multiple sources while also creating new boundary conflicts as powerful actors seek to gain advantage by controlling big data and excluding competitors. These changes are not just...... about new data sources for global decision-makers, but instead signal more profound changes in the character of global governance....

  11. White House announces “big data” initiative

    Science.gov (United States)

    Showstack, Randy

    2012-04-01

    The world is now generating zetabytes—which is 10 to the 21st power, or a billion trillion bytess—of information every year, according to John Holdren, director of the White House Office of Science and Technology Policy. With data volumes growing exponentially from a variety of sources such as computers running large-scale models, scientific instruments including telescopes and particle accelerators, and even online retail transactions, a key challenge is to better manage and utilize the data. The Big Data Research and Development Initiative, launched by the White House at a 29 March briefing, initially includes six federal departments and agencies providing more than $200 million in new commitments to improve tools and techniques for better accessing, organizing, and using data for scientific advances. The agencies and departments include the National Science Foundation (NSF), Department of Energy, U.S. Geological Survey (USGS), National Institutes of Health (NIH), Department of Defense, and Defense Advanced Research Projects Agency.

  12. Big Data in Caenorhabditis elegans: quo vadis?

    Science.gov (United States)

    Hutter, Harald; Moerman, Donald

    2015-11-05

    A clear definition of what constitutes "Big Data" is difficult to identify, but we find it most useful to define Big Data as a data collection that is complete. By this criterion, researchers on Caenorhabditis elegans have a long history of collecting Big Data, since the organism was selected with the idea of obtaining a complete biological description and understanding of development. The complete wiring diagram of the nervous system, the complete cell lineage, and the complete genome sequence provide a framework to phrase and test hypotheses. Given this history, it might be surprising that the number of "complete" data sets for this organism is actually rather small--not because of lack of effort, but because most types of biological experiments are not currently amenable to complete large-scale data collection. Many are also not inherently limited, so that it becomes difficult to even define completeness. At present, we only have partial data on mutated genes and their phenotypes, gene expression, and protein-protein interaction--important data for many biological questions. Big Data can point toward unexpected correlations, and these unexpected correlations can lead to novel investigations; however, Big Data cannot establish causation. As a result, there is much excitement about Big Data, but there is also a discussion on just what Big Data contributes to solving a biological problem. Because of its relative simplicity, C. elegans is an ideal test bed to explore this issue and at the same time determine what is necessary to build a multicellular organism from a single cell. © 2015 Hutter and Moerman. This article is distributed by The American Society for Cell Biology under license from the author(s). Two months after publication it is available to the public under an Attribution–Noncommercial–Share Alike 3.0 Unported Creative Commons License (http://creativecommons.org/licenses/by-nc-sa/3.0).

  13. 76 FR 7810 - Big Horn County Resource Advisory Committee

    Science.gov (United States)

    2011-02-11

    ..., Wyoming 82801. Comments may also be sent via e-mail to [email protected] , with the words Big... DEPARTMENT OF AGRICULTURE Forest Service Big Horn County Resource Advisory Committee AGENCY: Forest Service, USDA. ACTION: Notice of meeting. SUMMARY: The Big Horn County Resource Advisory Committee...

  14. Hot big bang or slow freeze?

    Energy Technology Data Exchange (ETDEWEB)

    Wetterich, C.

    2014-09-07

    We confront the big bang for the beginning of the universe with an equivalent picture of a slow freeze — a very cold and slowly evolving universe. In the freeze picture the masses of elementary particles increase and the gravitational constant decreases with cosmic time, while the Newtonian attraction remains unchanged. The freeze and big bang pictures both describe the same observations or physical reality. We present a simple “crossover model” without a big bang singularity. In the infinite past space–time is flat. Our model is compatible with present observations, describing the generation of primordial density fluctuations during inflation as well as the present transition to a dark energy-dominated universe.

  15. Hot big bang or slow freeze?

    International Nuclear Information System (INIS)

    Wetterich, C.

    2014-01-01

    We confront the big bang for the beginning of the universe with an equivalent picture of a slow freeze — a very cold and slowly evolving universe. In the freeze picture the masses of elementary particles increase and the gravitational constant decreases with cosmic time, while the Newtonian attraction remains unchanged. The freeze and big bang pictures both describe the same observations or physical reality. We present a simple “crossover model” without a big bang singularity. In the infinite past space–time is flat. Our model is compatible with present observations, describing the generation of primordial density fluctuations during inflation as well as the present transition to a dark energy-dominated universe

  16. Hot big bang or slow freeze?

    Directory of Open Access Journals (Sweden)

    C. Wetterich

    2014-09-01

    Full Text Available We confront the big bang for the beginning of the universe with an equivalent picture of a slow freeze — a very cold and slowly evolving universe. In the freeze picture the masses of elementary particles increase and the gravitational constant decreases with cosmic time, while the Newtonian attraction remains unchanged. The freeze and big bang pictures both describe the same observations or physical reality. We present a simple “crossover model” without a big bang singularity. In the infinite past space–time is flat. Our model is compatible with present observations, describing the generation of primordial density fluctuations during inflation as well as the present transition to a dark energy-dominated universe.

  17. Pre-big bang cosmology and quantum fluctuations

    International Nuclear Information System (INIS)

    Ghosh, A.; Pollifrone, G.; Veneziano, G.

    2000-01-01

    The quantum fluctuations of a homogeneous, isotropic, open pre-big bang model are discussed. By solving exactly the equations for tensor and scalar perturbations we find that particle production is negligible during the perturbative Pre-Big Bang phase

  18. Analysis of Big Data Maturity Stage in Hospitality Industry

    OpenAIRE

    Shabani, Neda; Munir, Arslan; Bose, Avishek

    2017-01-01

    Big data analytics has an extremely significant impact on many areas in all businesses and industries including hospitality. This study aims to guide information technology (IT) professionals in hospitality on their big data expedition. In particular, the purpose of this study is to identify the maturity stage of the big data in hospitality industry in an objective way so that hotels be able to understand their progress, and realize what it will take to get to the next stage of big data matur...

  19. A Multidisciplinary Perspective of Big Data in Management Research

    OpenAIRE

    Sheng, Jie; Amankwah-Amoah, J.; Wang, X.

    2017-01-01

    In recent years, big data has emerged as one of the prominent buzzwords in business and management. In spite of the mounting body of research on big data across the social science disciplines, scholars have offered little synthesis on the current state of knowledge. To take stock of academic research that contributes to the big data revolution, this paper tracks scholarly work's perspectives on big data in the management domain over the past decade. We identify key themes emerging in manageme...

  20. An embedding for the big bang

    Science.gov (United States)

    Wesson, Paul S.

    1994-01-01

    A cosmological model is given that has good physical properties for the early and late universe but is a hypersurface in a flat five-dimensional manifold. The big bang can therefore be regarded as an effect of a choice of coordinates in a truncated higher-dimensional geometry. Thus the big bang is in some sense a geometrical illusion.