WorldWideScience

Sample records for sampling system big

  1. Sampling Operations on Big Data

    Science.gov (United States)

    2015-11-29

    gories. These include edge sampling methods where edges are selected by a predetermined criteria; snowball sampling methods where algorithms start... Sampling Operations on Big Data Vijay Gadepally, Taylor Herr, Luke Johnson, Lauren Milechin, Maja Milosavljevic, Benjamin A. Miller Lincoln...process and disseminate information for discovery and exploration under real-time constraints. Common signal processing operations such as sampling and

  2. The Berlin Inventory of Gambling behavior - Screening (BIG-S): Validation using a clinical sample.

    Science.gov (United States)

    Wejbera, Martin; Müller, Kai W; Becker, Jan; Beutel, Manfred E

    2017-05-18

    Published diagnostic questionnaires for gambling disorder in German are either based on DSM-III criteria or focus on aspects other than life time prevalence. This study was designed to assess the usability of the DSM-IV criteria based Berlin Inventory of Gambling Behavior Screening tool in a clinical sample and adapt it to DSM-5 criteria. In a sample of 432 patients presenting for behavioral addiction assessment at the University Medical Center Mainz, we checked the screening tool's results against clinical diagnosis and compared a subsample of n=300 clinically diagnosed gambling disorder patients with a comparison group of n=132. The BIG-S produced a sensitivity of 99.7% and a specificity of 96.2%. The instrument's unidimensionality and the diagnostic improvements of DSM-5 criteria were verified by exploratory and confirmatory factor analysis as well as receiver operating characteristic analysis. The BIG-S is a reliable and valid screening tool for gambling disorder and demonstrated its concise and comprehensible operationalization of current DSM-5 criteria in a clinical setting.

  3. Benchmarking Big Data Systems and the BigData Top100 List.

    Science.gov (United States)

    Baru, Chaitanya; Bhandarkar, Milind; Nambiar, Raghunath; Poess, Meikel; Rabl, Tilmann

    2013-03-01

    "Big data" has become a major force of innovation across enterprises of all sizes. New platforms with increasingly more features for managing big datasets are being announced almost on a weekly basis. Yet, there is currently a lack of any means of comparability among such platforms. While the performance of traditional database systems is well understood and measured by long-established institutions such as the Transaction Processing Performance Council (TCP), there is neither a clear definition of the performance of big data systems nor a generally agreed upon metric for comparing these systems. In this article, we describe a community-based effort for defining a big data benchmark. Over the past year, a Big Data Benchmarking Community has become established in order to fill this void. The effort focuses on defining an end-to-end application-layer benchmark for measuring the performance of big data applications, with the ability to easily adapt the benchmark specification to evolving challenges in the big data space. This article describes the efforts that have been undertaken thus far toward the definition of a BigData Top100 List. While highlighting the major technical as well as organizational challenges, through this article, we also solicit community input into this process.

  4. Proxy Graph: Visual Quality Metrics of Big Graph Sampling.

    Science.gov (United States)

    Nguyen, Quan Hoang; Hong, Seok-Hee; Eades, Peter; Meidiana, Amyra

    2017-06-01

    Data sampling has been extensively studied for large scale graph mining. Many analyses and tasks become more efficient when performed on graph samples of much smaller size. The use of proxy objects is common in software engineering for analysis and interaction with heavy objects or systems. In this paper, we coin the term 'proxy graph' and empirically investigate how well a proxy graph visualization can represent a big graph. Our investigation focuses on proxy graphs obtained by sampling; this is one of the most common proxy approaches. Despite the plethora of data sampling studies, this is the first evaluation of sampling in the context of graph visualization. For an objective evaluation, we propose a new family of quality metrics for visual quality of proxy graphs. Our experiments cover popular sampling techniques. Our experimental results lead to guidelines for using sampling-based proxy graphs in visualization.

  5. Big Data: Implications for Health System Pharmacy.

    Science.gov (United States)

    Stokes, Laura B; Rogers, Joseph W; Hertig, John B; Weber, Robert J

    2016-07-01

    Big Data refers to datasets that are so large and complex that traditional methods and hardware for collecting, sharing, and analyzing them are not possible. Big Data that is accurate leads to more confident decision making, improved operational efficiency, and reduced costs. The rapid growth of health care information results in Big Data around health services, treatments, and outcomes, and Big Data can be used to analyze the benefit of health system pharmacy services. The goal of this article is to provide a perspective on how Big Data can be applied to health system pharmacy. It will define Big Data, describe the impact of Big Data on population health, review specific implications of Big Data in health system pharmacy, and describe an approach for pharmacy leaders to effectively use Big Data. A few strategies involved in managing Big Data in health system pharmacy include identifying potential opportunities for Big Data, prioritizing those opportunities, protecting privacy concerns, promoting data transparency, and communicating outcomes. As health care information expands in its content and becomes more integrated, Big Data can enhance the development of patient-centered pharmacy services.

  6. Big Data, Small Sample.

    Science.gov (United States)

    Gerlovina, Inna; van der Laan, Mark J; Hubbard, Alan

    2017-05-20

    Multiple comparisons and small sample size, common characteristics of many types of "Big Data" including those that are produced by genomic studies, present specific challenges that affect reliability of inference. Use of multiple testing procedures necessitates calculation of very small tail probabilities of a test statistic distribution. Results based on large deviation theory provide a formal condition that is necessary to guarantee error rate control given practical sample sizes, linking the number of tests and the sample size; this condition, however, is rarely satisfied. Using methods that are based on Edgeworth expansions (relying especially on the work of Peter Hall), we explore the impact of departures of sampling distributions from typical assumptions on actual error rates. Our investigation illustrates how far the actual error rates can be from the declared nominal levels, suggesting potentially wide-spread problems with error rate control, specifically excessive false positives. This is an important factor that contributes to "reproducibility crisis". We also review some other commonly used methods (such as permutation and methods based on finite sampling inequalities) in their application to multiple testing/small sample data. We point out that Edgeworth expansions, providing higher order approximations to the sampling distribution, offer a promising direction for data analysis that could improve reliability of studies relying on large numbers of comparisons with modest sample sizes.

  7. Big Data in the Earth Observing System Data and Information System

    Science.gov (United States)

    Lynnes, Chris; Baynes, Katie; McInerney, Mark

    2016-01-01

    Approaches that are being pursued for the Earth Observing System Data and Information System (EOSDIS) data system to address the challenges of Big Data were presented to the NASA Big Data Task Force. Cloud prototypes are underway to tackle the volume challenge of Big Data. However, advances in computer hardware or cloud won't help (much) with variety. Rather, interoperability standards, conventions, and community engagement are the key to addressing variety.

  8. On the Performance Evaluation of Big Data Systems

    OpenAIRE

    Pirzadeh, Pouria

    2015-01-01

    Big Data is turning to be a key basis for the competition and growth among various businesses. The emerging need to store and process huge volumes of data has resulted in the appearance of different Big Data serving systems with fundamental differences. Big Data benchmarking is a means to assist users to pick the correct system to fulfill their applications' needs. It can also help the developers of these systems to make the correct decisions in building and extending them. While there have b...

  9. Adolescent Psychopathy and the Big Five: Results from Two Samples

    Science.gov (United States)

    Lynam, Donald R.; Caspi, Avshalom; Moffitt, Terrie E.; Raine, Adrian; Loeber, Rolf; Stouthamer-Loeber, Magda

    2005-01-01

    The present study examines the relation between psychopathy and the Big Five dimensions of personality in two samples of adolescents. Specifically, the study tests the hypothesis that the aspect of psychopathy representing selfishness, callousness, and interpersonal manipulation (Factor 1) is most strongly associated with low Agreeableness,…

  10. BigOP: Generating Comprehensive Big Data Workloads as a Benchmarking Framework

    OpenAIRE

    Zhu, Yuqing; Zhan, Jianfeng; Weng, Chuliang; Nambiar, Raghunath; Zhang, Jinchao; Chen, Xingzhen; Wang, Lei

    2014-01-01

    Big Data is considered proprietary asset of companies, organizations, and even nations. Turning big data into real treasure requires the support of big data systems. A variety of commercial and open source products have been unleashed for big data storage and processing. While big data users are facing the choice of which system best suits their needs, big data system developers are facing the question of how to evaluate their systems with regard to general big data processing needs. System b...

  11. The relationship of trait emotional intelligence with the Big Five in Croatian and Slovene university student samples

    Directory of Open Access Journals (Sweden)

    Andreja Avsec

    2009-11-01

    Full Text Available The aim of the study was to examine the relationship between trait emotional intelligence (EI and the Big Five factors of personality in two samples of Croatian and Slovenian university students. If EI is to be of significant value, it must measure something unique and distinct from standard personality traits. The Croatian sample consisted of 257 undergraduate students from University of Rijeka and Osijek and in Slovene sample there were 171 undergraduate students from University of Ljubljana. Participants filled out the Emotional Skills and Competences Questionnaire (ESCQ, Takšić, 1998 and the Big Five Inventory (BFI; John, Donahue, & Kentle, 1991. After controlling for nationality and gender, the Big Five explained up to 33% of the variance of EI. For the Perceive and Understand Emotions Scale only openness and extraversion explain important part of the variance; for the Express and Label Emotions Scale extraversion and conscientiousness are important predictors. The Big Five traits are able to explain the highest proportion of the variance in the Manage and Regulate Emotion Scale; neuroticism is the strongest predictor, but extraversion and conscientiousness also predict important part of the variance. Although high, this percentage of explained variance does not put in question the discriminant validity of EI questionnaire.

  12. BigDataBench: a Big Data Benchmark Suite from Internet Services

    OpenAIRE

    Wang, Lei; Zhan, Jianfeng; Luo, Chunjie; Zhu, Yuqing; Yang, Qiang; He, Yongqiang; Gao, Wanling; Jia, Zhen; Shi, Yingjie; Zhang, Shujie; Zheng, Chen; Lu, Gang; Zhan, Kent; Li, Xiaona; Qiu, Bizhu

    2014-01-01

    As architecture, systems, and data management communities pay greater attention to innovative big data systems and architectures, the pressure of benchmarking and evaluating these systems rises. Considering the broad use of big data systems, big data benchmarks must include diversity of data and workloads. Most of the state-of-the-art big data benchmarking efforts target evaluating specific types of applications or system software stacks, and hence they are not qualified for serving the purpo...

  13. Issues in Big-Data Database Systems

    Science.gov (United States)

    2014-06-01

    that big data will not be manageable using conventional relational database technology, and it is true that alternative paradigms, such as NoSQL systems...conventional relational database technology, and it is true that alternative paradigms, such as NoSQL systems and search engines, have much to offer...scale well, and because integration with external data sources is so difficult. NoSQL systems are more open to this integration, and provide excellent

  14. A Proposed Concentration Curriculum Design for Big Data Analytics for Information Systems Students

    Science.gov (United States)

    Molluzzo, John C.; Lawler, James P.

    2015-01-01

    Big Data is becoming a critical component of the Information Systems curriculum. Educators are enhancing gradually the concentration curriculum for Big Data in schools of computer science and information systems. This paper proposes a creative curriculum design for Big Data Analytics for a program at a major metropolitan university. The design…

  15. Survey of real-time processing systems for big data

    DEFF Research Database (Denmark)

    Liu, Xiufeng; Lftikhar, Nadeem; Xie, Xike

    2014-01-01

    In recent years, real-time processing and analytics systems for big data–in the context of Business Intelligence (BI)–have received a growing attention. The traditional BI platforms that perform regular updates on daily, weekly or monthly basis are no longer adequate to satisfy the fast......-changing business environments. However, due to the nature of big data, it has become a challenge to achieve the real-time capability using the traditional technologies. The recent distributed computing technology, MapReduce, provides off-the-shelf high scalability that can significantly shorten the processing time...... for big data; Its open-source implementation such as Hadoop has become the de-facto standard for processing big data, however, Hadoop has the limitation of supporting real-time updates. The improvements in Hadoop for the real-time capability, and the other alternative real-time frameworks have been...

  16. Database Resources of the BIG Data Center in 2018.

    Science.gov (United States)

    2018-01-04

    The BIG Data Center at Beijing Institute of Genomics (BIG) of the Chinese Academy of Sciences provides freely open access to a suite of database resources in support of worldwide research activities in both academia and industry. With the vast amounts of omics data generated at ever-greater scales and rates, the BIG Data Center is continually expanding, updating and enriching its core database resources through big-data integration and value-added curation, including BioCode (a repository archiving bioinformatics tool codes), BioProject (a biological project library), BioSample (a biological sample library), Genome Sequence Archive (GSA, a data repository for archiving raw sequence reads), Genome Warehouse (GWH, a centralized resource housing genome-scale data), Genome Variation Map (GVM, a public repository of genome variations), Gene Expression Nebulas (GEN, a database of gene expression profiles based on RNA-Seq data), Methylation Bank (MethBank, an integrated databank of DNA methylomes), and Science Wikis (a series of biological knowledge wikis for community annotations). In addition, three featured web services are provided, viz., BIG Search (search as a service; a scalable inter-domain text search engine), BIG SSO (single sign-on as a service; a user access control system to gain access to multiple independent systems with a single ID and password) and Gsub (submission as a service; a unified submission service for all relevant resources). All of these resources are publicly accessible through the home page of the BIG Data Center at http://bigd.big.ac.cn. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.

  17. Big Surveys, Big Data Centres

    Science.gov (United States)

    Schade, D.

    2016-06-01

    Well-designed astronomical surveys are powerful and have consistently been keystones of scientific progress. The Byurakan Surveys using a Schmidt telescope with an objective prism produced a list of about 3000 UV-excess Markarian galaxies but these objects have stimulated an enormous amount of further study and appear in over 16,000 publications. The CFHT Legacy Surveys used a wide-field imager to cover thousands of square degrees and those surveys are mentioned in over 1100 publications since 2002. Both ground and space-based astronomy have been increasing their investments in survey work. Survey instrumentation strives toward fair samples and large sky coverage and therefore strives to produce massive datasets. Thus we are faced with the "big data" problem in astronomy. Survey datasets require specialized approaches to data management. Big data places additional challenging requirements for data management. If the term "big data" is defined as data collections that are too large to move then there are profound implications for the infrastructure that supports big data science. The current model of data centres is obsolete. In the era of big data the central problem is how to create architectures that effectively manage the relationship between data collections, networks, processing capabilities, and software, given the science requirements of the projects that need to be executed. A stand alone data silo cannot support big data science. I'll describe the current efforts of the Canadian community to deal with this situation and our successes and failures. I'll talk about how we are planning in the next decade to try to create a workable and adaptable solution to support big data science.

  18. Big Data and medicine: a big deal?

    Science.gov (United States)

    Mayer-Schönberger, V; Ingelsson, E

    2018-05-01

    Big Data promises huge benefits for medical research. Looking beyond superficial increases in the amount of data collected, we identify three key areas where Big Data differs from conventional analyses of data samples: (i) data are captured more comprehensively relative to the phenomenon under study; this reduces some bias but surfaces important trade-offs, such as between data quantity and data quality; (ii) data are often analysed using machine learning tools, such as neural networks rather than conventional statistical methods resulting in systems that over time capture insights implicit in data, but remain black boxes, rarely revealing causal connections; and (iii) the purpose of the analyses of data is no longer simply answering existing questions, but hinting at novel ones and generating promising new hypotheses. As a consequence, when performed right, Big Data analyses can accelerate research. Because Big Data approaches differ so fundamentally from small data ones, research structures, processes and mindsets need to adjust. The latent value of data is being reaped through repeated reuse of data, which runs counter to existing practices not only regarding data privacy, but data management more generally. Consequently, we suggest a number of adjustments such as boards reviewing responsible data use, and incentives to facilitate comprehensive data sharing. As data's role changes to a resource of insight, we also need to acknowledge the importance of collecting and making data available as a crucial part of our research endeavours, and reassess our formal processes from career advancement to treatment approval. © 2017 The Association for the Publication of the Journal of Internal Medicine.

  19. BigDansing

    KAUST Repository

    Khayyat, Zuhair

    2015-06-02

    Data cleansing approaches have usually focused on detecting and fixing errors with little attention to scaling to big datasets. This presents a serious impediment since data cleansing often involves costly computations such as enumerating pairs of tuples, handling inequality joins, and dealing with user-defined functions. In this paper, we present BigDansing, a Big Data Cleansing system to tackle efficiency, scalability, and ease-of-use issues in data cleansing. The system can run on top of most common general purpose data processing platforms, ranging from DBMSs to MapReduce-like frameworks. A user-friendly programming interface allows users to express data quality rules both declaratively and procedurally, with no requirement of being aware of the underlying distributed platform. BigDansing takes these rules into a series of transformations that enable distributed computations and several optimizations, such as shared scans and specialized joins operators. Experimental results on both synthetic and real datasets show that BigDansing outperforms existing baseline systems up to more than two orders of magnitude without sacrificing the quality provided by the repair algorithms.

  20. Reflection on Quality Assurance System of Higher Vocational Education under Big Data Era

    Directory of Open Access Journals (Sweden)

    Jiang Xinlan

    2015-01-01

    Full Text Available Big data has the features like Volume, Variety, Value and Velocity. Here come the new opportunities and challenges for construction of Chinese quality assurance system of higher vocational education under big data era. There are problems in current quality assurance system of higher vocational education, such as imperfect main body, non-formation of internally and externally incorporated quality assurance system, non-scientific security standard and insufficiency in security investment. The construction of higher vocational education under big data era requires a change in the idea of quality assurance system construction to realize the multiple main bodies and multiple layers development trend for educational quality assurance system, and strengthen the construction of information platform for quality assurance system.

  1. Infectious Disease Surveillance in the Big Data Era: Towards Faster and Locally Relevant Systems

    Science.gov (United States)

    Simonsen, Lone; Gog, Julia R.; Olson, Don; Viboud, Cécile

    2016-01-01

    While big data have proven immensely useful in fields such as marketing and earth sciences, public health is still relying on more traditional surveillance systems and awaiting the fruits of a big data revolution. A new generation of big data surveillance systems is needed to achieve rapid, flexible, and local tracking of infectious diseases, especially for emerging pathogens. In this opinion piece, we reflect on the long and distinguished history of disease surveillance and discuss recent developments related to use of big data. We start with a brief review of traditional systems relying on clinical and laboratory reports. We then examine how large-volume medical claims data can, with great spatiotemporal resolution, help elucidate local disease patterns. Finally, we review efforts to develop surveillance systems based on digital and social data streams, including the recent rise and fall of Google Flu Trends. We conclude by advocating for increased use of hybrid systems combining information from traditional surveillance and big data sources, which seems the most promising option moving forward. Throughout the article, we use influenza as an exemplar of an emerging and reemerging infection which has traditionally been considered a model system for surveillance and modeling. PMID:28830112

  2. Big data in complex systems challenges and opportunities

    CERN Document Server

    Azar, Ahmad; Snasael, Vaclav; Kacprzyk, Janusz; Abawajy, Jemal

    2015-01-01

    This volume provides challenges and Opportunities with updated, in-depth material on the application of Big data to complex systems in order to find solutions for the challenges and problems facing big data sets applications. Much data today is not natively in structured format; for example, tweets and blogs are weakly structured pieces of text, while images and video are structured for storage and display, but not for semantic content and search. Therefore transforming such content into a structured format for later analysis is a major challenge. Data analysis, organization, retrieval, and modeling are other  foundational challenges treated in this book. The material of this book will be useful for researchers and practitioners in the field of big data as well as advanced undergraduate and graduate  students. Each of the 17 chapters in the book opens with a chapter abstract and key terms list. The chapters are organized along the lines of problem description, related works, and analysis of the results and ...

  3. Big Data and Public Health Systems: Issues and Opportunities

    Directory of Open Access Journals (Sweden)

    David Rojas

    2018-03-01

    Full Text Available Over the last years, the need for changing the current model of European public health systems has been repeatedly addressed, in order to ensure their sustainability. Following this line, IT has always been referred to as one of the key instruments for enhancing the information management processes of healthcare organizations, thus contributing to the improvement and evolution of health systems. On the IT field, Big Data solutions are expected to play a main role, since they are designed for handling huge amounts of information in a fast and efficient way, allowing users to make important decisions quickly. This article reviews the main features of the European public health system model and the corresponding healthcare and management-related information systems, the challenges that these health systems are currently facing, and the possible contributions of Big Data solutions to this field. To that end, the authors share their professional experience on the Spanish public health system, and review the existing literature related to this topic.

  4. Pillow seal system at the BigRIPS separator

    Energy Technology Data Exchange (ETDEWEB)

    Tanaka, K., E-mail: ktanaka@riken.jp; Inabe, N.; Yoshida, K.; Kusaka, K.; Kubo, T.

    2013-12-15

    Highlights: • Pillow seal system has been installed for a high-intensity RI-beam facility at RIKEN. • It is aimed at facilitating remote maintenance under high residual radiation. • Local radiation shields are integrated with one of the pillow seals. • Pillow seals have been aligned to the beam axis within 1mm accuracy. • A leakage rate of 10{sup –9} Pa m{sup 3}/s has been achieved with our pillow seal system. -- Abstract: We have designed and installed a pillow seal system for the BigRIPS fragment separator at the RIKEN Radioactive Isotope Beam Factory (RIBF) to facilitate remote maintenance in a radioactive environment. The pillow seal system is a device to connect a vacuum chamber and a beam tube. It allows quick attachment and detachment of vacuum connections in the BigRIPS separator and consists of a double diaphragm with a differential pumping system. The leakage rate achieved with this system is as low as 10{sup –9} Pa m{sup 3}/s. We have also designed and installed a local radiation-shielding system, integrated with the pillow seal system, to protect the superconducting magnets and to reduce the heat load on the cryogenic system. We present an overview of the pillow seal and the local shielding systems.

  5. Networking for big data

    CERN Document Server

    Yu, Shui; Misic, Jelena; Shen, Xuemin (Sherman)

    2015-01-01

    Networking for Big Data supplies an unprecedented look at cutting-edge research on the networking and communication aspects of Big Data. Starting with a comprehensive introduction to Big Data and its networking issues, it offers deep technical coverage of both theory and applications.The book is divided into four sections: introduction to Big Data, networking theory and design for Big Data, networking security for Big Data, and platforms and systems for Big Data applications. Focusing on key networking issues in Big Data, the book explains network design and implementation for Big Data. It exa

  6. Big Data, Big Problems: A Healthcare Perspective.

    Science.gov (United States)

    Househ, Mowafa S; Aldosari, Bakheet; Alanazi, Abdullah; Kushniruk, Andre W; Borycki, Elizabeth M

    2017-01-01

    Much has been written on the benefits of big data for healthcare such as improving patient outcomes, public health surveillance, and healthcare policy decisions. Over the past five years, Big Data, and the data sciences field in general, has been hyped as the "Holy Grail" for the healthcare industry promising a more efficient healthcare system with the promise of improved healthcare outcomes. However, more recently, healthcare researchers are exposing the potential and harmful effects Big Data can have on patient care associating it with increased medical costs, patient mortality, and misguided decision making by clinicians and healthcare policy makers. In this paper, we review the current Big Data trends with a specific focus on the inadvertent negative impacts that Big Data could have on healthcare, in general, and specifically, as it relates to patient and clinical care. Our study results show that although Big Data is built up to be as a the "Holy Grail" for healthcare, small data techniques using traditional statistical methods are, in many cases, more accurate and can lead to more improved healthcare outcomes than Big Data methods. In sum, Big Data for healthcare may cause more problems for the healthcare industry than solutions, and in short, when it comes to the use of data in healthcare, "size isn't everything."

  7. Next Generation Workload Management and Analysis System for Big Data

    Energy Technology Data Exchange (ETDEWEB)

    De, Kaushik [Univ. of Texas, Arlington, TX (United States)

    2017-04-24

    We report on the activities and accomplishments of a four-year project (a three-year grant followed by a one-year no cost extension) to develop a next generation workload management system for Big Data. The new system is based on the highly successful PanDA software developed for High Energy Physics (HEP) in 2005. PanDA is used by the ATLAS experiment at the Large Hadron Collider (LHC), and the AMS experiment at the space station. The program of work described here was carried out by two teams of developers working collaboratively at Brookhaven National Laboratory (BNL) and the University of Texas at Arlington (UTA). These teams worked closely with the original PanDA team – for the sake of clarity the work of the next generation team will be referred to as the BigPanDA project. Their work has led to the adoption of BigPanDA by the COMPASS experiment at CERN, and many other experiments and science projects worldwide.

  8. A Dictionary Learning Approach for Signal Sampling in Task-Based fMRI for Reduction of Big Data

    Science.gov (United States)

    Ge, Bao; Li, Xiang; Jiang, Xi; Sun, Yifei; Liu, Tianming

    2018-01-01

    The exponential growth of fMRI big data offers researchers an unprecedented opportunity to explore functional brain networks. However, this opportunity has not been fully explored yet due to the lack of effective and efficient tools for handling such fMRI big data. One major challenge is that computing capabilities still lag behind the growth of large-scale fMRI databases, e.g., it takes many days to perform dictionary learning and sparse coding of whole-brain fMRI data for an fMRI database of average size. Therefore, how to reduce the data size but without losing important information becomes a more and more pressing issue. To address this problem, we propose a signal sampling approach for significant fMRI data reduction before performing structurally-guided dictionary learning and sparse coding of whole brain's fMRI data. We compared the proposed structurally guided sampling method with no sampling, random sampling and uniform sampling schemes, and experiments on the Human Connectome Project (HCP) task fMRI data demonstrated that the proposed method can achieve more than 15 times speed-up without sacrificing the accuracy in identifying task-evoked functional brain networks. PMID:29706880

  9. A Dictionary Learning Approach for Signal Sampling in Task-Based fMRI for Reduction of Big Data.

    Science.gov (United States)

    Ge, Bao; Li, Xiang; Jiang, Xi; Sun, Yifei; Liu, Tianming

    2018-01-01

    The exponential growth of fMRI big data offers researchers an unprecedented opportunity to explore functional brain networks. However, this opportunity has not been fully explored yet due to the lack of effective and efficient tools for handling such fMRI big data. One major challenge is that computing capabilities still lag behind the growth of large-scale fMRI databases, e.g., it takes many days to perform dictionary learning and sparse coding of whole-brain fMRI data for an fMRI database of average size. Therefore, how to reduce the data size but without losing important information becomes a more and more pressing issue. To address this problem, we propose a signal sampling approach for significant fMRI data reduction before performing structurally-guided dictionary learning and sparse coding of whole brain's fMRI data. We compared the proposed structurally guided sampling method with no sampling, random sampling and uniform sampling schemes, and experiments on the Human Connectome Project (HCP) task fMRI data demonstrated that the proposed method can achieve more than 15 times speed-up without sacrificing the accuracy in identifying task-evoked functional brain networks.

  10. Harnessing Big Data for Systems Pharmacology.

    Science.gov (United States)

    Xie, Lei; Draizen, Eli J; Bourne, Philip E

    2017-01-06

    Systems pharmacology aims to holistically understand mechanisms of drug actions to support drug discovery and clinical practice. Systems pharmacology modeling (SPM) is data driven. It integrates an exponentially growing amount of data at multiple scales (genetic, molecular, cellular, organismal, and environmental). The goal of SPM is to develop mechanistic or predictive multiscale models that are interpretable and actionable. The current explosions in genomics and other omics data, as well as the tremendous advances in big data technologies, have already enabled biologists to generate novel hypotheses and gain new knowledge through computational models of genome-wide, heterogeneous, and dynamic data sets. More work is needed to interpret and predict a drug response phenotype, which is dependent on many known and unknown factors. To gain a comprehensive understanding of drug actions, SPM requires close collaborations between domain experts from diverse fields and integration of heterogeneous models from biophysics, mathematics, statistics, machine learning, and semantic webs. This creates challenges in model management, model integration, model translation, and knowledge integration. In this review, we discuss several emergent issues in SPM and potential solutions using big data technology and analytics. The concurrent development of high-throughput techniques, cloud computing, data science, and the semantic web will likely allow SPM to be findable, accessible, interoperable, reusable, reliable, interpretable, and actionable.

  11. The Design of Intelligent Repair Welding Mechanism and Relative Control System of Big Gear

    Directory of Open Access Journals (Sweden)

    Hong-Yu LIU

    2014-10-01

    Full Text Available Effective repair of worn big gear has large influence on ensuring safety production and enhancing economic benefits. A kind of intelligent repair welding method was put forward mainly aimed at the big gear restriction conditions of high production cost, long production cycle and high- intensity artificial repair welding work. Big gear repair welding mechanism was designed in this paper. The work principle and part selection of big gear repair welding mechanism was introduced. The three dimensional mode of big gear repair welding mechanism was constructed by Pro/E three dimensional design software. Three dimensional motions can be realized by motor controlling ball screw. According to involute gear feature, the complicated curve motion on curved gear surface can be transformed to linear motion by orientation. By this way, the repair welding on worn gear area can be realized. In the design of big gear repair welding mechanism control system, Siemens S7-200 series hardware was chosen. Siemens STEP7 programming software was chosen as system design tool. The entire repair welding process was simulated by experiment simulation. It provides a kind of practical and feasible method for the intelligent repair welding of big worn gear.

  12. Big Data challenges and solutions in building the Global Earth Observation System of Systems (GEOSS)

    Science.gov (United States)

    Mazzetti, Paolo; Nativi, Stefano; Santoro, Mattia; Boldrini, Enrico

    2014-05-01

    The Group on Earth Observation (GEO) is a voluntary partnership of governments and international organizations launched in response to calls for action by the 2002 World Summit on Sustainable Development and by the G8 (Group of Eight) leading industrialized countries. These high-level meetings recognized that international collaboration is essential for exploiting the growing potential of Earth observations to support decision making in an increasingly complex and environmentally stressed world. To this aim is constructing the Global Earth Observation System of Systems (GEOSS) on the basis of a 10-Year Implementation Plan for the period 2005 to 2015 when it will become operational. As a large-scale integrated system handling large datasets as those provided by Earth Observation, GEOSS needs to face several challenges related to big data handling and big data infrastructures management. Referring to the traditional multiple Vs characteristics of Big Data (volume, variety, velocity, veracity and visualization) it is evident how most of them can be found in data handled by GEOSS. In particular, concerning Volume, Earth Observation already generates a large amount of data which can be estimated in the range of Petabytes (1015 bytes), with Exabytes (1018) already targeted. Moreover, the challenge is related not only to the data size, but also to the large amount of datasets (not necessarily having a big size) that systems need to manage. Variety is the other main challenge since datasets coming from different sensors, processed for different use-cases are published with highly heterogeneous metadata and data models, through different service interfaces. Innovative multidisciplinary applications need to access and use those datasets in a harmonized way. Moreover Earth Observation data are growing in size and variety at an exceptionally fast rate and new technologies and applications, including crowdsourcing, will even increase data volume and variety in the next future

  13. Understanding Big Data for Industrial Innovation and Design: The Missing Information Systems Perspective

    Directory of Open Access Journals (Sweden)

    Miguel Baptista Nunes

    2017-12-01

    Full Text Available This paper identifies a need to complement the current rich technical and mathematical research agenda on big data with a more information systems and information science strand, which focuses on the business value of big data. An agenda of research for information systems would explore motives for using big data in real organizational contexts, and consider proposed benefits, such as increased effectiveness and efficiency, production of high-quality products/services, creation of added business value, and stimulation of innovation and design. Impacts of such research on the academic community, the industrial and business world, and policy-makers are discussed.

  14. Infectious Disease Surveillance in the Big Data Era: Towards Faster and Locally Relevant Systems.

    Science.gov (United States)

    Simonsen, Lone; Gog, Julia R; Olson, Don; Viboud, Cécile

    2016-12-01

    While big data have proven immensely useful in fields such as marketing and earth sciences, public health is still relying on more traditional surveillance systems and awaiting the fruits of a big data revolution. A new generation of big data surveillance systems is needed to achieve rapid, flexible, and local tracking of infectious diseases, especially for emerging pathogens. In this opinion piece, we reflect on the long and distinguished history of disease surveillance and discuss recent developments related to use of big data. We start with a brief review of traditional systems relying on clinical and laboratory reports. We then examine how large-volume medical claims data can, with great spatiotemporal resolution, help elucidate local disease patterns. Finally, we review efforts to develop surveillance systems based on digital and social data streams, including the recent rise and fall of Google Flu Trends. We conclude by advocating for increased use of hybrid systems combining information from traditional surveillance and big data sources, which seems the most promising option moving forward. Throughout the article, we use influenza as an exemplar of an emerging and reemerging infection which has traditionally been considered a model system for surveillance and modeling. Published by Oxford University Press for the Infectious Diseases Society of America 2016. This work is written by (a) US Government employee(s) and is in the public domain in the US.

  15. Design of Flow Big Data System Based on Smart Pipeline Theory

    Directory of Open Access Journals (Sweden)

    Zhang Jianqing

    2017-01-01

    Full Text Available As telecom operators to build intelligent pipe more and more, analysis and processing of big data technology to deal the huge amounts of data intelligent pipeline generated has become an inevitable trend. Intelligent pipe describes operational data, sales data; operator’s pipe flow data make the value for e-commerce business form and business model in mobile e-business environment. Intelligent pipe is the third dimension of 3 D pipeline mobile electronic commerce system. Intelligent operation dimensions make the mobile e-business three-dimensional artifacts. This paper discusses the smart pipeline theory, smart pipeline flow big data system, their system framework and core technology.

  16. Big data system for disaster warning of solar greenhouse vegetables

    OpenAIRE

    Li, M; Zhao, L; Chen, M; Wen, D; Liu, R; Yang, X

    2017-01-01

    Background: Solar greenhouses are very popular in the north of China as a way of meeting the demand for fresh local winter vegetables. Nonetheless, they are more susceptible to biological and meteorological disasters, such as diseases, pests, fog, haze and cold temperatures. Although we have deployed many record keeping equipment and weather stations, we have lower efficiency of usage on data. Big data has great potential in the future. Thus, our aim is to investigate a big data system for di...

  17. Big-Data Based Decision-Support Systems to Improve Clinicians' Cognition.

    Science.gov (United States)

    Roosan, Don; Samore, Matthew; Jones, Makoto; Livnat, Yarden; Clutter, Justin

    2016-01-01

    Complex clinical decision-making could be facilitated by using population health data to inform clinicians. In two previous studies, we interviewed 16 infectious disease experts to understand complex clinical reasoning. For this study, we focused on answers from the experts on how clinical reasoning can be supported by population-based Big-Data. We found cognitive strategies such as trajectory tracking, perspective taking, and metacognition has the potential to improve clinicians' cognition to deal with complex problems. These cognitive strategies could be supported by population health data, and all have important implications for the design of Big-Data based decision-support tools that could be embedded in electronic health records. Our findings provide directions for task allocation and design of decision-support applications for health care industry development of Big data based decision-support systems.

  18. Traffic Congestion Detection System through Connected Vehicles and Big Data

    Directory of Open Access Journals (Sweden)

    Néstor Cárdenas-Benítez

    2016-04-01

    Full Text Available This article discusses the simulation and evaluation of a traffic congestion detection system which combines inter-vehicular communications, fixed roadside infrastructure and infrastructure-to-infrastructure connectivity and big data. The system discussed in this article permits drivers to identify traffic congestion and change their routes accordingly, thus reducing the total emissions of CO2 and decreasing travel time. This system monitors, processes and stores large amounts of data, which can detect traffic congestion in a precise way by means of a series of algorithms that reduces localized vehicular emission by rerouting vehicles. To simulate and evaluate the proposed system, a big data cluster was developed based on Cassandra, which was used in tandem with the OMNeT++ discreet event network simulator, coupled with the SUMO (Simulation of Urban MObility traffic simulator and the Veins vehicular network framework. The results validate the efficiency of the traffic detection system and its positive impact in detecting, reporting and rerouting traffic when traffic events occur.

  19. Traffic Congestion Detection System through Connected Vehicles and Big Data.

    Science.gov (United States)

    Cárdenas-Benítez, Néstor; Aquino-Santos, Raúl; Magaña-Espinoza, Pedro; Aguilar-Velazco, José; Edwards-Block, Arthur; Medina Cass, Aldo

    2016-04-28

    This article discusses the simulation and evaluation of a traffic congestion detection system which combines inter-vehicular communications, fixed roadside infrastructure and infrastructure-to-infrastructure connectivity and big data. The system discussed in this article permits drivers to identify traffic congestion and change their routes accordingly, thus reducing the total emissions of CO₂ and decreasing travel time. This system monitors, processes and stores large amounts of data, which can detect traffic congestion in a precise way by means of a series of algorithms that reduces localized vehicular emission by rerouting vehicles. To simulate and evaluate the proposed system, a big data cluster was developed based on Cassandra, which was used in tandem with the OMNeT++ discreet event network simulator, coupled with the SUMO (Simulation of Urban MObility) traffic simulator and the Veins vehicular network framework. The results validate the efficiency of the traffic detection system and its positive impact in detecting, reporting and rerouting traffic when traffic events occur.

  20. Big-Data Based Decision-Support Systems to Improve Clinicians’ Cognition

    Science.gov (United States)

    Roosan, Don; Samore, Matthew; Jones, Makoto; Livnat, Yarden; Clutter, Justin

    2016-01-01

    Complex clinical decision-making could be facilitated by using population health data to inform clinicians. In two previous studies, we interviewed 16 infectious disease experts to understand complex clinical reasoning. For this study, we focused on answers from the experts on how clinical reasoning can be supported by population-based Big-Data. We found cognitive strategies such as trajectory tracking, perspective taking, and metacognition has the potential to improve clinicians’ cognition to deal with complex problems. These cognitive strategies could be supported by population health data, and all have important implications for the design of Big-Data based decision-support tools that could be embedded in electronic health records. Our findings provide directions for task allocation and design of decision-support applications for health care industry development of Big data based decision-support systems. PMID:27990498

  1. Analyses of Digman's child-personality data: derivation of Big-Five factor scores from each of six samples.

    Science.gov (United States)

    Goldberg, L R

    2001-10-01

    One of the world's richest collections of teacher descriptions of elementary-school children was obtained by John M. Digman from 1959 to 1967 in schools on two Hawaiian islands. In six phases of data collection, 88 teachers described 2,572 of their students, using one of five different sets of personality variables. The present report provides findings from new analyses of these important data, which have never before been analyzed in a comprehensive manner. When factors developed from carefully selected markers of the Big-Five factor structure were compared to those based on the total set of variables in each sample, the congruence between both types of factors was quite high. Attempts to extend the structure to 6 and 7 factors revealed no other broad factors beyond the Big Five in any of the 6 samples. These robust findings provide significant new evidence for the structure of teacher-based assessments of child personality attributes.

  2. Big questions come in bundles, hence they should be tackled systemically.

    Science.gov (United States)

    Bunge, Mario

    2014-01-01

    A big problem is, by definition, one involving either multiple traits of a thing, or a collection of things. Because such problems are systemic or global rather than local or sectoral, they call for a systemic approach, and often a multidisciplinary one as well. Just think of individual and public health problems, or of income inequality, gender discrimination, housing, environmental, or political participation issues. All of them are inter-related, so that focusing on one of them at a time is bound to produce either short-term solutions or utter failure. This is also why single-issue political movements are bound to fail. In other words, big problems are systemic and must, therefore be approached systemically--though with the provisos that systemism must not be confused with holism and that synthesis complements analysis instead of replacing it.

  3. Generalized formal model of Big Data

    OpenAIRE

    Shakhovska, N.; Veres, O.; Hirnyak, M.

    2016-01-01

    This article dwells on the basic characteristic features of the Big Data technologies. It is analyzed the existing definition of the “big data” term. The article proposes and describes the elements of the generalized formal model of big data. It is analyzed the peculiarities of the application of the proposed model components. It is described the fundamental differences between Big Data technology and business analytics. Big Data is supported by the distributed file system Google File System ...

  4. The big data-big model (BDBM) challenges in ecological research

    Science.gov (United States)

    Luo, Y.

    2015-12-01

    The field of ecology has become a big-data science in the past decades due to development of new sensors used in numerous studies in the ecological community. Many sensor networks have been established to collect data. For example, satellites, such as Terra and OCO-2 among others, have collected data relevant on global carbon cycle. Thousands of field manipulative experiments have been conducted to examine feedback of terrestrial carbon cycle to global changes. Networks of observations, such as FLUXNET, have measured land processes. In particular, the implementation of the National Ecological Observatory Network (NEON), which is designed to network different kinds of sensors at many locations over the nation, will generate large volumes of ecological data every day. The raw data from sensors from those networks offer an unprecedented opportunity for accelerating advances in our knowledge of ecological processes, educating teachers and students, supporting decision-making, testing ecological theory, and forecasting changes in ecosystem services. Currently, ecologists do not have the infrastructure in place to synthesize massive yet heterogeneous data into resources for decision support. It is urgent to develop an ecological forecasting system that can make the best use of multiple sources of data to assess long-term biosphere change and anticipate future states of ecosystem services at regional and continental scales. Forecasting relies on big models that describe major processes that underlie complex system dynamics. Ecological system models, despite great simplification of the real systems, are still complex in order to address real-world problems. For example, Community Land Model (CLM) incorporates thousands of processes related to energy balance, hydrology, and biogeochemistry. Integration of massive data from multiple big data sources with complex models has to tackle Big Data-Big Model (BDBM) challenges. Those challenges include interoperability of multiple

  5. Enhancing Trust Management for Wireless Intrusion Detection via Traffic Sampling in the Era of Big Data

    DEFF Research Database (Denmark)

    Meng, Weizhi; Li, Wenjuan; Su, Chunhua

    2017-01-01

    many kinds of information among sensors, whereas such network is vulnerable to a wide range of attacks, especially insider attacks, due to its natural environment and inherent unreliable transmission. To safeguard its security, intrusion detection systems (IDSs) are widely adopted in a WSN to defend...... against insider attacks through implementing proper trustbased mechanisms. However, in the era of big data, sensors may generate excessive information and data, which could degrade the effectiveness of trust computation. In this paper, we focus on this challenge and propose a way of combining Bayesian......-based trust management with traffic sampling for wireless intrusion detection under a hierarchical structure. In the evaluation, we investigate the performance of our approach in both a simulated and a real network environment. Experimental results demonstrate that packet-based trust management would become...

  6. [Big data, medical language and biomedical terminology systems].

    Science.gov (United States)

    Schulz, Stefan; López-García, Pablo

    2015-08-01

    A variety of rich terminology systems, such as thesauri, classifications, nomenclatures and ontologies support information and knowledge processing in health care and biomedical research. Nevertheless, human language, manifested as individually written texts, persists as the primary carrier of information, in the description of disease courses or treatment episodes in electronic medical records, and in the description of biomedical research in scientific publications. In the context of the discussion about big data in biomedicine, we hypothesize that the abstraction of the individuality of natural language utterances into structured and semantically normalized information facilitates the use of statistical data analytics to distil new knowledge out of textual data from biomedical research and clinical routine. Computerized human language technologies are constantly evolving and are increasingly ready to annotate narratives with codes from biomedical terminology. However, this depends heavily on linguistic and terminological resources. The creation and maintenance of such resources is labor-intensive. Nevertheless, it is sensible to assume that big data methods can be used to support this process. Examples include the learning of hierarchical relationships, the grouping of synonymous terms into concepts and the disambiguation of homonyms. Although clear evidence is still lacking, the combination of natural language technologies, semantic resources, and big data analytics is promising.

  7. What is beyond the big five?

    Science.gov (United States)

    Saucier, G; Goldberg, L R

    1998-08-01

    Previous investigators have proposed that various kinds of person-descriptive content--such as differences in attitudes or values, in sheer evaluation, in attractiveness, or in height and girth--are not adequately captured by the Big Five Model. We report on a rather exhaustive search for reliable sources of Big Five-independent variation in data from person-descriptive adjectives. Fifty-three candidate clusters were developed in a college sample using diverse approaches and sources. In a nonstudent adult sample, clusters were evaluated with respect to a minimax criterion: minimum multiple correlation with factors from Big Five markers and maximum reliability. The most clearly Big Five-independent clusters referred to Height, Girth, Religiousness, Employment Status, Youthfulness and Negative Valence (or low-base-rate attributes). Clusters referring to Fashionableness, Sensuality/Seductiveness, Beauty, Masculinity, Frugality, Humor, Wealth, Prejudice, Folksiness, Cunning, and Luck appeared to be potentially beyond the Big Five, although each of these clusters demonstrated Big Five multiple correlations of .30 to .45, and at least one correlation of .20 and over with a Big Five factor. Of all these content areas, Religiousness, Negative Valence, and the various aspects of Attractiveness were found to be represented by a substantial number of distinct, common adjectives. Results suggest directions for supplementing the Big Five when one wishes to extend variable selection outside the domain of personality traits as conventionally defined.

  8. Nursing Needs Big Data and Big Data Needs Nursing.

    Science.gov (United States)

    Brennan, Patricia Flatley; Bakken, Suzanne

    2015-09-01

    Contemporary big data initiatives in health care will benefit from greater integration with nursing science and nursing practice; in turn, nursing science and nursing practice has much to gain from the data science initiatives. Big data arises secondary to scholarly inquiry (e.g., -omics) and everyday observations like cardiac flow sensors or Twitter feeds. Data science methods that are emerging ensure that these data be leveraged to improve patient care. Big data encompasses data that exceed human comprehension, that exist at a volume unmanageable by standard computer systems, that arrive at a velocity not under the control of the investigator and possess a level of imprecision not found in traditional inquiry. Data science methods are emerging to manage and gain insights from big data. The primary methods included investigation of emerging federal big data initiatives, and exploration of exemplars from nursing informatics research to benchmark where nursing is already poised to participate in the big data revolution. We provide observations and reflections on experiences in the emerging big data initiatives. Existing approaches to large data set analysis provide a necessary but not sufficient foundation for nursing to participate in the big data revolution. Nursing's Social Policy Statement guides a principled, ethical perspective on big data and data science. There are implications for basic and advanced practice clinical nurses in practice, for the nurse scientist who collaborates with data scientists, and for the nurse data scientist. Big data and data science has the potential to provide greater richness in understanding patient phenomena and in tailoring interventional strategies that are personalized to the patient. © 2015 Sigma Theta Tau International.

  9. The Prospect of Internet of Things and Big Data Analytics in Transportation System

    Science.gov (United States)

    Noori Hussein, Waleed; Kamarudin, L. M.; Hussain, Haider N.; Zakaria, A.; Badlishah Ahmed, R.; Zahri, N. A. H.

    2018-05-01

    Internet of Things (IoT); the new dawn technology that describes how data, people and interconnected physical objects act based on communicated information, and big data analytics have been adopted by diverse domains for varying purposes. Manufacturing, agriculture, banks, oil and gas, healthcare, retail, hospitality, and food services are few of the sectors that have adopted and massively utilized IoT and big data analytics. The transportation industry is also an early adopter, with significant attendant effects on its processes of tracking shipment, freight monitoring, and transparent warehousing. This is recorded in countries like England, Singapore, Portugal, and Germany, while Malaysia is currently assessing the potentials and researching a purpose-driven adoption and implementation. This paper, based on review of related literature, presents a summary of the inherent prospects in adopting IoT and big data analytics in the Malaysia transportation system. Efficient and safe port environment, predictive maintenance and remote management, boundary-less software platform and connected ecosystem, among others, are the inherent benefits in the IoT and big data analytics for the Malaysia transportation system.

  10. A study and analysis of recommendation systems for location-based social network (LBSN with big data

    Directory of Open Access Journals (Sweden)

    Murale Narayanan

    2016-03-01

    Full Text Available Recommender systems play an important role in our day-to-day life. A recommender system automatically suggests an item to a user that he/she might be interested in. Small-scale datasets are used to provide recommendations based on location, but in real time, the volume of data is large. We have selected Foursquare dataset to study the need for big data in recommendation systems for location-based social network (LBSN. A few quality parameters like parallel processing and multimodal interface have been selected to study the need for big data in recommender systems. This paper provides a study and analysis of quality parameters of recommendation systems for LBSN with big data.

  11. Satellite Telemetry and Command using Big LEO Mobile Telecommunications Systems

    Science.gov (United States)

    Huegel, Fred

    1998-01-01

    Various issues associated with satellite telemetry and command using Big LEO mobile telecommunications systems are presented in viewgraph form. Specific topics include: 1) Commercial Satellite system overviews: Globalstar, ICO, and Iridium; 2) System capabilities and cost reduction; 3) Satellite constellations and contact limitations; 4) Capabilities of Globalstar, ICO and Iridium with emphasis on Globalstar; and 5) Flight transceiver issues and security.

  12. Evaluation of Data Management Systems for Geospatial Big Data

    OpenAIRE

    Amirian, Pouria; Basiri, Anahid; Winstanley, Adam C.

    2014-01-01

    Big Data encompasses collection, management, processing and analysis of the huge amount of data that varies in types and changes with high frequency. Often data component of Big Data has a positional component as an important part of it in various forms, such as postal address, Internet Protocol (IP) address and geographical location. If the positional components in Big Data extensively used in storage, retrieval, analysis, processing, visualization and knowledge discovery (geospatial Big Dat...

  13. Innovating Big Data Computing Geoprocessing for Analysis of Engineered-Natural Systems

    Science.gov (United States)

    Rose, K.; Baker, V.; Bauer, J. R.; Vasylkivska, V.

    2016-12-01

    Big data computing and analytical techniques offer opportunities to improve predictions about subsurface systems while quantifying and characterizing associated uncertainties from these analyses. Spatial analysis, big data and otherwise, of subsurface natural and engineered systems are based on variable resolution, discontinuous, and often point-driven data to represent continuous phenomena. We will present examples from two spatio-temporal methods that have been adapted for use with big datasets and big data geo-processing capabilities. The first approach uses regional earthquake data to evaluate spatio-temporal trends associated with natural and induced seismicity. The second algorithm, the Variable Grid Method (VGM), is a flexible approach that presents spatial trends and patterns, such as those resulting from interpolation methods, while simultaneously visualizing and quantifying uncertainty in the underlying spatial datasets. In this presentation we will show how we are utilizing Hadoop to store and perform spatial analyses to efficiently consume and utilize large geospatial data in these custom analytical algorithms through the development of custom Spark and MapReduce applications that incorporate ESRI Hadoop libraries. The team will present custom `Big Data' geospatial applications that run on the Hadoop cluster and integrate with ESRI ArcMap with the team's probabilistic VGM approach. The VGM-Hadoop tool has been specially built as a multi-step MapReduce application running on the Hadoop cluster for the purpose of data reduction. This reduction is accomplished by generating multi-resolution, non-overlapping, attributed topology that is then further processed using ESRI's geostatistical analyst to convey a probabilistic model of a chosen study region. Finally, we will share our approach for implementation of data reduction and topology generation via custom multi-step Hadoop applications, performance benchmarking comparisons, and Hadoop

  14. Big data, big knowledge: big data for personalized healthcare.

    Science.gov (United States)

    Viceconti, Marco; Hunter, Peter; Hose, Rod

    2015-07-01

    The idea that the purely phenomenological knowledge that we can extract by analyzing large amounts of data can be useful in healthcare seems to contradict the desire of VPH researchers to build detailed mechanistic models for individual patients. But in practice no model is ever entirely phenomenological or entirely mechanistic. We propose in this position paper that big data analytics can be successfully combined with VPH technologies to produce robust and effective in silico medicine solutions. In order to do this, big data technologies must be further developed to cope with some specific requirements that emerge from this application. Such requirements are: working with sensitive data; analytics of complex and heterogeneous data spaces, including nontextual information; distributed data management under security and performance constraints; specialized analytics to integrate bioinformatics and systems biology information with clinical observations at tissue, organ and organisms scales; and specialized analytics to define the "physiological envelope" during the daily life of each patient. These domain-specific requirements suggest a need for targeted funding, in which big data technologies for in silico medicine becomes the research priority.

  15. Big data analytics turning big data into big money

    CERN Document Server

    Ohlhorst, Frank J

    2012-01-01

    Unique insights to implement big data analytics and reap big returns to your bottom line Focusing on the business and financial value of big data analytics, respected technology journalist Frank J. Ohlhorst shares his insights on the newly emerging field of big data analytics in Big Data Analytics. This breakthrough book demonstrates the importance of analytics, defines the processes, highlights the tangible and intangible values and discusses how you can turn a business liability into actionable material that can be used to redefine markets, improve profits and identify new business opportuni

  16. Application of the hybrid Big Bang-Big Crunch algorithm to optimal reconfiguration and distributed generation power allocation in distribution systems

    International Nuclear Information System (INIS)

    Sedighizadeh, Mostafa; Esmaili, Masoud; Esmaeili, Mobin

    2014-01-01

    In this paper, a multi-objective framework is proposed for simultaneous optimal network reconfiguration and DG (distributed generation) power allocation. The proposed method encompasses objective functions of power losses, voltage stability, DG cost, and greenhouse gas emissions and it is optimized subject to power system operational and technical constraints. In order to solve the optimization problem, the HBB-BC (Hybrid Big Bang-Big Crunch) algorithm as one of the most recent heuristic tools is modified and employed here by introducing a mutation operator to enhance its exploration capability. To resolve the scaling problem of differently-scaled objective functions, a fuzzy membership is used to bring them into a same scale and then, the fuzzy fitness of the final objective function is utilized to measure the satisfaction level of the obtained solution. The proposed method is tested on balanced and unbalanced test systems and its results are comprehensively compared with previous methods considering different scenarios. According to results, the proposed method not only offers an enhanced exploration capability but also has a better converge rate compared with previous methods. In addition, the simultaneous network reconfiguration and DG power allocation leads to a more optimal result than separately doing tasks of reconfiguration and DG power allocation. - Highlights: • Hybrid Big Bang-Big Crunch algorithm is applied to network reconfiguration problem. • Joint reconfiguration and DG power allocation leads to a more optimal solution. • A mutation operator is used to improve the exploration capability of HBB-BC method. • The HBB-BC has a better convergence rate than the compared algorithms

  17. Identifying Dwarfs Workloads in Big Data Analytics

    OpenAIRE

    Gao, Wanling; Luo, Chunjie; Zhan, Jianfeng; Ye, Hainan; He, Xiwen; Wang, Lei; Zhu, Yuqing; Tian, Xinhui

    2015-01-01

    Big data benchmarking is particularly important and provides applicable yardsticks for evaluating booming big data systems. However, wide coverage and great complexity of big data computing impose big challenges on big data benchmarking. How can we construct a benchmark suite using a minimum set of units of computation to represent diversity of big data analytics workloads? Big data dwarfs are abstractions of extracting frequently appearing operations in big data computing. One dwarf represen...

  18. Kazakhstan's Environment-Health system, a Big Data challenge

    Science.gov (United States)

    Vitolo, Claudia; Bella Gazdiyeva, Bella; Tucker, Allan; Russell, Andrew; Ali, Maged; Althonayan, Abraham

    2016-04-01

    Kazakhstan has witnessed a remarkable economic development in the past 15 years, becoming an upper-middle-income country. However it is still widely regarded as a developing nation, partially because of its population's low life expectancy which is 5 years below the average in similar economies. The environment is in a rather fragile state, affected by soil, water, air pollution, radioactive contamination and climate change. However, Kazakhstan's government is moving towards clean energy and environmental protection and calling on scientists to help prioritise investments. The British Council-funded "Kazakhstan's Environment-Health Risk Analysis (KEHRA)" project is one of the recently launched initiatives to support Kazakhstan healthier future. The underlying hypothesis of this research is that the above mentioned factors (air/water/soil pollution, etc.) affecting public health almost certainly do not act independently but rather trigger and exacerbate each other. Exploring the environment-health links in a multi-dimensional framework is a typical Big Data problem, in which the volume and variety of the data needed poses technical as well as scientific challenges. In Kazakhstan, the complexities related to managing and analysing Big Data are worsened by a number of obstacles at the data acquisition step: most of the data is not in digital form, spatial and temporal attributes are often ambiguous and the re-use and re-purpose of the information is subject to restrictive licenses and other mechanisms of control. In this work, we document the first steps taken towards building an understanding of the complex environment-health system in Kazakhstan, using interactive visualisation tools to identify and compare hot-spots of pollution and poor health outcomes, Big Data and web technologies to collect, manage and explore available information. In the future, the knowledge acquired will be modelled to develop evidence-based recommendation systems for decision makers in

  19. Traffic Congestion Detection System through Connected Vehicles and Big Data

    OpenAIRE

    Néstor Cárdenas-Benítez; Raúl Aquino-Santos; Pedro Magaña-Espinoza; José Aguilar-Velazco; Arthur Edwards-Block; Aldo Medina Cass

    2016-01-01

    This article discusses the simulation and evaluation of a traffic congestion detection system which combines inter-vehicular communications, fixed roadside infrastructure and infrastructure-to-infrastructure connectivity and big data. The system discussed in this article permits drivers to identify traffic congestion and change their routes accordingly, thus reducing the total emissions of CO2 and decreasing travel time. This system monitors, processes and stores large amounts of data, which ...

  20. Big Opportunities and Big Concerns of Big Data in Education

    Science.gov (United States)

    Wang, Yinying

    2016-01-01

    Against the backdrop of the ever-increasing influx of big data, this article examines the opportunities and concerns over big data in education. Specifically, this article first introduces big data, followed by delineating the potential opportunities of using big data in education in two areas: learning analytics and educational policy. Then, the…

  1. Deductive systems for BigData integration

    Directory of Open Access Journals (Sweden)

    Radu BUCEA-MANEA-TONIS

    2018-03-01

    Full Text Available The globalization is associated with an increased data to be processed from E-commerce transactions. The specialists are looking for different solutions, such as BigData, Hadoop, Datawarehoues, but it seems that the future is the predicative logic implemented through deductive database technology. It has to be done the swift from imperative languages, to not declaratively languages used for the application development. The deductive databases are very useful in the student teaching programs, too. Thus, the article makes a consistent literature review in the field and shows practical examples of using predicative logic in deductive systems, in order to integrate different kind of data types.

  2. Comparative growth models of big-scale sand smelt (Atherina boyeri Risso, 1810 sampled from Hirfanll Dam Lake, Klrsehir, Ankara, Turkey

    Directory of Open Access Journals (Sweden)

    S. Benzer

    2017-06-01

    Full Text Available In this current publication the growth characteristics of big-scale sand smelt data were compared for population dynamics within artificial neural networks and length-weight relationships models. This study aims to describe the optimal decision of the growth model of big-scale sand smelt by artificial neural networks and length-weight relationships models at Hirfanll Dam Lake, Klrsehir, Turkey. There were a total of 1449 samples collected from Hirfanll Dam Lake between May 2015 and May 2016. Both model results were compared with each other and the results were also evaluated with MAPE (mean absolute percentage error, MSE (mean squared error and r2 (coefficient correlation data as a performance criterion. The results of the current study show that artificial neural networks is a superior estimation tool compared to length-weight relationships models of big-scale sand smelt in Hirfanll Dam Lake.

  3. Challenges of Big Data Analysis.

    Science.gov (United States)

    Fan, Jianqing; Han, Fang; Liu, Han

    2014-06-01

    Big Data bring new opportunities to modern society and challenges to data scientists. On one hand, Big Data hold great promises for discovering subtle population patterns and heterogeneities that are not possible with small-scale data. On the other hand, the massive sample size and high dimensionality of Big Data introduce unique computational and statistical challenges, including scalability and storage bottleneck, noise accumulation, spurious correlation, incidental endogeneity, and measurement errors. These challenges are distinguished and require new computational and statistical paradigm. This article gives overviews on the salient features of Big Data and how these features impact on paradigm change on statistical and computational methods as well as computing architectures. We also provide various new perspectives on the Big Data analysis and computation. In particular, we emphasize on the viability of the sparsest solution in high-confidence set and point out that exogeneous assumptions in most statistical methods for Big Data can not be validated due to incidental endogeneity. They can lead to wrong statistical inferences and consequently wrong scientific conclusions.

  4. Limited sampling hampers "big data" estimation of species richness in a tropical biodiversity hotspot.

    Science.gov (United States)

    Engemann, Kristine; Enquist, Brian J; Sandel, Brody; Boyle, Brad; Jørgensen, Peter M; Morueta-Holme, Naia; Peet, Robert K; Violle, Cyrille; Svenning, Jens-Christian

    2015-02-01

    Macro-scale species richness studies often use museum specimens as their main source of information. However, such datasets are often strongly biased due to variation in sampling effort in space and time. These biases may strongly affect diversity estimates and may, thereby, obstruct solid inference on the underlying diversity drivers, as well as mislead conservation prioritization. In recent years, this has resulted in an increased focus on developing methods to correct for sampling bias. In this study, we use sample-size-correcting methods to examine patterns of tropical plant diversity in Ecuador, one of the most species-rich and climatically heterogeneous biodiversity hotspots. Species richness estimates were calculated based on 205,735 georeferenced specimens of 15,788 species using the Margalef diversity index, the Chao estimator, the second-order Jackknife and Bootstrapping resampling methods, and Hill numbers and rarefaction. Species richness was heavily correlated with sampling effort, and only rarefaction was able to remove this effect, and we recommend this method for estimation of species richness with "big data" collections.

  5. "Big data" in economic history.

    Science.gov (United States)

    Gutmann, Myron P; Merchant, Emily Klancher; Roberts, Evan

    2018-03-01

    Big data is an exciting prospect for the field of economic history, which has long depended on the acquisition, keying, and cleaning of scarce numerical information about the past. This article examines two areas in which economic historians are already using big data - population and environment - discussing ways in which increased frequency of observation, denser samples, and smaller geographic units allow us to analyze the past with greater precision and often to track individuals, places, and phenomena across time. We also explore promising new sources of big data: organically created economic data, high resolution images, and textual corpora.

  6. Influence of big power motors for irrigation of electric systems

    International Nuclear Information System (INIS)

    Shimoda, M.; Gialuca, V.; Trombetta, O.R.

    1988-01-01

    The evolution of rural electrification in CPFL - Companhia Paulista de Forca e Luz, Sao Paulo State, Brazil, and the influence of big power motors installation for irrigation in electric system are shown. Considerations about rural market, energy consumption, planning of distribution and transmission line and some calculations are also presented. (author)

  7. Big Data as a Driver for Clinical Decision Support Systems: A Learning Health Systems Perspective

    Directory of Open Access Journals (Sweden)

    Arianna Dagliati

    2018-05-01

    Full Text Available Big data technologies are nowadays providing health care with powerful instruments to gather and analyze large volumes of heterogeneous data collected for different purposes, including clinical care, administration, and research. This makes possible to design IT infrastructures that favor the implementation of the so-called “Learning Healthcare System Cycle,” where healthcare practice and research are part of a unique and synergic process. In this paper we highlight how “Big Data enabled” integrated data collections may support clinical decision-making together with biomedical research. Two effective implementations are reported, concerning decision support in Diabetes and in Inherited Arrhythmogenic Diseases.

  8. An Open Framework for Dynamic Big-data-driven Application Systems (DBDDAS) Development

    KAUST Repository

    Douglas, Craig

    2014-01-01

    In this paper, we outline key features that dynamic data-driven application systems (DDDAS) have. A DDDAS is an application that has data assimilation that can change the models and/or scales of the computation and that the application controls the data collection based on the computational results. The term Big Data (BD) has come into being in recent years that is highly applicable to most DDDAS since most applications use networks of sensors that generate an overwhelming amount of data in the lifespan of the application runs. We describe what a dynamic big-data-driven application system (DBDDAS) toolkit must have in order to provide all of the essential building blocks that are necessary to easily create new DDDAS without re-inventing the building blocks.

  9. An Open Framework for Dynamic Big-data-driven Application Systems (DBDDAS) Development

    KAUST Repository

    Douglas, Craig

    2014-06-06

    In this paper, we outline key features that dynamic data-driven application systems (DDDAS) have. A DDDAS is an application that has data assimilation that can change the models and/or scales of the computation and that the application controls the data collection based on the computational results. The term Big Data (BD) has come into being in recent years that is highly applicable to most DDDAS since most applications use networks of sensors that generate an overwhelming amount of data in the lifespan of the application runs. We describe what a dynamic big-data-driven application system (DBDDAS) toolkit must have in order to provide all of the essential building blocks that are necessary to easily create new DDDAS without re-inventing the building blocks.

  10. Big data need big theory too.

    Science.gov (United States)

    Coveney, Peter V; Dougherty, Edward R; Highfield, Roger R

    2016-11-13

    The current interest in big data, machine learning and data analytics has generated the widespread impression that such methods are capable of solving most problems without the need for conventional scientific methods of inquiry. Interest in these methods is intensifying, accelerated by the ease with which digitized data can be acquired in virtually all fields of endeavour, from science, healthcare and cybersecurity to economics, social sciences and the humanities. In multiscale modelling, machine learning appears to provide a shortcut to reveal correlations of arbitrary complexity between processes at the atomic, molecular, meso- and macroscales. Here, we point out the weaknesses of pure big data approaches with particular focus on biology and medicine, which fail to provide conceptual accounts for the processes to which they are applied. No matter their 'depth' and the sophistication of data-driven methods, such as artificial neural nets, in the end they merely fit curves to existing data. Not only do these methods invariably require far larger quantities of data than anticipated by big data aficionados in order to produce statistically reliable results, but they can also fail in circumstances beyond the range of the data used to train them because they are not designed to model the structural characteristics of the underlying system. We argue that it is vital to use theory as a guide to experimental design for maximal efficiency of data collection and to produce reliable predictive models and conceptual knowledge. Rather than continuing to fund, pursue and promote 'blind' big data projects with massive budgets, we call for more funding to be allocated to the elucidation of the multiscale and stochastic processes controlling the behaviour of complex systems, including those of life, medicine and healthcare.This article is part of the themed issue 'Multiscale modelling at the physics-chemistry-biology interface'. © 2015 The Authors.

  11. Minsky on "Big Government"

    Directory of Open Access Journals (Sweden)

    Daniel de Santana Vasconcelos

    2014-03-01

    Full Text Available This paper objective is to assess, in light of the main works of Minsky, his view and analysis of what he called the "Big Government" as that huge institution which, in parallels with the "Big Bank" was capable of ensuring stability in the capitalist system and regulate its inherently unstable financial system in mid-20th century. In this work, we analyze how Minsky proposes an active role for the government in a complex economic system flawed by financial instability.

  12. Water quality and trend analysis of Colorado--Big Thompson system reservoirs and related conveyances, 1969 through 2000

    Science.gov (United States)

    Stevens, Michael R.

    2003-01-01

    The U.S. Geological Survey, in an ongoing cooperative monitoring program with the Northern Colorado Water Conservancy District, Bureau of Reclamation, and City of Fort Collins, has collected water-quality data in north-central Colorado since 1969 in reservoirs and conveyances, such as canals and tunnels, related to the Colorado?Big Thompson Project, a water-storage, collection, and distribution system. Ongoing changes in water use among agricultural and municipal users on the eastern slope of the Rocky Mountains in Colorado, changing land use in reservoir watersheds, and other water-quality issues among Northern Colorado Water Conservancy District customers necessitated a reexamination of water-quality trends in the Colorado?Big Thompson system reservoirs and related conveyances. The sampling sites are on reservoirs, canals, and tunnels in the headwaters of the Colorado River (on the western side of the transcontinental diversion operations) and the headwaters of the Big Thompson River (on the eastern side of the transcontinental diversion operations). Carter Lake Reservoir and Horsetooth Reservoir are off-channel water-storage facilities, located in the foothills of the northern Colorado Front Range, for water supplied from the Colorado?Big Thompson Project. The length of water-quality record ranges from approximately 3 to 30 years depending on the site and the type of measurement or constituent. Changes in sampling frequency, analytical methods, and minimum reporting limits have occurred repeatedly over the period of record. The objective of this report was to complete a retrospective water-quality and trend analysis of reservoir profiles, nutrients, major ions, selected trace elements, chlorophyll-a, and hypolimnetic oxygen data from 1969 through 2000 in Lake Granby, Shadow Mountain Lake, and the Granby Pump Canal in Grand County, Colorado, and Horsetooth Reservoir, Carter Lake, Lake Estes, Alva B. Adams Tunnel, and Olympus Tunnel in Larimer County, Colorado

  13. Consideration of the direction for improving RI-biomics information system for using big data in radiation field

    Energy Technology Data Exchange (ETDEWEB)

    Lee, Seung Hyun; Kim, Joo Yeon; Park, Tai Jin [Korean Association for Radiation Application, Seoul (Korea, Republic of); Lim, Young Khi [Dept. of Radiological Science, Gachon University, Incheon (Korea, Republic of)

    2017-03-15

    RI-Biomics is a fusion technology in radiation felds for evaluating in-vivo dynamics such as absorption, distribution, metabolism and excretion (RI-ADME) of new drugs and materials using radioisotopes and quantitative evaluation of their effcacy. RI-Biomics information is being provided by RIBio-Info developed as information system for distributing its information and three requirements for improving RIBio-Info system have been derived through reviewing recent big data trends in this study. Three requirements are defined as resource, technology and manpower, and some reviews for applying big data in RIBio-In system are suggested. Fist, applicable external big data have to be obtained, second, some infrastructures for realizing applying big data to be expanded, and finally, data scientists able to analyze large scale of information to be trained. Therefore, an original technology driven to analyze for atypical and large scale of data can be created and this stated technology can contribute to obtain a basis to create a new value in RI-Biomics feld.

  14. Consideration of the direction for improving RI-biomics information system for using big data in radiation field

    International Nuclear Information System (INIS)

    Lee, Seung Hyun; Kim, Joo Yeon; Park, Tai Jin; Lim, Young Khi

    2017-01-01

    RI-Biomics is a fusion technology in radiation felds for evaluating in-vivo dynamics such as absorption, distribution, metabolism and excretion (RI-ADME) of new drugs and materials using radioisotopes and quantitative evaluation of their effcacy. RI-Biomics information is being provided by RIBio-Info developed as information system for distributing its information and three requirements for improving RIBio-Info system have been derived through reviewing recent big data trends in this study. Three requirements are defined as resource, technology and manpower, and some reviews for applying big data in RIBio-In system are suggested. Fist, applicable external big data have to be obtained, second, some infrastructures for realizing applying big data to be expanded, and finally, data scientists able to analyze large scale of information to be trained. Therefore, an original technology driven to analyze for atypical and large scale of data can be created and this stated technology can contribute to obtain a basis to create a new value in RI-Biomics feld

  15. Efficient data redistribution to speedup big data analytics in large systems

    NARCIS (Netherlands)

    Cheng, L.; Li, T.

    2017-01-01

    The performance of parallel data analytics systems becomes increasingly important with the rise of Big Data. An essential operation in such environment is parallel join, which always incurs significant cost on network communication. State-of-the-art approaches have achieved performance improvements

  16. Big Data in Medicine is Driving Big Changes

    Science.gov (United States)

    Verspoor, K.

    2014-01-01

    Summary Objectives To summarise current research that takes advantage of “Big Data” in health and biomedical informatics applications. Methods Survey of trends in this work, and exploration of literature describing how large-scale structured and unstructured data sources are being used to support applications from clinical decision making and health policy, to drug design and pharmacovigilance, and further to systems biology and genetics. Results The survey highlights ongoing development of powerful new methods for turning that large-scale, and often complex, data into information that provides new insights into human health, in a range of different areas. Consideration of this body of work identifies several important paradigm shifts that are facilitated by Big Data resources and methods: in clinical and translational research, from hypothesis-driven research to data-driven research, and in medicine, from evidence-based practice to practice-based evidence. Conclusions The increasing scale and availability of large quantities of health data require strategies for data management, data linkage, and data integration beyond the limits of many existing information systems, and substantial effort is underway to meet those needs. As our ability to make sense of that data improves, the value of the data will continue to increase. Health systems, genetics and genomics, population and public health; all areas of biomedicine stand to benefit from Big Data and the associated technologies. PMID:25123716

  17. Big Data for Business Ecosystem Players

    Directory of Open Access Journals (Sweden)

    Perko Igor

    2016-06-01

    Full Text Available In the provided research, some of the Big Data most prospective usage domains connect with distinguished player groups found in the business ecosystem. Literature analysis is used to identify the state of the art of Big Data related research in the major domains of its use-namely, individual marketing, health treatment, work opportunities, financial services, and security enforcement. System theory was used to identify business ecosystem major player types disrupted by Big Data: individuals, small and mid-sized enterprises, large organizations, information providers, and regulators. Relationships between the domains and players were explained through new Big Data opportunities and threats and by players’ responsive strategies. System dynamics was used to visualize relationships in the provided model.

  18. Big data in fashion industry

    Science.gov (United States)

    Jain, S.; Bruniaux, J.; Zeng, X.; Bruniaux, P.

    2017-10-01

    Significant work has been done in the field of big data in last decade. The concept of big data includes analysing voluminous data to extract valuable information. In the fashion world, big data is increasingly playing a part in trend forecasting, analysing consumer behaviour, preference and emotions. The purpose of this paper is to introduce the term fashion data and why it can be considered as big data. It also gives a broad classification of the types of fashion data and briefly defines them. Also, the methodology and working of a system that will use this data is briefly described.

  19. Soft computing in big data processing

    CERN Document Server

    Park, Seung-Jong; Lee, Jee-Hyong

    2014-01-01

    Big data is an essential key to build a smart world as a meaning of the streaming, continuous integration of large volume and high velocity data covering from all sources to final destinations. The big data range from data mining, data analysis and decision making, by drawing statistical rules and mathematical patterns through systematical or automatically reasoning. The big data helps serve our life better, clarify our future and deliver greater value. We can discover how to capture and analyze data. Readers will be guided to processing system integrity and implementing intelligent systems. With intelligent systems, we deal with the fundamental data management and visualization challenges in effective management of dynamic and large-scale data, and efficient processing of real-time and spatio-temporal data. Advanced intelligent systems have led to managing the data monitoring, data processing and decision-making in realistic and effective way. Considering a big size of data, variety of data and frequent chan...

  20. ATLAS BigPanDA Monitoring and Its Evolution

    CERN Document Server

    Wenaus, Torre; The ATLAS collaboration; Korchuganova, Tatiana

    2016-01-01

    BigPanDA is the latest generation of the monitoring system for the Production and Distributed Analysis (PanDA) system. The BigPanDA monitor is a core component of PanDA and also serves the monitoring needs of the new ATLAS Production System Prodsys-2. BigPanDA has been developed to serve the growing computation needs of the ATLAS Experiment and the wider applications of PanDA beyond ATLAS. Through a system-wide job database, the BigPanDA monitor provides a comprehensive and coherent view of the tasks and jobs executed by the system, from high level summaries to detailed drill-down job diagnostics. The system has been in production and has remained in continuous development since mid 2014, today effectively managing more than 2 million jobs per day distributed over 150 computing centers worldwide. BigPanDA also delivers web-based analytics and system state views to groups of users including distributed computing systems operators, shifters, physicist end-users, computing managers and accounting services. Provi...

  1. Big Data Analytic, Big Step for Patient Management and Care in Puerto Rico.

    Science.gov (United States)

    Borrero, Ernesto E

    2018-01-01

    This letter provides an overview of the application of big data in health care system to improve quality of care, including predictive modelling for risk and resource use, precision medicine and clinical decision support, quality of care and performance measurement, public health and research applications, among others. The author delineates the tremendous potential for big data analytics and discuss how it can be successfully implemented in clinical practice, as an important component of a learning health-care system.

  2. Big Data in industry

    Science.gov (United States)

    Latinović, T. S.; Preradović, D. M.; Barz, C. R.; Latinović, M. T.; Petrica, P. P.; Pop-Vadean, A.

    2016-08-01

    The amount of data at the global level has grown exponentially. Along with this phenomena, we have a need for a new unit of measure like exabyte, zettabyte, and yottabyte as the last unit measures the amount of data. The growth of data gives a situation where the classic systems for the collection, storage, processing, and visualization of data losing the battle with a large amount, speed, and variety of data that is generated continuously. Many of data that is created by the Internet of Things, IoT (cameras, satellites, cars, GPS navigation, etc.). It is our challenge to come up with new technologies and tools for the management and exploitation of these large amounts of data. Big Data is a hot topic in recent years in IT circles. However, Big Data is recognized in the business world, and increasingly in the public administration. This paper proposes an ontology of big data analytics and examines how to enhance business intelligence through big data analytics as a service by presenting a big data analytics services-oriented architecture. This paper also discusses the interrelationship between business intelligence and big data analytics. The proposed approach in this paper might facilitate the research and development of business analytics, big data analytics, and business intelligence as well as intelligent agents.

  3. Big Data Analytics An Overview

    Directory of Open Access Journals (Sweden)

    Jayshree Dwivedi

    2015-08-01

    Full Text Available Big data is a data beyond the storage capacity and beyond the processing power is called big data. Big data term is used for data sets its so large or complex that traditional data it involves data sets with sizes. Big data size is a constantly moving target year by year ranging from a few dozen terabytes to many petabytes of data means like social networking sites the amount of data produced by people is growing rapidly every year. Big data is not only a data rather it become a complete subject which includes various tools techniques and framework. It defines the epidemic possibility and evolvement of data both structured and unstructured. Big data is a set of techniques and technologies that require new forms of assimilate to uncover large hidden values from large datasets that are diverse complex and of a massive scale. It is difficult to work with using most relational database management systems and desktop statistics and visualization packages exacting preferably massively parallel software running on tens hundreds or even thousands of servers. Big data environment is used to grab organize and resolve the various types of data. In this paper we describe applications problems and tools of big data and gives overview of big data.

  4. A View on Fuzzy Systems for Big Data: Progress and Opportunities

    Directory of Open Access Journals (Sweden)

    Alberto Fernandez

    2016-04-01

    Full Text Available Currently, we are witnessing a growing trend in the study and application of problems in the framework of Big Data. This is mainly due to the great advantages which come from the knowledge extraction from a high volume of information. For this reason, we observe a migration of the standard Data Mining systems towards a new functional paradigm that allows at working with Big Data. By means of the MapReduce model and its different extensions, scalability can be successfully addressed, while maintaining a good fault tolerance during the execution of the algorithms. Among the different approaches used in Data Mining, those models based on fuzzy systems stand out for many applications. Among their advantages, we must stress the use of a representation close to the natural language. Additionally, they use an inference model that allows a good adaptation to different scenarios, especially those with a given degree of uncertainty. Despite the success of this type of systems, their migration to the Big Data environment in the different learning areas is at a preliminary stage yet. In this paper, we will carry out an overview of the main existing proposals on the topic, analyzing the design of these models. Additionally, we will discuss those problems related to the data distribution and parallelization of the current algorithms, and also its relationship with the fuzzy representation of the information. Finally, we will provide our view on the expectations for the future in this framework according to the design of those methods based on fuzzy sets, as well as the open challenges on the topic.

  5. How Big Are "Martin's Big Words"? Thinking Big about the Future.

    Science.gov (United States)

    Gardner, Traci

    "Martin's Big Words: The Life of Dr. Martin Luther King, Jr." tells of King's childhood determination to use "big words" through biographical information and quotations. In this lesson, students in grades 3 to 5 explore information on Dr. King to think about his "big" words, then they write about their own…

  6. BigWig and BigBed: enabling browsing of large distributed datasets.

    Science.gov (United States)

    Kent, W J; Zweig, A S; Barber, G; Hinrichs, A S; Karolchik, D

    2010-09-01

    BigWig and BigBed files are compressed binary indexed files containing data at several resolutions that allow the high-performance display of next-generation sequencing experiment results in the UCSC Genome Browser. The visualization is implemented using a multi-layered software approach that takes advantage of specific capabilities of web-based protocols and Linux and UNIX operating systems files, R trees and various indexing and compression tricks. As a result, only the data needed to support the current browser view is transmitted rather than the entire file, enabling fast remote access to large distributed data sets. Binaries for the BigWig and BigBed creation and parsing utilities may be downloaded at http://hgdownload.cse.ucsc.edu/admin/exe/linux.x86_64/. Source code for the creation and visualization software is freely available for non-commercial use at http://hgdownload.cse.ucsc.edu/admin/jksrc.zip, implemented in C and supported on Linux. The UCSC Genome Browser is available at http://genome.ucsc.edu.

  7. Small decisions with big impact on data analytics

    OpenAIRE

    Jana Diesner

    2015-01-01

    Big social data have enabled new opportunities for evaluating the applicability of social science theories that were formulated decades ago and were often based on small- to medium-sized samples. Big Data coupled with powerful computing has the potential to replace the statistical practice of sampling and estimating effects by measuring phenomena based on full populations. Preparing these data for analysis and conducting analytics involves a plethora of decisions, some of which are already em...

  8. Big data bioinformatics.

    Science.gov (United States)

    Greene, Casey S; Tan, Jie; Ung, Matthew; Moore, Jason H; Cheng, Chao

    2014-12-01

    Recent technological advances allow for high throughput profiling of biological systems in a cost-efficient manner. The low cost of data generation is leading us to the "big data" era. The availability of big data provides unprecedented opportunities but also raises new challenges for data mining and analysis. In this review, we introduce key concepts in the analysis of big data, including both "machine learning" algorithms as well as "unsupervised" and "supervised" examples of each. We note packages for the R programming language that are available to perform machine learning analyses. In addition to programming based solutions, we review webservers that allow users with limited or no programming background to perform these analyses on large data compendia. © 2014 Wiley Periodicals, Inc.

  9. Investigating Seed Longevity of Big Sagebrush (Artemisia tridentata)

    Science.gov (United States)

    Wijayratne, Upekala C.; Pyke, David A.

    2009-01-01

    The Intermountain West is dominated by big sagebrush communities (Artemisia tridentata subspecies) that provide habitat and forage for wildlife, prevent erosion, and are economically important to recreation and livestock industries. The two most prominent subspecies of big sagebrush in this region are Wyoming big sagebrush (A. t. ssp. wyomingensis) and mountain big sagebrush (A. t. ssp. vaseyana). Increased understanding of seed bank dynamics will assist with sustainable management and persistence of sagebrush communities. For example, mountain big sagebrush may be subjected to shorter fire return intervals and prescribed fire is a tool used often to rejuvenate stands and reduce tree (Juniperus sp. or Pinus sp.) encroachment into these communities. A persistent seed bank for mountain big sagebrush would be advantageous under these circumstances. Laboratory germination trials indicate that seed dormancy in big sagebrush may be habitat-specific, with collections from colder sites being more dormant. Our objective was to investigate seed longevity of both subspecies by evaluating viability of seeds in the field with a seed retrieval experiment and sampling for seeds in situ. We chose six study sites for each subspecies. These sites were dispersed across eastern Oregon, southern Idaho, northwestern Utah, and eastern Nevada. Ninety-six polyester mesh bags, each containing 100 seeds of a subspecies, were placed at each site during November 2006. Seed bags were placed in three locations: (1) at the soil surface above litter, (2) on the soil surface beneath litter, and (3) 3 cm below the soil surface to determine whether dormancy is affected by continued darkness or environmental conditions. Subsets of seeds were examined in April and November in both 2007 and 2008 to determine seed viability dynamics. Seed bank samples were taken at each site, separated into litter and soil fractions, and assessed for number of germinable seeds in a greenhouse. Community composition data

  10. Potential Solution of a Hardware-Software System V-Cluster for Big Data Analysis

    Science.gov (United States)

    Morra, G.; Tufo, H.; Yuen, D. A.; Brown, J.; Zihao, S.

    2017-12-01

    Today it cannot be denied that the Big Data revolution is taking place and is replacing HPC and numerical simulation as the main driver in society. Outside the immediate scientific arena, the Big Data market encompass much more than the AGU. There are many sectors in society that Big Data can ably serve, such as governments finances, hospitals, tourism, and, last by not least, scientific and engineering problems. In many countries, education has not kept pace with the demands from students outside computer science to get into Big Data science. Ultimate Vision (UV) in Beijing attempts to address this need in China by focusing part of our energy on education and training outside the immediate university environment. UV plans a strategy to maximize profits in our beginning. Therefore, we will focus on growing markets such as provincial governments, medical sectors, mass media, and education. And will not address issues such as performance for scientific collaboration, such as seismic networks, where the market share and profits are small by comparison. We have developed a software-hardware system, called V-Cluster, built with the latest NVIDIA GPUs and Intel CPUs with ample amounts of RAM (over couple of Tbytes) and local storage. We have put in an internal network with high bandwidth (over 100 Gbits/sec) and each node of V-Cluster can run at around 40 Tflops. Our system can scale linearly with the number of codes. Our main strength in data analytics is the use of graph-computing paradigm for optimizing the transfer rate in collaborative efforts. We focus in training and education with our clients in order to gain experience in learning about new applications. We will present the philosophy of this second generation of our Data Analytic system, whose costs fall far below those offered elsewhere.

  11. Rule based systems for big data a machine learning approach

    CERN Document Server

    Liu, Han; Cocea, Mihaela

    2016-01-01

    The ideas introduced in this book explore the relationships among rule based systems, machine learning and big data. Rule based systems are seen as a special type of expert systems, which can be built by using expert knowledge or learning from real data. The book focuses on the development and evaluation of rule based systems in terms of accuracy, efficiency and interpretability. In particular, a unified framework for building rule based systems, which consists of the operations of rule generation, rule simplification and rule representation, is presented. Each of these operations is detailed using specific methods or techniques. In addition, this book also presents some ensemble learning frameworks for building ensemble rule based systems.

  12. Changing the personality of a face: Perceived Big Two and Big Five personality factors modeled in real photographs.

    Science.gov (United States)

    Walker, Mirella; Vetter, Thomas

    2016-04-01

    General, spontaneous evaluations of strangers based on their faces have been shown to reflect judgments of these persons' intention and ability to harm. These evaluations can be mapped onto a 2D space defined by the dimensions trustworthiness (intention) and dominance (ability). Here we go beyond general evaluations and focus on more specific personality judgments derived from the Big Two and Big Five personality concepts. In particular, we investigate whether Big Two/Big Five personality judgments can be mapped onto the 2D space defined by the dimensions trustworthiness and dominance. Results indicate that judgments of the Big Two personality dimensions almost perfectly map onto the 2D space. In contrast, at least 3 of the Big Five dimensions (i.e., neuroticism, extraversion, and conscientiousness) go beyond the 2D space, indicating that additional dimensions are necessary to describe more specific face-based personality judgments accurately. Building on this evidence, we model the Big Two/Big Five personality dimensions in real facial photographs. Results from 2 validation studies show that the Big Two/Big Five are perceived reliably across different samples of faces and participants. Moreover, results reveal that participants differentiate reliably between the different Big Two/Big Five dimensions. Importantly, this high level of agreement and differentiation in personality judgments from faces likely creates a subjective reality which may have serious consequences for those being perceived-notably, these consequences ensue because the subjective reality is socially shared, irrespective of the judgments' validity. The methodological approach introduced here might prove useful in various psychological disciplines. (PsycINFO Database Record (c) 2016 APA, all rights reserved).

  13. A paper recommender system based on user's profile in big data ...

    African Journals Online (AJOL)

    These systems present a personalized proposal to users who seek to find a special kind of relevant data or their priorities through the big number of data. Recommendersystem based on personalization uses the user profile and in view of the fact that the user profile encompass information pertaining to the user priorities; ...

  14. GraphStore: A Distributed Graph Storage System for Big Data Networks

    Science.gov (United States)

    Martha, VenkataSwamy

    2013-01-01

    Networks, such as social networks, are a universal solution for modeling complex problems in real time, especially in the Big Data community. While previous studies have attempted to enhance network processing algorithms, none have paved a path for the development of a persistent storage system. The proposed solution, GraphStore, provides an…

  15. Scaling Big Data Cleansing

    KAUST Repository

    Khayyat, Zuhair

    2017-07-31

    Data cleansing approaches have usually focused on detecting and fixing errors with little attention to big data scaling. This presents a serious impediment since identify- ing and repairing dirty data often involves processing huge input datasets, handling sophisticated error discovery approaches and managing huge arbitrary errors. With large datasets, error detection becomes overly expensive and complicated especially when considering user-defined functions. Furthermore, a distinctive algorithm is de- sired to optimize inequality joins in sophisticated error discovery rather than na ̈ıvely parallelizing them. Also, when repairing large errors, their skewed distribution may obstruct effective error repairs. In this dissertation, I present solutions to overcome the above three problems in scaling data cleansing. First, I present BigDansing as a general system to tackle efficiency, scalability, and ease-of-use issues in data cleansing for Big Data. It automatically parallelizes the user’s code on top of general-purpose distributed platforms. Its programming inter- face allows users to express data quality rules independently from the requirements of parallel and distributed environments. Without sacrificing their quality, BigDans- ing also enables parallel execution of serial repair algorithms by exploiting the graph representation of discovered errors. The experimental results show that BigDansing outperforms existing baselines up to more than two orders of magnitude. Although BigDansing scales cleansing jobs, it still lacks the ability to handle sophisticated error discovery requiring inequality joins. Therefore, I developed IEJoin as an algorithm for fast inequality joins. It is based on sorted arrays and space efficient bit-arrays to reduce the problem’s search space. By comparing IEJoin against well- known optimizations, I show that it is more scalable, and several orders of magnitude faster. BigDansing depends on vertex-centric graph systems, i.e., Pregel

  16. Gossip Management at Universities Using Big Data Warehouse Model Integrated with a Decision Support System

    Directory of Open Access Journals (Sweden)

    Pelin Vardarlier

    2016-01-01

    Full Text Available Big Data has recently been used for many purposes like medicine, marketing and sports. It has helped improve management decisions. However, for almost each case a unique data warehouse should be built to benefit from the merits of data mining and Big Data. Hence, each time we start from scratch to form and build a Big Data Warehouse. In this study, we propose a Big Data Warehouse and a model for universities to be used for information management, to be more specific gossip management. The overall model is a decision support system that may help university administraitons when they are making decisions and also provide them with information or gossips being circulated among students and staff. In the model, unsupervised machine learning algorithms have been employed. A prototype of the proposed system has also been presented in the study. User generated data has been collected from students in order to learn gossips and students’ problems related to school, classes, staff and instructors. The findings and results of the pilot study suggest that social media messages among students may give important clues for the happenings at school and this information may be used for management purposes.The model may be developed and implemented by not only universities but also some other organisations.

  17. Object oriented software for simulation and reconstruction of big alignment systems

    International Nuclear Information System (INIS)

    Arce, P.

    2003-01-01

    Modern high-energy physics experiments require tracking detectors to provide high precision under difficult working conditions (high magnetic field, gravity loads and temperature gradients). This is the reason why several of them are deciding to implement optical alignment systems to monitor the displacement of tracking elements in operation. To simulate and reconstruct optical alignment systems a general purpose software, named COCOA, has been developed, using the object oriented paradigm and software engineering techniques. Thanks to the big flexibility in its design, COCOA is able to reconstruct any optical system made of a combination of the following objects: laser, x-hair laser, incoherent source--pinhole, lens, mirror, plate splitter, cube splitter, optical square, rhomboid prism, 2D sensor, 1D sensor, distance-meter, tilt-meter, user-defined. COCOA was designed to satisfy the requirements of the CMS alignment system, which has several thousands of components. Sparse matrix techniques had been investigated for solving non-linear least squares fits with such a big number of parameters. The soundness of COCOA has already been stressed in the reconstruction of the data of a full simulation of a quarter plane of the CMS muon alignment system, which implied solving a system of 900 equations with 850 unknown parameters. Full simulation of the whole CMS alignment system, with over 30,000 parameters, is quite advanced. The integration of COCOA in the CMS software framework is also under progress

  18. BIG DATA-Related Challenges and Opportunities in Earth System Modeling

    Science.gov (United States)

    Bamzai, A. S.

    2012-12-01

    Knowledge of the Earth's climate has increased immensely in recent decades, both through observational analysis and modeling. BIG DATA-related challenges emerge in our quest for understanding the variability and predictability of the climate and earth system on a range of time scales, as well as in our endeavor to improve predictive capability using state-of-the-science models. To enable further scientific discovery, bottlenecks in current paradigms need to be addressed. An overview of current NSF activities in Earth System Modeling with a focus on associated data-related challenges and opportunities, will be presented.

  19. A Control Approach for Performance of Big Data Systems

    OpenAIRE

    Berekmeri , Mihaly; Serrano , Damián; Bouchenak , Sara; Marchand , Nicolas; Robu , Bogdan

    2014-01-01

    International audience; We are at the dawn of a huge data explosion therefore companies have fast growing amounts of data to process. For this purpose Google developed MapReduce, a parallel programming paradigm which is slowly becoming the de facto tool for Big Data analytics. Although to some extent its use is already wide-spread in the industry, ensuring performance constraints for such a complex system poses great challenges and its management requires a high level of expertise. This paper...

  20. Current applications of big data in obstetric anesthesiology.

    Science.gov (United States)

    Klumpner, Thomas T; Bauer, Melissa E; Kheterpal, Sachin

    2017-06-01

    The narrative review aims to highlight several recently published 'big data' studies pertinent to the field of obstetric anesthesiology. Big data has been used to study rare outcomes, to identify trends within the healthcare system, to identify variations in practice patterns, and to highlight potential inequalities in obstetric anesthesia care. Big data studies have helped define the risk of rare complications of obstetric anesthesia, such as the risk of neuraxial hematoma in thrombocytopenic parturients. Also, large national databases have been used to better understand trends in anesthesia-related adverse events during cesarean delivery as well as outline potential racial/ethnic disparities in obstetric anesthesia care. Finally, real-time analysis of patient data across a number of disparate health information systems through the use of sophisticated clinical decision support and surveillance systems is one promising application of big data technology on the labor and delivery unit. 'Big data' research has important implications for obstetric anesthesia care and warrants continued study. Real-time electronic surveillance is a potentially useful application of big data technology on the labor and delivery unit.

  1. Recht voor big data, big data voor recht

    NARCIS (Netherlands)

    Lafarre, Anne

    Big data is een niet meer weg te denken fenomeen in onze maatschappij. Het is de hype cycle voorbij en de eerste implementaties van big data-technieken worden uitgevoerd. Maar wat is nu precies big data? Wat houden de vijf V's in die vaak genoemd worden in relatie tot big data? Ter inleiding van

  2. Slaves to Big Data. Or Are We?

    Directory of Open Access Journals (Sweden)

    Mireille Hildebrandt

    2013-10-01

    Full Text Available

    In this contribution, the notion of Big Data is discussed in relation to the monetisation of personal data. The claim of some proponents, as well as adversaries, that Big Data implies that ‘n = all’, meaning that we no longer need to rely on samples because we have all the data, is scrutinised and found to be both overly optimistic and unnecessarily pessimistic. A set of epistemological and ethical issues is presented, focusing on the implications of Big Data for our perception, cognition, fairness, privacy and due process. The article then looks into the idea of user-centric personal data management to investigate to what extent it provides solutions for some of the problems triggered by the Big Data conundrum. Special attention is paid to the core principle of data protection legislation, namely purpose binding. Finally, this contribution seeks to inquire into the influence of Big Data politics on self, mind and society, and asks how we can prevent ourselves from becoming slaves to Big Data.

  3. GEOSS: Addressing Big Data Challenges

    Science.gov (United States)

    Nativi, S.; Craglia, M.; Ochiai, O.

    2014-12-01

    In the sector of Earth Observation, the explosion of data is due to many factors including: new satellite constellations, the increased capabilities of sensor technologies, social media, crowdsourcing, and the need for multidisciplinary and collaborative research to face Global Changes. In this area, there are many expectations and concerns about Big Data. Vendors have attempted to use this term for their commercial purposes. It is necessary to understand whether Big Data is a radical shift or an incremental change for the existing digital infrastructures. This presentation tries to explore and discuss the impact of Big Data challenges and new capabilities on the Global Earth Observation System of Systems (GEOSS) and particularly on its common digital infrastructure called GCI. GEOSS is a global and flexible network of content providers allowing decision makers to access an extraordinary range of data and information at their desk. The impact of the Big Data dimensionalities (commonly known as 'V' axes: volume, variety, velocity, veracity, visualization) on GEOSS is discussed. The main solutions and experimentation developed by GEOSS along these axes are introduced and analyzed. GEOSS is a pioneering framework for global and multidisciplinary data sharing in the Earth Observation realm; its experience on Big Data is valuable for the many lessons learned.

  4. Reframing Open Big Data

    DEFF Research Database (Denmark)

    Marton, Attila; Avital, Michel; Jensen, Tina Blegind

    2013-01-01

    Recent developments in the techniques and technologies of collecting, sharing and analysing data are challenging the field of information systems (IS) research let alone the boundaries of organizations and the established practices of decision-making. Coined ‘open data’ and ‘big data......’, these developments introduce an unprecedented level of societal and organizational engagement with the potential of computational data to generate new insights and information. Based on the commonalities shared by open data and big data, we develop a research framework that we refer to as open big data (OBD......) by employing the dimensions of ‘order’ and ‘relationality’. We argue that these dimensions offer a viable approach for IS research on open and big data because they address one of the core value propositions of IS; i.e. how to support organizing with computational data. We contrast these dimensions with two...

  5. Study on Big Database Construction and its Application of Sample Data Collected in CHINA'S First National Geographic Conditions Census Based on Remote Sensing Images

    Science.gov (United States)

    Cheng, T.; Zhou, X.; Jia, Y.; Yang, G.; Bai, J.

    2018-04-01

    In the project of China's First National Geographic Conditions Census, millions of sample data have been collected all over the country for interpreting land cover based on remote sensing images, the quantity of data files reaches more than 12,000,000 and has grown in the following project of National Geographic Conditions Monitoring. By now, using database such as Oracle for storing the big data is the most effective method. However, applicable method is more significant for sample data's management and application. This paper studies a database construction method which is based on relational database with distributed file system. The vector data and file data are saved in different physical location. The key issues and solution method are discussed. Based on this, it studies the application method of sample data and analyzes some kinds of using cases, which could lay the foundation for sample data's application. Particularly, sample data locating in Shaanxi province are selected for verifying the method. At the same time, it takes 10 first-level classes which defined in the land cover classification system for example, and analyzes the spatial distribution and density characteristics of all kinds of sample data. The results verify that the method of database construction which is based on relational database with distributed file system is very useful and applicative for sample data's searching, analyzing and promoted application. Furthermore, sample data collected in the project of China's First National Geographic Conditions Census could be useful in the earth observation and land cover's quality assessment.

  6. Exploring the potential of open big data from ticketing websites to characterize travel patterns within the Chinese high-speed rail system.

    Directory of Open Access Journals (Sweden)

    Sheng Wei

    Full Text Available Big data have contributed to deepen our understanding in regards to many human systems, particularly human mobility patterns and the structure and functioning of transportation systems. Resonating the recent call for 'open big data,' big data from various sources on a range of scales have become increasingly accessible to the public. However, open big data relevant to travelers within public transit tools remain scarce, hindering any further in-depth study on human mobility patterns. Here, we explore ticketing-website derived data that are publically available but have been largely neglected. We demonstrate the power, potential and limitations of this open big data, using the Chinese high-speed rail (HSR system as an example. Using an application programming interface, we automatically collected the data on the remaining tickets (RTD for scheduled trains at the last second before departure in order to retrieve information on unused transit capacity, occupancy rate of trains, and passenger flux at stations. We show that this information is highly useful in characterizing the spatiotemporal patterns of traveling behaviors on the Chinese HSR, such as weekend traveling behavior, imbalanced commuting behavior, and station functionality. Our work facilitates the understanding of human traveling patterns along the Chinese HSR, and the functionality of the largest HSR system in the world. We expect our work to attract attention regarding this unique open big data source for the study of analogous transportation systems.

  7. Thick-Big Descriptions

    DEFF Research Database (Denmark)

    Lai, Signe Sophus

    The paper discusses the rewards and challenges of employing commercial audience measurements data – gathered by media industries for profitmaking purposes – in ethnographic research on the Internet in everyday life. It questions claims to the objectivity of big data (Anderson 2008), the assumption...... communication systems, language and behavior appear as texts, outputs, and discourses (data to be ‘found’) – big data then documents things that in earlier research required interviews and observations (data to be ‘made’) (Jensen 2014). However, web-measurement enterprises build audiences according...... to a commercial logic (boyd & Crawford 2011) and is as such directed by motives that call for specific types of sellable user data and specific segmentation strategies. In combining big data and ‘thick descriptions’ (Geertz 1973) scholars need to question how ethnographic fieldwork might map the ‘data not seen...

  8. 78 FR 56264 - Big Bear Mining Corp., Four Rivers BioEnergy, Inc., Mainland Resources, Inc., QI Systems Inc...

    Science.gov (United States)

    2013-09-12

    ... SECURITIES AND EXCHANGE COMMISSION [File No. 500-1] Big Bear Mining Corp., Four Rivers BioEnergy, Inc., Mainland Resources, Inc., QI Systems Inc., South Texas Oil Co., and Synova Healthcare Group, Inc... that there is a lack of current and accurate information concerning the securities of Big Bear Mining...

  9. Next Generation Workload Management System For Big Data on Heterogeneous Distributed Computing

    International Nuclear Information System (INIS)

    Klimentov, A; Maeno, T; Nilsson, P; Panitkin, S; Wenaus, T; Buncic, P; De, K; Oleynik, D; Petrosyan, A; Jha, S; Mount, R; Porter, R J; Read, K F; Wells, J C; Vaniachine, A

    2015-01-01

    The Large Hadron Collider (LHC), operating at the international CERN Laboratory in Geneva, Switzerland, is leading Big Data driven scientific explorations. Experiments at the LHC explore the fundamental nature of matter and the basic forces that shape our universe, and were recently credited for the discovery of a Higgs boson. ATLAS and ALICE are the largest collaborations ever assembled in the sciences and are at the forefront of research at the LHC. To address an unprecedented multi-petabyte data processing challenge, both experiments rely on a heterogeneous distributed computational infrastructure. The ATLAS experiment uses PanDA (Production and Data Analysis) Workload Management System (WMS) for managing the workflow for all data processing on hundreds of data centers. Through PanDA, ATLAS physicists see a single computing facility that enables rapid scientific breakthroughs for the experiment, even though the data centers are physically scattered all over the world. The scale is demonstrated by the following numbers: PanDA manages O(10 2 ) sites, O(10 5 ) cores, O(10 8 ) jobs per year, O(10 3 ) users, and ATLAS data volume is O(10 17 ) bytes. In 2013 we started an ambitious program to expand PanDA to all available computing resources, including opportunistic use of commercial and academic clouds and Leadership Computing Facilities (LCF). The project titled ‘Next Generation Workload Management and Analysis System for Big Data’ (BigPanDA) is funded by DOE ASCR and HEP. Extending PanDA to clouds and LCF presents new challenges in managing heterogeneity and supporting workflow. The BigPanDA project is underway to setup and tailor PanDA at the Oak Ridge Leadership Computing Facility (OLCF) and at the National Research Center 'Kurchatov Institute' together with ALICE distributed computing and ORNL computing professionals. Our approach to integration of HPC platforms at the OLCF and elsewhere is to reuse, as much as possible, existing components of the

  10. Big data analytics methods and applications

    CERN Document Server

    Rao, BLS; Rao, SB

    2016-01-01

    This book has a collection of articles written by Big Data experts to describe some of the cutting-edge methods and applications from their respective areas of interest, and provides the reader with a detailed overview of the field of Big Data Analytics as it is practiced today. The chapters cover technical aspects of key areas that generate and use Big Data such as management and finance; medicine and healthcare; genome, cytome and microbiome; graphs and networks; Internet of Things; Big Data standards; bench-marking of systems; and others. In addition to different applications, key algorithmic approaches such as graph partitioning, clustering and finite mixture modelling of high-dimensional data are also covered. The varied collection of themes in this volume introduces the reader to the richness of the emerging field of Big Data Analytics.

  11. ATLAS BigPanDA Monitoring

    CERN Document Server

    Padolski, Siarhei; The ATLAS collaboration; Klimentov, Alexei; Korchuganova, Tatiana

    2017-01-01

    BigPanDA monitoring is a web based application which provides various processing and representation of the Production and Distributed Analysis (PanDA) system objects states. Analyzing hundreds of millions of computation entities such as an event or a job BigPanDA monitoring builds different scale and levels of abstraction reports in real time mode. Provided information allows users to drill down into the reason of a concrete event failure or observe system bigger picture such as tracking the computation nucleus and satellites performance or the progress of whole production campaign. PanDA system was originally developed for the Atlas experiment and today effectively managing more than 2 million jobs per day distributed over 170 computing centers worldwide. BigPanDA is its core component commissioned in the middle of 2014 and now is the primary source of information for ATLAS users about state of their computations and the source of decision support information for shifters, operators and managers. In this wor...

  12. ATLAS BigPanDA Monitoring

    CERN Document Server

    Padolski, Siarhei; The ATLAS collaboration

    2017-01-01

    BigPanDA monitoring is a web-based application that provides various processing and representation of the Production and Distributed Analysis (PanDA) system objects states. Analysing hundreds of millions of computation entities such as an event or a job BigPanDA monitoring builds different scale and levels of abstraction reports in real time mode. Provided information allows users to drill down into the reason of a concrete event failure or observe system bigger picture such as tracking the computation nucleus and satellites performance or the progress of whole production campaign. PanDA system was originally developed for the Atlas experiment and today effectively managing more than 2 million jobs per day distributed over 170 computing centers worldwide. BigPanDA is its core component commissioned in the middle of 2014 and now is the primary source of information for ATLAS users about state of their computations and the source of decision support information for shifters, operators and managers. In this work...

  13. Development of the Lymphoma Enterprise Architecture Database: a caBIG Silver level compliant system.

    Science.gov (United States)

    Huang, Taoying; Shenoy, Pareen J; Sinha, Rajni; Graiser, Michael; Bumpers, Kevin W; Flowers, Christopher R

    2009-04-03

    Lymphomas are the fifth most common cancer in United States with numerous histological subtypes. Integrating existing clinical information on lymphoma patients provides a platform for understanding biological variability in presentation and treatment response and aids development of novel therapies. We developed a cancer Biomedical Informatics Grid (caBIG) Silver level compliant lymphoma database, called the Lymphoma Enterprise Architecture Data-system (LEAD), which integrates the pathology, pharmacy, laboratory, cancer registry, clinical trials, and clinical data from institutional databases. We utilized the Cancer Common Ontological Representation Environment Software Development Kit (caCORE SDK) provided by National Cancer Institute's Center for Bioinformatics to establish the LEAD platform for data management. The caCORE SDK generated system utilizes an n-tier architecture with open Application Programming Interfaces, controlled vocabularies, and registered metadata to achieve semantic integration across multiple cancer databases. We demonstrated that the data elements and structures within LEAD could be used to manage clinical research data from phase 1 clinical trials, cohort studies, and registry data from the Surveillance Epidemiology and End Results database. This work provides a clear example of how semantic technologies from caBIG can be applied to support a wide range of clinical and research tasks, and integrate data from disparate systems into a single architecture. This illustrates the central importance of caBIG to the management of clinical and biological data.

  14. Integrating R and Hadoop for Big Data Analysis

    OpenAIRE

    Bogdan Oancea; Raluca Mariana Dragoescu

    2014-01-01

    Analyzing and working with big data could be very diffi cult using classical means like relational database management systems or desktop software packages for statistics and visualization. Instead, big data requires large clusters with hundreds or even thousands of computing nodes. Offi cial statistics is increasingly considering big data for deriving new statistics because big data sources could produce more relevant and timely statistics than traditional sources. One of the software tools ...

  15. The measurement equivalence of Big Five factor markers for persons with different levels of education.

    Science.gov (United States)

    Rammstedt, Beatrice; Goldberg, Lewis R; Borg, Ingwer

    2010-02-01

    Previous findings suggest that the Big-Five factor structure is not guaranteed in samples with lower educational levels. The present study investigates the Big-Five factor structure in two large samples representative of the German adult population. In both samples, the Big-Five factor structure emerged only in a blurry way at lower educational levels, whereas for highly educated persons it emerged with textbook-like clarity. Because well-educated persons are most comparable to the usual subjects of psychological research, it might be asked if the Big Five are limited to such persons. Our data contradict this conclusion. There are strong individual differences in acquiescence response tendencies among less highly educated persons. After controlling for this bias the Big-Five model holds at all educational levels.

  16. Smart Information Management in Health Big Data.

    Science.gov (United States)

    Muteba A, Eustache

    2017-01-01

    The smart information management system (SIMS) is concerned with the organization of anonymous patient records in a big data and their extraction in order to provide needful real-time intelligence. The purpose of the present study is to highlight the design and the implementation of the smart information management system. We emphasis, in one hand, the organization of a big data in flat file in simulation of nosql database, and in the other hand, the extraction of information based on lookup table and cache mechanism. The SIMS in the health big data aims the identification of new therapies and approaches to delivering care.

  17. A roadmap for big-data research and education

    OpenAIRE

    Schelén, Olov; Elragal, Ahmed; Haddara, Moutaz

    2015-01-01

    The research area known as big data is characterized by the 3 V’s, which are vol- ume; variety; and velocity. Recently, also veracity and value have been associated with big data and that adds up to the 5 V’s. Big data related information systems (IS) are typically highly distributed and scalable in order to handle the huge datasets in organizations. Data processing in such systems includes creation, retrieval, storage, analysis, presentation, visualization, and any other activity that is typ...

  18. Peran Dimensi Kepribadian Big Five terhadap Psychological Adjustment Pada Mahasiswa Indonesia yang Studi Keluar Negeri

    OpenAIRE

    Adelia, Cindy Inge

    2012-01-01

    This study aims to examine the Effect of Big Five Personality to Psychological Adjustment on Indonesian’s Sojourners. The instruments used to collect the data arepsychological adjustment scale and Big Five Inventory. The scale of psychological adjustment was made within 33 items. Big Five Inventory was used from Big Five Inventory that had been adapted by professional translator. Convenience sampling method was used to gather the respond of 117 samples. The data obtained are later analyzed us...

  19. How Big Is Too Big?

    Science.gov (United States)

    Cibes, Margaret; Greenwood, James

    2016-01-01

    Media Clips appears in every issue of Mathematics Teacher, offering readers contemporary, authentic applications of quantitative reasoning based on print or electronic media. This issue features "How Big is Too Big?" (Margaret Cibes and James Greenwood) in which students are asked to analyze the data and tables provided and answer a…

  20. Application and Prospect of Big Data in Water Resources

    Science.gov (United States)

    Xi, Danchi; Xu, Xinyi

    2017-04-01

    Because of developed information technology and affordable data storage, we h ave entered the era of data explosion. The term "Big Data" and technology relate s to it has been created and commonly applied in many fields. However, academic studies just got attention on Big Data application in water resources recently. As a result, water resource Big Data technology has not been fully developed. This paper introduces the concept of Big Data and its key technologies, including the Hadoop system and MapReduce. In addition, this paper focuses on the significance of applying the big data in water resources and summarizing prior researches by others. Most studies in this field only set up theoretical frame, but we define the "Water Big Data" and explain its tridimensional properties which are time dimension, spatial dimension and intelligent dimension. Based on HBase, the classification system of Water Big Data is introduced: hydrology data, ecology data and socio-economic data. Then after analyzing the challenges in water resources management, a series of solutions using Big Data technologies such as data mining and web crawler, are proposed. Finally, the prospect of applying big data in water resources is discussed, it can be predicted that as Big Data technology keeps developing, "3D" (Data Driven Decision) will be utilized more in water resources management in the future.

  1. WE-H-BRB-00: Big Data in Radiation Oncology

    Energy Technology Data Exchange (ETDEWEB)

    NONE

    2016-06-15

    Big Data in Radiation Oncology: (1) Overview of the NIH 2015 Big Data Workshop, (2) Where do we stand in the applications of big data in radiation oncology?, and (3) Learning Health Systems for Radiation Oncology: Needs and Challenges for Future Success The overriding goal of this trio panel of presentations is to improve awareness of the wide ranging opportunities for big data impact on patient quality care and enhancing potential for research and collaboration opportunities with NIH and a host of new big data initiatives. This presentation will also summarize the Big Data workshop that was held at the NIH Campus on August 13–14, 2015 and sponsored by AAPM, ASTRO, and NIH. The workshop included discussion of current Big Data cancer registry initiatives, safety and incident reporting systems, and other strategies that will have the greatest impact on radiation oncology research, quality assurance, safety, and outcomes analysis. Learning Objectives: To discuss current and future sources of big data for use in radiation oncology research To optimize our current data collection by adopting new strategies from outside radiation oncology To determine what new knowledge big data can provide for clinical decision support for personalized medicine L. Xing, NIH/NCI Google Inc.

  2. WE-H-BRB-00: Big Data in Radiation Oncology

    International Nuclear Information System (INIS)

    2016-01-01

    Big Data in Radiation Oncology: (1) Overview of the NIH 2015 Big Data Workshop, (2) Where do we stand in the applications of big data in radiation oncology?, and (3) Learning Health Systems for Radiation Oncology: Needs and Challenges for Future Success The overriding goal of this trio panel of presentations is to improve awareness of the wide ranging opportunities for big data impact on patient quality care and enhancing potential for research and collaboration opportunities with NIH and a host of new big data initiatives. This presentation will also summarize the Big Data workshop that was held at the NIH Campus on August 13–14, 2015 and sponsored by AAPM, ASTRO, and NIH. The workshop included discussion of current Big Data cancer registry initiatives, safety and incident reporting systems, and other strategies that will have the greatest impact on radiation oncology research, quality assurance, safety, and outcomes analysis. Learning Objectives: To discuss current and future sources of big data for use in radiation oncology research To optimize our current data collection by adopting new strategies from outside radiation oncology To determine what new knowledge big data can provide for clinical decision support for personalized medicine L. Xing, NIH/NCI Google Inc.

  3. Research on the Method of Big Data Collecting, Storing and Analyzing of Tongue Diagnosis System

    Science.gov (United States)

    Chen, Xiaowei; Wu, Qingfeng

    2018-03-01

    This paper analyzes the contents of the clinical data of tongue diagnosis of TCM (Traditional Chinese Medicine), and puts forward a method to collect, store and analyze the clinical data of tongue diagnosis. Under the guidance of TCM theory of syndrome differentiation and treatment, this method combines with Hadoop, which is a distributed computing system with strong expansibility, and integrates the functions of analysis and conversion of big data of clinic tongue diagnosis. At the same time, the consistency, scalability and security of big data in tongue diagnosis are realized.

  4. Big-data-based edge biomarkers: study on dynamical drug sensitivity and resistance in individuals.

    Science.gov (United States)

    Zeng, Tao; Zhang, Wanwei; Yu, Xiangtian; Liu, Xiaoping; Li, Meiyi; Chen, Luonan

    2016-07-01

    Big-data-based edge biomarker is a new concept to characterize disease features based on biomedical big data in a dynamical and network manner, which also provides alternative strategies to indicate disease status in single samples. This article gives a comprehensive review on big-data-based edge biomarkers for complex diseases in an individual patient, which are defined as biomarkers based on network information and high-dimensional data. Specifically, we firstly introduce the sources and structures of biomedical big data accessible in public for edge biomarker and disease study. We show that biomedical big data are typically 'small-sample size in high-dimension space', i.e. small samples but with high dimensions on features (e.g. omics data) for each individual, in contrast to traditional big data in many other fields characterized as 'large-sample size in low-dimension space', i.e. big samples but with low dimensions on features. Then, we demonstrate the concept, model and algorithm for edge biomarkers and further big-data-based edge biomarkers. Dissimilar to conventional biomarkers, edge biomarkers, e.g. module biomarkers in module network rewiring-analysis, are able to predict the disease state by learning differential associations between molecules rather than differential expressions of molecules during disease progression or treatment in individual patients. In particular, in contrast to using the information of the common molecules or edges (i.e.molecule-pairs) across a population in traditional biomarkers including network and edge biomarkers, big-data-based edge biomarkers are specific for each individual and thus can accurately evaluate the disease state by considering the individual heterogeneity. Therefore, the measurement of big data in a high-dimensional space is required not only in the learning process but also in the diagnosing or predicting process of the tested individual. Finally, we provide a case study on analyzing the temporal expression

  5. Big Data in Shipping - Challenges and Opportunities

    OpenAIRE

    Rødseth, Ørnulf Jan; Perera, Lokukaluge Prasad; Mo, Brage

    2016-01-01

    Big Data is getting popular in shipping where large amounts of information is collected to better understand and improve logistics, emissions, energy consumption and maintenance. Constraints to the use of big data include cost and quality of on-board sensors and data acquisition systems, satellite communication, data ownership and technical obstacles to effective collection and use of big data. New protocol standards may simplify the process of collecting and organizing the data, including in...

  6. BIG Data - BIG Gains? Understanding the Link Between Big Data Analytics and Innovation

    OpenAIRE

    Niebel, Thomas; Rasel, Fabienne; Viete, Steffen

    2017-01-01

    This paper analyzes the relationship between firms’ use of big data analytics and their innovative performance for product innovations. Since big data technologies provide new data information practices, they create new decision-making possibilities, which firms can use to realize innovations. Applying German firm-level data we find suggestive evidence that big data analytics matters for the likelihood of becoming a product innovator as well as the market success of the firms’ product innovat...

  7. The community-driven BiG CZ software system for integration and analysis of bio- and geoscience data in the critical zone

    Science.gov (United States)

    Aufdenkampe, A. K.; Mayorga, E.; Horsburgh, J. S.; Lehnert, K. A.; Zaslavsky, I.; Valentine, D. W., Jr.; Richard, S. M.; Cheetham, R.; Meyer, F.; Henry, C.; Berg-Cross, G.; Packman, A. I.; Aronson, E. L.

    2014-12-01

    Here we present the prototypes of a new scientific software system designed around the new Observations Data Model version 2.0 (ODM2, https://github.com/UCHIC/ODM2) to substantially enhance integration of biological and Geological (BiG) data for Critical Zone (CZ) science. The CZ science community takes as its charge the effort to integrate theory, models and data from the multitude of disciplines collectively studying processes on the Earth's surface. The central scientific challenge of the CZ science community is to develop a "grand unifying theory" of the critical zone through a theory-model-data fusion approach, for which the key missing need is a cyberinfrastructure for seamless 4D visual exploration of the integrated knowledge (data, model outputs and interpolations) from all the bio and geoscience disciplines relevant to critical zone structure and function, similar to today's ability to easily explore historical satellite imagery and photographs of the earth's surface using Google Earth. This project takes the first "BiG" steps toward answering that need. The overall goal of this project is to co-develop with the CZ science and broader community, including natural resource managers and stakeholders, a web-based integration and visualization environment for joint analysis of cross-scale bio and geoscience processes in the critical zone (BiG CZ), spanning experimental and observational designs. We will: (1) Engage the CZ and broader community to co-develop and deploy the BiG CZ software stack; (2) Develop the BiG CZ Portal web application for intuitive, high-performance map-based discovery, visualization, access and publication of data by scientists, resource managers, educators and the general public; (3) Develop the BiG CZ Toolbox to enable cyber-savvy CZ scientists to access BiG CZ Application Programming Interfaces (APIs); and (4) Develop the BiG CZ Central software stack to bridge data systems developed for multiple critical zone domains into a single

  8. Global fluctuation spectra in big-crunch-big-bang string vacua

    International Nuclear Information System (INIS)

    Craps, Ben; Ovrut, Burt A.

    2004-01-01

    We study big-crunch-big-bang cosmologies that correspond to exact world-sheet superconformal field theories of type II strings. The string theory spacetime contains a big crunch and a big bang cosmology, as well as additional 'whisker' asymptotic and intermediate regions. Within the context of free string theory, we compute, unambiguously, the scalar fluctuation spectrum in all regions of spacetime. Generically, the big crunch fluctuation spectrum is altered while passing through the bounce singularity. The change in the spectrum is characterized by a function Δ, which is momentum and time dependent. We compute Δ explicitly and demonstrate that it arises from the whisker regions. The whiskers are also shown to lead to 'entanglement' entropy in the big bang region. Finally, in the Milne orbifold limit of our superconformal vacua, we show that Δ→1 and, hence, the fluctuation spectrum is unaltered by the big-crunch-big-bang singularity. We comment on, but do not attempt to resolve, subtleties related to gravitational back reaction and light winding modes when interactions are taken into account

  9. Big Book of Windows Hacks

    CERN Document Server

    Gralla, Preston

    2008-01-01

    Bigger, better, and broader in scope, the Big Book of Windows Hacks gives you everything you need to get the most out of your Windows Vista or XP system, including its related applications and the hardware it runs on or connects to. Whether you want to tweak Vista's Aero interface, build customized sidebar gadgets and run them from a USB key, or hack the "unhackable" screensavers, you'll find quick and ingenious ways to bend these recalcitrant operating systems to your will. The Big Book of Windows Hacks focuses on Vista, the new bad boy on Microsoft's block, with hacks and workarounds that

  10. Age differences in personality traits from 10 to 65: Big Five domains and facets in a large cross-sectional sample.

    Science.gov (United States)

    Soto, Christopher J; John, Oliver P; Gosling, Samuel D; Potter, Jeff

    2011-02-01

    Hypotheses about mean-level age differences in the Big Five personality domains, as well as 10 more specific facet traits within those domains, were tested in a very large cross-sectional sample (N = 1,267,218) of children, adolescents, and adults (ages 10-65) assessed over the World Wide Web. The results supported several conclusions. First, late childhood and adolescence were key periods. Across these years, age trends for some traits (a) were especially pronounced, (b) were in a direction different from the corresponding adult trends, or (c) first indicated the presence of gender differences. Second, there were some negative trends in psychosocial maturity from late childhood into adolescence, whereas adult trends were overwhelmingly in the direction of greater maturity and adjustment. Third, the related but distinguishable facet traits within each broad Big Five domain often showed distinct age trends, highlighting the importance of facet-level research for understanding life span age differences in personality. (PsycINFO Database Record (c) 2010 APA, all rights reserved).

  11. Medical big data: promise and challenges.

    Science.gov (United States)

    Lee, Choong Ho; Yoon, Hyung-Jin

    2017-03-01

    The concept of big data, commonly characterized by volume, variety, velocity, and veracity, goes far beyond the data type and includes the aspects of data analysis, such as hypothesis-generating, rather than hypothesis-testing. Big data focuses on temporal stability of the association, rather than on causal relationship and underlying probability distribution assumptions are frequently not required. Medical big data as material to be analyzed has various features that are not only distinct from big data of other disciplines, but also distinct from traditional clinical epidemiology. Big data technology has many areas of application in healthcare, such as predictive modeling and clinical decision support, disease or safety surveillance, public health, and research. Big data analytics frequently exploits analytic methods developed in data mining, including classification, clustering, and regression. Medical big data analyses are complicated by many technical issues, such as missing values, curse of dimensionality, and bias control, and share the inherent limitations of observation study, namely the inability to test causality resulting from residual confounding and reverse causation. Recently, propensity score analysis and instrumental variable analysis have been introduced to overcome these limitations, and they have accomplished a great deal. Many challenges, such as the absence of evidence of practical benefits of big data, methodological issues including legal and ethical issues, and clinical integration and utility issues, must be overcome to realize the promise of medical big data as the fuel of a continuous learning healthcare system that will improve patient outcome and reduce waste in areas including nephrology.

  12. Medical big data: promise and challenges

    Directory of Open Access Journals (Sweden)

    Choong Ho Lee

    2017-03-01

    Full Text Available The concept of big data, commonly characterized by volume, variety, velocity, and veracity, goes far beyond the data type and includes the aspects of data analysis, such as hypothesis-generating, rather than hypothesis-testing. Big data focuses on temporal stability of the association, rather than on causal relationship and underlying probability distribution assumptions are frequently not required. Medical big data as material to be analyzed has various features that are not only distinct from big data of other disciplines, but also distinct from traditional clinical epidemiology. Big data technology has many areas of application in healthcare, such as predictive modeling and clinical decision support, disease or safety surveillance, public health, and research. Big data analytics frequently exploits analytic methods developed in data mining, including classification, clustering, and regression. Medical big data analyses are complicated by many technical issues, such as missing values, curse of dimensionality, and bias control, and share the inherent limitations of observation study, namely the inability to test causality resulting from residual confounding and reverse causation. Recently, propensity score analysis and instrumental variable analysis have been introduced to overcome these limitations, and they have accomplished a great deal. Many challenges, such as the absence of evidence of practical benefits of big data, methodological issues including legal and ethical issues, and clinical integration and utility issues, must be overcome to realize the promise of medical big data as the fuel of a continuous learning healthcare system that will improve patient outcome and reduce waste in areas including nephrology.

  13. Big Argumentation?

    Directory of Open Access Journals (Sweden)

    Daniel Faltesek

    2013-08-01

    Full Text Available Big Data is nothing new. Public concern regarding the mass diffusion of data has appeared repeatedly with computing innovations, in the formation before Big Data it was most recently referred to as the information explosion. In this essay, I argue that the appeal of Big Data is not a function of computational power, but of a synergistic relationship between aesthetic order and a politics evacuated of a meaningful public deliberation. Understanding, and challenging, Big Data requires an attention to the aesthetics of data visualization and the ways in which those aesthetics would seem to depoliticize information. The conclusion proposes an alternative argumentative aesthetic as the appropriate response to the depoliticization posed by the popular imaginary of Big Data.

  14. [Big data in official statistics].

    Science.gov (United States)

    Zwick, Markus

    2015-08-01

    The concept of "big data" stands to change the face of official statistics over the coming years, having an impact on almost all aspects of data production. The tasks of future statisticians will not necessarily be to produce new data, but rather to identify and make use of existing data to adequately describe social and economic phenomena. Until big data can be used correctly in official statistics, a lot of questions need to be answered and problems solved: the quality of data, data protection, privacy, and the sustainable availability are some of the more pressing issues to be addressed. The essential skills of official statisticians will undoubtedly change, and this implies a number of challenges to be faced by statistical education systems, in universities, and inside the statistical offices. The national statistical offices of the European Union have concluded a concrete strategy for exploring the possibilities of big data for official statistics, by means of the Big Data Roadmap and Action Plan 1.0. This is an important first step and will have a significant influence on implementing the concept of big data inside the statistical offices of Germany.

  15. Big-hole drilling - the state of the art

    International Nuclear Information System (INIS)

    Lackey, M.D.

    1983-01-01

    The art of big-hole drilling has been in a continual state of evolution at the Nevada Test Site since the start of underground testing in 1961. Emplacement holes for nuclear devices are still being drilled by the rotary-drilling process, but almost all the hardware and systems have undergone many changes during the intervening years. The current design of bits, cutters, and other big-hole-drilling hardware results from contributions of manufacturers and Test Site personnel. The dual-string, air-lift, reverse-circulation system was developed at the Test Site. Necessity was really the Mother of this invention, but this circulation system is worthy of consideration under almost any condition. Drill rigs for big-hole drilling are usually adaptations of large oil-well drill rigs with minor modifications required to handle the big bits and drilling assemblies. Steel remains the favorite shaft lining material, but a lot of thought is being given to concrete linings, especially precast concrete

  16. Revisiting the Big Six and the Big Five among Hong Kong University Students

    Science.gov (United States)

    Zhang, Li-fang

    2008-01-01

    The present study replicated investigation of the link between Holland's six career interest types and Costa and McCrae's big five personality traits in a Chinese context. A sample of 79 university students from Hong Kong evaluated their own abilities and responded to the Short-Version Self-Directed Search (SVSDS) and the NEO Five-Factor…

  17. A Rigorous Test of the Fit of the Circumplex Model to Big Five Personality Data: Theoretical and Methodological Issues and Two Large Sample Empirical Tests.

    Science.gov (United States)

    DeGeest, David Scott; Schmidt, Frank

    2015-01-01

    Our objective was to apply the rigorous test developed by Browne (1992) to determine whether the circumplex model fits Big Five personality data. This test has yet to be applied to personality data. Another objective was to determine whether blended items explained correlations among the Big Five traits. We used two working adult samples, the Eugene-Springfield Community Sample and the Professional Worker Career Experience Survey. Fit to the circumplex was tested via Browne's (1992) procedure. Circumplexes were graphed to identify items with loadings on multiple traits (blended items), and to determine whether removing these items changed five-factor model (FFM) trait intercorrelations. In both samples, the circumplex structure fit the FFM traits well. Each sample had items with dual-factor loadings (8 items in the first sample, 21 in the second). Removing blended items had little effect on construct-level intercorrelations among FFM traits. We conclude that rigorous tests show that the fit of personality data to the circumplex model is good. This finding means the circumplex model is competitive with the factor model in understanding the organization of personality traits. The circumplex structure also provides a theoretically and empirically sound rationale for evaluating intercorrelations among FFM traits. Even after eliminating blended items, FFM personality traits remained correlated.

  18. Main Issues in Big Data Security

    Directory of Open Access Journals (Sweden)

    Julio Moreno

    2016-09-01

    Full Text Available Data is currently one of the most important assets for companies in every field. The continuous growth in the importance and volume of data has created a new problem: it cannot be handled by traditional analysis techniques. This problem was, therefore, solved through the creation of a new paradigm: Big Data. However, Big Data originated new issues related not only to the volume or the variety of the data, but also to data security and privacy. In order to obtain a full perspective of the problem, we decided to carry out an investigation with the objective of highlighting the main issues regarding Big Data security, and also the solutions proposed by the scientific community to solve them. In this paper, we explain the results obtained after applying a systematic mapping study to security in the Big Data ecosystem. It is almost impossible to carry out detailed research into the entire topic of security, and the outcome of this research is, therefore, a big picture of the main problems related to security in a Big Data system, along with the principal solutions to them proposed by the research community.

  19. NASA's Big Data Task Force

    Science.gov (United States)

    Holmes, C. P.; Kinter, J. L.; Beebe, R. F.; Feigelson, E.; Hurlburt, N. E.; Mentzel, C.; Smith, G.; Tino, C.; Walker, R. J.

    2017-12-01

    Two years ago NASA established the Ad Hoc Big Data Task Force (BDTF - https://science.nasa.gov/science-committee/subcommittees/big-data-task-force), an advisory working group with the NASA Advisory Council system. The scope of the Task Force included all NASA Big Data programs, projects, missions, and activities. The Task Force focused on such topics as exploring the existing and planned evolution of NASA's science data cyber-infrastructure that supports broad access to data repositories for NASA Science Mission Directorate missions; best practices within NASA, other Federal agencies, private industry and research institutions; and Federal initiatives related to big data and data access. The BDTF has completed its two-year term and produced several recommendations plus four white papers for NASA's Science Mission Directorate. This presentation will discuss the activities and results of the TF including summaries of key points from its focused study topics. The paper serves as an introduction to the papers following in this ESSI session.

  20. Big Data and reality

    Directory of Open Access Journals (Sweden)

    Ryan Shaw

    2015-11-01

    Full Text Available DNA sequencers, Twitter, MRIs, Facebook, particle accelerators, Google Books, radio telescopes, Tumblr: what do these things have in common? According to the evangelists of “data science,” all of these are instruments for observing reality at unprecedentedly large scales and fine granularities. This perspective ignores the social reality of these very different technological systems, ignoring how they are made, how they work, and what they mean in favor of an exclusive focus on what they generate: Big Data. But no data, big or small, can be interpreted without an understanding of the process that generated them. Statistical data science is applicable to systems that have been designed as scientific instruments, but is likely to lead to confusion when applied to systems that have not. In those cases, a historical inquiry is preferable.

  1. New Evidence on the Development of the Word "Big."

    Science.gov (United States)

    Sena, Rhonda; Smith, Linda B.

    1990-01-01

    Results indicate that curvilinear trend in children's understanding of word "big" is not obtained in all stimulus contexts. This suggests that meaning and use of "big" is complex, and may not refer simply to larger objects in a set. Proposes that meaning of "big" constitutes a dynamic system driven by many perceptual,…

  2. Feature driven survey of big data systems

    NARCIS (Netherlands)

    Salma, Cigdem Avci; Tekinerdogan, Bedir; Athanasiadis, Ioannis N.

    2016-01-01

    Big Data has become a very important driver for innovation and growth for various industries such as health, administration, agriculture, defence, and education. Storing and analysing large amounts of data are becoming increasingly common in many of these application areas. In general, different

  3. Big Data Technologies

    Science.gov (United States)

    Bellazzi, Riccardo; Dagliati, Arianna; Sacchi, Lucia; Segagni, Daniele

    2015-01-01

    The so-called big data revolution provides substantial opportunities to diabetes management. At least 3 important directions are currently of great interest. First, the integration of different sources of information, from primary and secondary care to administrative information, may allow depicting a novel view of patient’s care processes and of single patient’s behaviors, taking into account the multifaceted nature of chronic care. Second, the availability of novel diabetes technologies, able to gather large amounts of real-time data, requires the implementation of distributed platforms for data analysis and decision support. Finally, the inclusion of geographical and environmental information into such complex IT systems may further increase the capability of interpreting the data gathered and extract new knowledge from them. This article reviews the main concepts and definitions related to big data, it presents some efforts in health care, and discusses the potential role of big data in diabetes care. Finally, as an example, it describes the research efforts carried on in the MOSAIC project, funded by the European Commission. PMID:25910540

  4. Design of Flow Big Data System Based on Smart Pipeline Theory

    OpenAIRE

    Zhang Jianqing; Li Shuai; Liu Lilan

    2017-01-01

    As telecom operators to build intelligent pipe more and more, analysis and processing of big data technology to deal the huge amounts of data intelligent pipeline generated has become an inevitable trend. Intelligent pipe describes operational data, sales data; operator’s pipe flow data make the value for e-commerce business form and business model in mobile e-business environment. Intelligent pipe is the third dimension of 3 D pipeline mobile electronic commerce system. Intelligent operation...

  5. BRIC Health Systems and Big Pharma: A Challenge for Health Policy and Management.

    Science.gov (United States)

    Rodwin, Victor G; Fabre, Guilhem; Ayoub, Rafael F

    2018-01-02

    BRIC nations - Brazil, Russia, India, and China - represent 40% of the world's population, including a growing aging population and middle class with an increasing prevalence of chronic disease. Their healthcare systems increasingly rely on prescription drugs, but they differ from most other healthcare systems because healthcare expenditures in BRIC nations have exhibited the highest revenue growth rates for pharmaceutical multinational corporations (MNCs), Big Pharma. The response of BRIC nations to Big Pharma presents contrasting cases of how governments manage the tensions posed by rising public expectations and limited resources to satisfy them. Understanding these tensions represents an emerging area of research and an important challenge for all those who work in the field of health policy and management (HPAM). © 2018 The Author(s); Published by Kerman University of Medical Sciences. This is an open-access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

  6. Integrating R and Hadoop for Big Data Analysis

    Directory of Open Access Journals (Sweden)

    Bogdan Oancea

    2014-06-01

    Full Text Available Analyzing and working with big data could be very difficult using classical means like relational database management systems or desktop software packages for statistics and visualization. Instead, big data requires large clusters with hundreds or even thousands of computing nodes. Official statistics is increasingly considering big data for deriving new statistics because big data sources could produce more relevant and timely statistics than traditional sources. One of the software tools successfully and wide spread used for storage and processing of big data sets on clusters of commodity hardware is Hadoop. Hadoop framework contains libraries, a distributed file-system (HDFS, a resource-management platform and implements a version of the MapReduce programming model for large scale data processing. In this paper we investigate the possibilities of integrating Hadoop with R which is a popular software used for statistical computing and data visualization. We present three ways of integrating them: R with Streaming, Rhipe and RHadoop and we emphasize the advantages and disadvantages of each solution.

  7. Big data

    DEFF Research Database (Denmark)

    Madsen, Anders Koed; Flyverbom, Mikkel; Hilbert, Martin

    2016-01-01

    is to outline a research agenda that can be used to raise a broader set of sociological and practice-oriented questions about the increasing datafication of international relations and politics. First, it proposes a way of conceptualizing big data that is broad enough to open fruitful investigations......The claim that big data can revolutionize strategy and governance in the context of international relations is increasingly hard to ignore. Scholars of international political sociology have mainly discussed this development through the themes of security and surveillance. The aim of this paper...... into the emerging use of big data in these contexts. This conceptualization includes the identification of three moments contained in any big data practice. Second, it suggests a research agenda built around a set of subthemes that each deserve dedicated scrutiny when studying the interplay between big data...

  8. Performance of automatic exposure control of digital imaging systems in three big hospitals

    International Nuclear Information System (INIS)

    Avramova-Cholakova, S.; Dyakov, I.

    2012-01-01

    The digital imaging systems are quite new in Bulgaria and there is still no clear evidence between all the servicing companies how to set the automatic exposure control (AEC) of the systems. The aim of this work is to study the AEC settings of the digital imaging systems in three big hospitals, serviced by different engineering companies. This study included seven systems, one with digital radiography (DR) and the others with computed radiography (CR) detectors. The AEC settings were tested in terms of mean pixel value (MPV), air kerma at phantom exit and detector dose indicator (DDI) dependence from AEC chamber selection, tube voltage and phantom thickness. Relatively small variations in MPV up to 8 % were observed for 4 of the CR systems, usually with small variations in DDI as well (except for one system, for which up to 40 % difference in DDI consistency between chambers was found). Air kerma at phantom exit for these systems had bigger variations up to 29 %. The other CR systems had big variations in MPV up to 57 % with DDI variations up to 30 % while air kerma changes were not small - from 8 to 38 %. The DR system showed smaller variation in air kerma at phantom exit up to 8 % and bigger variations in MPV and DDI up to 20 % and 15 % respectively. There is no systematic approach in the AEC settings used in the 3 hospitals. Further investigation and collaboration with the servicing companies is needed aiming to establish the optimized selection of AEC calibration parameters in each case. (authors)

  9. The Big Data Tools Impact on Development of Simulation-Concerned Academic Disciplines

    Directory of Open Access Journals (Sweden)

    A. A. Sukhobokov

    2015-01-01

    Full Text Available The article gives a definition of Big Data on the basis of 5V (Volume, Variety, Velocity, Veracity, Value as well as shows examples of tasks that require using Big Data tools in a diversity of areas, namely: health, education, financial services, industry, agriculture, logistics, retail, information technology, telecommunications and others. An overview of Big Data tools is delivered, including open source products, IBM Bluemix and SAP HANA platforms. Examples of architecture of corporate data processing and management systems using Big Data tools are shown for big Internet companies and for enterprises in traditional industries. Within the overview, a classification of Big Data tools is proposed that fills gaps of previously developed similar classifications. The new classification contains 19 classes and allows embracing several hundreds of existing and emerging products.The uprise and use of Big Data tools, in addition to solving practical problems, affects the development of scientific disciplines concerning the simulation of technical, natural or socioeconomic systems and the solution of practical problems based on developed models. New schools arise in these disciplines. These new schools decide peculiar to each discipline tasks, but for systems with a much bigger number of internal elements and connections between them. Characteristics of the problems to be solved under new schools, not always meet the criteria for Big Data. It is suggested to identify the Big Data as a part of the theory of sorting and searching algorithms. In other disciplines the new schools are called by analogy with Big Data: Big Calculation in numerical methods, Big Simulation in imitational modeling, Big Management in the management of socio-economic systems, Big Optimal Control in the optimal control theory. The paper shows examples of tasks and methods to be developed within new schools. The educed tendency is not limited to the considered disciplines: there are

  10. Big Data in food and agriculture

    Directory of Open Access Journals (Sweden)

    Kelly Bronson

    2016-06-01

    Full Text Available Farming is undergoing a digital revolution. Our existing review of current Big Data applications in the agri-food sector has revealed several collection and analytics tools that may have implications for relationships of power between players in the food system (e.g. between farmers and large corporations. For example, Who retains ownership of the data generated by applications like Monsanto Corproation's Weed I.D . “app”? Are there privacy implications with the data gathered by John Deere's precision agricultural equipment? Systematically tracing the digital revolution in agriculture, and charting the affordances as well as the limitations of Big Data applied to food and agriculture, should be a broad research goal for Big Data scholarship. Such a goal brings data scholarship into conversation with food studies and it allows for a focus on the material consequences of big data in society.

  11. Big data computing

    CERN Document Server

    Akerkar, Rajendra

    2013-01-01

    Due to market forces and technological evolution, Big Data computing is developing at an increasing rate. A wide variety of novel approaches and tools have emerged to tackle the challenges of Big Data, creating both more opportunities and more challenges for students and professionals in the field of data computation and analysis. Presenting a mix of industry cases and theory, Big Data Computing discusses the technical and practical issues related to Big Data in intelligent information management. Emphasizing the adoption and diffusion of Big Data tools and technologies in industry, the book i

  12. Notes on the Diet of Reproductively Active Male Rafinesque's Big Eared Bats

    International Nuclear Information System (INIS)

    Menzel, M.A.; Carter, T.C.; Menzel, J.M.; Edwards, J.W.; Ford, W.M.

    2002-01-01

    Diet examination through the use of fecal samples, of five reproductively active male Rafinesque's big-eared bats from the Upper Coastal Plain of South Carolina during August and September 1999. Diets of these individuals in upland pine stands were similar to diets of Rafinesque's big-eared bats in bottomland and upland hardwood habitats. Although fecal samples had three insect orders, the diet consisted primarily of lepidopterans

  13. Big Data: Concept, Potentialities and Vulnerabilities

    Directory of Open Access Journals (Sweden)

    Fernando Almeida

    2018-03-01

    Full Text Available The evolution of information systems and the growth in the use of the Internet and social networks has caused an explosion in the amount of available data relevant to the activities of the companies. Therefore, the treatment of these available data is vital to support operational, tactical and strategic decisions. This paper aims to present the concept of big data and the main technologies that support the analysis of large data volumes. The potential of big data is explored considering nine sectors of activity, such as financial, retail, healthcare, transports, agriculture, energy, manufacturing, public, and media and entertainment. In addition, the main current opportunities, vulnerabilities and privacy challenges of big data are discussed. It was possible to conclude that despite the potential for using the big data to grow in the previously identified areas, there are still some challenges that need to be considered and mitigated, namely the privacy of information, the existence of qualified human resources to work with Big Data and the promotion of a data-driven organizational culture.

  14. Network Analysis of Urban Traffic with Big Bus Data

    OpenAIRE

    Zhao, Kai

    2016-01-01

    Urban traffic analysis is crucial for traffic forecasting systems, urban planning and, more recently, various mobile and network applications. In this paper, we analyse urban traffic with network and statistical methods. Our analysis is based on one big bus dataset containing 45 million bus arrival samples in Helsinki. We mainly address following questions: 1. How can we identify the areas that cause most of the traffic in the city? 2. Why there is a urban traffic? Is bus traffic a key cause ...

  15. Data management by using R: big data clinical research series.

    Science.gov (United States)

    Zhang, Zhongheng

    2015-11-01

    Electronic medical record (EMR) system has been widely used in clinical practice. Instead of traditional record system by hand writing and recording, the EMR makes big data clinical research feasible. The most important feature of big data research is its real-world setting. Furthermore, big data research can provide all aspects of information related to healthcare. However, big data research requires some skills on data management, which however, is always lacking in the curriculum of medical education. This greatly hinders doctors from testing their clinical hypothesis by using EMR. To make ends meet, a series of articles introducing data management techniques are put forward to guide clinicians to big data clinical research. The present educational article firstly introduces some basic knowledge on R language, followed by some data management skills on creating new variables, recoding variables and renaming variables. These are very basic skills and may be used in every project of big data research.

  16. Implementing the “Big Data” Concept in Official Statistics

    Directory of Open Access Journals (Sweden)

    О. V.

    2017-02-01

    Full Text Available Big data is a huge resource that needs to be used at all levels of economic planning. The article is devoted to the study of the development of the concept of “Big Data” in the world and its impact on the transformation of statistical simulation of economic processes. Statistics at the current stage should take into account the complex system of international economic relations, which functions in the conditions of globalization and brings new forms of economic development in small open economies. Statistical science should take into account such phenomena as gig-economy, common economy, institutional factors, etc. The concept of “Big Data” and open data are analyzed, problems of implementation of “Big Data” in the official statistics are shown. The ways of implementation of “Big Data” in the official statistics of Ukraine through active use of technological opportunities of mobile operators, navigation systems, surveillance cameras, social networks, etc. are presented. The possibilities of using “Big Data” in different sectors of the economy, also on the level of companies are shown. The problems of storage of large volumes of data are highlighted. The study shows that “Big Data” is a huge resource that should be used across the Ukrainian economy.

  17. From big bang to big crunch and beyond

    International Nuclear Information System (INIS)

    Elitzur, Shmuel; Rabinovici, Eliezer; Giveon, Amit; Kutasov, David

    2002-01-01

    We study a quotient Conformal Field Theory, which describes a 3+1 dimensional cosmological spacetime. Part of this spacetime is the Nappi-Witten (NW) universe, which starts at a 'big bang' singularity, expands and then contracts to a 'big crunch' singularity at a finite time. The gauged WZW model contains a number of copies of the NW spacetime, with each copy connected to the preceding one and to the next one at the respective big bang/big crunch singularities. The sequence of NW spacetimes is further connected at the singularities to a series of non-compact static regions with closed timelike curves. These regions contain boundaries, on which the observables of the theory live. This suggests a holographic interpretation of the physics. (author)

  18. BIG data - BIG gains? Empirical evidence on the link between big data analytics and innovation

    OpenAIRE

    Niebel, Thomas; Rasel, Fabienne; Viete, Steffen

    2017-01-01

    This paper analyzes the relationship between firms’ use of big data analytics and their innovative performance in terms of product innovations. Since big data technologies provide new data information practices, they create novel decision-making possibilities, which are widely believed to support firms’ innovation process. Applying German firm-level data within a knowledge production function framework we find suggestive evidence that big data analytics is a relevant determinant for the likel...

  19. Development of a system using the library of the Genie spectroscopy software and exchange of samples

    International Nuclear Information System (INIS)

    Lapolli, Andre L.; Munita, Casimiro S.

    2011-01-01

    One the great difficulties in using the NAA method is in regards to the time that the operator spends exchanging the samples after each measurement. It becomes a big problem in routine analyses when various chemical elements are determined and then each sample must be measured at different decay times. The application of the automatic sample exchanger reduces the time analysis by several hours and reduces the tedious manual operation. Then, the effective use of NAA depends on the availability of a suitable automatic sample changer. There are some systems that are sold commercially, however many laboratories can not acquire them because they are costly. This paper presents altered programs the G2KNAA.REX, which created a screen making possible automatic or manual acquisitions by calling the old program NAAACQ.rex for the procurement manual and the new program NAAACQ2.rex for automatic requisitions. In conclusion, as can be seen in the program lines, the synchronization to automation, which unites the three systems (the computer, the Canberra Set, the sample exchanger) is done in a timely manner. The system was tested and is functioning in a satisfactory manner. (author)

  20. Harnessing the Power of Big Data to Improve Graduate Medical Education: Big Idea or Bust?

    Science.gov (United States)

    Arora, Vineet M

    2018-06-01

    With the advent of electronic medical records (EMRs) fueling the rise of big data, the use of predictive analytics, machine learning, and artificial intelligence are touted as transformational tools to improve clinical care. While major investments are being made in using big data to transform health care delivery, little effort has been directed toward exploiting big data to improve graduate medical education (GME). Because our current system relies on faculty observations of competence, it is not unreasonable to ask whether big data in the form of clinical EMRs and other novel data sources can answer questions of importance in GME such as when is a resident ready for independent practice.The timing is ripe for such a transformation. A recent National Academy of Medicine report called for reforms to how GME is delivered and financed. While many agree on the need to ensure that GME meets our nation's health needs, there is little consensus on how to measure the performance of GME in meeting this goal. During a recent workshop at the National Academy of Medicine on GME outcomes and metrics in October 2017, a key theme emerged: Big data holds great promise to inform GME performance at individual, institutional, and national levels. In this Invited Commentary, several examples are presented, such as using big data to inform clinical experience and provide clinically meaningful data to trainees, and using novel data sources, including ambient data, to better measure the quality of GME training.

  1. The SIKS/BiGGrid Big Data Tutorial

    NARCIS (Netherlands)

    Hiemstra, Djoerd; Lammerts, Evert; de Vries, A.P.

    2011-01-01

    The School for Information and Knowledge Systems SIKS and the Dutch e-science grid BiG Grid organized a new two-day tutorial on Big Data at the University of Twente on 30 November and 1 December 2011, just preceding the Dutch-Belgian Database Day. The tutorial is on top of some exciting new

  2. Intrusion Detection System Based on Decision Tree over Big Data in Fog Environment

    Directory of Open Access Journals (Sweden)

    Kai Peng

    2018-01-01

    Full Text Available Fog computing, as the supplement of cloud computing, can provide low-latency services between mobile users and the cloud. However, fog devices may encounter security challenges as a result of the fog nodes being close to the end users and having limited computing ability. Traditional network attacks may destroy the system of fog nodes. Intrusion detection system (IDS is a proactive security protection technology and can be used in the fog environment. Although IDS in tradition network has been well investigated, unfortunately directly using them in the fog environment may be inappropriate. Fog nodes produce massive amounts of data at all times, and, thus, enabling an IDS system over big data in the fog environment is of paramount importance. In this study, we propose an IDS system based on decision tree. Firstly, we propose a preprocessing algorithm to digitize the strings in the given dataset and then normalize the whole data, to ensure the quality of the input data so as to improve the efficiency of detection. Secondly, we use decision tree method for our IDS system, and then we compare this method with Naïve Bayesian method as well as KNN method. Both the 10% dataset and the full dataset are tested. Our proposed method not only completely detects four kinds of attacks but also enables the detection of twenty-two kinds of attacks. The experimental results show that our IDS system is effective and precise. Above all, our IDS system can be used in fog computing environment over big data.

  3. Research on Optimization of Pooling System and Its Application in Drug Supply Chain Based on Big Data Analysis.

    Science.gov (United States)

    Wu, DengFeng; Mao, Hongyi

    2017-01-01

    Reform of drug procurement is being extensively implemented and expanded in China, especially in today's big data environment. However, the pattern of supply mode innovation lags behind procurement improvement. Problems in financial strain and supply break frequently occur, which affect the stability of drug supply. Drug Pooling System is proposed and applied in a few pilot cities to resolve these problems. From the perspective of supply chain, this study analyzes the process of setting important parameters and sets out the tasks of involved parties in a pooling system according to the issues identified in the pilot run. The approach is based on big data analysis and simulation using system dynamic theory and modeling of Vensim software to optimize system performance. This study proposes a theoretical framework to resolve problems and attempts to provide a valuable reference for future application of pooling systems.

  4. Big Data in the Aerospace Industry

    Directory of Open Access Journals (Sweden)

    Victor Emmanuell BADEA

    2018-01-01

    Full Text Available This paper presents the approaches related to the need for large volume data analysis, Big Data, and also the information that the beneficiaries of this analysis can interpret. Aerospace companies understand better the challenges of Big Data than the rest of the industries. Also, in this paper we describe a novel analytical system that enables query processing and predictive analytics over streams of large aviation data.

  5. 75 FR 3948 - Big Sky Energy Corp., Biomedical Waste Systems, Inc., Biometrics Security Technology, Inc...

    Science.gov (United States)

    2010-01-25

    ... SECURITIES AND EXCHANGE COMMISSION [File No. 500-1] Big Sky Energy Corp., Biomedical Waste Systems, Inc., Biometrics Security Technology, Inc., Biosys, Inc., Bolder Technologies Corp., Boyds Wheels, Inc... securities of Biometrics Security Technology, Inc. because it has not filed any periodic reports since...

  6. Occurrence and transport of nitrogen in the Big Sunflower River, northwestern Mississippi, October 2009-June 2011

    Science.gov (United States)

    Barlow, Jeannie R.B.; Coupe, Richard H.

    2014-01-01

    The Big Sunflower River Basin, located within the Yazoo River Basin, is subject to large annual inputs of nitrogen from agriculture, atmospheric deposition, and point sources. Understanding how nutrients are transported in, and downstream from, the Big Sunflower River is key to quantifying their eutrophying effects on the Gulf. Recent results from two Spatially Referenced Regressions on Watershed attributes (SPARROW models), which include the Big Sunflower River, indicate minimal losses of nitrogen in stream reaches typical of the main channels of major river systems. If SPARROW assumptions of relatively conservative transport of nitrogen are correct and surface-water losses through the bed of the Big Sunflower River are negligible, then options for managing nutrient loads to the Gulf of Mexico may be limited. Simply put, if every pound of nitrogen entering the Delta is eventually delivered to the Gulf, then the only effective nutrient management option in the Delta is to reduce inputs. If, on the other hand, it can be shown that processes within river channels of the Mississippi Delta act to reduce the mass of nitrogen in transport, other hydrologic approaches may be designed to further limit nitrogen transport. Direct validation of existing SPARROW models for the Delta is a first step in assessing the assumptions underlying those models. In order to characterize spatial and temporal variability of nitrogen in the Big Sunflower River Basin, water samples were collected at four U.S. Geological Survey gaging stations located on the Big Sunflower River between October 1, 2009, and June 30, 2011. Nitrogen concentrations were generally highest at each site during the spring of the 2010 water year and the fall and winter of the 2011 water year. Additionally, the dominant form of nitrogen varied between sites. For example, in samples collected from the most upstream site (Clarksdale), the concentration of organic nitrogen was generally higher than the concentrations of

  7. Modeling of the Nuclear Power Plant Life cycle for 'Big Data' Management System: A Systems Engineering Viewpoint

    International Nuclear Information System (INIS)

    Ha, Bui Hoang; Khanh, Tran Quang Diep; Shakirah, Wan; Kahar, Wan Abdul; Jung, Jae Cheon

    2012-01-01

    Together with the significant development of Internet and Web technologies, the rapid evolution of 'Big Data' idea has been observed since it is first introduced in 1941 as an 'information explosion'(OED). Using the '3Vs' model, as proposed by Gartner, 'Big Data' can be defined as 'high volume, high velocity, and/or high variety information assets that require new forms of processing to enable enhanced decision making, insight discovery and process optimization.' Big Data technologies and tools have been developed to address the way large quantities of data are stored, accessed and presented for manipulation or analysis. The idea also focuses on how the users can easily access and extract the 'useful and right' data, information, or even knowledge from the 'Big Data'.

  8. OXIDATIVE STRESS BIOMARKERS IN MUSSELS SAMPLED FROM FOUR SITES ALONG THE MOROCCAN ATLANTIC COAST (BIG CASABLANCA

    Directory of Open Access Journals (Sweden)

    LAILA EL JOURMI

    2012-12-01

    Full Text Available Catalase (CAT activity and malondialdehyde (MDA level in whole bodies of the mussel perna perna, collected from four stations along the Moroccan Atlantic coast (Big Casablanca area, were monitored to evaluate stress effects on mussels collected from the selected sites. The oxidative stress biomarkers showed statistically significant differences at the polluted sites when compared to the control ones. In general, our data indicated that CAT activity and MDA concentration are a higher and significant (p < 0.05 in mussels collected at polluted site when compared to specimen sampled from control ones. In conclusion, the oxidative stress biomarkers response obtained for October 2010 and 2011, clearly demonstrate the potential presence of different contaminants in Site 4 and Site 3 reflecting the intensity of pollution in these areas.

  9. Adapting bioinformatics curricula for big data

    Science.gov (United States)

    Greene, Anna C.; Giffin, Kristine A.; Greene, Casey S.

    2016-01-01

    Modern technologies are capable of generating enormous amounts of data that measure complex biological systems. Computational biologists and bioinformatics scientists are increasingly being asked to use these data to reveal key systems-level properties. We review the extent to which curricula are changing in the era of big data. We identify key competencies that scientists dealing with big data are expected to possess across fields, and we use this information to propose courses to meet these growing needs. While bioinformatics programs have traditionally trained students in data-intensive science, we identify areas of particular biological, computational and statistical emphasis important for this era that can be incorporated into existing curricula. For each area, we propose a course structured around these topics, which can be adapted in whole or in parts into existing curricula. In summary, specific challenges associated with big data provide an important opportunity to update existing curricula, but we do not foresee a wholesale redesign of bioinformatics training programs. PMID:25829469

  10. Odysseus/DFS: Integration of DBMS and Distributed File System for Transaction Processing of Big Data

    OpenAIRE

    Kim, Jun-Sung; Whang, Kyu-Young; Kwon, Hyuk-Yoon; Song, Il-Yeol

    2014-01-01

    The relational DBMS (RDBMS) has been widely used since it supports various high-level functionalities such as SQL, schemas, indexes, and transactions that do not exist in the O/S file system. But, a recent advent of big data technology facilitates development of new systems that sacrifice the DBMS functionality in order to efficiently manage large-scale data. Those so-called NoSQL systems use a distributed file system, which support scalability and reliability. They support scalability of the...

  11. [New context for the Individual Healthcare Professions Act (BIG law)].

    Science.gov (United States)

    Sijmons, Jaap G; Winter, Heinrich B; Hubben, Joep H

    2014-01-01

    In 2013 the Dutch Individual Healthcare Professions Act (known as the BIG law) was evaluated for the second time. The research showed that patients have limited awareness of the registration of healthcare professionals and that the system of reserved procedures is almost unknown. On the other hand, healthcare institutions (especially hospitals) frequently check the register, as do healthcare insurance companies when contracting institutions. Knowledge of the reserved procedures system is moderate amongst professionals too, while the organisation of care is to a great extent based on this system. Since the change of system in 2006 quality assurance in professional practice has been much more rooted in the internal structure of care; in this way, the BIG law did not go the way the legislator intended. According to the researchers, this has not prevented the BIG law from still playing an essential function. Indeed, the BIG law has not reached its final destination, but it may reach its goal via another route.

  12. Conociendo Big Data

    Directory of Open Access Journals (Sweden)

    Juan José Camargo-Vega

    2014-12-01

    Full Text Available Teniendo en cuenta la importancia que ha adquirido el término Big Data, la presente investigación buscó estudiar y analizar de manera exhaustiva el estado del arte del Big Data; además, y como segundo objetivo, analizó las características, las herramientas, las tecnologías, los modelos y los estándares relacionados con Big Data, y por último buscó identificar las características más relevantes en la gestión de Big Data, para que con ello se pueda conocer todo lo concerniente al tema central de la investigación.La metodología utilizada incluyó revisar el estado del arte de Big Data y enseñar su situación actual; conocer las tecnologías de Big Data; presentar algunas de las bases de datos NoSQL, que son las que permiten procesar datos con formatos no estructurados, y mostrar los modelos de datos y las tecnologías de análisis de ellos, para terminar con algunos beneficios de Big Data.El diseño metodológico usado para la investigación fue no experimental, pues no se manipulan variables, y de tipo exploratorio, debido a que con esta investigación se empieza a conocer el ambiente del Big Data.

  13. Big Data: Challenges, Opportunities and Realities

    OpenAIRE

    Bhadani, Abhay; Jothimani, Dhanya

    2017-01-01

    With the advent of Internet of Things (IoT) and Web 2.0 technologies, there has been a tremendous growth in the amount of data generated. This chapter emphasizes on the need for big data, technological advancements, tools and techniques being used to process big data are discussed. Technological improvements and limitations of existing storage techniques are also presented. Since, the traditional technologies like Relational Database Management System (RDBMS) have their own limitations to han...

  14. Big Bayou Creek and Little Bayou Creek Watershed Monitoring Program

    Energy Technology Data Exchange (ETDEWEB)

    Kszos, L.A.; Peterson, M.J.; Ryon; Smith, J.G.

    1999-03-01

    Biological monitoring of Little Bayou and Big Bayou creeks, which border the Paducah Site, has been conducted since 1987. Biological monitoring was conducted by University of Kentucky from 1987 to 1991 and by staff of the Environmental Sciences Division (ESD) at Oak Ridge National Laboratory (ORNL) from 1991 through March 1999. In March 1998, renewed Kentucky Pollutant Discharge Elimination System (KPDES) permits were issued to the US Department of Energy (DOE) and US Enrichment Corporation. The renewed DOE permit requires that a watershed monitoring program be developed for the Paducah Site within 90 days of the effective date of the renewed permit. This plan outlines the sampling and analysis that will be conducted for the watershed monitoring program. The objectives of the watershed monitoring are to (1) determine whether discharges from the Paducah Site and the Solid Waste Management Units (SWMUs) associated with the Paducah Site are adversely affecting instream fauna, (2) assess the ecological health of Little Bayou and Big Bayou creeks, (3) assess the degree to which abatement actions ecologically benefit Big Bayou Creek and Little Bayou Creek, (4) provide guidance for remediation, (5) provide an evaluation of changes in potential human health concerns, and (6) provide data which could be used to assess the impact of inadvertent spills or fish kill. According to the cleanup will result in these watersheds [Big Bayou and Little Bayou creeks] achieving compliance with the applicable water quality criteria.

  15. Recent Development in Big Data Analytics for Business Operations and Risk Management.

    Science.gov (United States)

    Choi, Tsan-Ming; Chan, Hing Kai; Yue, Xiaohang

    2017-01-01

    "Big data" is an emerging topic and has attracted the attention of many researchers and practitioners in industrial systems engineering and cybernetics. Big data analytics would definitely lead to valuable knowledge for many organizations. Business operations and risk management can be a beneficiary as there are many data collection channels in the related industrial systems (e.g., wireless sensor networks, Internet-based systems, etc.). Big data research, however, is still in its infancy. Its focus is rather unclear and related studies are not well amalgamated. This paper aims to present the challenges and opportunities of big data analytics in this unique application domain. Technological development and advances for industrial-based business systems, reliability and security of industrial systems, and their operational risk management are examined. Important areas for future research are also discussed and revealed.

  16. Big cats as a model system for the study of the evolution of intelligence.

    Science.gov (United States)

    Borrego, Natalia

    2017-08-01

    Currently, carnivores, and felids in particular, are vastly underrepresented in cognitive literature, despite being an ideal model system for tests of social and ecological intelligence hypotheses. Within Felidae, big cats (Panthera) are uniquely suited to studies investigating the evolutionary links between social, ecological, and cognitive complexity. Intelligence likely did not evolve in a unitary way but instead evolved as the result of mutually reinforcing feedback loops within the physical and social environments. The domain-specific social intelligence hypothesis proposes that social complexity drives only the evolution of cognitive abilities adapted only to social domains. The domain-general hypothesis proposes that the unique demands of social life serve as a bootstrap for the evolution of superior general cognition. Big cats are one of the few systems in which we can directly address conflicting predictions of the domain-general and domain-specific hypothesis by comparing cognition among closely related species that face roughly equivalent ecological complexity but vary considerably in social complexity. Copyright © 2017 Elsevier B.V. All rights reserved.

  17. Characterizing Big Data Management

    Directory of Open Access Journals (Sweden)

    Rogério Rossi

    2015-06-01

    Full Text Available Big data management is a reality for an increasing number of organizations in many areas and represents a set of challenges involving big data modeling, storage and retrieval, analysis and visualization. However, technological resources, people and processes are crucial to facilitate the management of big data in any kind of organization, allowing information and knowledge from a large volume of data to support decision-making. Big data management can be supported by these three dimensions: technology, people and processes. Hence, this article discusses these dimensions: the technological dimension that is related to storage, analytics and visualization of big data; the human aspects of big data; and, in addition, the process management dimension that involves in a technological and business approach the aspects of big data management.

  18. Big science

    CERN Multimedia

    Nadis, S

    2003-01-01

    " "Big science" is moving into astronomy, bringing large experimental teams, multi-year research projects, and big budgets. If this is the wave of the future, why are some astronomers bucking the trend?" (2 pages).

  19. Soil biogeochemistry in the age of big data

    Science.gov (United States)

    Cécillon, Lauric; Barré, Pierre; Coissac, Eric; Plante, Alain; Rasse, Daniel

    2015-04-01

    already been made thanks to meta-analysis, chemometrics, machine-learning systems and bioinformatics. Some techniques like structural equation modeling eventually propose to explore causalities opening a way towards the mechanistic understanding of soil big data rather than simple correlations. We claim that data science should be fully integrated into soil biogeochemists basic education schemes. We expect the blooming of a new generation of soil biogeochemists highly skilled in manipulating big data. Will big data represent a net gain for soil biogeochemistry? Increasing the amount of data will increase associated biases that may further be exacerbated by the increasing distance between data manipulators, soil sampling and data acquisition. Integrating data science into soil biogeochemistry should thus not be done at the expenses of pedology and metrology. We further expect that the more data, the more spurious correlations will appear leading to possible misinterpretation of data. Finally, big data on soils characteristics and processes will always need to be confronted to biogeochemical theories and socio-economic knowledge to be useful. Big data could revolutionize soil biogeochemistry, fostering new scientific and business models around the conservation of the soil natural capital, but our community should go into this new era with clear-sightedness and discernment.

  20. A Big Data Decision-making Mechanism for Food Supply Chain

    Directory of Open Access Journals (Sweden)

    Ji Guojun

    2017-01-01

    Full Text Available Many companies have captured and analyzed huge volumes of data to improve the decision mechanism of supply chain, this paper presents a big data harvest model that uses big data as inputs to make more informed decisions in the food supply chain. By introducing a method of Bayesian network, this paper integrates sample data and finds a cause-and-effect between data to predict market demand. Then the deduction graph model that translates foods demand into processes and divides processes into tasks and assets is presented, and an example of how big data in the food supply chain can be combined with Bayesian network and deduction graph model to guide production decision. Our conclusions indicate that the decision-making mechanism has vast potential by extracting value from big data.

  1. Emerging technology and architecture for big-data analytics

    CERN Document Server

    Chang, Chip; Yu, Hao

    2017-01-01

    This book describes the current state of the art in big-data analytics, from a technology and hardware architecture perspective. The presentation is designed to be accessible to a broad audience, with general knowledge of hardware design and some interest in big-data analytics. Coverage includes emerging technology and devices for data-analytics, circuit design for data-analytics, and architecture and algorithms to support data-analytics. Readers will benefit from the realistic context used by the authors, which demonstrates what works, what doesn’t work, and what are the fundamental problems, solutions, upcoming challenges and opportunities. Provides a single-source reference to hardware architectures for big-data analytics; Covers various levels of big-data analytics hardware design abstraction and flow, from device, to circuits and systems; Demonstrates how non-volatile memory (NVM) based hardware platforms can be a viable solution to existing challenges in hardware architecture for big-data analytics.

  2. Big bang and big crunch in matrix string theory

    OpenAIRE

    Bedford, J; Papageorgakis, C; Rodríguez-Gómez, D; Ward, J

    2007-01-01

    Following the holographic description of linear dilaton null Cosmologies with a Big Bang in terms of Matrix String Theory put forward by Craps, Sethi and Verlinde, we propose an extended background describing a Universe including both Big Bang and Big Crunch singularities. This belongs to a class of exact string backgrounds and is perturbative in the string coupling far away from the singularities, both of which can be resolved using Matrix String Theory. We provide a simple theory capable of...

  3. Big Five personality group differences across academic majors

    DEFF Research Database (Denmark)

    Vedel, Anna

    2016-01-01

    During the past decades, a number of studies have explored personality group differences in the Big Five personality traits among students in different academic majors. To date, though, this research has not been reviewed systematically. This was the aim of the present review. A systematic...... literature search identified twelve eligible studies yielding an aggregated sample size of 13,389. Eleven studies reported significant group differences in one or multiple Big Five personality traits. Consistent findings across studies were that students of arts/humanities and psychology scored high...... on Conscientiousness. Effect sizes were calculated to estimate the magnitude of the personality group differences. These effect sizes were consistent across studies comparing similar pairs of academic majors. For all Big Five personality traits medium effect sizes were found frequently, and for Openness even large...

  4. Characterization and Architectural Implications of Big Data Workloads

    OpenAIRE

    Wang, Lei; Zhan, Jianfeng; Jia, Zhen; Han, Rui

    2015-01-01

    Big data areas are expanding in a fast way in terms of increasing workloads and runtime systems, and this situation imposes a serious challenge to workload characterization, which is the foundation of innovative system and architecture design. The previous major efforts on big data benchmarking either propose a comprehensive but a large amount of workloads, or only select a few workloads according to so-called popularity, which may lead to partial or even biased observations. In this paper, o...

  5. Bliver big data til big business?

    DEFF Research Database (Denmark)

    Ritter, Thomas

    2015-01-01

    Danmark har en digital infrastruktur, en registreringskultur og it-kompetente medarbejdere og kunder, som muliggør en førerposition, men kun hvis virksomhederne gør sig klar til næste big data-bølge.......Danmark har en digital infrastruktur, en registreringskultur og it-kompetente medarbejdere og kunder, som muliggør en førerposition, men kun hvis virksomhederne gør sig klar til næste big data-bølge....

  6. Big data uncertainties.

    Science.gov (United States)

    Maugis, Pierre-André G

    2018-07-01

    Big data-the idea that an always-larger volume of information is being constantly recorded-suggests that new problems can now be subjected to scientific scrutiny. However, can classical statistical methods be used directly on big data? We analyze the problem by looking at two known pitfalls of big datasets. First, that they are biased, in the sense that they do not offer a complete view of the populations under consideration. Second, that they present a weak but pervasive level of dependence between all their components. In both cases we observe that the uncertainty of the conclusion obtained by statistical methods is increased when used on big data, either because of a systematic error (bias), or because of a larger degree of randomness (increased variance). We argue that the key challenge raised by big data is not only how to use big data to tackle new problems, but to develop tools and methods able to rigorously articulate the new risks therein. Copyright © 2016. Published by Elsevier Ltd.

  7. Object oriented software for simulation and reconstruction of big alignment systems

    CERN Document Server

    Arce, P

    2003-01-01

    Modern high-energy physics experiments require tracking detectors to provide high precision under difficult working conditions (high magnetic field, gravity loads and temperature gradients). This is the reason why several of them are deciding to implement optical alignment systems to monitor the displacement of tracking elements in operation. To simulate and reconstruct optical alignment systems a general purpose software, named COCOA, has been developed, using the object oriented paradigm and software engineering techniques. Thanks to the big flexibility in its design, COCOA is able to reconstruct any optical system made of a combination of the following objects: laser, x-hair laser, incoherent source - pinhole, lens, mirror, plate splitter, cube splitter, optical square, rhomboid prism, 2D sensor, 1D sensor, distance-meter, tilt-meter, user-defined. COCOA was designed to satisfy the requirements of the CMS alignment system, which has several thousands of components. Sparse matrix techniques had been investi...

  8. Medical Big Data Warehouse: Architecture and System Design, a Case Study: Improving Healthcare Resources Distribution.

    Science.gov (United States)

    Sebaa, Abderrazak; Chikh, Fatima; Nouicer, Amina; Tari, AbdelKamel

    2018-02-19

    The huge increases in medical devices and clinical applications which generate enormous data have raised a big issue in managing, processing, and mining this massive amount of data. Indeed, traditional data warehousing frameworks can not be effective when managing the volume, variety, and velocity of current medical applications. As a result, several data warehouses face many issues over medical data and many challenges need to be addressed. New solutions have emerged and Hadoop is one of the best examples, it can be used to process these streams of medical data. However, without an efficient system design and architecture, these performances will not be significant and valuable for medical managers. In this paper, we provide a short review of the literature about research issues of traditional data warehouses and we present some important Hadoop-based data warehouses. In addition, a Hadoop-based architecture and a conceptual data model for designing medical Big Data warehouse are given. In our case study, we provide implementation detail of big data warehouse based on the proposed architecture and data model in the Apache Hadoop platform to ensure an optimal allocation of health resources.

  9. [Contemplation on the application of big data in clinical medicine].

    Science.gov (United States)

    Lian, Lei

    2015-01-01

    Medicine is another area where big data is being used. The link between clinical treatment and outcome is the key step when applying big data in medicine. In the era of big data, it is critical to collect complete outcome data. Patient follow-up, comprehensive integration of data resources, quality control and standardized data management are the predominant approaches to avoid missing data and data island. Therefore, establishment of systemic patients follow-up protocol and prospective data management strategy are the important aspects of big data in medicine.

  10. Big data and biomedical informatics: a challenging opportunity.

    Science.gov (United States)

    Bellazzi, R

    2014-05-22

    Big data are receiving an increasing attention in biomedicine and healthcare. It is therefore important to understand the reason why big data are assuming a crucial role for the biomedical informatics community. The capability of handling big data is becoming an enabler to carry out unprecedented research studies and to implement new models of healthcare delivery. Therefore, it is first necessary to deeply understand the four elements that constitute big data, namely Volume, Variety, Velocity, and Veracity, and their meaning in practice. Then, it is mandatory to understand where big data are present, and where they can be beneficially collected. There are research fields, such as translational bioinformatics, which need to rely on big data technologies to withstand the shock wave of data that is generated every day. Other areas, ranging from epidemiology to clinical care, can benefit from the exploitation of the large amounts of data that are nowadays available, from personal monitoring to primary care. However, building big data-enabled systems carries on relevant implications in terms of reproducibility of research studies and management of privacy and data access; proper actions should be taken to deal with these issues. An interesting consequence of the big data scenario is the availability of new software, methods, and tools, such as map-reduce, cloud computing, and concept drift machine learning algorithms, which will not only contribute to big data research, but may be beneficial in many biomedical informatics applications. The way forward with the big data opportunity will require properly applied engineering principles to design studies and applications, to avoid preconceptions or over-enthusiasms, to fully exploit the available technologies, and to improve data processing and data management regulations.

  11. Adapting bioinformatics curricula for big data.

    Science.gov (United States)

    Greene, Anna C; Giffin, Kristine A; Greene, Casey S; Moore, Jason H

    2016-01-01

    Modern technologies are capable of generating enormous amounts of data that measure complex biological systems. Computational biologists and bioinformatics scientists are increasingly being asked to use these data to reveal key systems-level properties. We review the extent to which curricula are changing in the era of big data. We identify key competencies that scientists dealing with big data are expected to possess across fields, and we use this information to propose courses to meet these growing needs. While bioinformatics programs have traditionally trained students in data-intensive science, we identify areas of particular biological, computational and statistical emphasis important for this era that can be incorporated into existing curricula. For each area, we propose a course structured around these topics, which can be adapted in whole or in parts into existing curricula. In summary, specific challenges associated with big data provide an important opportunity to update existing curricula, but we do not foresee a wholesale redesign of bioinformatics training programs. © The Author 2015. Published by Oxford University Press.

  12. HARNESSING BIG DATA VOLUMES

    Directory of Open Access Journals (Sweden)

    Bogdan DINU

    2014-04-01

    Full Text Available Big Data can revolutionize humanity. Hidden within the huge amounts and variety of the data we are creating we may find information, facts, social insights and benchmarks that were once virtually impossible to find or were simply inexistent. Large volumes of data allow organizations to tap in real time the full potential of all the internal or external information they possess. Big data calls for quick decisions and innovative ways to assist customers and the society as a whole. Big data platforms and product portfolio will help customers harness to the full the value of big data volumes. This paper deals with technical and technological issues related to handling big data volumes in the Big Data environment.

  13. A peek into the future of radiology using big data applications

    Directory of Open Access Journals (Sweden)

    Amit T Kharat

    2017-01-01

    Full Text Available Big data is extremely large amount of data which is available in the radiology department. Big data is identified by four Vs – Volume, Velocity, Variety, and Veracity. By applying different algorithmic tools and converting raw data to transformed data in such large datasets, there is a possibility of understanding and using radiology data for gaining new knowledge and insights. Big data analytics consists of 6Cs – Connection, Cloud, Cyber, Content, Community, and Customization. The global technological prowess and per-capita capacity to save digital information has roughly doubled every 40 months since the 1980's. By using big data, the planning and implementation of radiological procedures in radiology departments can be given a great boost. Potential applications of big data in the future are scheduling of scans, creating patient-specific personalized scanning protocols, radiologist decision support, emergency reporting, virtual quality assurance for the radiologist, etc. Targeted use of big data applications can be done for images by supporting the analytic process. Screening software tools designed on big data can be used to highlight a region of interest, such as subtle changes in parenchymal density, solitary pulmonary nodule, or focal hepatic lesions, by plotting its multidimensional anatomy. Following this, we can run more complex applications such as three-dimensional multi planar reconstructions (MPR, volumetric rendering (VR, and curved planar reconstruction, which consume higher system resources on targeted data subsets rather than querying the complete cross-sectional imaging dataset. This pre-emptive selection of dataset can substantially reduce the system requirements such as system memory, server load and provide prompt results. However, a word of caution, “big data should not become “dump data” due to inadequate and poor analysis and non-structured improperly stored data. In the near future, big data can ring in the

  14. A peek into the future of radiology using big data applications.

    Science.gov (United States)

    Kharat, Amit T; Singhal, Shubham

    2017-01-01

    Big data is extremely large amount of data which is available in the radiology department. Big data is identified by four Vs - Volume, Velocity, Variety, and Veracity. By applying different algorithmic tools and converting raw data to transformed data in such large datasets, there is a possibility of understanding and using radiology data for gaining new knowledge and insights. Big data analytics consists of 6Cs - Connection, Cloud, Cyber, Content, Community, and Customization. The global technological prowess and per-capita capacity to save digital information has roughly doubled every 40 months since the 1980's. By using big data, the planning and implementation of radiological procedures in radiology departments can be given a great boost. Potential applications of big data in the future are scheduling of scans, creating patient-specific personalized scanning protocols, radiologist decision support, emergency reporting, virtual quality assurance for the radiologist, etc. Targeted use of big data applications can be done for images by supporting the analytic process. Screening software tools designed on big data can be used to highlight a region of interest, such as subtle changes in parenchymal density, solitary pulmonary nodule, or focal hepatic lesions, by plotting its multidimensional anatomy. Following this, we can run more complex applications such as three-dimensional multi planar reconstructions (MPR), volumetric rendering (VR), and curved planar reconstruction, which consume higher system resources on targeted data subsets rather than querying the complete cross-sectional imaging dataset. This pre-emptive selection of dataset can substantially reduce the system requirements such as system memory, server load and provide prompt results. However, a word of caution, "big data should not become "dump data" due to inadequate and poor analysis and non-structured improperly stored data. In the near future, big data can ring in the era of personalized and

  15. A peek into the future of radiology using big data applications

    Science.gov (United States)

    Kharat, Amit T.; Singhal, Shubham

    2017-01-01

    Big data is extremely large amount of data which is available in the radiology department. Big data is identified by four Vs – Volume, Velocity, Variety, and Veracity. By applying different algorithmic tools and converting raw data to transformed data in such large datasets, there is a possibility of understanding and using radiology data for gaining new knowledge and insights. Big data analytics consists of 6Cs – Connection, Cloud, Cyber, Content, Community, and Customization. The global technological prowess and per-capita capacity to save digital information has roughly doubled every 40 months since the 1980's. By using big data, the planning and implementation of radiological procedures in radiology departments can be given a great boost. Potential applications of big data in the future are scheduling of scans, creating patient-specific personalized scanning protocols, radiologist decision support, emergency reporting, virtual quality assurance for the radiologist, etc. Targeted use of big data applications can be done for images by supporting the analytic process. Screening software tools designed on big data can be used to highlight a region of interest, such as subtle changes in parenchymal density, solitary pulmonary nodule, or focal hepatic lesions, by plotting its multidimensional anatomy. Following this, we can run more complex applications such as three-dimensional multi planar reconstructions (MPR), volumetric rendering (VR), and curved planar reconstruction, which consume higher system resources on targeted data subsets rather than querying the complete cross-sectional imaging dataset. This pre-emptive selection of dataset can substantially reduce the system requirements such as system memory, server load and provide prompt results. However, a word of caution, “big data should not become “dump data” due to inadequate and poor analysis and non-structured improperly stored data. In the near future, big data can ring in the era of personalized

  16. The Study of “big data” to support internal business strategists

    Science.gov (United States)

    Ge, Mei

    2018-01-01

    How is big data different from previous data analysis systems? The primary purpose behind traditional small data analytics that all managers are more or less familiar with is to support internal business strategies. But big data also offers a promising new dimension: to discover new opportunities to offer customers high-value products and services. The study focus to introduce some strategists which big data support to. Business decisions using big data can also involve some areas for analytics. They include customer satisfaction, customer journeys, supply chains, risk management, competitive intelligence, pricing, discovery and experimentation or facilitating big data discovery.

  17. Big bang and big crunch in matrix string theory

    International Nuclear Information System (INIS)

    Bedford, J.; Ward, J.; Papageorgakis, C.; Rodriguez-Gomez, D.

    2007-01-01

    Following the holographic description of linear dilaton null cosmologies with a big bang in terms of matrix string theory put forward by Craps, Sethi, and Verlinde, we propose an extended background describing a universe including both big bang and big crunch singularities. This belongs to a class of exact string backgrounds and is perturbative in the string coupling far away from the singularities, both of which can be resolved using matrix string theory. We provide a simple theory capable of describing the complete evolution of this closed universe

  18. Big data a primer

    CERN Document Server

    Bhuyan, Prachet; Chenthati, Deepak

    2015-01-01

    This book is a collection of chapters written by experts on various aspects of big data. The book aims to explain what big data is and how it is stored and used. The book starts from  the fundamentals and builds up from there. It is intended to serve as a review of the state-of-the-practice in the field of big data handling. The traditional framework of relational databases can no longer provide appropriate solutions for handling big data and making it available and useful to users scattered around the globe. The study of big data covers a wide range of issues including management of heterogeneous data, big data frameworks, change management, finding patterns in data usage and evolution, data as a service, service-generated data, service management, privacy and security. All of these aspects are touched upon in this book. It also discusses big data applications in different domains. The book will prove useful to students, researchers, and practicing database and networking engineers.

  19. Toward a manifesto for the 'public understanding of big data'.

    Science.gov (United States)

    Michael, Mike; Lupton, Deborah

    2016-01-01

    In this article, we sketch a 'manifesto' for the 'public understanding of big data'. On the one hand, this entails such public understanding of science and public engagement with science and technology-tinged questions as follows: How, when and where are people exposed to, or do they engage with, big data? Who are regarded as big data's trustworthy sources, or credible commentators and critics? What are the mechanisms by which big data systems are opened to public scrutiny? On the other hand, big data generate many challenges for public understanding of science and public engagement with science and technology: How do we address publics that are simultaneously the informant, the informed and the information of big data? What counts as understanding of, or engagement with, big data, when big data themselves are multiplying, fluid and recursive? As part of our manifesto, we propose a range of empirical, conceptual and methodological exhortations. We also provide Appendix 1 that outlines three novel methods for addressing some of the issues raised in the article. © The Author(s) 2015.

  20. Automated Big Traffic Analytics for Cyber Security

    OpenAIRE

    Miao, Yuantian; Ruan, Zichan; Pan, Lei; Wang, Yu; Zhang, Jun; Xiang, Yang

    2018-01-01

    Network traffic analytics technology is a cornerstone for cyber security systems. We demonstrate its use through three popular and contemporary cyber security applications in intrusion detection, malware analysis and botnet detection. However, automated traffic analytics faces the challenges raised by big traffic data. In terms of big data's three characteristics --- volume, variety and velocity, we review three state of the art techniques to mitigate the key challenges including real-time tr...

  1. Embracing Big Data in Complex Educational Systems: The Learning Analytics Imperative and the Policy Challenge

    Science.gov (United States)

    Macfadyen, Leah P.; Dawson, Shane; Pardo, Abelardo; Gaševic, Dragan

    2014-01-01

    In the new era of big educational data, learning analytics (LA) offer the possibility of implementing real-time assessment and feedback systems and processes at scale that are focused on improvement of learning, development of self-regulated learning skills, and student success. However, to realize this promise, the necessary shifts in the…

  2. Livermore Big Trees Park Soil Survey

    International Nuclear Information System (INIS)

    McConachie, W.A.; Failor, R.A.

    1995-01-01

    Lawrence Livermore National Laboratory (LLNL) will sample and analyze soil in the Big Trees Park area in Livermore, California, to determine if the initial level of plutonium (Pu) in a soil sample taken by the U.S. Environmental Protection Agency (EPA) in September 1993 can be confirmed. Nineteen samples will be collected and analyzed: 4 in the area where the initial EPA sample was taken, 2 in the nearby Arroyo Seco, 12 in scattered uncovered soil areas in the park and nearby school, and 1 from the sandbox of a nearby apartment complex. Two quality control (QC) samples (field duplicates of the preceding samples) win also be collected and analyzed. This document briefly describes the purpose behind the sampling, the sampling rationale, and the methodology

  3. Big Data, Big Responsibility! Building best-practice privacy strategies into a large-scale neuroinformatics platform

    Directory of Open Access Journals (Sweden)

    Christina Popovich

    2017-04-01

    OBI’s rigorous approach to data sharing in the field of neuroscience maintains the accessibility of research data for big discoveries without compromising patient privacy and security. We believe that Brain-CODE is a powerful and advantageous tool; moving neuroscience research from independent silos to an integrative system approach for improving patient health. OBI’s vision for improved brain health for patients living with neurological disorders paired with Brain-CODE’s best-practice strategies in privacy protection of patient data offer a novel and innovative approach to “big data” initiatives aimed towards improving public health and society world-wide.

  4. Microsoft big data solutions

    CERN Document Server

    Jorgensen, Adam; Welch, John; Clark, Dan; Price, Christopher; Mitchell, Brian

    2014-01-01

    Tap the power of Big Data with Microsoft technologies Big Data is here, and Microsoft's new Big Data platform is a valuable tool to help your company get the very most out of it. This timely book shows you how to use HDInsight along with HortonWorks Data Platform for Windows to store, manage, analyze, and share Big Data throughout the enterprise. Focusing primarily on Microsoft and HortonWorks technologies but also covering open source tools, Microsoft Big Data Solutions explains best practices, covers on-premises and cloud-based solutions, and features valuable case studies. Best of all,

  5. Transforming Healthcare Delivery: Integrating Dynamic Simulation Modelling and Big Data in Health Economics and Outcomes Research.

    Science.gov (United States)

    Marshall, Deborah A; Burgos-Liz, Lina; Pasupathy, Kalyan S; Padula, William V; IJzerman, Maarten J; Wong, Peter K; Higashi, Mitchell K; Engbers, Jordan; Wiebe, Samuel; Crown, William; Osgood, Nathaniel D

    2016-02-01

    In the era of the Information Age and personalized medicine, healthcare delivery systems need to be efficient and patient-centred. The health system must be responsive to individual patient choices and preferences about their care, while considering the system consequences. While dynamic simulation modelling (DSM) and big data share characteristics, they present distinct and complementary value in healthcare. Big data and DSM are synergistic-big data offer support to enhance the application of dynamic models, but DSM also can greatly enhance the value conferred by big data. Big data can inform patient-centred care with its high velocity, volume, and variety (the three Vs) over traditional data analytics; however, big data are not sufficient to extract meaningful insights to inform approaches to improve healthcare delivery. DSM can serve as a natural bridge between the wealth of evidence offered by big data and informed decision making as a means of faster, deeper, more consistent learning from that evidence. We discuss the synergies between big data and DSM, practical considerations and challenges, and how integrating big data and DSM can be useful to decision makers to address complex, systemic health economics and outcomes questions and to transform healthcare delivery.

  6. Big Data Analytics in Healthcare.

    Science.gov (United States)

    Belle, Ashwin; Thiagarajan, Raghuram; Soroushmehr, S M Reza; Navidi, Fatemeh; Beard, Daniel A; Najarian, Kayvan

    2015-01-01

    The rapidly expanding field of big data analytics has started to play a pivotal role in the evolution of healthcare practices and research. It has provided tools to accumulate, manage, analyze, and assimilate large volumes of disparate, structured, and unstructured data produced by current healthcare systems. Big data analytics has been recently applied towards aiding the process of care delivery and disease exploration. However, the adoption rate and research development in this space is still hindered by some fundamental problems inherent within the big data paradigm. In this paper, we discuss some of these major challenges with a focus on three upcoming and promising areas of medical research: image, signal, and genomics based analytics. Recent research which targets utilization of large volumes of medical data while combining multimodal data from disparate sources is discussed. Potential areas of research within this field which have the ability to provide meaningful impact on healthcare delivery are also examined.

  7. Summary big data

    CERN Document Server

    2014-01-01

    This work offers a summary of Cukier the book: "Big Data: A Revolution That Will Transform How we Live, Work, and Think" by Viktor Mayer-Schonberg and Kenneth. Summary of the ideas in Viktor Mayer-Schonberg's and Kenneth Cukier's book: " Big Data " explains that big data is where we use huge quantities of data to make better predictions based on the fact we identify patters in the data rather than trying to understand the underlying causes in more detail. This summary highlights that big data will be a source of new economic value and innovation in the future. Moreover, it shows that it will

  8. Research in Big Data Warehousing using Hadoop

    Directory of Open Access Journals (Sweden)

    Abderrazak Sebaa

    2017-05-01

    Full Text Available Traditional data warehouses have played a key role in decision support system until the recent past. However, the rapid growing of the data generation by the current applications requires new data warehousing systems: volume and format of collected datasets, data source variety, integration of unstructured data and powerful analytical processing. In the age of the Big Data, it is important to follow this pace and adapt the existing warehouse systems to overcome the new issues and challenges. In this paper, we focus on the data warehousing over big data. We discuss the limitations of the traditional ones. We present its alternative technologies and related future work for data warehousing.

  9. Stream sediment sampling and analysis. Final report

    International Nuclear Information System (INIS)

    Means, J.L.; Voris, P.V.; Headington, G.L.

    1986-04-01

    The objectives were to sample and analyze sediments from upstream and downstream locations (relative to the Goodyear Atomic plant site) of three streams for selected pollutants. The three streams sampled were the Scioto River, Big Beaver Creek, and Big Run Creek. Sediment samples were analyzed for EPA's 129 priority pollutants (Clean Water Act) as well as isotopic uranium ( 234 U, 235 U, and 238 U) and technetium-99

  10. "Too big to fail" or "Too non-traditional to fail"?: The determinants of banks' systemic importance

    OpenAIRE

    Moore, Kyle; Zhou, Chen

    2013-01-01

    This paper empirically analyzes the determinants of banks' systemic importance. In constructing a measure on the systemic importance of financial institutions we find that size is a leading determinant. This confirms the usual "Too big to fail'' argument. Nevertheless, banks with size above a sufficiently high level have equal systemic importance. In addition to size, we find that the extent to which banks engage in non-traditional banking activities is also positively related to ...

  11. Big data and visual analytics in anaesthesia and health care.

    Science.gov (United States)

    Simpao, A F; Ahumada, L M; Rehman, M A

    2015-09-01

    Advances in computer technology, patient monitoring systems, and electronic health record systems have enabled rapid accumulation of patient data in electronic form (i.e. big data). Organizations such as the Anesthesia Quality Institute and Multicenter Perioperative Outcomes Group have spearheaded large-scale efforts to collect anaesthesia big data for outcomes research and quality improvement. Analytics--the systematic use of data combined with quantitative and qualitative analysis to make decisions--can be applied to big data for quality and performance improvements, such as predictive risk assessment, clinical decision support, and resource management. Visual analytics is the science of analytical reasoning facilitated by interactive visual interfaces, and it can facilitate performance of cognitive activities involving big data. Ongoing integration of big data and analytics within anaesthesia and health care will increase demand for anaesthesia professionals who are well versed in both the medical and the information sciences. © The Author 2015. Published by Oxford University Press on behalf of the British Journal of Anaesthesia. All rights reserved. For Permissions, please email: journals.permissions@oup.com.

  12. Notes on the Diet of Reproductively Active Male Rafinesque's Big Eared Bats

    Energy Technology Data Exchange (ETDEWEB)

    Menzel, M.A.; Carter, T.C.; Menzel, J.M.; Edwards, J.W.; Ford, W.M.

    2002-01-01

    Diet examination through the use of fecal samples, of five reproductively active male Rafinesque's big-eared bats from the Upper Coastal Plain of South Carolina during August and September 1999. Diets of these individuals in upland pine stands were similar to diets of Rafinesque's big-eared bats in bottomland and upland hardwood habitats. Although fecal samples had three insect orders, the diet consisted primarily of lepidopterans.

  13. Big Data and Biomedical Informatics: A Challenging Opportunity

    Science.gov (United States)

    2014-01-01

    Summary Big data are receiving an increasing attention in biomedicine and healthcare. It is therefore important to understand the reason why big data are assuming a crucial role for the biomedical informatics community. The capability of handling big data is becoming an enabler to carry out unprecedented research studies and to implement new models of healthcare delivery. Therefore, it is first necessary to deeply understand the four elements that constitute big data, namely Volume, Variety, Velocity, and Veracity, and their meaning in practice. Then, it is mandatory to understand where big data are present, and where they can be beneficially collected. There are research fields, such as translational bioinformatics, which need to rely on big data technologies to withstand the shock wave of data that is generated every day. Other areas, ranging from epidemiology to clinical care, can benefit from the exploitation of the large amounts of data that are nowadays available, from personal monitoring to primary care. However, building big data-enabled systems carries on relevant implications in terms of reproducibility of research studies and management of privacy and data access; proper actions should be taken to deal with these issues. An interesting consequence of the big data scenario is the availability of new software, methods, and tools, such as map-reduce, cloud computing, and concept drift machine learning algorithms, which will not only contribute to big data research, but may be beneficial in many biomedical informatics applications. The way forward with the big data opportunity will require properly applied engineering principles to design studies and applications, to avoid preconceptions or over-enthusiasms, to fully exploit the available technologies, and to improve data processing and data management regulations. PMID:24853034

  14. Big Data - Smart Health Strategies

    Science.gov (United States)

    2014-01-01

    Summary Objectives To select best papers published in 2013 in the field of big data and smart health strategies, and summarize outstanding research efforts. Methods A systematic search was performed using two major bibliographic databases for relevant journal papers. The references obtained were reviewed in a two-stage process, starting with a blinded review performed by the two section editors, and followed by a peer review process operated by external reviewers recognized as experts in the field. Results The complete review process selected four best papers, illustrating various aspects of the special theme, among them: (a) using large volumes of unstructured data and, specifically, clinical notes from Electronic Health Records (EHRs) for pharmacovigilance; (b) knowledge discovery via querying large volumes of complex (both structured and unstructured) biological data using big data technologies and relevant tools; (c) methodologies for applying cloud computing and big data technologies in the field of genomics, and (d) system architectures enabling high-performance access to and processing of large datasets extracted from EHRs. Conclusions The potential of big data in biomedicine has been pinpointed in various viewpoint papers and editorials. The review of current scientific literature illustrated a variety of interesting methods and applications in the field, but still the promises exceed the current outcomes. As we are getting closer towards a solid foundation with respect to common understanding of relevant concepts and technical aspects, and the use of standardized technologies and tools, we can anticipate to reach the potential that big data offer for personalized medicine and smart health strategies in the near future. PMID:25123721

  15. Storage and Database Management for Big Data

    Science.gov (United States)

    2015-07-27

    cloud models that satisfy different problem 1.2. THE BIG DATA CHALLENGE 3 Enterprise Big Data - Interactive - On-demand - Virtualization - Java ...replication. Data loss can only occur if three drives fail prior to any one of the failures being corrected. Hadoop is written in Java and is installed in a...visible view into a dataset. There are many popular database management systems such as MySQL [4], PostgreSQL [63], and Oracle [5]. Most commonly

  16. How Big Data, Comparative Effectiveness Research, and Rapid-Learning Health-Care Systems Can Transform Patient Care in Radiation Oncology.

    Science.gov (United States)

    Sanders, Jason C; Showalter, Timothy N

    2018-01-01

    Big data and comparative effectiveness research methodologies can be applied within the framework of a rapid-learning health-care system (RLHCS) to accelerate discovery and to help turn the dream of fully personalized medicine into a reality. We synthesize recent advances in genomics with trends in big data to provide a forward-looking perspective on the potential of new advances to usher in an era of personalized radiation therapy, with emphases on the power of RLHCS to accelerate discovery and the future of individualized radiation treatment planning.

  17. Clinical research of traditional Chinese medicine in big data era.

    Science.gov (United States)

    Zhang, Junhua; Zhang, Boli

    2014-09-01

    With the advent of big data era, our thinking, technology and methodology are being transformed. Data-intensive scientific discovery based on big data, named "The Fourth Paradigm," has become a new paradigm of scientific research. Along with the development and application of the Internet information technology in the field of healthcare, individual health records, clinical data of diagnosis and treatment, and genomic data have been accumulated dramatically, which generates big data in medical field for clinical research and assessment. With the support of big data, the defects and weakness may be overcome in the methodology of the conventional clinical evaluation based on sampling. Our research target shifts from the "causality inference" to "correlativity analysis." This not only facilitates the evaluation of individualized treatment, disease prediction, prevention and prognosis, but also is suitable for the practice of preventive healthcare and symptom pattern differentiation for treatment in terms of traditional Chinese medicine (TCM), and for the post-marketing evaluation of Chinese patent medicines. To conduct clinical studies involved in big data in TCM domain, top level design is needed and should be performed orderly. The fundamental construction and innovation studies should be strengthened in the sections of data platform creation, data analysis technology and big-data professionals fostering and training.

  18. Big Data en surveillance, deel 1 : Definities en discussies omtrent Big Data

    NARCIS (Netherlands)

    Timan, Tjerk

    2016-01-01

    Naar aanleiding van een (vrij kort) college over surveillance en Big Data, werd me gevraagd iets dieper in te gaan op het thema, definities en verschillende vraagstukken die te maken hebben met big data. In dit eerste deel zal ik proberen e.e.a. uiteen te zetten betreft Big Data theorie en

  19. Big data analytics to improve cardiovascular care: promise and challenges.

    Science.gov (United States)

    Rumsfeld, John S; Joynt, Karen E; Maddox, Thomas M

    2016-06-01

    The potential for big data analytics to improve cardiovascular quality of care and patient outcomes is tremendous. However, the application of big data in health care is at a nascent stage, and the evidence to date demonstrating that big data analytics will improve care and outcomes is scant. This Review provides an overview of the data sources and methods that comprise big data analytics, and describes eight areas of application of big data analytics to improve cardiovascular care, including predictive modelling for risk and resource use, population management, drug and medical device safety surveillance, disease and treatment heterogeneity, precision medicine and clinical decision support, quality of care and performance measurement, and public health and research applications. We also delineate the important challenges for big data applications in cardiovascular care, including the need for evidence of effectiveness and safety, the methodological issues such as data quality and validation, and the critical importance of clinical integration and proof of clinical utility. If big data analytics are shown to improve quality of care and patient outcomes, and can be successfully implemented in cardiovascular practice, big data will fulfil its potential as an important component of a learning health-care system.

  20. Designing Cloud Infrastructure for Big Data in E-government

    Directory of Open Access Journals (Sweden)

    Jelena Šuh

    2015-03-01

    Full Text Available The development of new information services and technologies, especially in domains of mobile communications, Internet of things, and social media, has led to appearance of the large quantities of unstructured data. The pervasive computing also affects the e-government systems, where big data emerges and cannot be processed and analyzed in a traditional manner due to its complexity, heterogeneity and size. The subject of this paper is the design of the cloud infrastructure for big data storage and processing in e-government. The goal is to analyze the potential of cloud computing for big data infrastructure, and propose a model for effective storing, processing and analyzing big data in e-government. The paper provides an overview of current relevant concepts related to cloud infrastructure design that should provide support for big data. The second part of the paper gives a model of the cloud infrastructure based on the concepts of software defined networks and multi-tenancy. The final goal is to support projects in the field of big data in e-government

  1. Characterizing Big Data Management

    OpenAIRE

    Rogério Rossi; Kechi Hirama

    2015-01-01

    Big data management is a reality for an increasing number of organizations in many areas and represents a set of challenges involving big data modeling, storage and retrieval, analysis and visualization. However, technological resources, people and processes are crucial to facilitate the management of big data in any kind of organization, allowing information and knowledge from a large volume of data to support decision-making. Big data management can be supported by these three dimensions: t...

  2. Next generation hyper-scale software and hardware systems for big data analytics

    CERN Multimedia

    CERN. Geneva

    2013-01-01

    Building on foundational technologies such as many-core systems, non-volatile memories and photonic interconnects, we describe some current technologies and future research to create real-time, big data analytics, IT infrastructure. We will also briefly describe some of our biologically-inspired software and hardware architecture for creating radically new hyper-scale cognitive computing systems. About the speaker Rich Friedrich is the director of Strategic Innovation and Research Services (SIRS) at HP Labs. In this strategic role, he is responsible for research investments in nano-technology, exascale computing, cyber security, information management, cloud computing, immersive interaction, sustainability, social computing and commercial digital printing. Rich's philosophy is to fuse strategy and inspiration to create compelling capabilities for next generation information devices, systems and services. Using essential insights gained from the metaphysics of innnovation, he effectively leads ...

  3. The Big Fish Down Under: Examining Moderators of the "Big-Fish-Little-Pond" Effect for Australia's High Achievers

    Science.gov (United States)

    Seaton, Marjorie; Marsh, Herbert W.; Yeung, Alexander Seeshing; Craven, Rhonda

    2011-01-01

    Big-fish-little-pond effect (BFLPE) research has demonstrated that academic self-concept is negatively affected by attending high-ability schools. This article examines data from large, representative samples of 15-year-olds from each Australian state, based on the three Program for International Student Assessment (PISA) databases that focus on…

  4. Recent big flare

    International Nuclear Information System (INIS)

    Moriyama, Fumio; Miyazawa, Masahide; Yamaguchi, Yoshisuke

    1978-01-01

    The features of three big solar flares observed at Tokyo Observatory are described in this paper. The active region, McMath 14943, caused a big flare on September 16, 1977. The flare appeared on both sides of a long dark line which runs along the boundary of the magnetic field. Two-ribbon structure was seen. The electron density of the flare observed at Norikura Corona Observatory was 3 x 10 12 /cc. Several arc lines which connect both bright regions of different magnetic polarity were seen in H-α monochrome image. The active region, McMath 15056, caused a big flare on December 10, 1977. At the beginning, several bright spots were observed in the region between two main solar spots. Then, the area and the brightness increased, and the bright spots became two ribbon-shaped bands. A solar flare was observed on April 8, 1978. At first, several bright spots were seen around the solar spot in the active region, McMath 15221. Then, these bright spots developed to a large bright region. On both sides of a dark line along the magnetic neutral line, bright regions were generated. These developed to a two-ribbon flare. The time required for growth was more than one hour. A bright arc which connects two ribbons was seen, and this arc may be a loop prominence system. (Kato, T.)

  5. Infectious Disease Surveillance in the Big Data Era

    DEFF Research Database (Denmark)

    Simonsen, Lone; Gog, Julia R.; Olson, Don

    2016-01-01

    , flexible, and local tracking of infectious diseases, especially for emerging pathogens. In this opinion piece, we reflect on the long and distinguished history of disease surveillance and discuss recent developments related to use of big data. We start with a brief review of traditional systems relying...... of Google Flu Trends. We conclude by advocating for increased use of hybrid systems combining information from traditional surveillance and big data sources, which seems the most promising option moving forward. Throughout the article, we use influenza as an exemplar of an emerging and reemerging infection...

  6. Big Data and Nursing: Implications for the Future.

    Science.gov (United States)

    Topaz, Maxim; Pruinelli, Lisiane

    2017-01-01

    Big data is becoming increasingly more prevalent and it affects the way nurses learn, practice, conduct research and develop policy. The discipline of nursing needs to maximize the benefits of big data to advance the vision of promoting human health and wellbeing. However, current practicing nurses, educators and nurse scientists often lack the required skills and competencies necessary for meaningful use of big data. Some of the key skills for further development include the ability to mine narrative and structured data for new care or outcome patterns, effective data visualization techniques, and further integration of nursing sensitive data into artificial intelligence systems for better clinical decision support. We provide growth-path vision recommendations for big data competencies for practicing nurses, nurse educators, researchers, and policy makers to help prepare the next generation of nurses and improve patient outcomes trough better quality connected health.

  7. Introducing Big Data Concepts in an Introductory Technology Course

    Science.gov (United States)

    Frydenberg, Mark

    2015-01-01

    From their presence on social media sites to in-house application data files, the amount of data that companies, governments, individuals, and sensors generate is overwhelming. The growth of Big Data in both consumer and enterprise activities has caused educators to consider options for including Big Data in the Information Systems curriculum.…

  8. Towards an understanding of big data analytics as a weapon for competitive performance

    OpenAIRE

    Danielsen, Frank; Framnes, Vetle Augustin

    2017-01-01

    Master's thesis Information Systems IS501 - University of Agder 2017 Context: Big data has in the recent years been an area of interest among innovative organizations and has started to become a major priority for organizations in general, either through their own big data departments or by purchasing big data analyses from suppliers. Big data analytics means more knowledge from more data sources and is by many prophesied to be a contributing source of big change in how organizations recei...

  9. Foreword: Big Data and Its Application in Health Disparities Research.

    Science.gov (United States)

    Onukwugha, Eberechukwu; Duru, O Kenrik; Peprah, Emmanuel

    2017-01-01

    The articles presented in this special issue advance the conversation by describing the current efforts, findings and concerns related to Big Data and health disparities. They offer important recommendations and perspectives to consider when designing systems that can usefully leverage Big Data to reduce health disparities. We hope that ongoing Big Data efforts can build on these contributions to advance the conversation, address our embedded assumptions, and identify levers for action to reduce health care disparities.

  10. Big Data in der Cloud

    DEFF Research Database (Denmark)

    Leimbach, Timo; Bachlechner, Daniel

    2014-01-01

    Technology assessment of big data, in particular cloud based big data services, for the Office for Technology Assessment at the German federal parliament (Bundestag)......Technology assessment of big data, in particular cloud based big data services, for the Office for Technology Assessment at the German federal parliament (Bundestag)...

  11. An analysis of cross-sectional differences in big and non-big public accounting firms' audit programs

    NARCIS (Netherlands)

    Blokdijk, J.H. (Hans); Drieenhuizen, F.; Stein, M.T.; Simunic, D.A.

    2006-01-01

    A significant body of prior research has shown that audits by the Big 5 (now Big 4) public accounting firms are quality differentiated relative to non-Big 5 audits. This result can be derived analytically by assuming that Big 5 and non-Big 5 firms face different loss functions for "audit failures"

  12. Big Data is invading big places as CERN

    CERN Multimedia

    CERN. Geneva

    2017-01-01

    Big Data technologies are becoming more popular with the constant grow of data generation in different fields such as social networks, internet of things and laboratories like CERN. How is CERN making use of such technologies? How machine learning is applied at CERN with Big Data technologies? How much data we move and how it is analyzed? All these questions will be answered during the talk.

  13. The big bang

    International Nuclear Information System (INIS)

    Chown, Marcus.

    1987-01-01

    The paper concerns the 'Big Bang' theory of the creation of the Universe 15 thousand million years ago, and traces events which physicists predict occurred soon after the creation. Unified theory of the moment of creation, evidence of an expanding Universe, the X-boson -the particle produced very soon after the big bang and which vanished from the Universe one-hundredth of a second after the big bang, and the fate of the Universe, are all discussed. (U.K.)

  14. Small Big Data Congress 2017

    NARCIS (Netherlands)

    Doorn, J.

    2017-01-01

    TNO, in collaboration with the Big Data Value Center, presents the fourth Small Big Data Congress! Our congress aims at providing an overview of practical and innovative applications based on big data. Do you want to know what is happening in applied research with big data? And what can already be

  15. Big data opportunities and challenges

    CERN Document Server

    2014-01-01

    This ebook aims to give practical guidance for all those who want to understand big data better and learn how to make the most of it. Topics range from big data analysis, mobile big data and managing unstructured data to technologies, governance and intellectual property and security issues surrounding big data.

  16. Big Data and Neuroimaging.

    Science.gov (United States)

    Webb-Vargas, Yenny; Chen, Shaojie; Fisher, Aaron; Mejia, Amanda; Xu, Yuting; Crainiceanu, Ciprian; Caffo, Brian; Lindquist, Martin A

    2017-12-01

    Big Data are of increasing importance in a variety of areas, especially in the biosciences. There is an emerging critical need for Big Data tools and methods, because of the potential impact of advancements in these areas. Importantly, statisticians and statistical thinking have a major role to play in creating meaningful progress in this arena. We would like to emphasize this point in this special issue, as it highlights both the dramatic need for statistical input for Big Data analysis and for a greater number of statisticians working on Big Data problems. We use the field of statistical neuroimaging to demonstrate these points. As such, this paper covers several applications and novel methodological developments of Big Data tools applied to neuroimaging data.

  17. Identify too big to fail banks and capital insurance: An equilibrium approach

    OpenAIRE

    Katerina Ivanov

    2017-01-01

    The objective of this paper is develop a rational expectation equilibrium model of capital insurance to identify too big to fail banks. The main results of this model include (1) too big to fail banks can be identified explicitly by a systemic risk measure, loss betas, of all banks in the entire financial sector; (2) the too big to fail feature can be largely justified by a high level of loss beta; (3) the capital insurance proposal benefits market participants and reduces the systemic risk; ...

  18. Big Data; A Management Revolution : The emerging role of big data in businesses

    OpenAIRE

    Blasiak, Kevin

    2014-01-01

    Big data is a term that was coined in 2012 and has since then emerged to one of the top trends in business and technology. Big data is an agglomeration of different technologies resulting in data processing capabilities that have been unreached before. Big data is generally characterized by 4 factors. Volume, velocity and variety. These three factors distinct it from the traditional data use. The possibilities to utilize this technology are vast. Big data technology has touch points in differ...

  19. ''Big Dee'' upgrade of the Doublet III diagnostic data acquisition computer system

    International Nuclear Information System (INIS)

    Mcharg, B.B.

    1983-01-01

    The ''Big Dee'' upgrade of the Doublet III tokamak facility will begin operation in 1986 with an initial quantity of data expected to be 10 megabytes per shot and eventually attaining 20-25 megabytes per shot. This is in comparison to the 4-5 megabytes of data currently acquired. To handle this greater quantity of data and to serve physics needs for significantly improved between-shot processing of data will require a substantial upgrade of the existing data acquisition system. The key points of the philosophy that have been adopted for the upgraded system to handle the greater quantity of data are (1) preserve existing hardware, (2) preserve existing software; (3) configure the system in a modular fashion; and (4) distribute the data acquisition over multiple computers. The existing system using ModComp CLASSIC 16 bit minicomputers is capable of handling 5 megabytes of data per shot

  20. Big Dee upgrade of the Doublet III diagnostic data acquisition computer system

    International Nuclear Information System (INIS)

    McHarg, B.B. Jr.

    1983-12-01

    The Big Dee upgrade of the Doublet III tokamak facility will begin operation in 1986 with an initial quantity of data expected to be 10 megabytes per shot and eventually attaining 20 to 25 megabytes per shot. This is in comparison to the 4 to 5 megabytes of data currently acquired. To handle this greater quantity of data and to serve physics needs for significantly improved between-shot processing of data will require a substantial upgrade of the existing data acquisition system. The key points of the philosophy that have been adopted for the upgraded system to handle the greater quantity of data are (1) preserve existing hardware; (2) preserve existing software; (3) configure the system in a modular fashion; and (4) distribute the data acquisition over multiple computers. The existing system using ModComp CLASSIC 16 bit minicomputers is capable of handling 5 megabytes of data per shot

  1. Big data analysis new algorithms for a new society

    CERN Document Server

    Stefanowski, Jerzy

    2016-01-01

    This edited volume is devoted to Big Data Analysis from a Machine Learning standpoint as presented by some of the most eminent researchers in this area. It demonstrates that Big Data Analysis opens up new research problems which were either never considered before, or were only considered within a limited range. In addition to providing methodological discussions on the principles of mining Big Data and the difference between traditional statistical data analysis and newer computing frameworks, this book presents recently developed algorithms affecting such areas as business, financial forecasting, human mobility, the Internet of Things, information networks, bioinformatics, medical systems and life science. It explores, through a number of specific examples, how the study of Big Data Analysis has evolved and how it has started and will most likely continue to affect society. While the benefits brought upon by Big Data Analysis are underlined, the book also discusses some of the warnings that have been issued...

  2. A Brief Review on Leading Big Data Models

    Directory of Open Access Journals (Sweden)

    Sugam Sharma

    2014-11-01

    Full Text Available Today, science is passing through an era of transformation, where the inundation of data, dubbed data deluge is influencing the decision making process. The science is driven by the data and is being termed as data science. In this internet age, the volume of the data has grown up to petabytes, and this large, complex, structured or unstructured, and heterogeneous data in the form of “Big Data” has gained significant attention. The rapid pace of data growth through various disparate sources, especially social media such as Facebook, has seriously challenged the data analytic capabilities of traditional relational databases. The velocity of the expansion of the amount of data gives rise to a complete paradigm shift in how new age data is processed. Confidence in the data engineering of the existing data processing systems is gradually fading whereas the capabilities of the new techniques for capturing, storing, visualizing, and analyzing data are evolving. In this review paper, we discuss some of the modern Big Data models that are leading contributors in the NoSQL era and claim to address Big Data challenges in reliable and efficient ways. Also, we take the potential of Big Data into consideration and try to reshape the original operationaloriented definition of “Big Science” (Furner, 2003 into a new data-driven definition and rephrase it as “The science that deals with Big Data is Big Science.”

  3. Social big data mining

    CERN Document Server

    Ishikawa, Hiroshi

    2015-01-01

    Social Media. Big Data and Social Data. Hypotheses in the Era of Big Data. Social Big Data Applications. Basic Concepts in Data Mining. Association Rule Mining. Clustering. Classification. Prediction. Web Structure Mining. Web Content Mining. Web Access Log Mining, Information Extraction and Deep Web Mining. Media Mining. Scalability and Outlier Detection.

  4. Meta-analysis of Big Five personality traits in autism spectrum disorder.

    Science.gov (United States)

    Lodi-Smith, Jennifer; Rodgers, Jonathan D; Cunningham, Sara A; Lopata, Christopher; Thomeer, Marcus L

    2018-04-01

    The present meta-analysis synthesizes the emerging literature on the relationship of Big Five personality traits to autism spectrum disorder. Studies were included if they (1) either (a) measured autism spectrum disorder characteristics using a metric that yielded a single score quantification of the magnitude of autism spectrum disorder characteristics and/or (b) studied individuals with an autism spectrum disorder diagnosis compared to individuals without an autism spectrum disorder diagnosis and (2) measured Big Five traits in the same sample or samples. Fourteen reviewed studies include both correlational analyses and group comparisons. Eighteen effect sizes per Big Five trait were used to calculate two overall effect sizes per trait. Meta-analytic effects were calculated using random effects models. Twelve effects (per trait) from nine studies reporting correlations yielded a negative association between each Big Five personality trait and autism spectrum disorder characteristics (Fisher's z ranged from -.21 (conscientiousness) to -.50 (extraversion)). Six group contrasts (per trait) from six studies comparing individuals diagnosed with autism spectrum disorder to neurotypical individuals were also substantial (Hedges' g ranged from -.88 (conscientiousness) to -1.42 (extraversion)). The potential impact of personality on important life outcomes and new directions for future research on personality in autism spectrum disorder are discussed in light of results.

  5. Really big numbers

    CERN Document Server

    Schwartz, Richard Evan

    2014-01-01

    In the American Mathematical Society's first-ever book for kids (and kids at heart), mathematician and author Richard Evan Schwartz leads math lovers of all ages on an innovative and strikingly illustrated journey through the infinite number system. By means of engaging, imaginative visuals and endearing narration, Schwartz manages the monumental task of presenting the complex concept of Big Numbers in fresh and relatable ways. The book begins with small, easily observable numbers before building up to truly gigantic ones, like a nonillion, a tredecillion, a googol, and even ones too huge for names! Any person, regardless of age, can benefit from reading this book. Readers will find themselves returning to its pages for a very long time, perpetually learning from and growing with the narrative as their knowledge deepens. Really Big Numbers is a wonderful enrichment for any math education program and is enthusiastically recommended to every teacher, parent and grandparent, student, child, or other individual i...

  6. Cryptography for Big Data Security

    Science.gov (United States)

    2015-07-13

    Cryptography for Big Data Security Book Chapter for Big Data: Storage, Sharing, and Security (3S) Distribution A: Public Release Ariel Hamlin1 Nabil...Email: arkady@ll.mit.edu ii Contents 1 Cryptography for Big Data Security 1 1.1 Introduction...48 Chapter 1 Cryptography for Big Data Security 1.1 Introduction With the amount

  7. A Big Data Decision-making Mechanism for Food Supply Chain

    OpenAIRE

    Ji Guojun; Tan KimHua

    2017-01-01

    Many companies have captured and analyzed huge volumes of data to improve the decision mechanism of supply chain, this paper presents a big data harvest model that uses big data as inputs to make more informed decisions in the food supply chain. By introducing a method of Bayesian network, this paper integrates sample data and finds a cause-and-effect between data to predict market demand. Then the deduction graph model that translates foods demand into processes and divides processes into ta...

  8. Data: Big and Small.

    Science.gov (United States)

    Jones-Schenk, Jan

    2017-02-01

    Big data is a big topic in all leadership circles. Leaders in professional development must develop an understanding of what data are available across the organization that can inform effective planning for forecasting. Collaborating with others to integrate data sets can increase the power of prediction. Big data alone is insufficient to make big decisions. Leaders must find ways to access small data and triangulate multiple types of data to ensure the best decision making. J Contin Educ Nurs. 2017;48(2):60-61. Copyright 2017, SLACK Incorporated.

  9. Big Data Revisited

    DEFF Research Database (Denmark)

    Kallinikos, Jannis; Constantiou, Ioanna

    2015-01-01

    We elaborate on key issues of our paper New games, new rules: big data and the changing context of strategy as a means of addressing some of the concerns raised by the paper’s commentators. We initially deal with the issue of social data and the role it plays in the current data revolution...... and the technological recording of facts. We further discuss the significance of the very mechanisms by which big data is produced as distinct from the very attributes of big data, often discussed in the literature. In the final section of the paper, we qualify the alleged importance of algorithms and claim...... that the structures of data capture and the architectures in which data generation is embedded are fundamental to the phenomenon of big data....

  10. Implications of Big Data for cell biology

    OpenAIRE

    Dolinski, Kara; Troyanskaya, Olga G.

    2015-01-01

    Big Data” has surpassed “systems biology” and “omics” as the hottest buzzword in the biological sciences, but is there any substance behind the hype? Certainly, we have learned about various aspects of cell and molecular biology from the many individual high-throughput data sets that have been published in the past 15–20 years. These data, although useful as individual data sets, can provide much more knowledge when interrogated with Big Data approaches, such as applying integrative methods ...

  11. The Relationship between the Big-Five Model of Personality and Self-Regulated Learning Strategies

    Science.gov (United States)

    Bidjerano, Temi; Dai, David Yun

    2007-01-01

    The study examined the relationship between the big-five model of personality and the use of self-regulated learning strategies. Measures of self-regulated learning strategies and big-five personality traits were administered to a sample of undergraduate students. Results from canonical correlation analysis indicated an overlap between the…

  12. Small decisions with big impact on data analytics

    Directory of Open Access Journals (Sweden)

    Jana Diesner

    2015-11-01

    Full Text Available Big social data have enabled new opportunities for evaluating the applicability of social science theories that were formulated decades ago and were often based on small- to medium-sized samples. Big Data coupled with powerful computing has the potential to replace the statistical practice of sampling and estimating effects by measuring phenomena based on full populations. Preparing these data for analysis and conducting analytics involves a plethora of decisions, some of which are already embedded in previously collected data and built tools. These decisions refer to the recording, indexing and representation of data and the settings for analysis methods. While these choices can have tremendous impact on research outcomes, they are not often obvious, not considered or not being made explicit. Consequently, our awareness and understanding of the impact of these decisions on analysis results and derived implications are highly underdeveloped. This might be attributable to occasional high levels of over-confidence in computational solutions as well as the possible yet questionable assumption that Big Data can wash out minor data quality issues, among other reasons. This article provides examples for how to address this issue. It argues that checking, ensuring and validating the quality of big social data and related auxiliary material is a key ingredient for empowering users to gain reliable insights from their work. Scrutinizing data for accuracy issues, systematically fixing them and diligently documenting these processes can have another positive side effect: Closely interacting with the data, thereby forcing ourselves to understand their idiosyncrasies and patterns, can help us to move from being able to precisely model and formally describe effects in society to also understand and explain them.

  13. Priming the Pump for Big Data at Sentara Healthcare.

    Science.gov (United States)

    Kern, Howard P; Reagin, Michael J; Reese, Bertram S

    2016-01-01

    Today's healthcare organizations are facing significant demands with respect to managing population health, demonstrating value, and accepting risk for clinical outcomes across the continuum of care. The patient's environment outside the walls of the hospital and physician's office-and outside the electronic health record (EHR)-has a substantial impact on clinical care outcomes. The use of big data is key to understanding factors that affect the patient's health status and enhancing clinicians' ability to anticipate how the patient will respond to various therapies. Big data is essential to delivering sustainable, highquality, value-based healthcare, as well as to the success of new models of care such as clinically integrated networks (CINs) and accountable care organizations.Sentara Healthcare, based in Norfolk, Virginia, has been an early adopter of the technologies that have readied us for our big data journey: EHRs, telehealth-supported electronic intensive care units, and telehealth primary care support through MDLIVE. Although we would not say Sentara is at the cutting edge of the big data trend, it certainly is among the fast followers. Use of big data in healthcare is still at an early stage compared with other industries. Tools for data analytics are maturing, but traditional challenges such as heightened data security and limited human resources remain the primary focus for regional health systems to improve care and reduce costs. Sentara primarily makes actionable use of big data in our CIN, Sentara Quality Care Network, and at our health plan, Optima Health. Big data projects can be expensive, and justifying the expense organizationally has often been easier in times of crisis. We have developed an analytics strategic plan separate from but aligned with corporate system goals to ensure optimal investment and management of this essential asset.

  14. A Big Data Platform for Storing, Accessing, Mining and Learning Geospatial Data

    Science.gov (United States)

    Yang, C. P.; Bambacus, M.; Duffy, D.; Little, M. M.

    2017-12-01

    Big Data is becoming a norm in geoscience domains. A platform that is capable to effiently manage, access, analyze, mine, and learn the big data for new information and knowledge is desired. This paper introduces our latest effort on developing such a platform based on our past years' experiences on cloud and high performance computing, analyzing big data, comparing big data containers, and mining big geospatial data for new information. The platform includes four layers: a) the bottom layer includes a computing infrastructure with proper network, computer, and storage systems; b) the 2nd layer is a cloud computing layer based on virtualization to provide on demand computing services for upper layers; c) the 3rd layer is big data containers that are customized for dealing with different types of data and functionalities; d) the 4th layer is a big data presentation layer that supports the effient management, access, analyses, mining and learning of big geospatial data.

  15. A Hybrid Approach to Processing Big Data Graphs on Memory-Restricted Systems

    KAUST Repository

    Harshvardhan,

    2015-05-01

    With the advent of big-data, processing large graphs quickly has become increasingly important. Most existing approaches either utilize in-memory processing techniques that can only process graphs that fit completely in RAM, or disk-based techniques that sacrifice performance. In this work, we propose a novel RAM-Disk hybrid approach to graph processing that can scale well from a single shared-memory node to large distributed-memory systems. It works by partitioning the graph into sub graphs that fit in RAM and uses a paging-like technique to load sub graphs. We show that without modifying the algorithms, this approach can scale from small memory-constrained systems (such as tablets) to large-scale distributed machines with 16, 000+ cores.

  16. Big Bang 5

    CERN Document Server

    Apolin, Martin

    2007-01-01

    Physik soll verständlich sein und Spaß machen! Deshalb beginnt jedes Kapitel in Big Bang mit einem motivierenden Überblick und Fragestellungen und geht dann von den Grundlagen zu den Anwendungen, vom Einfachen zum Komplizierten. Dabei bleibt die Sprache einfach, alltagsorientiert und belletristisch. Der Band 5 RG behandelt die Grundlagen (Maßsystem, Größenordnungen) und die Mechanik (Translation, Rotation, Kraft, Erhaltungssätze).

  17. Urbanising Big

    DEFF Research Database (Denmark)

    Ljungwall, Christer

    2013-01-01

    Development in China raises the question of how big a city can become, and at the same time be sustainable, writes Christer Ljungwall of the Swedish Agency for Growth Policy Analysis.......Development in China raises the question of how big a city can become, and at the same time be sustainable, writes Christer Ljungwall of the Swedish Agency for Growth Policy Analysis....

  18. Big bang nucleosynthesis

    International Nuclear Information System (INIS)

    Boyd, Richard N.

    2001-01-01

    The precision of measurements in modern cosmology has made huge strides in recent years, with measurements of the cosmic microwave background and the determination of the Hubble constant now rivaling the level of precision of the predictions of big bang nucleosynthesis. However, these results are not necessarily consistent with the predictions of the Standard Model of big bang nucleosynthesis. Reconciling these discrepancies may require extensions of the basic tenets of the model, and possibly of the reaction rates that determine the big bang abundances

  19. Image Mosaicking Approach for a Double-Camera System in the GaoFen2 Optical Remote Sensing Satellite Based on the Big Virtual Camera.

    Science.gov (United States)

    Cheng, Yufeng; Jin, Shuying; Wang, Mi; Zhu, Ying; Dong, Zhipeng

    2017-06-20

    The linear array push broom imaging mode is widely used for high resolution optical satellites (HROS). Using double-cameras attached by a high-rigidity support along with push broom imaging is one method to enlarge the field of view while ensuring high resolution. High accuracy image mosaicking is the key factor of the geometrical quality of complete stitched satellite imagery. This paper proposes a high accuracy image mosaicking approach based on the big virtual camera (BVC) in the double-camera system on the GaoFen2 optical remote sensing satellite (GF2). A big virtual camera can be built according to the rigorous imaging model of a single camera; then, each single image strip obtained by each TDI-CCD detector can be re-projected to the virtual detector of the big virtual camera coordinate system using forward-projection and backward-projection to obtain the corresponding single virtual image. After an on-orbit calibration and relative orientation, the complete final virtual image can be obtained by stitching the single virtual images together based on their coordinate information on the big virtual detector image plane. The paper subtly uses the concept of the big virtual camera to obtain a stitched image and the corresponding high accuracy rational function model (RFM) for concurrent post processing. Experiments verified that the proposed method can achieve seamless mosaicking while maintaining the geometric accuracy.

  20. A Survey of Scholarly Data: From Big Data Perspective

    DEFF Research Database (Denmark)

    Khan, Samiya; Liu, Xiufeng; Shakil, Kashish A.

    2017-01-01

    of which, this scholarly reserve is popularly referred to as big scholarly data. In order to facilitate data analytics for big scholarly data, architectures and services for the same need to be developed. The evolving nature of research problems has made them essentially interdisciplinary. As a result......, there is a growing demand for scholarly applications like collaborator discovery, expert finding and research recommendation systems, in addition to several others. This research paper investigates the current trends and identifies the existing challenges in development of a big scholarly data platform......Recently, there has been a shifting focus of organizations and governments towards digitization of academic and technical documents, adding a new facet to the concept of digital libraries. The volume, variety and velocity of this generated data, satisfies the big data definition, as a result...

  1. Big Data: More than Just Big and More than Just Data.

    Science.gov (United States)

    Spencer, Gregory A

    2017-01-01

    According to an report, 90 percent of the data in the world today were created in the past two years. This statistic is not surprising given the explosion of mobile phones and other devices that generate data, the Internet of Things (e.g., smart refrigerators), and metadata (data about data). While it might be a stretch to figure out how a healthcare organization can use data generated from an ice maker, data from a plethora of rich and useful sources, when combined with an organization's own data, can produce improved results. How can healthcare organizations leverage these rich and diverse data sources to improve patients' health and make their businesses more competitive? The authors of the two feature articles in this issue of Frontiers provide tangible examples of how their organizations are using big data to meaningfully improve healthcare. Sentara Healthcare and Carolinas HealthCare System both use big data in creative ways that differ because of different business situations, yet are also similar in certain respects.

  2. Real-time Medical Emergency Response System: Exploiting IoT and Big Data for Public Health.

    Science.gov (United States)

    Rathore, M Mazhar; Ahmad, Awais; Paul, Anand; Wan, Jiafu; Zhang, Daqiang

    2016-12-01

    Healthy people are important for any nation's development. Use of the Internet of Things (IoT)-based body area networks (BANs) is increasing for continuous monitoring and medical healthcare in order to perform real-time actions in case of emergencies. However, in the case of monitoring the health of all citizens or people in a country, the millions of sensors attached to human bodies generate massive volume of heterogeneous data, called "Big Data." Processing Big Data and performing real-time actions in critical situations is a challenging task. Therefore, in order to address such issues, we propose a Real-time Medical Emergency Response System that involves IoT-based medical sensors deployed on the human body. Moreover, the proposed system consists of the data analysis building, called "Intelligent Building," depicted by the proposed layered architecture and implementation model, and it is responsible for analysis and decision-making. The data collected from millions of body-attached sensors is forwarded to Intelligent Building for processing and for performing necessary actions using various units such as collection, Hadoop Processing (HPU), and analysis and decision. The feasibility and efficiency of the proposed system are evaluated by implementing the system on Hadoop using an UBUNTU 14.04 LTS coreTMi5 machine. Various medical sensory datasets and real-time network traffic are considered for evaluating the efficiency of the system. The results show that the proposed system has the capability of efficiently processing WBAN sensory data from millions of users in order to perform real-time responses in case of emergencies.

  3. Modeling of the Nuclear Power Plant Life cycle for 'Big Data' Management System: A Systems Engineering Viewpoint

    Energy Technology Data Exchange (ETDEWEB)

    Ha, Bui Hoang; Khanh, Tran Quang Diep; Shakirah, Wan; Kahar, Wan Abdul; Jung, Jae Cheon [KEPCO International Nuclear Graduate School, Ulsan (Korea, Republic of)

    2012-10-15

    Together with the significant development of Internet and Web technologies, the rapid evolution of 'Big Data' idea has been observed since it is first introduced in 1941 as an 'information explosion'(OED). Using the '3Vs' model, as proposed by Gartner, 'Big Data' can be defined as 'high volume, high velocity, and/or high variety information assets that require new forms of processing to enable enhanced decision making, insight discovery and process optimization.' Big Data technologies and tools have been developed to address the way large quantities of data are stored, accessed and presented for manipulation or analysis. The idea also focuses on how the users can easily access and extract the 'useful and right' data, information, or even knowledge from the 'Big Data'.

  4. The ethics of big data in big agriculture

    OpenAIRE

    Carbonell (Isabelle M.)

    2016-01-01

    This paper examines the ethics of big data in agriculture, focusing on the power asymmetry between farmers and large agribusinesses like Monsanto. Following the recent purchase of Climate Corp., Monsanto is currently the most prominent biotech agribusiness to buy into big data. With wireless sensors on tractors monitoring or dictating every decision a farmer makes, Monsanto can now aggregate large quantities of previously proprietary farming data, enabling a privileged position with unique in...

  5. Trait Emotional Intelligence and Personality: Gender-Invariant Linkages Across Different Measures of the Big Five.

    OpenAIRE

    Siegling, A. B.; Furnham, A.; Petrides, K. V.

    2015-01-01

    This study investigated if the linkages between trait emotional intelligence (trait EI) and the Five-Factor Model of personality were invariant between men and women. Five English-speaking samples (N = 307-685) of mostly undergraduate students each completed a different measure of the Big Five personality traits and either the full form or short form of the Trait Emotional Intelligence Questionnaire (TEIQue). Across samples, models predicting global TEIQue scores from the Big Five were invari...

  6. A Big Video Manifesto

    DEFF Research Database (Denmark)

    Mcilvenny, Paul Bruce; Davidsen, Jacob

    2017-01-01

    and beautiful visualisations. However, we also need to ask what the tools of big data can do both for the Humanities and for more interpretative approaches and methods. Thus, we prefer to explore how the power of computation, new sensor technologies and massive storage can also help with video-based qualitative......For the last few years, we have witnessed a hype about the potential results and insights that quantitative big data can bring to the social sciences. The wonder of big data has moved into education, traffic planning, and disease control with a promise of making things better with big numbers...

  7. Big Data in Caenorhabditis elegans: quo vadis?

    Science.gov (United States)

    Hutter, Harald; Moerman, Donald

    2015-11-05

    A clear definition of what constitutes "Big Data" is difficult to identify, but we find it most useful to define Big Data as a data collection that is complete. By this criterion, researchers on Caenorhabditis elegans have a long history of collecting Big Data, since the organism was selected with the idea of obtaining a complete biological description and understanding of development. The complete wiring diagram of the nervous system, the complete cell lineage, and the complete genome sequence provide a framework to phrase and test hypotheses. Given this history, it might be surprising that the number of "complete" data sets for this organism is actually rather small--not because of lack of effort, but because most types of biological experiments are not currently amenable to complete large-scale data collection. Many are also not inherently limited, so that it becomes difficult to even define completeness. At present, we only have partial data on mutated genes and their phenotypes, gene expression, and protein-protein interaction--important data for many biological questions. Big Data can point toward unexpected correlations, and these unexpected correlations can lead to novel investigations; however, Big Data cannot establish causation. As a result, there is much excitement about Big Data, but there is also a discussion on just what Big Data contributes to solving a biological problem. Because of its relative simplicity, C. elegans is an ideal test bed to explore this issue and at the same time determine what is necessary to build a multicellular organism from a single cell. © 2015 Hutter and Moerman. This article is distributed by The American Society for Cell Biology under license from the author(s). Two months after publication it is available to the public under an Attribution–Noncommercial–Share Alike 3.0 Unported Creative Commons License (http://creativecommons.org/licenses/by-nc-sa/3.0).

  8. Applications of Big Data in Education

    OpenAIRE

    Faisal Kalota

    2015-01-01

    Big Data and analytics have gained a huge momentum in recent years. Big Data feeds into the field of Learning Analytics (LA) that may allow academic institutions to better understand the learners' needs and proactively address them. Hence, it is important to have an understanding of Big Data and its applications. The purpose of this descriptive paper is to provide an overview of Big Data, the technologies used in Big Data, and some of the applications of Big Data in educa...

  9. The dominance of big pharma: power.

    Science.gov (United States)

    Edgar, Andrew

    2013-05-01

    The purpose of this paper is to provide a normative model for the assessment of the exercise of power by Big Pharma. By drawing on the work of Steven Lukes, it will be argued that while Big Pharma is overtly highly regulated, so that its power is indeed restricted in the interests of patients and the general public, the industry is still able to exercise what Lukes describes as a third dimension of power. This entails concealing the conflicts of interest and grievances that Big Pharma may have with the health care system, physicians and patients, crucially through rhetorical engagements with Patient Advocacy Groups that seek to shape public opinion, and also by marginalising certain groups, excluding them from debates over health care resource allocation. Three issues will be examined: the construction of a conception of the patient as expert patient or consumer; the phenomenon of disease mongering; the suppression or distortion of debates over resource allocation.

  10. Big Data Semantics

    NARCIS (Netherlands)

    Ceravolo, Paolo; Azzini, Antonia; Angelini, Marco; Catarci, Tiziana; Cudré-Mauroux, Philippe; Damiani, Ernesto; Mazak, Alexandra; van Keulen, Maurice; Jarrar, Mustafa; Santucci, Giuseppe; Sattler, Kai-Uwe; Scannapieco, Monica; Wimmer, Manuel; Wrembel, Robert; Zaraket, Fadi

    2018-01-01

    Big Data technology has discarded traditional data modeling approaches as no longer applicable to distributed data processing. It is, however, largely recognized that Big Data impose novel challenges in data and infrastructure management. Indeed, multiple components and procedures must be

  11. Big Data for Infectious Disease Surveillance and Modeling

    OpenAIRE

    Bansal, Shweta; Chowell, Gerardo; Simonsen, Lone; Vespignani, Alessandro; Viboud, Cécile

    2016-01-01

    We devote a special issue of the Journal of Infectious Diseases to review the recent advances of big data in strengthening disease surveillance, monitoring medical adverse events, informing transmission models, and tracking patient sentiments and mobility. We consider a broad definition of big data for public health, one encompassing patient information gathered from high-volume electronic health records and participatory surveillance systems, as well as mining of digital traces such as socia...

  12. The use of big data in transfusion medicine.

    Science.gov (United States)

    Pendry, K

    2015-06-01

    'Big data' refers to the huge quantities of digital information now available that describe much of human activity. The science of data management and analysis is rapidly developing to enable organisations to convert data into useful information and knowledge. Electronic health records and new developments in Pathology Informatics now support the collection of 'big laboratory and clinical data', and these digital innovations are now being applied to transfusion medicine. To use big data effectively, we must address concerns about confidentiality and the need for a change in culture and practice, remove barriers to adopting common operating systems and data standards and ensure the safe and secure storage of sensitive personal information. In the UK, the aim is to formulate a single set of data and standards for communicating test results and so enable pathology data to contribute to national datasets. In transfusion, big data has been used for benchmarking, detection of transfusion-related complications, determining patterns of blood use and definition of blood order schedules for surgery. More generally, rapidly available information can monitor compliance with key performance indicators for patient blood management and inventory management leading to better patient care and reduced use of blood. The challenges of enabling reliable systems and analysis of big data and securing funding in the restrictive financial climate are formidable, but not insurmountable. The promise is that digital information will soon improve the implementation of best practice in transfusion medicine and patient blood management globally. © 2015 British Blood Transfusion Society.

  13. Visualization of big data security: a case study on the KDD99 cup data set

    Directory of Open Access Journals (Sweden)

    Zichan Ruan

    2017-11-01

    Full Text Available Cyber security has been thrust into the limelight in the modern technological era because of an array of attacks often bypassing untrained intrusion detection systems (IDSs. Therefore, greater attention has been directed on being able deciphering better methods for identifying attack types to train IDSs more effectively. Keycyber-attack insights exist in big data; however, an efficient approach is required to determine strong attack types to train IDSs to become more effective in key areas. Despite the rising growth in IDS research, there is a lack of studies involving big data visualization, which is key. The KDD99 data set has served as a strong benchmark since 1999; therefore, we utilized this data set in our experiment. In this study, we utilized hash algorithm, a weight table, and sampling method to deal with the inherent problems caused by analyzing big data; volume, variety, and velocity. By utilizing a visualization algorithm, we were able to gain insights into the KDD99 data set with a clear identification of “normal” clusters and described distinct clusters of effective attacks.

  14. Big data in psychology: Introduction to the special issue.

    Science.gov (United States)

    Harlow, Lisa L; Oswald, Frederick L

    2016-12-01

    The introduction to this special issue on psychological research involving big data summarizes the highlights of 10 articles that address a number of important and inspiring perspectives, issues, and applications. Four common themes that emerge in the articles with respect to psychological research conducted in the area of big data are mentioned, including: (a) The benefits of collaboration across disciplines, such as those in the social sciences, applied statistics, and computer science. Doing so assists in grounding big data research in sound theory and practice, as well as in affording effective data retrieval and analysis. (b) Availability of large data sets on Facebook, Twitter, and other social media sites that provide a psychological window into the attitudes and behaviors of a broad spectrum of the population. (c) Identifying, addressing, and being sensitive to ethical considerations when analyzing large data sets gained from public or private sources. (d) The unavoidable necessity of validating predictive models in big data by applying a model developed on 1 dataset to a separate set of data or hold-out sample. Translational abstracts that summarize the articles in very clear and understandable terms are included in Appendix A, and a glossary of terms relevant to big data research discussed in the articles is presented in Appendix B. (PsycINFO Database Record (c) 2016 APA, all rights reserved).

  15. Comparative validity of brief to medium-length Big Five and Big Six personality questionnaires

    NARCIS (Netherlands)

    Thalmayer, A.G.; Saucier, G.; Eigenhuis, A.

    2011-01-01

    A general consensus on the Big Five model of personality attributes has been highly generative for the field of personality psychology. Many important psychological and life outcome correlates with Big Five trait dimensions have been established. But researchers must choose between multiple Big Five

  16. Military Simulation Big Data: Background, State of the Art, and Challenges

    Directory of Open Access Journals (Sweden)

    Xiao Song

    2015-01-01

    Full Text Available Big data technology has undergone rapid development and attained great success in the business field. Military simulation (MS is another application domain producing massive datasets created by high-resolution models and large-scale simulations. It is used to study complicated problems such as weapon systems acquisition, combat analysis, and military training. This paper firstly reviewed several large-scale military simulations producing big data (MS big data for a variety of usages and summarized the main characteristics of result data. Then we looked at the technical details involving the generation, collection, processing, and analysis of MS big data. Two frameworks were also surveyed to trace the development of the underlying software platform. Finally, we identified some key challenges and proposed a framework as a basis for future work. This framework considered both the simulation and big data management at the same time based on layered and service oriented architectures. The objective of this review is to help interested researchers learn the key points of MS big data and provide references for tackling the big data problem and performing further research.

  17. Databases in the documentation management for big industrial projects

    International Nuclear Information System (INIS)

    Cauchet, A.; Chevillard, F.; Parisot, Y.; Tirefort, C.

    1990-05-01

    The documentation management of a big industrial project involves a continuous update of information, both in the study and realization phase or in the operation phase. The organization of the technical documentation for big industrial projects requests complex information systems. In the first part of this paper are presented the methods appropriate for the analysis of documentation management procedures and in the second part are presented the tools by the combination of which a documentation system for the user is provided. The case of the documentation centres for the Hague reprocessing plant is described

  18. On the Power of Randomization in Big Data Analytics

    DEFF Research Database (Denmark)

    Pham, Ninh Dang

    We are experiencing a big data deluge, a result of not only the internetization and computerization of our society, but also the fast development of affordable and powerful data collection and storage devices. The explosively growing data, in both size and form, has posed a fundamental challenge...... for big data analytics. That is how to efficiently handle and analyze such big data in order to bridge the gap between data and information. In wide range of application domains, data are represented as high-dimensional vectors in the Euclidean space in order to benefit from computationally advanced...... techniques from numerical linear algebra. The computational efficiency and scalability of such techniques have been growing demands for not only novel platform system architectures but also efficient and effective algorithms to address the fast paced big data needs. In the thesis we will tackle...

  19. Forget the hype or reality. Big data presents new opportunities in Earth Science.

    Science.gov (United States)

    Lee, T. J.

    2015-12-01

    Earth science is arguably one of the most mature science discipline which constantly acquires, curates, and utilizes a large volume of data with diverse variety. We deal with big data before there is big data. For example, while developing the EOS program in the 1980s, the EOS data and information system (EOSDIS) was developed to manage the vast amount of data acquired by the EOS fleet of satellites. EOSDIS continues to be a shining example of modern science data systems in the past two decades. With the explosion of internet, the usage of social media, and the provision of sensors everywhere, the big data era has bring new challenges. First, Goggle developed the search algorithm and a distributed data management system. The open source communities quickly followed up and developed Hadoop file system to facility the map reduce workloads. The internet continues to generate tens of petabytes of data every day. There is a significant shortage of algorithms and knowledgeable manpower to mine the data. In response, the federal government developed the big data programs that fund research and development projects and training programs to tackle these new challenges. Meanwhile, comparatively to the internet data explosion, Earth science big data problem has become quite small. Nevertheless, the big data era presents an opportunity for Earth science to evolve. We learned about the MapReduce algorithms, in memory data mining, machine learning, graph analysis, and semantic web technologies. How do we apply these new technologies to our discipline and bring the hype to Earth? In this talk, I will discuss how we might want to apply some of the big data technologies to our discipline and solve many of our challenging problems. More importantly, I will propose new Earth science data system architecture to enable new type of scientific inquires.

  20. Computational red teaming risk analytics of big-data-to-decisions intelligent systems

    CERN Document Server

    Abbass, Hussein A

    2015-01-01

    Written to bridge the information needs of management and computational scientists, this book presents the first comprehensive treatment of Computational Red Teaming (CRT).  The author describes an analytics environment that blends human reasoning and computational modeling to design risk-aware and evidence-based smart decision making systems. He presents the Shadow CRT Machine, which shadows the operations of an actual system to think with decision makers, challenge threats, and design remedies. This is the first book to generalize red teaming (RT) outside the military and security domains and it offers coverage of RT principles, practical and ethical guidelines. The author utilizes Gilbert’s principles for introducing a science. Simplicity: where the book follows a special style to make it accessible to a wide range of  readers. Coherence:  where only necessary elements from experimentation, optimization, simulation, data mining, big data, cognitive information processing, and system thinking are blend...

  1. Anger and hostility from the perspective of the Big Five personality model.

    Science.gov (United States)

    Sanz, Jesús; García-Vera, María Paz; Magán, Inés

    2010-06-01

    This study was aimed at examining the relationships of the personality dimensions of the five-factor model or Big Five with trait anger and with two specific traits of hostility (mistrust and confrontational attitude), and identifying the similarities and differences between trait anger and hostility in the framework of the Big Five. In a sample of 353 male and female adults, the Big Five explained a significant percentage of individual differences in trait anger and hostility after controlling the effects due to the relationship between both constructs and content overlapping across scales. In addition, trait anger was primarily associated with neuroticism, whereas mistrust and confrontational attitude were principally related to low agreeableness. These findings are discussed in the context of the anger-hostility-aggression syndrome and the capability of the Big Five for organizing and clarifying related personality constructs.

  2. Nursing Knowledge: Big Data Science-Implications for Nurse Leaders.

    Science.gov (United States)

    Westra, Bonnie L; Clancy, Thomas R; Sensmeier, Joyce; Warren, Judith J; Weaver, Charlotte; Delaney, Connie W

    2015-01-01

    The integration of Big Data from electronic health records and other information systems within and across health care enterprises provides an opportunity to develop actionable predictive models that can increase the confidence of nursing leaders' decisions to improve patient outcomes and safety and control costs. As health care shifts to the community, mobile health applications add to the Big Data available. There is an evolving national action plan that includes nursing data in Big Data science, spearheaded by the University of Minnesota School of Nursing. For the past 3 years, diverse stakeholders from practice, industry, education, research, and professional organizations have collaborated through the "Nursing Knowledge: Big Data Science" conferences to create and act on recommendations for inclusion of nursing data, integrated with patient-generated, interprofessional, and contextual data. It is critical for nursing leaders to understand the value of Big Data science and the ways to standardize data and workflow processes to take advantage of newer cutting edge analytics to support analytic methods to control costs and improve patient quality and safety.

  3. Codimension-Two Big-Bang Bifurcation in a ZAD-Controlled Boost DC-DC Converter

    Science.gov (United States)

    Amador, A.; Casanova, S.; Granada, H. A.; Olivar, G.; Hurtado, J.

    In this paper, we study some nonlinear behaviors in a two-dimensional system defined by a Boost Converter controlled by CPWM (Centered Pulse-Width Modulation) and a ZAD (Zero Average Dynamics) strategy. The dynamics was analyzed using a discrete-time map, which consists of a sampled system at each switching cycle. The structure of the two-parametric space is characterized analytically. This allows proving the existence and stability of an infinite number of codimension-one curves that intersect at the same point in the two-parametric space. This phenomenon has been called a big-bang bifurcation.

  4. Big Data for Infectious Disease Surveillance and Modeling.

    Science.gov (United States)

    Bansal, Shweta; Chowell, Gerardo; Simonsen, Lone; Vespignani, Alessandro; Viboud, Cécile

    2016-12-01

    We devote a special issue of the Journal of Infectious Diseases to review the recent advances of big data in strengthening disease surveillance, monitoring medical adverse events, informing transmission models, and tracking patient sentiments and mobility. We consider a broad definition of big data for public health, one encompassing patient information gathered from high-volume electronic health records and participatory surveillance systems, as well as mining of digital traces such as social media, Internet searches, and cell-phone logs. We introduce nine independent contributions to this special issue and highlight several cross-cutting areas that require further research, including representativeness, biases, volatility, and validation, and the need for robust statistical and hypotheses-driven analyses. Overall, we are optimistic that the big-data revolution will vastly improve the granularity and timeliness of available epidemiological information, with hybrid systems augmenting rather than supplanting traditional surveillance systems, and better prospects for accurate infectious diseases models and forecasts. Published by Oxford University Press for the Infectious Diseases Society of America 2016. This work is written by (a) US Government employee(s) and is in the public domain in the US.

  5. Assessing Big Data

    DEFF Research Database (Denmark)

    Leimbach, Timo; Bachlechner, Daniel

    2015-01-01

    In recent years, big data has been one of the most controversially discussed technologies in terms of its possible positive and negative impact. Therefore, the need for technology assessments is obvious. This paper first provides, based on the results of a technology assessment study, an overview...... of the potential and challenges associated with big data and then describes the problems experienced during the study as well as methods found helpful to address them. The paper concludes with reflections on how the insights from the technology assessment study may have an impact on the future governance of big...... data....

  6. Creating value in health care through big data: opportunities and policy implications.

    Science.gov (United States)

    Roski, Joachim; Bo-Linn, George W; Andrews, Timothy A

    2014-07-01

    Big data has the potential to create significant value in health care by improving outcomes while lowering costs. Big data's defining features include the ability to handle massive data volume and variety at high velocity. New, flexible, and easily expandable information technology (IT) infrastructure, including so-called data lakes and cloud data storage and management solutions, make big-data analytics possible. However, most health IT systems still rely on data warehouse structures. Without the right IT infrastructure, analytic tools, visualization approaches, work flows, and interfaces, the insights provided by big data are likely to be limited. Big data's success in creating value in the health care sector may require changes in current polices to balance the potential societal benefits of big-data approaches and the protection of patients' confidentiality. Other policy implications of using big data are that many current practices and policies related to data use, access, sharing, privacy, and stewardship need to be revised. Project HOPE—The People-to-People Health Foundation, Inc.

  7. Big data, big responsibilities

    Directory of Open Access Journals (Sweden)

    Primavera De Filippi

    2014-01-01

    Full Text Available Big data refers to the collection and aggregation of large quantities of data produced by and about people, things or the interactions between them. With the advent of cloud computing, specialised data centres with powerful computational hardware and software resources can be used for processing and analysing a humongous amount of aggregated data coming from a variety of different sources. The analysis of such data is all the more valuable to the extent that it allows for specific patterns to be found and new correlations to be made between different datasets, so as to eventually deduce or infer new information, as well as to potentially predict behaviours or assess the likelihood for a certain event to occur. This article will focus specifically on the legal and moral obligations of online operators collecting and processing large amounts of data, to investigate the potential implications of big data analysis on the privacy of individual users and on society as a whole.

  8. Comparative validity of brief to medium-length Big Five and Big Six Personality Questionnaires.

    Science.gov (United States)

    Thalmayer, Amber Gayle; Saucier, Gerard; Eigenhuis, Annemarie

    2011-12-01

    A general consensus on the Big Five model of personality attributes has been highly generative for the field of personality psychology. Many important psychological and life outcome correlates with Big Five trait dimensions have been established. But researchers must choose between multiple Big Five inventories when conducting a study and are faced with a variety of options as to inventory length. Furthermore, a 6-factor model has been proposed to extend and update the Big Five model, in part by adding a dimension of Honesty/Humility or Honesty/Propriety. In this study, 3 popular brief to medium-length Big Five measures (NEO Five Factor Inventory, Big Five Inventory [BFI], and International Personality Item Pool), and 3 six-factor measures (HEXACO Personality Inventory, Questionnaire Big Six Scales, and a 6-factor version of the BFI) were placed in competition to best predict important student life outcomes. The effect of test length was investigated by comparing brief versions of most measures (subsets of items) with original versions. Personality questionnaires were administered to undergraduate students (N = 227). Participants' college transcripts and student conduct records were obtained 6-9 months after data was collected. Six-factor inventories demonstrated better predictive ability for life outcomes than did some Big Five inventories. Additional behavioral observations made on participants, including their Facebook profiles and cell-phone text usage, were predicted similarly by Big Five and 6-factor measures. A brief version of the BFI performed surprisingly well; across inventory platforms, increasing test length had little effect on predictive validity. Comparative validity of the models and measures in terms of outcome prediction and parsimony is discussed.

  9. Big Machines and Big Science: 80 Years of Accelerators at Stanford

    Energy Technology Data Exchange (ETDEWEB)

    Loew, Gregory

    2008-12-16

    Longtime SLAC physicist Greg Loew will present a trip through SLAC's origins, highlighting its scientific achievements, and provide a glimpse of the lab's future in 'Big Machines and Big Science: 80 Years of Accelerators at Stanford.'

  10. Dual of big bang and big crunch

    International Nuclear Information System (INIS)

    Bak, Dongsu

    2007-01-01

    Starting from the Janus solution and its gauge theory dual, we obtain the dual gauge theory description of the cosmological solution by the procedure of double analytic continuation. The coupling is driven either to zero or to infinity at the big-bang and big-crunch singularities, which are shown to be related by the S-duality symmetry. In the dual Yang-Mills theory description, these are nonsingular as the coupling goes to zero in the N=4 super Yang-Mills theory. The cosmological singularities simply signal the failure of the supergravity description of the full type IIB superstring theory

  11. Ten aspects of the Big Five in the Personality Inventory for DSM-5.

    Science.gov (United States)

    DeYoung, Colin G; Carey, Bridget E; Krueger, Robert F; Ross, Scott R

    2016-04-01

    Diagnostic and Statistical Manual of Mental Disorders (5th ed.; DSM-5) includes a dimensional model of personality pathology, operationalized in the Personality Inventory for DSM-5 (PID-5), with 25 facets grouped into 5 higher order factors resembling the Big Five personality dimensions. The present study tested how well these 25 facets could be integrated with the 10-factor structure of traits within the Big Five that is operationalized by the Big Five Aspect Scales (BFAS). In 2 healthy adult samples, 10-factor solutions largely confirmed our hypothesis that each of the 10 BFAS would be the highest loading BFAS on 1 and only 1 factor. Varying numbers of PID-5 scales were additional markers of each factor, and the overall factor structure in the first sample was well replicated in the second. Our results allow Cybernetic Big Five Theory (CB5T) to be brought to bear on manifestations of personality disorder, because CB5T offers mechanistic explanations of the 10 factors measured by the BFAS. Future research, therefore, may begin to test hypotheses derived from CB5T regarding the mechanisms that are dysfunctional in specific personality disorders. (c) 2016 APA, all rights reserved).

  12. 10 Aspects of the Big Five in the Personality Inventory for DSM-5

    Science.gov (United States)

    DeYoung, Colin. G.; Carey, Bridget E.; Krueger, Robert F.; Ross, Scott R.

    2015-01-01

    DSM-5 includes a dimensional model of personality pathology, operationalized in the Personality Inventory for DSM-5 (PID-5), with 25 facets grouped into five higher-order factors resembling the Big Five personality dimensions. The present study tested how well these 25 facets could be integrated with the 10-factor structure of traits within the Big Five that is operationalized by the Big Five Aspect Scales (BFAS). In two healthy adult samples, 10-factor solutions largely confirmed our hypothesis that each of the 10 BFAS scales would be the highest loading BFAS scale on one and only one factor. Varying numbers of PID-5 scales were additional markers of each factor, and the overall factor structure in the first sample was well replicated in the second. Our results allow Cybernetic Big Five Theory (CB5T) to be brought to bear on manifestations of personality disorder, because CB5T offers mechanistic explanations of the 10 factors measured by the BFAS. Future research, therefore, may begin to test hypotheses derived from CB5T regarding the mechanisms that are dysfunctional in specific personality disorders. PMID:27032017

  13. Comparative Validity of Brief to Medium-Length Big Five and Big Six Personality Questionnaires

    Science.gov (United States)

    Thalmayer, Amber Gayle; Saucier, Gerard; Eigenhuis, Annemarie

    2011-01-01

    A general consensus on the Big Five model of personality attributes has been highly generative for the field of personality psychology. Many important psychological and life outcome correlates with Big Five trait dimensions have been established. But researchers must choose between multiple Big Five inventories when conducting a study and are…

  14. Big data for health.

    Science.gov (United States)

    Andreu-Perez, Javier; Poon, Carmen C Y; Merrifield, Robert D; Wong, Stephen T C; Yang, Guang-Zhong

    2015-07-01

    This paper provides an overview of recent developments in big data in the context of biomedical and health informatics. It outlines the key characteristics of big data and how medical and health informatics, translational bioinformatics, sensor informatics, and imaging informatics will benefit from an integrated approach of piecing together different aspects of personalized information from a diverse range of data sources, both structured and unstructured, covering genomics, proteomics, metabolomics, as well as imaging, clinical diagnosis, and long-term continuous physiological sensing of an individual. It is expected that recent advances in big data will expand our knowledge for testing new hypotheses about disease management from diagnosis to prevention to personalized treatment. The rise of big data, however, also raises challenges in terms of privacy, security, data ownership, data stewardship, and governance. This paper discusses some of the existing activities and future opportunities related to big data for health, outlining some of the key underlying issues that need to be tackled.

  15. Analysis of financing efficiency of big data industry in Guizhou province based on DEA models

    Science.gov (United States)

    Li, Chenggang; Pan, Kang; Luo, Cong

    2018-03-01

    Taking 20 listed enterprises of big data industry in Guizhou province as samples, this paper uses DEA method to evaluate the financing efficiency of big data industry in Guizhou province. The results show that the pure technical efficiency of big data enterprise in Guizhou province is high, whose mean value reaches to 0.925. The mean value of scale efficiency reaches to 0.749. The average value of comprehensive efficiency reaches 0.693. The comprehensive financing efficiency is low. According to the results of the study, this paper puts forward some policy and recommendations to improve the financing efficiency of the big data industry in Guizhou.

  16. Significance of Supply Logistics in Big Cities

    Directory of Open Access Journals (Sweden)

    Mario Šafran

    2012-10-01

    Full Text Available The paper considers the concept and importance of supplylogistics as element in improving storage, supply and transportof goods in big cities. There is always room for improvements inthis segmenl of economic activities, and therefore continuousoptimisation of the cargo flows from the manufacturer to theend user is impor1a11t. Due to complex requirements in thecargo supply a11d the "spoiled" end users, modem cities represe/ll great difficulties and a big challenge for the supply organisers.The consumers' needs in big cities have developed over therecent years i11 such a way that they require supply of goods severaltimes a day at precisely determined times (orders are receivedby e-mail, and the information transfer is therefore instantaneous.In order to successfully meet the consumers'needs in advanced economic systems, advanced methods ofgoods supply have been developed and improved, such as 'justin time'; ''door-to-door", and "desk-to-desk". Regular operationof these systems requires supply logistics 1vhiclz includes thetotalthroughpw of materials, from receiving the raw materialsor reproduction material to the delive1y of final products to theend users.

  17. Personality correlates of pathological gambling derived from Big Three and Big Five personality models.

    Science.gov (United States)

    Miller, Joshua D; Mackillop, James; Fortune, Erica E; Maples, Jessica; Lance, Charles E; Keith Campbell, W; Goodie, Adam S

    2013-03-30

    Personality traits have proved to be consistent and important factors in a variety of externalizing behaviors including addiction, aggression, and antisocial behavior. Given the comorbidity of these behaviors with pathological gambling (PG), it is important to test the degree to which PG shares these trait correlates. In a large community sample of regular gamblers (N=354; 111 with diagnoses of pathological gambling), the relations between measures of two major models of personality - Big Three and Big Five - were examined in relation to PG symptoms derived from a semi-structured diagnostic interview. Across measures, traits related to the experience of strong negative emotions were the most consistent correlates of PG, regardless of whether they were analyzed using bivariate or multivariate analyses. In several instances, however, the relations between personality and PG were moderated by demographic variable such as gender, race, and age. It will be important for future empirical work of this nature to pay closer attention to potentially important moderators of these relations. Copyright © 2012 Elsevier Ireland Ltd. All rights reserved.

  18. Innovations and enhancements in neutronic analysis of the Big-10 university research and training reactors based on the AGENT code system

    International Nuclear Information System (INIS)

    Hursin, M.; Shanjie, X.; Burns, A.; Hopkins, J.; Satvat, N.; Gert, G.; Tsoukalas, L. H.; Jevremovic, T.

    2006-01-01

    Introduction. This paper summarizes salient aspects of the 'virtual' reactor system developed at Purdue Univ. emphasizing efficient neutronic modeling through AGENT (Arbitrary Geometry Neutron Transport) a deterministic neutron transport code. DOE's Big-10 Innovations in Nuclear Infrastructure and Education (INIE) Consortium was launched in 2002 to enhance scholarship activities pertaining to university research and training reactors (URTRs). Existing and next generation URTRs are powerful campus tools for nuclear engineering as well as a number of disciplines that include, but are not limited to, medicine, biology, material science, and food science. Advancing new computational environments for the analysis and configuration of URTRs is an important Big-10 INIE aim. Specifically, Big-10 INIE has pursued development of a 'virtual' reactor, an advanced computational environment to serve as a platform on which to build operations, utilization (research and education), and systemic analysis of URTRs physics. The 'virtual' reactor computational system will integrate computational tools addressing the URTR core and near core physics (transport, dynamics, fuel management and fuel configuration); thermal-hydraulics; beam line, in-core and near-core experiments; instrumentation and controls; confinement/containment and security issues. Such integrated computational environment does not currently exist. The 'virtual' reactor is designed to allow researchers and educators to configure and analyze their systems to optimize experiments, fuel locations for flux shaping, as well as detector selection and configuration. (authors)

  19. WISE TECHNOLOGY FOR HANDLING BIG DATA FEDERATIONS

    NARCIS (Netherlands)

    Valentijn, E; Begeman, Kornelis; Belikov, Andrey; Boxhoorn, Danny; Verdoes Kleijn, Gijs; McFarland, John; Vriend, Willem-Jan; Williams, Owen; Soille, P.; Marchetti, P.G.

    2014-01-01

    The effective use of Big Data in current and future scientific missions requires intelligent data handling systems which are able to interface the user to complicated distributed data collections. We review the WISE Concept of Scientific Information Systems and the WISE solutions for the storage and

  20. Perspectives on making big data analytics work for oncology.

    Science.gov (United States)

    El Naqa, Issam

    2016-12-01

    Oncology, with its unique combination of clinical, physical, technological, and biological data provides an ideal case study for applying big data analytics to improve cancer treatment safety and outcomes. An oncology treatment course such as chemoradiotherapy can generate a large pool of information carrying the 5Vs hallmarks of big data. This data is comprised of a heterogeneous mixture of patient demographics, radiation/chemo dosimetry, multimodality imaging features, and biological markers generated over a treatment period that can span few days to several weeks. Efforts using commercial and in-house tools are underway to facilitate data aggregation, ontology creation, sharing, visualization and varying analytics in a secure environment. However, open questions related to proper data structure representation and effective analytics tools to support oncology decision-making need to be addressed. It is recognized that oncology data constitutes a mix of structured (tabulated) and unstructured (electronic documents) that need to be processed to facilitate searching and subsequent knowledge discovery from relational or NoSQL databases. In this context, methods based on advanced analytics and image feature extraction for oncology applications will be discussed. On the other hand, the classical p (variables)≫n (samples) inference problem of statistical learning is challenged in the Big data realm and this is particularly true for oncology applications where p-omics is witnessing exponential growth while the number of cancer incidences has generally plateaued over the past 5-years leading to a quasi-linear growth in samples per patient. Within the Big data paradigm, this kind of phenomenon may yield undesirable effects such as echo chamber anomalies, Yule-Simpson reversal paradox, or misleading ghost analytics. In this work, we will present these effects as they pertain to oncology and engage small thinking methodologies to counter these effects ranging from

  1. Big data naturally rescaled

    International Nuclear Information System (INIS)

    Stoop, Ruedi; Kanders, Karlis; Lorimer, Tom; Held, Jenny; Albert, Carlo

    2016-01-01

    We propose that a handle could be put on big data by looking at the systems that actually generate the data, rather than the data itself, realizing that there may be only few generic processes involved in this, each one imprinting its very specific structures in the space of systems, the traces of which translate into feature space. From this, we propose a practical computational clustering approach, optimized for coping with such data, inspired by how the human cortex is known to approach the problem.

  2. Hydrogeochemical and stream sediment reconnaissance basic data for Big Delta Quadrangle, Alaska

    International Nuclear Information System (INIS)

    1981-01-01

    Field and laboratory data are presented for 1380 water samples from the Big Delta Quadrangle, Alaska. The samples were collected by Los Alamos Scientific Laboratory; laboratory analysis and data reporting were performed by the Uranium Resource Evaluation Project at Oak Ridge, Tennessee

  3. The Need for a Definition of Big Data for Nursing Science: A Case Study of Disaster Preparedness

    Science.gov (United States)

    Wong, Ho Ting; Chiang, Vico Chung Lim; Choi, Kup Sze; Loke, Alice Yuen

    2016-01-01

    The rapid development of technology has made enormous volumes of data available and achievable anytime and anywhere around the world. Data scientists call this change a data era and have introduced the term “Big Data”, which has drawn the attention of nursing scholars. Nevertheless, the concept of Big Data is quite fuzzy and there is no agreement on its definition among researchers of different disciplines. Without a clear consensus on this issue, nursing scholars who are relatively new to the concept may consider Big Data to be merely a dataset of a bigger size. Having a suitable definition for nurse researchers in their context of research and practice is essential for the advancement of nursing research. In view of the need for a better understanding on what Big Data is, the aim in this paper is to explore and discuss the concept. Furthermore, an example of a Big Data research study on disaster nursing preparedness involving six million patient records is used for discussion. The example demonstrates that a Big Data analysis can be conducted from many more perspectives than would be possible in traditional sampling, and is superior to traditional sampling. Experience gained from the process of using Big Data in this study will shed light on future opportunities for conducting evidence-based nursing research to achieve competence in disaster nursing. PMID:27763525

  4. The Need for a Definition of Big Data for Nursing Science: A Case Study of Disaster Preparedness

    Directory of Open Access Journals (Sweden)

    Ho Ting Wong

    2016-10-01

    Full Text Available The rapid development of technology has made enormous volumes of data available and achievable anytime and anywhere around the world. Data scientists call this change a data era and have introduced the term “Big Data”, which has drawn the attention of nursing scholars. Nevertheless, the concept of Big Data is quite fuzzy and there is no agreement on its definition among researchers of different disciplines. Without a clear consensus on this issue, nursing scholars who are relatively new to the concept may consider Big Data to be merely a dataset of a bigger size. Having a suitable definition for nurse researchers in their context of research and practice is essential for the advancement of nursing research. In view of the need for a better understanding on what Big Data is, the aim in this paper is to explore and discuss the concept. Furthermore, an example of a Big Data research study on disaster nursing preparedness involving six million patient records is used for discussion. The example demonstrates that a Big Data analysis can be conducted from many more perspectives than would be possible in traditional sampling, and is superior to traditional sampling. Experience gained from the process of using Big Data in this study will shed light on future opportunities for conducting evidence-based nursing research to achieve competence in disaster nursing.

  5. The Need for a Definition of Big Data for Nursing Science: A Case Study of Disaster Preparedness.

    Science.gov (United States)

    Wong, Ho Ting; Chiang, Vico Chung Lim; Choi, Kup Sze; Loke, Alice Yuen

    2016-10-17

    The rapid development of technology has made enormous volumes of data available and achievable anytime and anywhere around the world. Data scientists call this change a data era and have introduced the term "Big Data", which has drawn the attention of nursing scholars. Nevertheless, the concept of Big Data is quite fuzzy and there is no agreement on its definition among researchers of different disciplines. Without a clear consensus on this issue, nursing scholars who are relatively new to the concept may consider Big Data to be merely a dataset of a bigger size. Having a suitable definition for nurse researchers in their context of research and practice is essential for the advancement of nursing research. In view of the need for a better understanding on what Big Data is, the aim in this paper is to explore and discuss the concept. Furthermore, an example of a Big Data research study on disaster nursing preparedness involving six million patient records is used for discussion. The example demonstrates that a Big Data analysis can be conducted from many more perspectives than would be possible in traditional sampling, and is superior to traditional sampling. Experience gained from the process of using Big Data in this study will shed light on future opportunities for conducting evidence-based nursing research to achieve competence in disaster nursing.

  6. Big data based fraud risk management at Alibaba

    OpenAIRE

    Chen, Jidong; Tao, Ye; Wang, Haoran; Chen, Tao

    2015-01-01

    With development of mobile internet and finance, fraud risk comes in all shapes and sizes. This paper is to introduce the Fraud Risk Management at Alibaba under big data. Alibaba has built a fraud risk monitoring and management system based on real-time big data processing and intelligent risk models. It captures fraud signals directly from huge amount data of user behaviors and network, analyzes them in real-time using machine learning, and accurately predicts the bad users and transactions....

  7. Astrophysical Li-7 as a product of big bang nucleosynthesis and galactic cosmic-ray spallation

    Science.gov (United States)

    Olive, Keith A.; Schramm, David N.

    1992-01-01

    The astrophysical Li-7 abundance is considered to be largely primordial, while the Be and B abundances are thought to be due to galactic cosmic ray (GCR) spallation reactions on top of a much smaller big bang component. But GCR spallation should also produce Li-7. As a consistency check on the combination of big bang nucleosynthesis and GCR spallation, the Be and B data from a sample of hot population II stars is used to subtract from the measured Li-7 abundance an estimate of the amount generated by GCR spallation for each star in the sample, and then to add to this baseline an estimate of the metallicity-dependent augmentation of Li-7 due to spallation. The singly reduced primordial Li-7 abundance is still consistent with big bang nucleosynthesis, and a single GCR spallation model can fit the Be, B, and corrected Li-7 abundances for all the stars in the sample.

  8. Big Data and Data Science in Critical Care.

    Science.gov (United States)

    Sanchez-Pinto, L Nelson; Luo, Yuan; Churpek, Matthew M

    2018-05-09

    The digitalization of the healthcare system has resulted in a deluge of clinical Big Data and has prompted the rapid growth of data science in medicine. Data science, which is the field of study dedicated to the principled extraction of knowledge from complex data, is particularly relevant in the critical care setting. The availability of large amounts of data in the intensive care unit, the need for better evidence-based care, and the complexity of critical illness makes the use of data science techniques and data-driven research particularly appealing to intensivists. Despite the increasing number of studies and publications in the field, so far there have been few examples of data science projects that have resulted in successful implementations of data-driven systems in the intensive care unit. However, given the expected growth in the field, intensivists should be familiar with the opportunities and challenges of Big Data and data science. In this paper, we review the definitions, types of algorithms, applications, challenges, and future of Big Data and data science in critical care. Copyright © 2018. Published by Elsevier Inc.

  9. Dashboard Monitoring System Berbasis Web Sebagai Pemantau Layanan liteBIG Instant Messenger

    Directory of Open Access Journals (Sweden)

    Gigih Forda Nama

    2017-04-01

    Full Text Available Saat ini hampir semua pengguna ponsel pintar menggunakan layanan perpesanan instan sebagai media komunikasi dikarenakan layanan perpesanan instan lebih hemat biaya dengan hanya menggunakan jaringan internet dibandingkan layanan pesan singkat (SMS. Layanan yang diberikan harus dapat melayani pengguna dengan baik agar pesan yang dikirim oleh pengirim dapat diterima oleh penerima dengan cepat dan akurat. Layanan ini juga harus dijaga keandalannya untuk menjamin kualitas pelayanan dan untuk menghindari ketidaknyamanan pengguna. Sehingga, diperlukan adanya sistem pemantauan berupa perangkat lunak untuk pengawasan status layanan setiap saat dapat diakses dari manapun dan kapanpun. PT.Sandika Cahaya Mandiri memiliki produk layanan perpesanan instan dengan brand name liteBIG Messenger. Perusahaan ini memerlukan perangkat lunak untuk pemantauan layanan liteBIG Messenger. Dengan adanya perangkat lunak pemantauan layanan, petugas pemantauan dapat melihat secara realtime status layanan utama pada setiap komputer server, pemakaian sumber daya (cpu, memory, dan harddisk, dan statistik pengguna baru liteBIG Messenger  melalui antarmuka web. Petugas pemantauan juga akan mendapat pemberitahuan ketika terjadi masalah pada layanan sehingga masalah dapat lebih dini diketahui dan downtime dapat dikurangi.

  10. Could Blobs Fuel Storage-Based Convergence between HPC and Big Data?

    Energy Technology Data Exchange (ETDEWEB)

    Matri, Pierre; Alforov, Yevhen; Brandon, Alvaro; Kuhn, Michael; Carns, Philip; Ludwig, Thomas

    2017-09-05

    The increasingly growing data sets processed on HPC platforms raise major challenges for the underlying storage layer. A promising alternative to POSIX-IO- compliant file systems are simpler blobs (binary large objects), or object storage systems. Such systems offer lower overhead and better performance at the cost of largely unused features such as file hierarchies or permissions. Similarly, blobs are increasingly considered for replacing distributed file systems for big data analytics or as a base for storage abstractions such as key-value stores or time-series databases. This growing interest in such object storage on HPC and big data platforms raises the question: Are blobs the right level of abstraction to enable storage-based convergence between HPC and Big Data? In this paper we study the impact of blob-based storage for real-world applications on HPC and cloud environments. The results show that blobbased storage convergence is possible, leading to a significant performance improvement on both platforms

  11. Big data-driven business how to use big data to win customers, beat competitors, and boost profits

    CERN Document Server

    Glass, Russell

    2014-01-01

    Get the expert perspective and practical advice on big data The Big Data-Driven Business: How to Use Big Data to Win Customers, Beat Competitors, and Boost Profits makes the case that big data is for real, and more than just big hype. The book uses real-life examples-from Nate Silver to Copernicus, and Apple to Blackberry-to demonstrate how the winners of the future will use big data to seek the truth. Written by a marketing journalist and the CEO of a multi-million-dollar B2B marketing platform that reaches more than 90% of the U.S. business population, this book is a comprehens

  12. The Big Mac Standard: A statistical Illustration

    OpenAIRE

    Yukinobu Kitamura; Hiroshi Fujiki

    2004-01-01

    We demonstrate a statistical procedure for selecting the most suitable empirical model to test an economic theory, using the example of the test for purchasing power parity based on the Big Mac Index. Our results show that supporting evidence for purchasing power parity, conditional on the Balassa-Samuelson effect, depends crucially on the selection of models, sample periods and economies used for estimations.

  13. Big Data Research in Italy: A Perspective

    Directory of Open Access Journals (Sweden)

    Sonia Bergamaschi

    2016-06-01

    Full Text Available The aim of this article is to synthetically describe the research projects that a selection of Italian universities is undertaking in the context of big data. Far from being exhaustive, this article has the objective of offering a sample of distinct applications that address the issue of managing huge amounts of data in Italy, collected in relation to diverse domains.

  14. Big Game Reporting Stations

    Data.gov (United States)

    Vermont Center for Geographic Information — Point locations of big game reporting stations. Big game reporting stations are places where hunters can legally report harvested deer, bear, or turkey. These are...

  15. Big Five personality traits, job satisfaction and subjective wellbeing in China.

    Science.gov (United States)

    Zhai, Qingguo; Willis, Mike; O'Shea, Bob; Zhai, Yubo; Yang, Yuwen

    2013-01-01

    This paper examines the effect of the Big Five personality traits on job satisfaction and subjective wellbeing (SWB). The paper also examines the mediating role of job satisfaction on the Big Five-SWB relationship. Data were collected from a sample of 818 urban employees from five Chinese cities: Harbin, Changchun, Shenyang, Dalian, and Fushun. All the study variables were measured with well-established multi-item scales that have been validated both in English-speaking populations and in China. The study found only extraversion to have an effect on job satisfaction, suggesting that there could be cultural difference in the relationships between the Big Five and job satisfaction in China and in the West. The study found that three factors in the Big Five--extraversion, conscientiousness, and neuroticism--have an effect on SWB. This finding is similar to findings in the West, suggesting convergence in the relationship between the Big Five and SWB in different cultural contexts. The research found that only the relationship between extraversion and SWB is partially mediated by job satisfaction, implying that the effect of the Big Five on SWB is mainly direct, rather than indirect via job satisfaction. The study also found that extraversion was the strongest predictor of both job satisfaction and SWB. This finding implies that extraversion could be more important than other factors in the Big Five in predicting job satisfaction and SWB in a "high collectivism" and "high power distance" country such as China. The research findings are discussed in the Chinese cultural context. The study also offers suggestions on the directions for future research.

  16. Stalin's Big Fleet Program

    National Research Council Canada - National Science Library

    Mauner, Milan

    2002-01-01

    Although Dr. Milan Hauner's study 'Stalin's Big Fleet program' has focused primarily on the formation of Big Fleets during the Tsarist and Soviet periods of Russia's naval history, there are important lessons...

  17. Identify too big to fail banks and capital insurance: An equilibrium approach

    Directory of Open Access Journals (Sweden)

    Katerina Ivanov

    2017-09-01

    Full Text Available The objective of this paper is develop a rational expectation equilibrium model of capital insurance to identify too big to fail banks. The main results of this model include (1 too big to fail banks can be identified explicitly by a systemic risk measure, loss betas, of all banks in the entire financial sector; (2 the too big to fail feature can be largely justified by a high level of loss beta; (3 the capital insurance proposal benefits market participants and reduces the systemic risk; (4 the implicit guarantee subsidy can be estimated endogenously; and lastly, (5 the capital insurance proposal can be used to resolve the moral hazard issue. We implement this model and document that the too big to fail issue has been considerably reduced in the pro-crisis period. As a result, the capital insurance proposal could be a useful macro-regulation innovation policy tool

  18. Forecasting in an integrated surface water-ground water system: The Big Cypress Basin, South Florida

    Science.gov (United States)

    Butts, M. B.; Feng, K.; Klinting, A.; Stewart, K.; Nath, A.; Manning, P.; Hazlett, T.; Jacobsen, T.

    2009-04-01

    The South Florida Water Management District (SFWMD) manages and protects the state's water resources on behalf of 7.5 million South Floridians and is the lead agency in restoring America's Everglades - the largest environmental restoration project in US history. Many of the projects to restore and protect the Everglades ecosystem are part of the Comprehensive Everglades Restoration Plan (CERP). The region has a unique hydrological regime, with close connection between surface water and groundwater, and a complex managed drainage network with many structures. Added to the physical complexity are the conflicting needs of the ecosystem for protection and restoration, versus the substantial urban development with the accompanying water supply, water quality and flood control issues. In this paper a novel forecasting and real-time modelling system is presented for the Big Cypress Basin. The Big Cypress Basin includes 272 km of primary canals and 46 water control structures throughout the area that provide limited levels of flood protection, as well as water supply and environmental quality management. This system is linked to the South Florida Water Management District's extensive real-time (SCADA) data monitoring and collection system. Novel aspects of this system include the use of a fully distributed and integrated modeling approach and a new filter-based updating approach for accurately forecasting river levels. Because of the interaction between surface- and groundwater a fully integrated forecast modeling approach is required. Indeed, results for the Tropical Storm Fay in 2008, the groundwater levels show an extremely rapid response to heavy rainfall. Analysis of this storm also shows that updating levels in the river system can have a direct impact on groundwater levels.

  19. Five Big, Big Five Issues : Rationale, Content, Structure, Status, and Crosscultural Assessment

    NARCIS (Netherlands)

    De Raad, Boele

    1998-01-01

    This article discusses the rationale, content, structure, status, and crosscultural assessment of the Big Five trait factors, focusing on topics of dispute and misunderstanding. Taxonomic restrictions of the original Big Five forerunner, the "Norman Five," are discussed, and criticisms regarding the

  20. Intelligent Management System of Power Network Information Collection Under Big Data Storage

    Directory of Open Access Journals (Sweden)

    Qin Yingying

    2017-01-01

    Full Text Available With the development of economy and society, big data storage in enterprise management has become a problem that can’t be ignored. How to manage and optimize the allocation of tasks better is an important factor in the sustainable development of an enterprise. Now the enterprise information intelligent management has become a hot spot of management mode and concept in the information age. It presents information to the business managers in a more efficient, lower cost, and global form. The system uses the SG-UAP development tools, which is based on Eclipse development environment, and suits for Windows operating system, with Oracle as database development platform, Tomcat network information service for application server. The system uses SOA service-oriented architecture, provides RESTful style service, and HTTP(S as the communication protocol, and JSON as the data format. The system is divided into two parts, the front-end and the backs-end, achieved functions like user login, registration, password retrieving, enterprise internal personnel information management and internal data display and other functions.

  1. Integrating Big Data into a Sustainable Mobility Policy 2.0 Planning Support System

    Directory of Open Access Journals (Sweden)

    Ivana Semanjski

    2016-11-01

    Full Text Available It is estimated that each of us, on a daily basis, produces a bit more than 1 GB of digital content through our mobile phone and social networks activities, bank card payments, location-based positioning information, online activities, etc. However, the implementation of these large data amounts in city assets planning systems still remains a rather abstract idea for several reasons, including the fact that practical examples are still very strongly services-oriented, and are a largely unexplored and interdisciplinary field; hence, missing the cross-cutting dimension. In this paper, we describe the Policy 2.0 concept and integrate user generated content into Policy 2.0 platform for sustainable mobility planning. By means of a real-life example, we demonstrate the applicability of such a big data integration approach to smart cities planning process. Observed benefits range from improved timeliness of the data and reduced duration of the planning cycle to more informed and agile decision making, on both the citizens and the city planners end. The integration of big data into the planning process, at this stage, does not have uniform impact across all levels of decision making and planning process, therefore it should be performed gradually and with full awareness of existing limitations.

  2. Big data challenges

    DEFF Research Database (Denmark)

    Bachlechner, Daniel; Leimbach, Timo

    2016-01-01

    Although reports on big data success stories have been accumulating in the media, most organizations dealing with high-volume, high-velocity and high-variety information assets still face challenges. Only a thorough understanding of these challenges puts organizations into a position in which...... they can make an informed decision for or against big data, and, if the decision is positive, overcome the challenges smoothly. The combination of a series of interviews with leading experts from enterprises, associations and research institutions, and focused literature reviews allowed not only...... framework are also relevant. For large enterprises and startups specialized in big data, it is typically easier to overcome the challenges than it is for other enterprises and public administration bodies....

  3. Big data managing in a landslide early warning system: experience from a ground-based interferometric radar application

    Directory of Open Access Journals (Sweden)

    E. Intrieri

    2017-10-01

    Full Text Available A big challenge in terms or landslide risk mitigation is represented by increasing the resiliency of society exposed to the risk. Among the possible strategies with which to reach this goal, there is the implementation of early warning systems. This paper describes a procedure to improve early warning activities in areas affected by high landslide risk, such as those classified as critical infrastructures for their central role in society. This research is part of the project LEWIS (Landslides Early Warning Integrated System: An Integrated System for Landslide Monitoring, Early Warning and Risk Mitigation along Lifelines. LEWIS is composed of a susceptibility assessment methodology providing information for single points and areal monitoring systems, a data transmission network and a data collecting and processing center (DCPC, where readings from all monitoring systems and mathematical models converge and which sets the basis for warning and intervention activities. The aim of this paper is to show how logistic issues linked to advanced monitoring techniques, such as big data transfer and storing, can be dealt with compatibly with an early warning system. Therefore, we focus on the interaction between an areal monitoring tool (a ground-based interferometric radar and the DCPC. By converting complex data into ASCII strings and through appropriate data cropping and average, and by implementing an algorithm for line-of-sight correction, we managed to reduce the data daily output without compromising the capability for performing.

  4. Big data managing in a landslide early warning system: experience from a ground-based interferometric radar application

    Science.gov (United States)

    Intrieri, Emanuele; Bardi, Federica; Fanti, Riccardo; Gigli, Giovanni; Fidolini, Francesco; Casagli, Nicola; Costanzo, Sandra; Raffo, Antonio; Di Massa, Giuseppe; Capparelli, Giovanna; Versace, Pasquale

    2017-10-01

    A big challenge in terms or landslide risk mitigation is represented by increasing the resiliency of society exposed to the risk. Among the possible strategies with which to reach this goal, there is the implementation of early warning systems. This paper describes a procedure to improve early warning activities in areas affected by high landslide risk, such as those classified as critical infrastructures for their central role in society. This research is part of the project LEWIS (Landslides Early Warning Integrated System): An Integrated System for Landslide Monitoring, Early Warning and Risk Mitigation along Lifelines. LEWIS is composed of a susceptibility assessment methodology providing information for single points and areal monitoring systems, a data transmission network and a data collecting and processing center (DCPC), where readings from all monitoring systems and mathematical models converge and which sets the basis for warning and intervention activities. The aim of this paper is to show how logistic issues linked to advanced monitoring techniques, such as big data transfer and storing, can be dealt with compatibly with an early warning system. Therefore, we focus on the interaction between an areal monitoring tool (a ground-based interferometric radar) and the DCPC. By converting complex data into ASCII strings and through appropriate data cropping and average, and by implementing an algorithm for line-of-sight correction, we managed to reduce the data daily output without compromising the capability for performing.

  5. Big Data and HPC collocation: Using HPC idle resources for Big Data Analytics

    OpenAIRE

    MERCIER , Michael; Glesser , David; Georgiou , Yiannis; Richard , Olivier

    2017-01-01

    International audience; Executing Big Data workloads upon High Performance Computing (HPC) infrastractures has become an attractive way to improve their performances. However, the collocation of HPC and Big Data workloads is not an easy task, mainly because of their core concepts' differences. This paper focuses on the challenges related to the scheduling of both Big Data and HPC workloads on the same computing platform. In classic HPC workloads, the rigidity of jobs tends to create holes in ...

  6. Big Data technology in traffic: A case study of automatic counters

    Directory of Open Access Journals (Sweden)

    Janković Slađana R.

    2016-01-01

    Full Text Available Modern information and communication technologies together with intelligent devices provide a continuous inflow of large amounts of data that are used by traffic and transport systems. Collecting traffic data does not represent a challenge nowadays, but the issues remains in relation to storing and processing increasing amounts of data. In this paper we have investigated the possibilities of using Big Data technology to store and process data in the transport domain. The term Big Data refers to a large volume of information resource, its velocity and variety, far beyond the capabilities of commonly used software for storing, processing and data management. In our case study, Apache™ Hadoop® Big Data was used for processing data collected from 10 automatic traffic counters set up in Novi Sad and its surroundings. Indicators of traffic load which were calculated using the Big Data platforms were presented using tables and graphs in Microsoft Office Excel tool. The visualization and geolocation of the obtained indicators were performed using the Microsoft Business Intelligence (BI tools such as: Excel Power View and Excel Power Map. This case study showed that Big Data technologies combined with the BI tools can be used as a reliable support in monitoring of the traffic management systems.

  7. Nonuniversal scalar-tensor theories and big bang nucleosynthesis

    International Nuclear Information System (INIS)

    Coc, Alain; Olive, Keith A.; Uzan, Jean-Philippe; Vangioni, Elisabeth

    2009-01-01

    We investigate the constraints that can be set from big bang nucleosynthesis on two classes of models: extended quintessence and scalar-tensor theories of gravity in which the equivalence principle between standard matter and dark matter is violated. In the latter case, and for a massless dilaton with quadratic couplings, the phase space of theories is investigated. We delineate those theories where attraction toward general relativity occurs. It is shown that big bang nucleosynthesis sets more stringent constraints than those obtained from Solar System tests.

  8. Nonuniversal scalar-tensor theories and big bang nucleosynthesis

    Science.gov (United States)

    Coc, Alain; Olive, Keith A.; Uzan, Jean-Philippe; Vangioni, Elisabeth

    2009-05-01

    We investigate the constraints that can be set from big bang nucleosynthesis on two classes of models: extended quintessence and scalar-tensor theories of gravity in which the equivalence principle between standard matter and dark matter is violated. In the latter case, and for a massless dilaton with quadratic couplings, the phase space of theories is investigated. We delineate those theories where attraction toward general relativity occurs. It is shown that big bang nucleosynthesis sets more stringent constraints than those obtained from Solar System tests.

  9. Building a Smarter University: Big Data, Innovation, and Analytics. Critical Issues in Higher Education

    Science.gov (United States)

    Lane, Jason E., Ed.

    2014-01-01

    The Big Data movement and the renewed focus on data analytics are transforming everything from healthcare delivery systems to the way cities deliver services to residents. Now is the time to examine how this Big Data could help build smarter universities. While much of the cutting-edge research that is being done with Big Data is happening at…

  10. Research on the development of scientific and technological intelligence in big data environment

    Directory of Open Access Journals (Sweden)

    Wu Qiong

    2018-01-01

    development of scientific and technological intelligence and problems faced by the big data intelligence system. Finally, the author has predicted the opportunities and challenges for scientific and technological intelligence service agency posed by the big data.

  11. Big Data as Governmentality

    DEFF Research Database (Denmark)

    Flyverbom, Mikkel; Madsen, Anders Koed; Rasche, Andreas

    This paper conceptualizes how large-scale data and algorithms condition and reshape knowledge production when addressing international development challenges. The concept of governmentality and four dimensions of an analytics of government are proposed as a theoretical framework to examine how big...... data is constituted as an aspiration to improve the data and knowledge underpinning development efforts. Based on this framework, we argue that big data’s impact on how relevant problems are governed is enabled by (1) new techniques of visualizing development issues, (2) linking aspects...... shows that big data problematizes selected aspects of traditional ways to collect and analyze data for development (e.g. via household surveys). We also demonstrate that using big data analyses to address development challenges raises a number of questions that can deteriorate its impact....

  12. Boarding to Big data

    Directory of Open Access Journals (Sweden)

    Oana Claudia BRATOSIN

    2016-05-01

    Full Text Available Today Big data is an emerging topic, as the quantity of the information grows exponentially, laying the foundation for its main challenge, the value of the information. The information value is not only defined by the value extraction from huge data sets, as fast and optimal as possible, but also by the value extraction from uncertain and inaccurate data, in an innovative manner using Big data analytics. At this point, the main challenge of the businesses that use Big data tools is to clearly define the scope and the necessary output of the business so that the real value can be gained. This article aims to explain the Big data concept, its various classifications criteria, architecture, as well as the impact in the world wide processes.

  13. Big data - a 21st century science Maginot Line? No-boundary thinking: shifting from the big data paradigm.

    Science.gov (United States)

    Huang, Xiuzhen; Jennings, Steven F; Bruce, Barry; Buchan, Alison; Cai, Liming; Chen, Pengyin; Cramer, Carole L; Guan, Weihua; Hilgert, Uwe Kk; Jiang, Hongmei; Li, Zenglu; McClure, Gail; McMullen, Donald F; Nanduri, Bindu; Perkins, Andy; Rekepalli, Bhanu; Salem, Saeed; Specker, Jennifer; Walker, Karl; Wunsch, Donald; Xiong, Donghai; Zhang, Shuzhong; Zhang, Yu; Zhao, Zhongming; Moore, Jason H

    2015-01-01

    Whether your interests lie in scientific arenas, the corporate world, or in government, you have certainly heard the praises of big data: Big data will give you new insights, allow you to become more efficient, and/or will solve your problems. While big data has had some outstanding successes, many are now beginning to see that it is not the Silver Bullet that it has been touted to be. Here our main concern is the overall impact of big data; the current manifestation of big data is constructing a Maginot Line in science in the 21st century. Big data is not "lots of data" as a phenomena anymore; The big data paradigm is putting the spirit of the Maginot Line into lots of data. Big data overall is disconnecting researchers and science challenges. We propose No-Boundary Thinking (NBT), applying no-boundary thinking in problem defining to address science challenges.

  14. Big Egos in Big Science

    DEFF Research Database (Denmark)

    Andersen, Kristina Vaarst; Jeppesen, Jacob

    In this paper we investigate the micro-mechanisms governing structural evolution and performance of scientific collaboration. Scientific discovery tends not to be lead by so called lone ?stars?, or big egos, but instead by collaboration among groups of researchers, from a multitude of institutions...

  15. Big Data and Big Science

    OpenAIRE

    Di Meglio, Alberto

    2014-01-01

    Brief introduction to the challenges of big data in scientific research based on the work done by the HEP community at CERN and how the CERN openlab promotes collaboration among research institutes and industrial IT companies. Presented at the FutureGov 2014 conference in Singapore.

  16. Act-Frequency Signatures of the Big Five.

    Science.gov (United States)

    Chapman, Benjamin P; Goldberg, Lewis R

    2017-10-01

    The traditional focus of work on personality and behavior has tended toward "major outcomes" such as health or antisocial behavior, or small sets of behaviors observable over short periods in laboratories or in convenience samples. In a community sample, we examined a wide set (400) of mundane, incidental or "every day" behavioral acts, the frequencies of which were reported over the past year. Using an exploratory methodology similar to genomic approaches (relying on the False Discovery Rate) revealed 26 prototypical acts for Intellect, 24 acts for Extraversion, 13 for Emotional Stability, nine for Conscientiousness, and six for Agreeableness. Many links were consistent with general intuition-for instance, low Conscientiousness with work and procrastination. Some of the most robust associations, however, were for acts too specific for a priori hypothesis. For instance, Extraversion was strongly associated with telling dirty jokes, Intellect with "loung[ing] around [the] house without clothes on", and Agreeableness with singing in the shower. Frequency categories for these acts changed with markedly non-linearity across Big Five Z-scores. Findings may help ground trait scores in emblematic acts, and enrich understanding of mundane or common behavioral signatures of the Big Five.

  17. Will Big Data Close the Missing Heritability Gap?

    Science.gov (United States)

    Kim, Hwasoon; Grueneberg, Alexander; Vazquez, Ana I; Hsu, Stephen; de Los Campos, Gustavo

    2017-11-01

    Despite the important discoveries reported by genome-wide association (GWA) studies, for most traits and diseases the prediction R-squared (R-sq.) achieved with genetic scores remains considerably lower than the trait heritability. Modern biobanks will soon deliver unprecedentedly large biomedical data sets: Will the advent of big data close the gap between the trait heritability and the proportion of variance that can be explained by a genomic predictor? We addressed this question using Bayesian methods and a data analysis approach that produces a surface response relating prediction R-sq. with sample size and model complexity ( e.g. , number of SNPs). We applied the methodology to data from the interim release of the UK Biobank. Focusing on human height as a model trait and using 80,000 records for model training, we achieved a prediction R-sq. in testing ( n = 22,221) of 0.24 (95% C.I.: 0.23-0.25). Our estimates show that prediction R-sq. increases with sample size, reaching an estimated plateau at values that ranged from 0.1 to 0.37 for models using 500 and 50,000 (GWA-selected) SNPs, respectively. Soon much larger data sets will become available. Using the estimated surface response, we forecast that larger sample sizes will lead to further improvements in prediction R-sq. We conclude that big data will lead to a substantial reduction of the gap between trait heritability and the proportion of interindividual differences that can be explained with a genomic predictor. However, even with the power of big data, for complex traits we anticipate that the gap between prediction R-sq. and trait heritability will not be fully closed. Copyright © 2017 by the Genetics Society of America.

  18. Functional connectomics from a "big data" perspective.

    Science.gov (United States)

    Xia, Mingrui; He, Yong

    2017-10-15

    In the last decade, explosive growth regarding functional connectome studies has been observed. Accumulating knowledge has significantly contributed to our understanding of the brain's functional network architectures in health and disease. With the development of innovative neuroimaging techniques, the establishment of large brain datasets and the increasing accumulation of published findings, functional connectomic research has begun to move into the era of "big data", which generates unprecedented opportunities for discovery in brain science and simultaneously encounters various challenging issues, such as data acquisition, management and analyses. Big data on the functional connectome exhibits several critical features: high spatial and/or temporal precision, large sample sizes, long-term recording of brain activity, multidimensional biological variables (e.g., imaging, genetic, demographic, cognitive and clinic) and/or vast quantities of existing findings. We review studies regarding functional connectomics from a big data perspective, with a focus on recent methodological advances in state-of-the-art image acquisition (e.g., multiband imaging), analysis approaches and statistical strategies (e.g., graph theoretical analysis, dynamic network analysis, independent component analysis, multivariate pattern analysis and machine learning), as well as reliability and reproducibility validations. We highlight the novel findings in the application of functional connectomic big data to the exploration of the biological mechanisms of cognitive functions, normal development and aging and of neurological and psychiatric disorders. We advocate the urgent need to expand efforts directed at the methodological challenges and discuss the direction of applications in this field. Copyright © 2017 Elsevier Inc. All rights reserved.

  19. Big data, smart cities and city planning.

    Science.gov (United States)

    Batty, Michael

    2013-11-01

    I define big data with respect to its size but pay particular attention to the fact that the data I am referring to is urban data, that is, data for cities that are invariably tagged to space and time. I argue that this sort of data are largely being streamed from sensors, and this represents a sea change in the kinds of data that we have about what happens where and when in cities. I describe how the growth of big data is shifting the emphasis from longer term strategic planning to short-term thinking about how cities function and can be managed, although with the possibility that over much longer periods of time, this kind of big data will become a source for information about every time horizon. By way of conclusion, I illustrate the need for new theory and analysis with respect to 6 months of smart travel card data of individual trips on Greater London's public transport systems.

  20. Big data is not a monolith

    CERN Document Server

    Ekbia, Hamid R; Mattioli, Michael

    2016-01-01

    Big data is ubiquitous but heterogeneous. Big data can be used to tally clicks and traffic on web pages, find patterns in stock trades, track consumer preferences, identify linguistic correlations in large corpuses of texts. This book examines big data not as an undifferentiated whole but contextually, investigating the varied challenges posed by big data for health, science, law, commerce, and politics. Taken together, the chapters reveal a complex set of problems, practices, and policies. The advent of big data methodologies has challenged the theory-driven approach to scientific knowledge in favor of a data-driven one. Social media platforms and self-tracking tools change the way we see ourselves and others. The collection of data by corporations and government threatens privacy while promoting transparency. Meanwhile, politicians, policy makers, and ethicists are ill-prepared to deal with big data's ramifications. The contributors look at big data's effect on individuals as it exerts social control throu...

  1. Big universe, big data

    DEFF Research Database (Denmark)

    Kremer, Jan; Stensbo-Smidt, Kristoffer; Gieseke, Fabian Cristian

    2017-01-01

    , modern astronomy requires big data know-how, in particular it demands highly efficient machine learning and image analysis algorithms. But scalability is not the only challenge: Astronomy applications touch several current machine learning research questions, such as learning from biased data and dealing......, and highlight some recent methodological advancements in machine learning and image analysis triggered by astronomical applications....

  2. Statistical model selection with “Big Data”

    Directory of Open Access Journals (Sweden)

    Jurgen A. Doornik

    2015-12-01

    Full Text Available Big Data offer potential benefits for statistical modelling, but confront problems including an excess of false positives, mistaking correlations for causes, ignoring sampling biases and selecting by inappropriate methods. We consider the many important requirements when searching for a data-based relationship using Big Data, and the possible role of Autometrics in that context. Paramount considerations include embedding relationships in general initial models, possibly restricting the number of variables to be selected over by non-statistical criteria (the formulation problem, using good quality data on all variables, analyzed with tight significance levels by a powerful selection procedure, retaining available theory insights (the selection problem while testing for relationships being well specified and invariant to shifts in explanatory variables (the evaluation problem, using a viable approach that resolves the computational problem of immense numbers of possible models.

  3. Poker Player Behavior After Big Wins and Big Losses

    OpenAIRE

    Gary Smith; Michael Levere; Robert Kurtzman

    2009-01-01

    We find that experienced poker players typically change their style of play after winning or losing a big pot--most notably, playing less cautiously after a big loss, evidently hoping for lucky cards that will erase their loss. This finding is consistent with Kahneman and Tversky's (Kahneman, D., A. Tversky. 1979. Prospect theory: An analysis of decision under risk. Econometrica 47(2) 263-292) break-even hypothesis and suggests that when investors incur a large loss, it might be time to take ...

  4. Big Data and Chemical Education

    Science.gov (United States)

    Pence, Harry E.; Williams, Antony J.

    2016-01-01

    The amount of computerized information that organizations collect and process is growing so large that the term Big Data is commonly being used to describe the situation. Accordingly, Big Data is defined by a combination of the Volume, Variety, Velocity, and Veracity of the data being processed. Big Data tools are already having an impact in…

  5. Big data in Finnish financial services

    OpenAIRE

    Laurila, M. (Mikko)

    2017-01-01

    Abstract This thesis aims to explore the concept of big data, and create understanding of big data maturity in the Finnish financial services industry. The research questions of this thesis are “What kind of big data solutions are being implemented in the Finnish financial services sector?” and “Which factors impede faster implementation of big data solutions in the Finnish financial services sector?”. ...

  6. Unifying the aspects of the Big Five, the interpersonal circumplex, and trait affiliation.

    Science.gov (United States)

    DeYoung, Colin G; Weisberg, Yanna J; Quilty, Lena C; Peterson, Jordan B

    2013-10-01

    Two dimensions of the Big Five, Extraversion and Agreeableness, are strongly related to interpersonal behavior. Factor analysis has indicated that each of the Big Five contains two separable but related aspects. The present study examined the manner in which the aspects of Extraversion (Assertiveness and Enthusiasm) and Agreeableness (Compassion and Politeness) relate to interpersonal behavior and trait affiliation, with the hypothesis that these four aspects have a structure corresponding to the octants of the interpersonal circumplex. A second hypothesis was that measures of trait affiliation would fall between Enthusiasm and Compassion in the IPC. These hypotheses were tested in three demographically different samples (N = 469; 294; 409) using both behavioral frequency and trait measures of the interpersonal circumplex, in conjunction with the Big Five Aspect Scales (BFAS) and measures of trait affiliation. Both hypotheses were strongly supported. These findings provide a more thorough and precise mapping of the interpersonal traits within the Big Five and support the integration of the Big Five with models of interpersonal behavior and trait affiliation. © 2012 Wiley Periodicals, Inc.

  7. Big Five Traits and Inclusive Generalized Prejudice

    OpenAIRE

    Brandt, Mark; Crawford, Jarret

    2018-01-01

    Existing meta-analytic evidence finds that low levels of Openness and Agreeableness correlate with generalized prejudice. However, previous studies relied on restricted operationalizations of generalized prejudice that only assessed prejudice toward disadvantaged, low-status groups. Across four samples (total N = 7,543), we tested the associations between Big Five traits and generalized prejudice using an inclusive operationalization of generalized prejudice. A meta-analysis of these findings...

  8. Supporting diagnosis and treatment in medical care based on Big Data processing.

    Science.gov (United States)

    Lupşe, Oana-Sorina; Crişan-Vida, Mihaela; Stoicu-Tivadar, Lăcrămioara; Bernard, Elena

    2014-01-01

    With information and data in all domains growing every day, it is difficult to manage and extract useful knowledge for specific situations. This paper presents an integrated system architecture to support the activity in the Ob-Gin departments with further developments in using new technology to manage Big Data processing - using Google BigQuery - in the medical domain. The data collected and processed with Google BigQuery results from different sources: two Obstetrics & Gynaecology Departments, the TreatSuggest application - an application for suggesting treatments, and a home foetal surveillance system. Data is uploaded in Google BigQuery from Bega Hospital Timişoara, Romania. The analysed data is useful for the medical staff, researchers and statisticians from public health domain. The current work describes the technological architecture and its processing possibilities that in the future will be proved based on quality criteria to lead to a better decision process in diagnosis and public health.

  9. Performance Isolation in Cloud-Based Big Data Architectures

    NARCIS (Netherlands)

    Tekinerdogan, B.; Oral, Alp

    2017-01-01

    Cloud-based big data systems usually have many different tenants that require access to the server's functionality. In a nonisolated cloud system, the different tenants can freely use the resources of the server. Hereby, disruptive tenants who exceed their limits can easily cause degradation of

  10. The BigBOSS Experiment

    Energy Technology Data Exchange (ETDEWEB)

    Schelgel, D.; Abdalla, F.; Abraham, T.; Ahn, C.; Allende Prieto, C.; Annis, J.; Aubourg, E.; Azzaro, M.; Bailey, S.; Baltay, C.; Baugh, C.; /APC, Paris /Brookhaven /IRFU, Saclay /Marseille, CPPM /Marseille, CPT /Durham U. / /IEU, Seoul /Fermilab /IAA, Granada /IAC, La Laguna

    2011-01-01

    BigBOSS will obtain observational constraints that will bear on three of the four 'science frontier' questions identified by the Astro2010 Cosmology and Fundamental Phyics Panel of the Decadal Survey: Why is the universe accelerating; what is dark matter and what are the properties of neutrinos? Indeed, the BigBOSS project was recommended for substantial immediate R and D support the PASAG report. The second highest ground-based priority from the Astro2010 Decadal Survey was the creation of a funding line within the NSF to support a 'Mid-Scale Innovations' program, and it used BigBOSS as a 'compelling' example for support. This choice was the result of the Decadal Survey's Program Priorization panels reviewing 29 mid-scale projects and recommending BigBOSS 'very highly'.

  11. Big game hunting practices, meanings, motivations and constraints: a survey of Oregon big game hunters

    Science.gov (United States)

    Suresh K. Shrestha; Robert C. Burns

    2012-01-01

    We conducted a self-administered mail survey in September 2009 with randomly selected Oregon hunters who had purchased big game hunting licenses/tags for the 2008 hunting season. Survey questions explored hunting practices, the meanings of and motivations for big game hunting, the constraints to big game hunting participation, and the effects of age, years of hunting...

  12. BIG: a Grid Portal for Biomedical Data and Images

    Directory of Open Access Journals (Sweden)

    Giovanni Aloisio

    2004-06-01

    Full Text Available Modern management of biomedical systems involves the use of many distributed resources, such as high performance computational resources to analyze biomedical data, mass storage systems to store them, medical instruments (microscopes, tomographs, etc., advanced visualization and rendering tools. Grids offer the computational power, security and availability needed by such novel applications. This paper presents BIG (Biomedical Imaging Grid, a Web-based Grid portal for management of biomedical information (data and images in a distributed environment. BIG is an interactive environment that deals with complex user's requests, regarding the acquisition of biomedical data, the "processing" and "delivering" of biomedical images, using the power and security of Computational Grids.

  13. Trait Emotional Intelligence and Personality: Gender-Invariant Linkages Across Different Measures of the Big Five.

    Science.gov (United States)

    Siegling, Alexander B; Furnham, Adrian; Petrides, K V

    2015-02-01

    This study investigated if the linkages between trait emotional intelligence (trait EI) and the Five-Factor Model of personality were invariant between men and women. Five English-speaking samples ( N = 307-685) of mostly undergraduate students each completed a different measure of the Big Five personality traits and either the full form or short form of the Trait Emotional Intelligence Questionnaire (TEIQue). Across samples, models predicting global TEIQue scores from the Big Five were invariant between genders, with Neuroticism and Extraversion being the strongest trait EI correlates, followed by Conscientiousness, Agreeableness, and Openness. However, there was some evidence indicating that the gender-specific contributions of the Big Five to trait EI vary depending on the personality measure used, being more consistent for women. Discussion focuses on the validity of the TEIQue as a measure of trait EI and its psychometric properties, more generally.

  14. The Opportunity and Challenge of The Age of Big Data

    Science.gov (United States)

    Yunguo, Hong

    2017-11-01

    The arrival of large data age has gradually expanded the scale of information industry in China, which has created favorable conditions for the expansion of information technology and computer network. Based on big data the computer system service function is becoming more and more perfect, and the efficiency of data processing in the system is improving, which provides important guarantee for the implementation of production plan in various industries. At the same time, the rapid development of fields such as Internet of things, social tools, cloud computing and the widen of information channel, these make the amount of data is increase, expand the influence range of the age of big data, we need to take the opportunities and challenges of the age of big data correctly, use data information resources effectively. Based on this, this paper will study the opportunities and challenges of the era of large data.

  15. Big Data solutions on a small scale: Evaluating accessible high-performance computing for social research

    OpenAIRE

    Murthy, Dhiraj; Bowman, S. A.

    2014-01-01

    Though full of promise, Big Data research success is often contingent on access to the newest, most advanced, and often expensive hardware systems and the expertise needed to build and implement such systems. As a result, the accessibility of the growing number of Big Data-capable technology solutions has often been the preserve of business analytics. Pay as you store/process services like Amazon Web Services have opened up possibilities for smaller scale Big Data projects. There is high dema...

  16. Google BigQuery analytics

    CERN Document Server

    Tigani, Jordan

    2014-01-01

    How to effectively use BigQuery, avoid common mistakes, and execute sophisticated queries against large datasets Google BigQuery Analytics is the perfect guide for business and data analysts who want the latest tips on running complex queries and writing code to communicate with the BigQuery API. The book uses real-world examples to demonstrate current best practices and techniques, and also explains and demonstrates streaming ingestion, transformation via Hadoop in Google Compute engine, AppEngine datastore integration, and using GViz with Tableau to generate charts of query results. In addit

  17. Big data for dummies

    CERN Document Server

    Hurwitz, Judith; Halper, Fern; Kaufman, Marcia

    2013-01-01

    Find the right big data solution for your business or organization Big data management is one of the major challenges facing business, industry, and not-for-profit organizations. Data sets such as customer transactions for a mega-retailer, weather patterns monitored by meteorologists, or social network activity can quickly outpace the capacity of traditional data management tools. If you need to develop or manage big data solutions, you'll appreciate how these four experts define, explain, and guide you through this new and often confusing concept. You'll learn what it is, why it m

  18. Phantom cosmology without Big Rip singularity

    Energy Technology Data Exchange (ETDEWEB)

    Astashenok, Artyom V. [Baltic Federal University of I. Kant, Department of Theoretical Physics, 236041, 14, Nevsky st., Kaliningrad (Russian Federation); Nojiri, Shin' ichi, E-mail: nojiri@phys.nagoya-u.ac.jp [Department of Physics, Nagoya University, Nagoya 464-8602 (Japan); Kobayashi-Maskawa Institute for the Origin of Particles and the Universe, Nagoya University, Nagoya 464-8602 (Japan); Odintsov, Sergei D. [Department of Physics, Nagoya University, Nagoya 464-8602 (Japan); Institucio Catalana de Recerca i Estudis Avancats - ICREA and Institut de Ciencies de l' Espai (IEEC-CSIC), Campus UAB, Facultat de Ciencies, Torre C5-Par-2a pl, E-08193 Bellaterra (Barcelona) (Spain); Tomsk State Pedagogical University, Tomsk (Russian Federation); Yurov, Artyom V. [Baltic Federal University of I. Kant, Department of Theoretical Physics, 236041, 14, Nevsky st., Kaliningrad (Russian Federation)

    2012-03-23

    We construct phantom energy models with the equation of state parameter w which is less than -1, w<-1, but finite-time future singularity does not occur. Such models can be divided into two classes: (i) energy density increases with time ('phantom energy' without 'Big Rip' singularity) and (ii) energy density tends to constant value with time ('cosmological constant' with asymptotically de Sitter evolution). The disintegration of bound structure is confirmed in Little Rip cosmology. Surprisingly, we find that such disintegration (on example of Sun-Earth system) may occur even in asymptotically de Sitter phantom universe consistent with observational data. We also demonstrate that non-singular phantom models admit wormhole solutions as well as possibility of Big Trip via wormholes.

  19. Phantom cosmology without Big Rip singularity

    International Nuclear Information System (INIS)

    Astashenok, Artyom V.; Nojiri, Shin'ichi; Odintsov, Sergei D.; Yurov, Artyom V.

    2012-01-01

    We construct phantom energy models with the equation of state parameter w which is less than -1, w<-1, but finite-time future singularity does not occur. Such models can be divided into two classes: (i) energy density increases with time (“phantom energy” without “Big Rip” singularity) and (ii) energy density tends to constant value with time (“cosmological constant” with asymptotically de Sitter evolution). The disintegration of bound structure is confirmed in Little Rip cosmology. Surprisingly, we find that such disintegration (on example of Sun-Earth system) may occur even in asymptotically de Sitter phantom universe consistent with observational data. We also demonstrate that non-singular phantom models admit wormhole solutions as well as possibility of Big Trip via wormholes.

  20. Exploring complex and big data

    Directory of Open Access Journals (Sweden)

    Stefanowski Jerzy

    2017-12-01

    Full Text Available This paper shows how big data analysis opens a range of research and technological problems and calls for new approaches. We start with defining the essential properties of big data and discussing the main types of data involved. We then survey the dedicated solutions for storing and processing big data, including a data lake, virtual integration, and a polystore architecture. Difficulties in managing data quality and provenance are also highlighted. The characteristics of big data imply also specific requirements and challenges for data mining algorithms, which we address as well. The links with related areas, including data streams and deep learning, are discussed. The common theme that naturally emerges from this characterization is complexity. All in all, we consider it to be the truly defining feature of big data (posing particular research and technological challenges, which ultimately seems to be of greater importance than the sheer data volume.

  1. Was there a big bang

    International Nuclear Information System (INIS)

    Narlikar, J.

    1981-01-01

    In discussing the viability of the big-bang model of the Universe relative evidence is examined including the discrepancies in the age of the big-bang Universe, the red shifts of quasars, the microwave background radiation, general theory of relativity aspects such as the change of the gravitational constant with time, and quantum theory considerations. It is felt that the arguments considered show that the big-bang picture is not as soundly established, either theoretically or observationally, as it is usually claimed to be, that the cosmological problem is still wide open and alternatives to the standard big-bang picture should be seriously investigated. (U.K.)

  2. BIG DATA-DRIVEN MARKETING: AN ABSTRACT

    OpenAIRE

    Suoniemi, Samppa; Meyer-Waarden, Lars; Munzel, Andreas

    2017-01-01

    Customer information plays a key role in managing successful relationships with valuable customers. Big data customer analytics use (BD use), i.e., the extent to which customer information derived from big data analytics guides marketing decisions, helps firms better meet customer needs for competitive advantage. This study addresses three research questions: What are the key antecedents of big data customer analytics use? How, and to what extent, does big data customer an...

  3. Big Data Analytics in Medicine and Healthcare.

    Science.gov (United States)

    Ristevski, Blagoj; Chen, Ming

    2018-05-10

    This paper surveys big data with highlighting the big data analytics in medicine and healthcare. Big data characteristics: value, volume, velocity, variety, veracity and variability are described. Big data analytics in medicine and healthcare covers integration and analysis of large amount of complex heterogeneous data such as various - omics data (genomics, epigenomics, transcriptomics, proteomics, metabolomics, interactomics, pharmacogenomics, diseasomics), biomedical data and electronic health records data. We underline the challenging issues about big data privacy and security. Regarding big data characteristics, some directions of using suitable and promising open-source distributed data processing software platform are given.

  4. The trashing of Big Green

    International Nuclear Information System (INIS)

    Felten, E.

    1990-01-01

    The Big Green initiative on California's ballot lost by a margin of 2-to-1. Green measures lost in five other states, shocking ecology-minded groups. According to the postmortem by environmentalists, Big Green was a victim of poor timing and big spending by the opposition. Now its supporters plan to break up the bill and try to pass some provisions in the Legislature

  5. The Big Bang Singularity

    Science.gov (United States)

    Ling, Eric

    The big bang theory is a model of the universe which makes the striking prediction that the universe began a finite amount of time in the past at the so called "Big Bang singularity." We explore the physical and mathematical justification of this surprising result. After laying down the framework of the universe as a spacetime manifold, we combine physical observations with global symmetrical assumptions to deduce the FRW cosmological models which predict a big bang singularity. Next we prove a couple theorems due to Stephen Hawking which show that the big bang singularity exists even if one removes the global symmetrical assumptions. Lastly, we investigate the conditions one needs to impose on a spacetime if one wishes to avoid a singularity. The ideas and concepts used here to study spacetimes are similar to those used to study Riemannian manifolds, therefore we compare and contrast the two geometries throughout.

  6. Bohmian quantization of the big rip

    International Nuclear Information System (INIS)

    Pinto-Neto, Nelson; Pantoja, Diego Moraes

    2009-01-01

    It is shown in this paper that minisuperspace quantization of homogeneous and isotropic geometries with phantom scalar fields, when examined in the light of the Bohm-de Broglie interpretation of quantum mechanics, does not eliminate, in general, the classical big rip singularity present in the classical model. For some values of the Hamilton-Jacobi separation constant present in a class of quantum state solutions of the Wheeler-De Witt equation, the big rip can be either completely eliminated or may still constitute a future attractor for all expanding solutions. This is contrary to the conclusion presented in [M. P. Dabrowski, C. Kiefer, and B. Sandhofer, Phys. Rev. D 74, 044022 (2006).], using a different interpretation of the wave function, where the big rip singularity is completely eliminated ('smoothed out') through quantization, independently of such a separation constant and for all members of the above mentioned class of solutions. This is an example of the very peculiar situation where different interpretations of the same quantum state of a system are predicting different physical facts, instead of just giving different descriptions of the same observable facts: in fact, there is nothing more observable than the fate of the whole Universe.

  7. Implementing the “Big Data” Concept in Official Statistics

    OpenAIRE

    О. V.

    2017-01-01

    Big data is a huge resource that needs to be used at all levels of economic planning. The article is devoted to the study of the development of the concept of “Big Data” in the world and its impact on the transformation of statistical simulation of economic processes. Statistics at the current stage should take into account the complex system of international economic relations, which functions in the conditions of globalization and brings new forms of economic development in small open ec...

  8. Are Rural Costs of Living Lower? Evidence from a Big Mac Index Approach

    OpenAIRE

    Scott Loveridge; Dusan Paredes

    2015-01-01

    Rural leaders can point to low housing costs as a reason that their area should be competitive for business attraction. To what extent do rural housing costs offset transportation and other locational disadvantages in costs structures? The US lacks information to systematically answer the question. We adapt a strategy employed by The Economist in exploring purchasing power parity: the Big Mac Index. We gather information on Big Mac prices with a random sample of restaurants across the contigu...

  9. Subscales of the Barratt Impulsiveness Scale differentially relate to the Big Five factors of personality.

    Science.gov (United States)

    Lange, Florian; Wagner, Adina; Müller, Astrid; Eggert, Frank

    2017-06-01

    The place of impulsiveness in multidimensional personality frameworks is still unclear. In particular, no consensus has yet been reached with regard to the relation of impulsiveness to Neuroticism and Extraversion. We aim to contribute to a clearer understanding of these relationships by accounting for the multidimensional structure of impulsiveness. In three independent studies, we related the subscales of the Barratt Impulsiveness Scale (BIS) to the Big Five factors of personality. Study 1 investigated the associations between the BIS subscales and the Big Five factors as measured by the NEO Five-Factor Inventory (NEO-FFI) in a student sample (N = 113). Selective positive correlations emerged between motor impulsiveness and Extraversion and between attentional impulsiveness and Neuroticism. This pattern of results was replicated in Study 2 (N = 132) using a 10-item short version of the Big Five Inventory. In Study 3, we analyzed BIS and NEO-FFI data obtained from a sample of patients with pathological buying (N = 68). In these patients, the relationship between motor impulsiveness and Extraversion was significantly weakened when compared to the non-clinical samples. At the same time, the relationship between attentional impulsiveness and Neuroticism was substantially stronger in the clinical sample. Our studies highlight the utility of the BIS subscales for clarifying the relationship between impulsiveness and the Big Five personality factors. We conclude that impulsiveness might occupy multiple places in multidimensional personality frameworks, which need to be specified to improve the interpretability of impulsiveness scales. © 2017 Scandinavian Psychological Associations and John Wiley & Sons Ltd.

  10. Effects of Pros and Cons of Applying Big Data Analytics to Consumers’ Responses in an E-Commerce Context

    Directory of Open Access Journals (Sweden)

    Thi Mai Le

    2017-05-01

    Full Text Available The era of Big Data analytics has begun in most industries within developed countries. This new analytics tool has raised motivation for experts and researchers to study its impacts to business values and challenges. However, studies which help to understand customers’ views and their behavior towards the applications of Big Data analytics are lacking. This research aims to explore and determine the pros and cons of applying Big Data analytics that affects customers’ responses in an e-commerce environment. Data analyses were conducted in a sample of 273 respondents from Vietnam. The findings found that information search, recommendation system, dynamic pricing, and customer services had significant positive effects on customers’ responses. Privacy and security, shopping addiction, and group influences were found to have significant negative effects on customers’ responses. Customers’ responses were measured at intention and behavior stages. Moreover, positive and negative effects simultaneously presented significant effect on customers’ responses. Each dimension of positive and negative factors had different significant impacts on customers’ intention and behavior. Specifically, information search had a significant influence on customers’ intention and improved customers’ behavior. Shopping addiction had a drastic change from intention to behavior compared to group influences and privacy and security. This study contributes to improve understanding of customers’ responses under big data era. This could play an important role to develop sustainable consumers market. E-vendors can rely on Big Data analytics but over usage may have some negative applications.

  11. A Hybrid Multilevel Storage Architecture for Electric Power Dispatching Big Data

    Science.gov (United States)

    Yan, Hu; Huang, Bibin; Hong, Bowen; Hu, Jing

    2017-10-01

    Electric power dispatching is the center of the whole power system. In the long run time, the power dispatching center has accumulated a large amount of data. These data are now stored in different power professional systems and form lots of information isolated islands. Integrating these data and do comprehensive analysis can greatly improve the intelligent level of power dispatching. In this paper, a hybrid multilevel storage architecture for electrical power dispatching big data is proposed. It introduces relational database and NoSQL database to establish a power grid panoramic data center, effectively meet power dispatching big data storage needs, including the unified storage of structured and unstructured data fast access of massive real-time data, data version management and so on. It can be solid foundation for follow-up depth analysis of power dispatching big data.

  12. Big Data solutions on a small scale: Evaluating accessible high-performance computing for social research

    Directory of Open Access Journals (Sweden)

    Dhiraj Murthy

    2014-11-01

    Full Text Available Though full of promise, Big Data research success is often contingent on access to the newest, most advanced, and often expensive hardware systems and the expertise needed to build and implement such systems. As a result, the accessibility of the growing number of Big Data-capable technology solutions has often been the preserve of business analytics. Pay as you store/process services like Amazon Web Services have opened up possibilities for smaller scale Big Data projects. There is high demand for this type of research in the digital humanities and digital sociology, for example. However, scholars are increasingly finding themselves at a disadvantage as available data sets of interest continue to grow in size and complexity. Without a large amount of funding or the ability to form interdisciplinary partnerships, only a select few find themselves in the position to successfully engage Big Data. This article identifies several notable and popular Big Data technologies typically implemented using large and extremely powerful cloud-based systems and investigates the feasibility and utility of development of Big Data analytics systems implemented using low-cost commodity hardware in basic and easily maintainable configurations for use within academic social research. Through our investigation and experimental case study (in the growing field of social Twitter analytics, we found that not only are solutions like Cloudera’s Hadoop feasible, but that they can also enable robust, deep, and fruitful research outcomes in a variety of use-case scenarios across the disciplines.

  13. Big Data Analytics and Its Applications

    Directory of Open Access Journals (Sweden)

    Mashooque A. Memon

    2017-10-01

    Full Text Available The term, Big Data, has been authored to refer to the extensive heave of data that can't be managed by traditional data handling methods or techniques. The field of Big Data plays an indispensable role in various fields, such as agriculture, banking, data mining, education, chemistry, finance, cloud computing, marketing, health care stocks. Big data analytics is the method for looking at big data to reveal hidden patterns, incomprehensible relationship and other important data that can be utilize to resolve on enhanced decisions. There has been a perpetually expanding interest for big data because of its fast development and since it covers different areas of applications. Apache Hadoop open source technology created in Java and keeps running on Linux working framework was used. The primary commitment of this exploration is to display an effective and free solution for big data application in a distributed environment, with its advantages and indicating its easy use. Later on, there emerge to be a required for an analytical review of new developments in the big data technology. Healthcare is one of the best concerns of the world. Big data in healthcare imply to electronic health data sets that are identified with patient healthcare and prosperity. Data in the healthcare area is developing past managing limit of the healthcare associations and is relied upon to increment fundamentally in the coming years.

  14. Measuring the Promise of Big Data Syllabi

    Science.gov (United States)

    Friedman, Alon

    2018-01-01

    Growing interest in Big Data is leading industries, academics and governments to accelerate Big Data research. However, how teachers should teach Big Data has not been fully examined. This article suggests criteria for redesigning Big Data syllabi in public and private degree-awarding higher education establishments. The author conducted a survey…

  15. WE-H-BRB-02: Where Do We Stand in the Applications of Big Data in Radiation Oncology?

    International Nuclear Information System (INIS)

    Xing, L.

    2016-01-01

    Big Data in Radiation Oncology: (1) Overview of the NIH 2015 Big Data Workshop, (2) Where do we stand in the applications of big data in radiation oncology?, and (3) Learning Health Systems for Radiation Oncology: Needs and Challenges for Future Success The overriding goal of this trio panel of presentations is to improve awareness of the wide ranging opportunities for big data impact on patient quality care and enhancing potential for research and collaboration opportunities with NIH and a host of new big data initiatives. This presentation will also summarize the Big Data workshop that was held at the NIH Campus on August 13–14, 2015 and sponsored by AAPM, ASTRO, and NIH. The workshop included discussion of current Big Data cancer registry initiatives, safety and incident reporting systems, and other strategies that will have the greatest impact on radiation oncology research, quality assurance, safety, and outcomes analysis. Learning Objectives: To discuss current and future sources of big data for use in radiation oncology research To optimize our current data collection by adopting new strategies from outside radiation oncology To determine what new knowledge big data can provide for clinical decision support for personalized medicine L. Xing, NIH/NCI Google Inc.

  16. WE-H-BRB-02: Where Do We Stand in the Applications of Big Data in Radiation Oncology?

    Energy Technology Data Exchange (ETDEWEB)

    Xing, L. [Stanford University School of Medicine (United States)

    2016-06-15

    Big Data in Radiation Oncology: (1) Overview of the NIH 2015 Big Data Workshop, (2) Where do we stand in the applications of big data in radiation oncology?, and (3) Learning Health Systems for Radiation Oncology: Needs and Challenges for Future Success The overriding goal of this trio panel of presentations is to improve awareness of the wide ranging opportunities for big data impact on patient quality care and enhancing potential for research and collaboration opportunities with NIH and a host of new big data initiatives. This presentation will also summarize the Big Data workshop that was held at the NIH Campus on August 13–14, 2015 and sponsored by AAPM, ASTRO, and NIH. The workshop included discussion of current Big Data cancer registry initiatives, safety and incident reporting systems, and other strategies that will have the greatest impact on radiation oncology research, quality assurance, safety, and outcomes analysis. Learning Objectives: To discuss current and future sources of big data for use in radiation oncology research To optimize our current data collection by adopting new strategies from outside radiation oncology To determine what new knowledge big data can provide for clinical decision support for personalized medicine L. Xing, NIH/NCI Google Inc.

  17. Modelling the low-tar BIG gasification concept[Biomass Integrated gasification

    Energy Technology Data Exchange (ETDEWEB)

    Andersen, Lars; Elmegaard, B.; Qvale, B.; Henriksen, Ulrrik [Technical univ. of Denmark (Denmark); Bentzen, J.D.; Hummelshoej, R. [COWI A/S (Denmark)

    2007-07-01

    A low-tar, high-efficient biomass gasification concept for medium- to large-scale power plants has been designed. The concept is named 'Low-Tar BIG' (BIG = Biomass Integrated Gasification). The concept is based on separate pyrolysis and gasification units. The volatile gases from the pyrolysis (containing tar) are partially oxidised in a separate chamber, and hereby the tar content is dramatically reduced. Thus, the investment, and running cost of a gas cleaning system can be reduced, and the reliability can be increased. Both pyrolysis and gasification chamber are bubbling fluid beds, fluidised with steam. For moist fuels, the gasifier can be integrated with a steam drying process, where the produced steam is used in the pyrolysis/gasification chamber. In this paper, mathematical models and results from initial tests of a laboratory Low-Tar BIG gasifier are presented. Two types of models are presented: 1. The gasifier-dryer applied in different power plant systems: Gas engine, Simple cycle gas turbine, Recuperated gas turbine and Integrated Gasification and Combined Cycle (IGCC). The paper determines the differences in efficiency of these systems and shows that the gasifier will be applicable for very different fuels with different moisture contents, depending on the system. 2. A thermodynamic Low-Tar BIG model. This model is based on mass and heat balance between four reactors: Pyrolysis, partial oxidation, gasification, gas-solid mixer. The paper describes the results from this study and compares the results to actual laboratory tests. The study shows, that the Low-Tar BIG process can use very wet fuels (up to 65-70% moist) and still produce heat and power with a remarkable high electric efficiency. Hereby the process offers the unique combination of large scale gasification and low-cost gas cleaning and use of low-cost fuels which very likely is the necessary combination that will lead to a breakthrough of gasification technology. (au)

  18. 77 FR 27245 - Big Stone National Wildlife Refuge, Big Stone and Lac Qui Parle Counties, MN

    Science.gov (United States)

    2012-05-09

    ... DEPARTMENT OF THE INTERIOR Fish and Wildlife Service [FWS-R3-R-2012-N069; FXRS1265030000S3-123-FF03R06000] Big Stone National Wildlife Refuge, Big Stone and Lac Qui Parle Counties, MN AGENCY: Fish and... plan (CCP) and environmental assessment (EA) for Big Stone National Wildlife Refuge (Refuge, NWR) for...

  19. Big data in oncologic imaging.

    Science.gov (United States)

    Regge, Daniele; Mazzetti, Simone; Giannini, Valentina; Bracco, Christian; Stasi, Michele

    2017-06-01

    Cancer is a complex disease and unfortunately understanding how the components of the cancer system work does not help understand the behavior of the system as a whole. In the words of the Greek philosopher Aristotle "the whole is greater than the sum of parts." To date, thanks to improved information technology infrastructures, it is possible to store data from each single cancer patient, including clinical data, medical images, laboratory tests, and pathological and genomic information. Indeed, medical archive storage constitutes approximately one-third of total global storage demand and a large part of the data are in the form of medical images. The opportunity is now to draw insight on the whole to the benefit of each individual patient. In the oncologic patient, big data analysis is at the beginning but several useful applications can be envisaged including development of imaging biomarkers to predict disease outcome, assessing the risk of X-ray dose exposure or of renal damage following the administration of contrast agents, and tracking and optimizing patient workflow. The aim of this review is to present current evidence of how big data derived from medical images may impact on the diagnostic pathway of the oncologic patient.

  20. The BigBoss Experiment

    Energy Technology Data Exchange (ETDEWEB)

    Schelgel, D.; Abdalla, F.; Abraham, T.; Ahn, C.; Allende Prieto, C.; Annis, J.; Aubourg, E.; Azzaro, M.; Bailey, S.; Baltay, C.; Baugh, C.; Bebek, C.; Becerril, S.; Blanton, M.; Bolton, A.; Bromley, B.; Cahn, R.; Carton, P.-H.; Cervanted-Cota, J.L.; Chu, Y.; Cortes, M.; /APC, Paris /Brookhaven /IRFU, Saclay /Marseille, CPPM /Marseille, CPT /Durham U. / /IEU, Seoul /Fermilab /IAA, Granada /IAC, La Laguna / /IAC, Mexico / / /Madrid, IFT /Marseille, Lab. Astrophys. / / /New York U. /Valencia U.

    2012-06-07

    BigBOSS is a Stage IV ground-based dark energy experiment to study baryon acoustic oscillations (BAO) and the growth of structure with a wide-area galaxy and quasar redshift survey over 14,000 square degrees. It has been conditionally accepted by NOAO in response to a call for major new instrumentation and a high-impact science program for the 4-m Mayall telescope at Kitt Peak. The BigBOSS instrument is a robotically-actuated, fiber-fed spectrograph capable of taking 5000 simultaneous spectra over a wavelength range from 340 nm to 1060 nm, with a resolution R = {lambda}/{Delta}{lambda} = 3000-4800. Using data from imaging surveys that are already underway, spectroscopic targets are selected that trace the underlying dark matter distribution. In particular, targets include luminous red galaxies (LRGs) up to z = 1.0, extending the BOSS LRG survey in both redshift and survey area. To probe the universe out to even higher redshift, BigBOSS will target bright [OII] emission line galaxies (ELGs) up to z = 1.7. In total, 20 million galaxy redshifts are obtained to measure the BAO feature, trace the matter power spectrum at smaller scales, and detect redshift space distortions. BigBOSS will provide additional constraints on early dark energy and on the curvature of the universe by measuring the Ly-alpha forest in the spectra of over 600,000 2.2 < z < 3.5 quasars. BigBOSS galaxy BAO measurements combined with an analysis of the broadband power, including the Ly-alpha forest in BigBOSS quasar spectra, achieves a FOM of 395 with Planck plus Stage III priors. This FOM is based on conservative assumptions for the analysis of broad band power (k{sub max} = 0.15), and could grow to over 600 if current work allows us to push the analysis to higher wave numbers (k{sub max} = 0.3). BigBOSS will also place constraints on theories of modified gravity and inflation, and will measure the sum of neutrino masses to 0.024 eV accuracy.

  1. Toward a Learning Health-care System - Knowledge Delivery at the Point of Care Empowered by Big Data and NLP.

    Science.gov (United States)

    Kaggal, Vinod C; Elayavilli, Ravikumar Komandur; Mehrabi, Saeed; Pankratz, Joshua J; Sohn, Sunghwan; Wang, Yanshan; Li, Dingcheng; Rastegar, Majid Mojarad; Murphy, Sean P; Ross, Jason L; Chaudhry, Rajeev; Buntrock, James D; Liu, Hongfang

    2016-01-01

    The concept of optimizing health care by understanding and generating knowledge from previous evidence, ie, the Learning Health-care System (LHS), has gained momentum and now has national prominence. Meanwhile, the rapid adoption of electronic health records (EHRs) enables the data collection required to form the basis for facilitating LHS. A prerequisite for using EHR data within the LHS is an infrastructure that enables access to EHR data longitudinally for health-care analytics and real time for knowledge delivery. Additionally, significant clinical information is embedded in the free text, making natural language processing (NLP) an essential component in implementing an LHS. Herein, we share our institutional implementation of a big data-empowered clinical NLP infrastructure, which not only enables health-care analytics but also has real-time NLP processing capability. The infrastructure has been utilized for multiple institutional projects including the MayoExpertAdvisor, an individualized care recommendation solution for clinical care. We compared the advantages of big data over two other environments. Big data infrastructure significantly outperformed other infrastructure in terms of computing speed, demonstrating its value in making the LHS a possibility in the near future.

  2. Technology Evaluation for the Big Spring Water Treatment System at the Y-12 National Security Complex, Oak Ridge, Tennessee

    International Nuclear Information System (INIS)

    Bechtel Jacobs Company LLC

    2002-01-01

    The Y-12 National Security Complex (Y-12 Complex) is an active manufacturing and developmental engineering facility that is located on the U.S. Department of Energy (DOE) Oak Ridge Reservation. Building 9201-2 was one of the first process buildings constructed at the Y-12 Complex. Construction involved relocating and straightening of the Upper East Fork Poplar Creek (UEFPC) channel, adding large quantities of fill material to level areas along the creek, and pumping of concrete into sinkholes and solution cavities present within the limestone bedrock. Flow from a large natural spring designated as ''Big Spring'' on the original 1943 Stone and Webster Building 9201-2 Field Sketch FS6003 was captured and directed to UEFPC through a drainpipe designated Outfall 51. The building was used from 1953 to 1955 for pilot plant operations for an industrial process that involved the use of large quantities of elemental mercury. Past operations at the Y-12 Complex led to the release of mercury to the environment. Significant environmental media at the site were contaminated by accidental releases of mercury from the building process facilities piping and sumps associated with Y-12 Complex mercury handling facilities. Releases to the soil surrounding the buildings have resulted in significant levels of mercury in these areas of contamination, which is ultimately transported to UEFPC, its streambed, and off-site. Bechtel Jacobs Company LLC (BJC) is the DOE-Oak Ridge Operations prime contractor responsible for conducting environmental restoration activities at the Y-12 Complex. In order to mitigate the mercury being released to UEFPC, the Big Spring Water Treatment System will be designed and constructed as a Comprehensive Environmental Response, Compensation, and Liability Act action. This facility will treat the combined flow from Big Spring feeding Outfall 51 and the inflow now being processed at the East End Mercury Treatment System (EEMTS). Both discharge to UEFPC adjacent to

  3. Big-Data-Based Thermal Runaway Prognosis of Battery Systems for Electric Vehicles

    Directory of Open Access Journals (Sweden)

    Jichao Hong

    2017-07-01

    Full Text Available A thermal runaway prognosis scheme for battery systems in electric vehicles is proposed based on the big data platform and entropy method. It realizes the diagnosis and prognosis of thermal runaway simultaneously, which is caused by the temperature fault through monitoring battery temperature during vehicular operations. A vast quantity of real-time voltage monitoring data is derived from the National Service and Management Center for Electric Vehicles (NSMC-EV in Beijing. Furthermore, a thermal security management strategy for thermal runaway is presented under the Z-score approach. The abnormity coefficient is introduced to present real-time precautions of temperature abnormity. The results illustrated that the proposed method can accurately forecast both the time and location of the temperature fault within battery packs. The presented method is flexible in all disorder systems and possesses widespread application potential in not only electric vehicles, but also other areas with complex abnormal fluctuating environments.

  4. Big data and educational research

    OpenAIRE

    Beneito-Montagut, Roser

    2017-01-01

    Big data and data analytics offer the promise to enhance teaching and learning, improve educational research and progress education governance. This chapter aims to contribute to the conceptual and methodological understanding of big data and analytics within educational research. It describes the opportunities and challenges that big data and analytics bring to education as well as critically explore the perils of applying a data driven approach to education. Despite the claimed value of the...

  5. Based Real Time Remote Health Monitoring Systems: A Review on Patients Prioritization and Related "Big Data" Using Body Sensors information and Communication Technology.

    Science.gov (United States)

    Kalid, Naser; Zaidan, A A; Zaidan, B B; Salman, Omar H; Hashim, M; Muzammil, H

    2017-12-29

    The growing worldwide population has increased the need for technologies, computerised software algorithms and smart devices that can monitor and assist patients anytime and anywhere and thus enable them to lead independent lives. The real-time remote monitoring of patients is an important issue in telemedicine. In the provision of healthcare services, patient prioritisation poses a significant challenge because of the complex decision-making process it involves when patients are considered 'big data'. To our knowledge, no study has highlighted the link between 'big data' characteristics and real-time remote healthcare monitoring in the patient prioritisation process, as well as the inherent challenges involved. Thus, we present comprehensive insights into the elements of big data characteristics according to the six 'Vs': volume, velocity, variety, veracity, value and variability. Each of these elements is presented and connected to a related part in the study of the connection between patient prioritisation and real-time remote healthcare monitoring systems. Then, we determine the weak points and recommend solutions as potential future work. This study makes the following contributions. (1) The link between big data characteristics and real-time remote healthcare monitoring in the patient prioritisation process is described. (2) The open issues and challenges for big data used in the patient prioritisation process are emphasised. (3) As a recommended solution, decision making using multiple criteria, such as vital signs and chief complaints, is utilised to prioritise the big data of patients with chronic diseases on the basis of the most urgent cases.

  6. Big Data access and infrastructure for modern biology: case studies in data repository utility.

    Science.gov (United States)

    Boles, Nathan C; Stone, Tyler; Bergeron, Charles; Kiehl, Thomas R

    2017-01-01

    Big Data is no longer solely the purview of big organizations with big resources. Today's routine tools and experimental methods can generate large slices of data. For example, high-throughput sequencing can quickly interrogate biological systems for the expression levels of thousands of different RNAs, examine epigenetic marks throughout the genome, and detect differences in the genomes of individuals. Multichannel electrophysiology platforms produce gigabytes of data in just a few minutes of recording. Imaging systems generate videos capturing biological behaviors over the course of days. Thus, any researcher now has access to a veritable wealth of data. However, the ability of any given researcher to utilize that data is limited by her/his own resources and skills for downloading, storing, and analyzing the data. In this paper, we examine the necessary resources required to engage Big Data, survey the state of modern data analysis pipelines, present a few data repository case studies, and touch on current institutions and programs supporting the work that relies on Big Data. © 2016 New York Academy of Sciences.

  7. Identifying the potential of changes to blood sample logistics using simulation

    DEFF Research Database (Denmark)

    Jørgensen, Pelle Morten Thomas; Jacobsen, Peter; Poulsen, Jørgen Hjelm

    2013-01-01

    of the simulation was to evaluate changes made to the transportation of blood samples between wards and the laboratory. The average- (AWT) and maximum waiting time (MWT) from a blood sample was drawn at the ward until it was received at the laboratory, and the distribution of arrivals of blood samples......, each of the scenarios was tested in terms of what amount of resources would give the optimal result. The simulations showed a big improvement potential in implementing a new technology/mean for transporting the blood samples. The pneumatic tube system showed the biggest potential lowering the AWT...

  8. Big bang is not needed

    Energy Technology Data Exchange (ETDEWEB)

    Allen, A.D.

    1976-02-01

    Recent computer simulations indicate that a system of n gravitating masses breaks up, even when the total energy is negative. As a result, almost any initial phase-space distribution results in a universe that eventually expands under the Hubble law. Hence Hubble expansion implies little regarding an initial cosmic state. Especially it does not imply the singularly dense superpositioned state used in the big bang model.

  9. A study on decision-making of food supply chain based on big data

    OpenAIRE

    Ji, Guojun; Hu, Limei; Tan, Kim Hua

    2017-01-01

    As more and more companies have captured and analyzed huge volumes of data to improve the performance of supply chain, this paper develops a big data harvest model that uses big data as inputs to make more informed production decisions in the food supply chain. By introducing a method of Bayesian network, this paper integrates sample data and finds a cause-and-effect between data to predict market demand. Then the deduction graph model that translates products demand into processes and divide...

  10. Big Data's Role in Precision Public Health.

    Science.gov (United States)

    Dolley, Shawn

    2018-01-01

    Precision public health is an emerging practice to more granularly predict and understand public health risks and customize treatments for more specific and homogeneous subpopulations, often using new data, technologies, and methods. Big data is one element that has consistently helped to achieve these goals, through its ability to deliver to practitioners a volume and variety of structured or unstructured data not previously possible. Big data has enabled more widespread and specific research and trials of stratifying and segmenting populations at risk for a variety of health problems. Examples of success using big data are surveyed in surveillance and signal detection, predicting future risk, targeted interventions, and understanding disease. Using novel big data or big data approaches has risks that remain to be resolved. The continued growth in volume and variety of available data, decreased costs of data capture, and emerging computational methods mean big data success will likely be a required pillar of precision public health into the future. This review article aims to identify the precision public health use cases where big data has added value, identify classes of value that big data may bring, and outline the risks inherent in using big data in precision public health efforts.

  11. Interventions for treating osteoarthritis of the big toe joint.

    Science.gov (United States)

    Zammit, Gerard V; Menz, Hylton B; Munteanu, Shannon E; Landorf, Karl B; Gilheany, Mark F

    2010-09-08

    Osteoarthritis affecting of the big toe joint of the foot (hallux limitus or rigidus) is a common and painful condition. Although several treatments have been proposed, few have been adequately evaluated. To identify controlled trials evaluating interventions for osteoarthritis of the big toe joint and to determine the optimum intervention(s). Literature searches were conducted across the following electronic databases: CENTRAL; MEDLINE; EMBASE; CINAHL; and PEDro (to 14th January 2010). No language restrictions were applied. Randomised controlled trials, quasi-randomised trials, or controlled clinical trials that assessed treatment outcomes for osteoarthritis of the big toe joint. Participants of any age or gender with osteoarthritis of the big toe joint (defined either radiographically or clinically) were included. Two authors examined the list of titles and abstracts identified by the literature searches. One content area expert and one methodologist independently applied the pre-determined inclusion and exclusion criteria to the full text of identified trials. To minimise error and reduce potential bias, data were extracted independently by two content experts. Only one trial satisfactorily fulfilled the inclusion criteria and was included in this review. This trial evaluated the effectiveness of two physical therapy programs in 20 individuals with osteoarthritis of the big toe joint. Assessment outcomes included pain levels, big toe joint range of motion and plantar flexion strength of the hallux. Mean differences at four weeks follow up were 3.80 points (95% CI 2.74 to 4.86) for self reported pain, 28.30 degrees (95% CI 21.37 to 35.23) for big toe joint range of motion, and 2.80 kg (95% CI 2.13 to 3.47) for muscle strength. Although differences in outcomes between treatment and control groups were reported, the risk of bias was high. The trial failed to employ appropriate randomisation or adequate allocation concealment, used a relatively small sample and

  12. Big Data, indispensable today

    Directory of Open Access Journals (Sweden)

    Radu-Ioan ENACHE

    2015-10-01

    Full Text Available Big data is and will be used more in the future as a tool for everything that happens both online and offline. Of course , online is a real hobbit, Big Data is found in this medium , offering many advantages , being a real help for all consumers. In this paper we talked about Big Data as being a plus in developing new applications, by gathering useful information about the users and their behaviour.We've also presented the key aspects of real-time monitoring and the architecture principles of this technology. The most important benefit brought to this paper is presented in the cloud section.

  13. Antigravity and the big crunch/big bang transition

    Science.gov (United States)

    Bars, Itzhak; Chen, Shih-Hung; Steinhardt, Paul J.; Turok, Neil

    2012-08-01

    We point out a new phenomenon which seems to be generic in 4d effective theories of scalar fields coupled to Einstein gravity, when applied to cosmology. A lift of such theories to a Weyl-invariant extension allows one to define classical evolution through cosmological singularities unambiguously, and hence construct geodesically complete background spacetimes. An attractor mechanism ensures that, at the level of the effective theory, generic solutions undergo a big crunch/big bang transition by contracting to zero size, passing through a brief antigravity phase, shrinking to zero size again, and re-emerging into an expanding normal gravity phase. The result may be useful for the construction of complete bouncing cosmologies like the cyclic model.

  14. Antigravity and the big crunch/big bang transition

    Energy Technology Data Exchange (ETDEWEB)

    Bars, Itzhak [Department of Physics and Astronomy, University of Southern California, Los Angeles, CA 90089-2535 (United States); Chen, Shih-Hung [Perimeter Institute for Theoretical Physics, Waterloo, ON N2L 2Y5 (Canada); Department of Physics and School of Earth and Space Exploration, Arizona State University, Tempe, AZ 85287-1404 (United States); Steinhardt, Paul J., E-mail: steinh@princeton.edu [Department of Physics and Princeton Center for Theoretical Physics, Princeton University, Princeton, NJ 08544 (United States); Turok, Neil [Perimeter Institute for Theoretical Physics, Waterloo, ON N2L 2Y5 (Canada)

    2012-08-29

    We point out a new phenomenon which seems to be generic in 4d effective theories of scalar fields coupled to Einstein gravity, when applied to cosmology. A lift of such theories to a Weyl-invariant extension allows one to define classical evolution through cosmological singularities unambiguously, and hence construct geodesically complete background spacetimes. An attractor mechanism ensures that, at the level of the effective theory, generic solutions undergo a big crunch/big bang transition by contracting to zero size, passing through a brief antigravity phase, shrinking to zero size again, and re-emerging into an expanding normal gravity phase. The result may be useful for the construction of complete bouncing cosmologies like the cyclic model.

  15. Antigravity and the big crunch/big bang transition

    International Nuclear Information System (INIS)

    Bars, Itzhak; Chen, Shih-Hung; Steinhardt, Paul J.; Turok, Neil

    2012-01-01

    We point out a new phenomenon which seems to be generic in 4d effective theories of scalar fields coupled to Einstein gravity, when applied to cosmology. A lift of such theories to a Weyl-invariant extension allows one to define classical evolution through cosmological singularities unambiguously, and hence construct geodesically complete background spacetimes. An attractor mechanism ensures that, at the level of the effective theory, generic solutions undergo a big crunch/big bang transition by contracting to zero size, passing through a brief antigravity phase, shrinking to zero size again, and re-emerging into an expanding normal gravity phase. The result may be useful for the construction of complete bouncing cosmologies like the cyclic model.

  16. Big data: een zoektocht naar instituties

    NARCIS (Netherlands)

    van der Voort, H.G.; Crompvoets, J

    2016-01-01

    Big data is a well-known phenomenon, even a buzzword nowadays. It refers to an abundance of data and new possibilities to process and use them. Big data is subject of many publications. Some pay attention to the many possibilities of big data, others warn us for their consequences. This special

  17. Data, Data, Data : Big, Linked & Open

    NARCIS (Netherlands)

    Folmer, E.J.A.; Krukkert, D.; Eckartz, S.M.

    2013-01-01

    De gehele business en IT-wereld praat op dit moment over Big Data, een trend die medio 2013 Cloud Computing is gepasseerd (op basis van Google Trends). Ook beleidsmakers houden zich actief bezig met Big Data. Neelie Kroes, vice-president van de Europese Commissie, spreekt over de ‘Big Data

  18. Design compliance matrix waste sample container filling system for nested, fixed-depth sampling system

    International Nuclear Information System (INIS)

    BOGER, R.M.

    1999-01-01

    This design compliance matrix document provides specific design related functional characteristics, constraints, and requirements for the container filling system that is part of the nested, fixed-depth sampling system. This document addresses performance, external interfaces, ALARA, Authorization Basis, environmental and design code requirements for the container filling system. The container filling system will interface with the waste stream from the fluidic pumping channels of the nested, fixed-depth sampling system and will fill containers with waste that meet the Resource Conservation and Recovery Act (RCRA) criteria for waste that contains volatile and semi-volatile organic materials. The specifications for the nested, fixed-depth sampling system are described in a Level 2 Specification document (HNF-3483, Rev. 1). The basis for this design compliance matrix document is the Tank Waste Remediation System (TWRS) desk instructions for design Compliance matrix documents (PI-CP-008-00, Rev. 0)

  19. Classroom Questioning Tendencies from the Perspective of Big Data

    Science.gov (United States)

    Lu, Wang; Rongxiao, Cai

    2016-01-01

    We are now living in an era of big data. From the perspective of data analysis of classroom questioning, the paper chooses three districts in City B that have significant differences. These are educationally developed District D, less developed District F, and developing District M. The study uses the stratified sampling method to choose from…

  20. Big Red: A Development Environment for Bigraphs

    DEFF Research Database (Denmark)

    Faithfull, Alexander John; Perrone, Gian David; Hildebrandt, Thomas

    2013-01-01

    We present Big Red, a visual editor for bigraphs and bigraphical reactive systems, based upon Eclipse. The editor integrates with several existing bigraph tools to permit simulation and model-checking of bigraphical models. We give a brief introduction to the bigraphs formalism, and show how...

  1. Challenges and potential solutions for big data implementations in developing countries.

    Science.gov (United States)

    Luna, D; Mayan, J C; García, M J; Almerares, A A; Househ, M

    2014-08-15

    The volume of data, the velocity with which they are generated, and their variety and lack of structure hinder their use. This creates the need to change the way information is captured, stored, processed, and analyzed, leading to the paradigm shift called Big Data. To describe the challenges and possible solutions for developing countries when implementing Big Data projects in the health sector. A non-systematic review of the literature was performed in PubMed and Google Scholar. The following keywords were used: "big data", "developing countries", "data mining", "health information systems", and "computing methodologies". A thematic review of selected articles was performed. There are challenges when implementing any Big Data program including exponential growth of data, special infrastructure needs, need for a trained workforce, need to agree on interoperability standards, privacy and security issues, and the need to include people, processes, and policies to ensure their adoption. Developing countries have particular characteristics that hinder further development of these projects. The advent of Big Data promises great opportunities for the healthcare field. In this article, we attempt to describe the challenges developing countries would face and enumerate the options to be used to achieve successful implementations of Big Data programs.

  2. Heuristics of the algorithm: Big Data, user interpretation and institutional translation

    Directory of Open Access Journals (Sweden)

    Göran Bolin

    2015-10-01

    Full Text Available Intelligence on mass media audiences was founded on representative statistical samples, analysed by statisticians at the market departments of media corporations. The techniques for aggregating user data in the age of pervasive and ubiquitous personal media (e.g. laptops, smartphones, credit cards/swipe cards and radio-frequency identification build on large aggregates of information (Big Data analysed by algorithms that transform data into commodities. While the former technologies were built on socio-economic variables such as age, gender, ethnicity, education, media preferences (i.e. categories recognisable to media users and industry representatives alike, Big Data technologies register consumer choice, geographical position, web movement, and behavioural information in technologically complex ways that for most lay people are too abstract to appreciate the full consequences of. The data mined for pattern recognition privileges relational rather than demographic qualities. We argue that the agency of interpretation at the bottom of market decisions within media companies nevertheless introduces a ‘heuristics of the algorithm’, where the data inevitably becomes translated into social categories. In the paper we argue that although the promise of algorithmically generated data is often implemented in automated systems where human agency gets increasingly distanced from the data collected (it is our technological gadgets that are being surveyed, rather than us as social beings, one can observe a felt need among media users and among industry actors to ‘translate back’ the algorithmically produced relational statistics into ‘traditional’ social parameters. The tenacious social structures within the advertising industries work against the techno-economically driven tendencies within the Big Data economy.

  3. Methods and tools for big data visualization

    OpenAIRE

    Zubova, Jelena; Kurasova, Olga

    2015-01-01

    In this paper, methods and tools for big data visualization have been investigated. Challenges faced by the big data analysis and visualization have been identified. Technologies for big data analysis have been discussed. A review of methods and tools for big data visualization has been done. Functionalities of the tools have been demonstrated by examples in order to highlight their advantages and disadvantages.

  4. A cost-benefit analysis of preventative management for zebra and quagga mussels in the Colorado-Big Thompson System

    Science.gov (United States)

    Thomas, Catherine M.

    2010-01-01

    net benefits of preventative management strategies. This study builds a bioeconomic simulation model to predict and compare the expected economic costs of the CDOW boat inspection program ot the benefits of reduced expected control costs to water conveyance systems, hydropower generation stations, and minicipal water treatment facilities. The model is based on a case study water delivery and storage system, the Colorado-Big Thompson system. The Colorado-Big Thomspon system is an excellent example of water systems in the Rocky Mountain West. The system is nearly entirely man-made, with all of its reservoirs and delivery points connected via pipelines, tunnels, and canals. The structures and hydropower systems of the Colorado-Big Thompson system are common to other western storage and delivery systems, making the methods and insight developed from this case study transferal to other western systems. The model developed in this study contributes to the bioeconomic literature in several ways. Foremost, the model predicts the spread of dreissena mussels and associated damage costs for a connected water system in the Rocky Mountain West. Very few zebra mussel studies have focused on western water systems. Another distinguishing factor is the simultaneous consideration of spread from propagules introduced by boats and by flows. Most zebra mussel dispersal models consider boater movement patterns combined with limnological characteristics as predictors of spread. A separate set of studies have addressed mussel spread via downstream flows. To the author's knowledge, this is the first study that builds a zebra mussel spread model that specifically accounts for propagule pressure from boat introductions and from downstream flow introductions. By modeling an entire connected system, the study highlights how the spatial layout of a system, and the risk of invasion within a system affect the benefits of preventative management. This report is presented in five chapters. The first

  5. The Big bang and the Quantum

    Science.gov (United States)

    Ashtekar, Abhay

    2010-06-01

    General relativity predicts that space-time comes to an end and physics comes to a halt at the big-bang. Recent developments in loop quantum cosmology have shown that these predictions cannot be trusted. Quantum geometry effects can resolve singularities, thereby opening new vistas. Examples are: The big bang is replaced by a quantum bounce; the `horizon problem' disappears; immediately after the big bounce, there is a super-inflationary phase with its own phenomenological ramifications; and, in presence of a standard inflation potential, initial conditions are naturally set for a long, slow roll inflation independently of what happens in the pre-big bang branch. As in my talk at the conference, I will first discuss the foundational issues and then the implications of the new Planck scale physics near the Big Bang.

  6. A cosmogonical analogy between the Big Bang and a supernova

    International Nuclear Information System (INIS)

    Brown, W.K.

    1981-01-01

    The Big Bang may be discussed most easily in analogy with an expanding spherical shell. An expanding spherical shell, in turn, is quite similar to an ejected supernova shell. In both the Big Bang and the supernova, fragmentation is postulated to occur, where each fragment of the universe becomes a galaxy, and each fragment of supernova shell becomes a solar system. By supporting the presence of shearing flow at the time of fragmentation, a model has been constructed to examine the results in both cases. It has been shown that the model produces a good description of reality on both the galactic and solar system scales. (Auth.)

  7. Big Bang baryosynthesis

    International Nuclear Information System (INIS)

    Turner, M.S.; Chicago Univ., IL

    1983-01-01

    In these lectures I briefly review Big Bang baryosynthesis. In the first lecture I discuss the evidence which exists for the BAU, the failure of non-GUT symmetrical cosmologies, the qualitative picture of baryosynthesis, and numerical results of detailed baryosynthesis calculations. In the second lecture I discuss the requisite CP violation in some detail, further the statistical mechanics of baryosynthesis, possible complications to the simplest scenario, and one cosmological implication of Big Bang baryosynthesis. (orig./HSI)

  8. Diomres (k,m): An efficient method based on Krylov subspaces to solve big, dispersed, unsymmetrical linear systems

    Energy Technology Data Exchange (ETDEWEB)

    de la Torre Vega, E. [Instituto de Investigaciones Electricas, Cuernavaca (Mexico); Cesar Suarez Arriaga, M. [Universidad Michoacana SNH, Michoacan (Mexico)

    1995-03-01

    In geothermal simulation processes, MULKOM uses Integrated Finite Differences to solve the corresponding partial differential equations. This method requires to resolve efficiently big linear dispersed systems of non-symmetrical nature on each temporal iteration. The order of the system is usually greater than one thousand its solution could represent around 80% of CPU total calculation time. If the elapsed time solving this class of linear systems is reduced, the duration of numerical simulation decreases notably. When the matrix is big (N{ge}500) and with holes, it is inefficient to handle all the system`s elements, because it is perfectly figured out by its elements distinct of zero, quantity greatly minor than N{sup 2}. In this area, iteration methods introduce advantages with respect to gaussian elimination methods, because these last replenish matrices not having any special distribution of their non-zero elements and because they do not make use of the available solution estimations. The iterating methods of the Conjugated Gradient family, based on the subspaces of Krylov, possess the advantage of improving the convergence speed by means of preconditioning techniques. The creation of DIOMRES(k,m) method guarantees the continuous descent of the residual norm, without incurring in division by zero. This technique converges at most in N iterations if the system`s matrix is symmetrical, it does not employ too much memory to converge and updates immediately the approximation by using incomplete orthogonalization and adequate restarting. A preconditioned version of DIOMRES was applied to problems related to unsymmetrical systems with 1000 unknowns and less than five terms per equation. We found that this technique could reduce notably the time needful to find the solution without requiring memory increment. The coupling of this method to geothermal versions of MULKOM is in process.

  9. Exploiting big data for critical care research.

    Science.gov (United States)

    Docherty, Annemarie B; Lone, Nazir I

    2015-10-01

    Over recent years the digitalization, collection and storage of vast quantities of data, in combination with advances in data science, has opened up a new era of big data. In this review, we define big data, identify examples of critical care research using big data, discuss the limitations and ethical concerns of using these large datasets and finally consider scope for future research. Big data refers to datasets whose size, complexity and dynamic nature are beyond the scope of traditional data collection and analysis methods. The potential benefits to critical care are significant, with faster progress in improving health and better value for money. Although not replacing clinical trials, big data can improve their design and advance the field of precision medicine. However, there are limitations to analysing big data using observational methods. In addition, there are ethical concerns regarding maintaining confidentiality of patients who contribute to these datasets. Big data have the potential to improve medical care and reduce costs, both by individualizing medicine, and bringing together multiple sources of data about individual patients. As big data become increasingly mainstream, it will be important to maintain public confidence by safeguarding data security, governance and confidentiality.

  10. Big Data in Health: a Literature Review from the Year 2005.

    Science.gov (United States)

    de la Torre Díez, Isabel; Cosgaya, Héctor Merino; Garcia-Zapirain, Begoña; López-Coronado, Miguel

    2016-09-01

    The information stored in healthcare systems has increased over the last ten years, leading it to be considered Big Data. There is a wealth of health information ready to be analysed. However, the sheer volume raises a challenge for traditional methods. The aim of this article is to conduct a cutting-edge study on Big Data in healthcare from 2005 to the present. This literature review will help researchers to know how Big Data has developed in the health industry and open up new avenues for research. Information searches have been made on various scientific databases such as Pubmed, Science Direct, Scopus and Web of Science for Big Data in healthcare. The search criteria were "Big Data" and "health" with a date range from 2005 to the present. A total of 9724 articles were found on the databases. 9515 articles were discarded as duplicates or for not having a title of interest to the study. 209 articles were read, with the resulting decision that 46 were useful for this study. 52.6 % of the articles used were found in Science Direct, 23.7 % in Pubmed, 22.1 % through Scopus and the remaining 2.6 % through the Web of Science. Big Data has undergone extremely high growth since 2011 and its use is becoming compulsory in developed nations and in an increasing number of developing nations. Big Data is a step forward and a cost reducer for public and private healthcare.

  11. Empathy and the Big Five

    OpenAIRE

    Paulus, Christoph

    2016-01-01

    Del Barrio et al. (2004) haben vor mehr als 10 Jahren versucht, eine direkte Beziehung zwischen Empathie und den Big Five herzustellen. Im Mittel hatten in ihrer Stichprobe Frauen höhere Werte in der Empathie und auf den Big Five-Faktoren mit Ausnahme des Faktors Neurotizismus. Zusammenhänge zu Empathie fanden sie in den Bereichen Offenheit, Verträglichkeit, Gewissenhaftigkeit und Extraversion. In unseren Daten besitzen Frauen sowohl in der Empathie als auch den Big Five signifikant höhere We...

  12. On Study of Application of Big Data and Cloud Computing Technology in Smart Campus

    Science.gov (United States)

    Tang, Zijiao

    2017-12-01

    We live in an era of network and information, which means we produce and face a lot of data every day, however it is not easy for database in the traditional meaning to better store, process and analyze the mass data, therefore the big data was born at the right moment. Meanwhile, the development and operation of big data rest with cloud computing which provides sufficient space and resources available to process and analyze data of big data technology. Nowadays, the proposal of smart campus construction aims at improving the process of building information in colleges and universities, therefore it is necessary to consider combining big data technology and cloud computing technology into construction of smart campus to make campus database system and campus management system mutually combined rather than isolated, and to serve smart campus construction through integrating, storing, processing and analyzing mass data.

  13. Planetary Sample Caching System Design Options

    Science.gov (United States)

    Collins, Curtis; Younse, Paulo; Backes, Paul

    2009-01-01

    Potential Mars Sample Return missions would aspire to collect small core and regolith samples using a rover with a sample acquisition tool and sample caching system. Samples would need to be stored in individual sealed tubes in a canister that could be transfered to a Mars ascent vehicle and returned to Earth. A sample handling, encapsulation and containerization system (SHEC) has been developed as part of an integrated system for acquiring and storing core samples for application to future potential MSR and other potential sample return missions. Requirements and design options for the SHEC system were studied and a recommended design concept developed. Two families of solutions were explored: 1)transfer of a raw sample from the tool to the SHEC subsystem and 2)transfer of a tube containing the sample to the SHEC subsystem. The recommended design utilizes sample tool bit change out as the mechanism for transferring tubes to and samples in tubes from the tool. The SHEC subsystem design, called the Bit Changeout Caching(BiCC) design, is intended for operations on a MER class rover.

  14. Relationship Between Big Five Personality Traits, Emotional Intelligence and Self-esteem Among College Students

    OpenAIRE

    Fauzia Nazir, AnamAzam, Muhammad Rafiq, Sobia Nazir, Sophia Nazir, ShaziaTasleem

    2015-01-01

    The current research study was on the “Relationship between Big Five Personality Traits & Emotional Intelligence and Self-esteem among the College Students”. This work is based on cross sectional survey research design. The convenience sample was used by including 170 female Students studying at government college kotla Arab Ali khan Gujrat, Pakistan, degree program of 3rd year and 4th year. The study variables were measured using Big Five Inventory Scale by Goldberg (1993), Emotional Intell...

  15. Big domains are novel Ca²+-binding modules: evidences from big domains of Leptospira immunoglobulin-like (Lig) proteins.

    Science.gov (United States)

    Raman, Rajeev; Rajanikanth, V; Palaniappan, Raghavan U M; Lin, Yi-Pin; He, Hongxuan; McDonough, Sean P; Sharma, Yogendra; Chang, Yung-Fu

    2010-12-29

    Many bacterial surface exposed proteins mediate the host-pathogen interaction more effectively in the presence of Ca²+. Leptospiral immunoglobulin-like (Lig) proteins, LigA and LigB, are surface exposed proteins containing Bacterial immunoglobulin like (Big) domains. The function of proteins which contain Big fold is not known. Based on the possible similarities of immunoglobulin and βγ-crystallin folds, we here explore the important question whether Ca²+ binds to a Big domains, which would provide a novel functional role of the proteins containing Big fold. We selected six individual Big domains for this study (three from the conserved part of LigA and LigB, denoted as Lig A3, Lig A4, and LigBCon5; two from the variable region of LigA, i.e., 9(th) (Lig A9) and 10(th) repeats (Lig A10); and one from the variable region of LigB, i.e., LigBCen2. We have also studied the conserved region covering the three and six repeats (LigBCon1-3 and LigCon). All these proteins bind the calcium-mimic dye Stains-all. All the selected four domains bind Ca²+ with dissociation constants of 2-4 µM. Lig A9 and Lig A10 domains fold well with moderate thermal stability, have β-sheet conformation and form homodimers. Fluorescence spectra of Big domains show a specific doublet (at 317 and 330 nm), probably due to Trp interaction with a Phe residue. Equilibrium unfolding of selected Big domains is similar and follows a two-state model, suggesting the similarity in their fold. We demonstrate that the Lig are Ca²+-binding proteins, with Big domains harbouring the binding motif. We conclude that despite differences in sequence, a Big motif binds Ca²+. This work thus sets up a strong possibility for classifying the proteins containing Big domains as a novel family of Ca²+-binding proteins. Since Big domain is a part of many proteins in bacterial kingdom, we suggest a possible function these proteins via Ca²+ binding.

  16. Core sampling system spare parts assessment

    International Nuclear Information System (INIS)

    Walter, E.J.

    1995-01-01

    Soon, there will be 4 independent core sampling systems obtaining samples from the underground tanks. It is desirable that these systems be available for sampling during the next 2 years. This assessment was prepared to evaluate the adequacy of the spare parts identified for the core sampling system and to provide recommendations that may remediate overages or inadequacies of spare parts

  17. Semantic Web Technologies and Big Data Infrastructures: SPARQL Federated Querying of Heterogeneous Big Data Stores

    OpenAIRE

    Konstantopoulos, Stasinos; Charalambidis, Angelos; Mouchakis, Giannis; Troumpoukis, Antonis; Jakobitsch, Jürgen; Karkaletsis, Vangelis

    2016-01-01

    The ability to cross-link large scale data with each other and with structured Semantic Web data, and the ability to uniformly process Semantic Web and other data adds value to both the Semantic Web and to the Big Data community. This paper presents work in progress towards integrating Big Data infrastructures with Semantic Web technologies, allowing for the cross-linking and uniform retrieval of data stored in both Big Data infrastructures and Semantic Web data. The technical challenges invo...

  18. Quantum fields in a big-crunch-big-bang spacetime

    International Nuclear Information System (INIS)

    Tolley, Andrew J.; Turok, Neil

    2002-01-01

    We consider quantum field theory on a spacetime representing the big-crunch-big-bang transition postulated in ekpyrotic or cyclic cosmologies. We show via several independent methods that an essentially unique matching rule holds connecting the incoming state, in which a single extra dimension shrinks to zero, to the outgoing state in which it reexpands at the same rate. For free fields in our construction there is no particle production from the incoming adiabatic vacuum. When interactions are included the particle production for fixed external momentum is finite at the tree level. We discuss a formal correspondence between our construction and quantum field theory on de Sitter spacetime

  19. Turning big bang into big bounce: II. Quantum dynamics

    Energy Technology Data Exchange (ETDEWEB)

    Malkiewicz, Przemyslaw; Piechocki, Wlodzimierz, E-mail: pmalk@fuw.edu.p, E-mail: piech@fuw.edu.p [Theoretical Physics Department, Institute for Nuclear Studies, Hoza 69, 00-681 Warsaw (Poland)

    2010-11-21

    We analyze the big bounce transition of the quantum Friedmann-Robertson-Walker model in the setting of the nonstandard loop quantum cosmology (LQC). Elementary observables are used to quantize composite observables. The spectrum of the energy density operator is bounded and continuous. The spectrum of the volume operator is bounded from below and discrete. It has equally distant levels defining a quantum of the volume. The discreteness may imply a foamy structure of spacetime at a semiclassical level which may be detected in astro-cosmo observations. The nonstandard LQC method has a free parameter that should be fixed in some way to specify the big bounce transition.

  20. The ethics of big data in big agriculture

    Directory of Open Access Journals (Sweden)

    Isabelle M. Carbonell

    2016-03-01

    Full Text Available This paper examines the ethics of big data in agriculture, focusing on the power asymmetry between farmers and large agribusinesses like Monsanto. Following the recent purchase of Climate Corp., Monsanto is currently the most prominent biotech agribusiness to buy into big data. With wireless sensors on tractors monitoring or dictating every decision a farmer makes, Monsanto can now aggregate large quantities of previously proprietary farming data, enabling a privileged position with unique insights on a field-by-field basis into a third or more of the US farmland. This power asymmetry may be rebalanced through open-sourced data, and publicly-funded data analytic tools which rival Climate Corp. in complexity and innovation for use in the public domain.

  1. MLBCD: a machine learning tool for big clinical data.

    Science.gov (United States)

    Luo, Gang

    2015-01-01

    Predictive modeling is fundamental for extracting value from large clinical data sets, or "big clinical data," advancing clinical research, and improving healthcare. Machine learning is a powerful approach to predictive modeling. Two factors make machine learning challenging for healthcare researchers. First, before training a machine learning model, the values of one or more model parameters called hyper-parameters must typically be specified. Due to their inexperience with machine learning, it is hard for healthcare researchers to choose an appropriate algorithm and hyper-parameter values. Second, many clinical data are stored in a special format. These data must be iteratively transformed into the relational table format before conducting predictive modeling. This transformation is time-consuming and requires computing expertise. This paper presents our vision for and design of MLBCD (Machine Learning for Big Clinical Data), a new software system aiming to address these challenges and facilitate building machine learning predictive models using big clinical data. The paper describes MLBCD's design in detail. By making machine learning accessible to healthcare researchers, MLBCD will open the use of big clinical data and increase the ability to foster biomedical discovery and improve care.

  2. Big Data Analytics: Challenges And Applications For Text, Audio, Video, And Social Media Data

    OpenAIRE

    Jai Prakash Verma; Smita Agrawal; Bankim Patel; Atul Patel

    2016-01-01

    All types of machine automated systems are generating large amount of data in different forms like statistical, text, audio, video, sensor, and bio-metric data that emerges the term Big Data. In this paper we are discussing issues, challenges, and application of these types of Big Data with the consideration of big data dimensions. Here we are discussing social media data analytics, content based analytics, text data analytics, audio, and video data analytics their issues and expected applica...

  3. Homogeneous and isotropic big rips?

    CERN Document Server

    Giovannini, Massimo

    2005-01-01

    We investigate the way big rips are approached in a fully inhomogeneous description of the space-time geometry. If the pressure and energy densities are connected by a (supernegative) barotropic index, the spatial gradients and the anisotropic expansion decay as the big rip is approached. This behaviour is contrasted with the usual big-bang singularities. A similar analysis is performed in the case of sudden (quiescent) singularities and it is argued that the spatial gradients may well be non-negligible in the vicinity of pressure singularities.

  4. Rate Change Big Bang Theory

    Science.gov (United States)

    Strickland, Ken

    2013-04-01

    The Rate Change Big Bang Theory redefines the birth of the universe with a dramatic shift in energy direction and a new vision of the first moments. With rate change graph technology (RCGT) we can look back 13.7 billion years and experience every step of the big bang through geometrical intersection technology. The analysis of the Big Bang includes a visualization of the first objects, their properties, the astounding event that created space and time as well as a solution to the mystery of anti-matter.

  5. Big Data, Internet of Things and Cloud Convergence--An Architecture for Secure E-Health Applications.

    Science.gov (United States)

    Suciu, George; Suciu, Victor; Martian, Alexandru; Craciunescu, Razvan; Vulpe, Alexandru; Marcu, Ioana; Halunga, Simona; Fratu, Octavian

    2015-11-01

    Big data storage and processing are considered as one of the main applications for cloud computing systems. Furthermore, the development of the Internet of Things (IoT) paradigm has advanced the research on Machine to Machine (M2M) communications and enabled novel tele-monitoring architectures for E-Health applications. However, there is a need for converging current decentralized cloud systems, general software for processing big data and IoT systems. The purpose of this paper is to analyze existing components and methods of securely integrating big data processing with cloud M2M systems based on Remote Telemetry Units (RTUs) and to propose a converged E-Health architecture built on Exalead CloudView, a search based application. Finally, we discuss the main findings of the proposed implementation and future directions.

  6. Intelligent Test Mechanism Design of Worn Big Gear

    Directory of Open Access Journals (Sweden)

    Hong-Yu LIU

    2014-10-01

    Full Text Available With the continuous development of national economy, big gear was widely applied in metallurgy and mine domains. So, big gear plays an important role in above domains. In practical production, big gear abrasion and breach take place often. It affects normal production and causes unnecessary economic loss. A kind of intelligent test method was put forward on worn big gear mainly aimed at the big gear restriction conditions of high production cost, long production cycle and high- intensity artificial repair welding work. The measure equations transformations were made on involute straight gear. Original polar coordinate equations were transformed into rectangular coordinate equations. Big gear abrasion measure principle was introduced. Detection principle diagram was given. Detection route realization method was introduced. OADM12 laser sensor was selected. Detection on big gear abrasion area was realized by detection mechanism. Tested data of unworn gear and worn gear were led in designed calculation program written by Visual Basic language. Big gear abrasion quantity can be obtained. It provides a feasible method for intelligent test and intelligent repair welding on worn big gear.

  7. [Big data in medicine and healthcare].

    Science.gov (United States)

    Rüping, Stefan

    2015-08-01

    Healthcare is one of the business fields with the highest Big Data potential. According to the prevailing definition, Big Data refers to the fact that data today is often too large and heterogeneous and changes too quickly to be stored, processed, and transformed into value by previous technologies. The technological trends drive Big Data: business processes are more and more executed electronically, consumers produce more and more data themselves - e.g. in social networks - and finally ever increasing digitalization. Currently, several new trends towards new data sources and innovative data analysis appear in medicine and healthcare. From the research perspective, omics-research is one clear Big Data topic. In practice, the electronic health records, free open data and the "quantified self" offer new perspectives for data analytics. Regarding analytics, significant advances have been made in the information extraction from text data, which unlocks a lot of data from clinical documentation for analytics purposes. At the same time, medicine and healthcare is lagging behind in the adoption of Big Data approaches. This can be traced to particular problems regarding data complexity and organizational, legal, and ethical challenges. The growing uptake of Big Data in general and first best-practice examples in medicine and healthcare in particular, indicate that innovative solutions will be coming. This paper gives an overview of the potentials of Big Data in medicine and healthcare.

  8. From Big Data to Big Business

    DEFF Research Database (Denmark)

    Lund Pedersen, Carsten

    2017-01-01

    Idea in Brief: Problem: There is an enormous profit potential for manufacturing firms in big data, but one of the key barriers to obtaining data-driven growth is the lack of knowledge about which capabilities are needed to extract value and profit from data. Solution: We (BDBB research group at C...

  9. The big five personality traits and environmental concern: the moderating roles of individualism/collectivism and gender

    OpenAIRE

    Abbas Abdollahi; Simin Hosseinian; Samaneh Karbalaei; Ahmad Beh‐Pajooh; Yousef Kesh; Mahmoud Najafi

    2017-01-01

    Environmental pollution has become a serious challenge for humanity and the environment. Therefore, this study aims to examine the relationships between the Big Five personality traits, individualism, collectivism, participant’s age, and environmental concern, and testing the moderating roles of individualism/collectivism and gender in the relationship between the Big Five personality traits and environmental concern. In this quantitative study, the multi-stage cluster random sampling method ...

  10. Review of Big Gods: How Religion Transformed Cooperation and Conflict

    Directory of Open Access Journals (Sweden)

    Thomas Joseph Coleman III

    2014-05-01

    Full Text Available 'Big Gods: How Religion Transformed Cooperation and Conflict', presents an empirically grounded rational reconstruction detailing the role that belief in “big gods” (i.e., omniscient, omnipresent, and omnipotent gods has played in the formation of society from a cultural-evolutionary perspective. Ara Norenzayan’s primary thesis is neatly summed up in the title of the book: religion has historically served—and perhaps still serves—as a building block and maintenance system in societies around the world.

  11. Making big sense from big data in toxicology by read-across.

    Science.gov (United States)

    Hartung, Thomas

    2016-01-01

    Modern information technologies have made big data available in safety sciences, i.e., extremely large data sets that may be analyzed only computationally to reveal patterns, trends and associations. This happens by (1) compilation of large sets of existing data, e.g., as a result of the European REACH regulation, (2) the use of omics technologies and (3) systematic robotized testing in a high-throughput manner. All three approaches and some other high-content technologies leave us with big data--the challenge is now to make big sense of these data. Read-across, i.e., the local similarity-based intrapolation of properties, is gaining momentum with increasing data availability and consensus on how to process and report it. It is predominantly applied to in vivo test data as a gap-filling approach, but can similarly complement other incomplete datasets. Big data are first of all repositories for finding similar substances and ensure that the available data is fully exploited. High-content and high-throughput approaches similarly require focusing on clusters, in this case formed by underlying mechanisms such as pathways of toxicity. The closely connected properties, i.e., structural and biological similarity, create the confidence needed for predictions of toxic properties. Here, a new web-based tool under development called REACH-across, which aims to support and automate structure-based read-across, is presented among others.

  12. Bringing Big Data to the Forefront of Healthcare Delivery: The Experience of Carolinas HealthCare System.

    Science.gov (United States)

    Dulin, Michael F; Lovin, Carol A; Wright, Jean A

    2017-01-01

    The use of big data to transform care delivery is rapidly becoming a reality. To deliver on the promise of value-based care, providers must know the key drivers of wellness at the patient and community levels, as well as understand resource constraints and opportunities to improve efficiency in the health-care system itself. Data are the linchpin. By gathering the right data and finding innovative ways to glean knowledge, we can improve clinical care, advance the health of our communities, improve the lives of our patients, and operate more efficiently. At Carolinas HealthCare System-one of the nation's largest health-care systems, with nearly 12 million patient encounters annually at more than 900 care locations-we have made substantial investments to establish a centralized data and analytics infrastructure that is transforming the way we deliver care across the continuum. Although the impetus and vision for our program have evolved over the past decade, our efforts coalesced into a strategic, centralized initiative with the launch of the Dickson Advanced Analytics (DA) group in 2012. DA has yielded significant gains in our ability to use data, not only for reporting purposes and understanding our business but also for predicting outcomes and informing action.While these efforts have been successful, the path has not been easy. Effectively harnessing big data requires navigating myriad technological, cultural, operational, and other hurdles. Building a program that is feasible, effective, and sustainable takes concerted effort and a rigorous process of continuous self-evaluation and strategic adaptation.

  13. Bringing Big Data to the Forefront of Healthcare Delivery: The Experience of Carolinas HealthCare System.

    Science.gov (United States)

    Dulin, Michael F; Lovin, Carol A; Wright, Jean A

    2016-01-01

    The use of big data to transform care delivery is rapidly becoming a reality. To deliver on the promise of value-based care, providers must know the key drivers of wellness at the patient and community levels, as well as understand resource constraints and opportunities to improve efficiency in the healthcare system itself. Data are the linchpin. By gathering the right data and finding innovative ways to glean knowledge, we can improve clinical care, advance the health of our communities, improve the lives of our patients, and operate more efficiently. At Carolinas HealthCare System-one of the nation's largest healthcare systems, with nearly 12 million patient encounters annually at more than 900 care locations-we have made substantial investments to establish a centralized data and analytics infrastructure that is transforming the way we deliver care across the continuum. Although the impetus and vision for our program have evolved over the past decade, our efforts coalesced into a strategic, centralized initiative with the launch of the Dickson Advanced Analytics (DA2) group in 2012. DA2 has yielded significant gains in our ability to use data, not only for reporting purposes and understanding our business but also for predicting outcomes and informing action.While these efforts have been successful, the path has not been easy. Effectively harnessing big data requires navigating myriad technological, cultural, operational, and other hurdles. Building a program that is feasible, effective, and sustainable takes concerted effort and a rigorous process of continuous self-evaluation and strategic adaptation.

  14. Entity information life cycle for big data master data management and information integration

    CERN Document Server

    Talburt, John R

    2015-01-01

    Entity Information Life Cycle for Big Data walks you through the ins and outs of managing entity information so you can successfully achieve master data management (MDM) in the era of big data. This book explains big data's impact on MDM and the critical role of entity information management system (EIMS) in successful MDM. Expert authors Dr. John R. Talburt and Dr. Yinle Zhou provide a thorough background in the principles of managing the entity information life cycle and provide practical tips and techniques for implementing an EIMS, strategies for exploiting distributed processing to hand

  15. Research in Big Data Warehousing using Hadoop

    OpenAIRE

    Abderrazak Sebaa; Fatima Chikh; Amina Nouicer; Abdelkamel Tari

    2017-01-01

    Traditional data warehouses have played a key role in decision support system until the recent past. However, the rapid growing of the data generation by the current applications requires new data warehousing systems: volume and format of collected datasets, data source variety, integration of unstructured data and powerful analytical processing. In the age of the Big Data, it is important to follow this pace and adapt the existing warehouse systems to overcome the new issues and challenges. ...

  16. Internet of Things and big data technologies for next generation healthcare

    CERN Document Server

    Dey, Nilanjan; Ashour, Amira

    2017-01-01

    This comprehensive book focuses on better big-data security for healthcare organizations. Following an extensive introduction to the Internet of Things (IoT) in healthcare including challenging topics and scenarios, it offers an in-depth analysis of medical body area networks with the 5th generation of IoT communication technology along with its nanotechnology. It also describes a novel strategic framework and computationally intelligent model to measure possible security vulnerabilities in the context of e-health. Moreover, the book addresses healthcare systems that handle large volumes of data driven by patients’ records and health/personal information, including big-data-based knowledge management systems to support clinical decisions. Several of the issues faced in storing/processing big data are presented along with the available tools, technologies and algorithms to deal with those problems as well as a case study in healthcare analytics. Addressing trust, privacy, and security issues as well as the I...

  17. Performance Analysis Tool for HPC and Big Data Applications on Scientific Clusters

    Energy Technology Data Exchange (ETDEWEB)

    Yoo, Wucherl [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States); Koo, Michelle [Univ. of California, Berkeley, CA (United States); Cao, Yu [California Inst. of Technology (CalTech), Pasadena, CA (United States); Sim, Alex [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States); Nugent, Peter [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States); Univ. of California, Berkeley, CA (United States); Wu, Kesheng [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States)

    2016-09-17

    Big data is prevalent in HPC computing. Many HPC projects rely on complex workflows to analyze terabytes or petabytes of data. These workflows often require running over thousands of CPU cores and performing simultaneous data accesses, data movements, and computation. It is challenging to analyze the performance involving terabytes or petabytes of workflow data or measurement data of the executions, from complex workflows over a large number of nodes and multiple parallel task executions. To help identify performance bottlenecks or debug the performance issues in large-scale scientific applications and scientific clusters, we have developed a performance analysis framework, using state-ofthe- art open-source big data processing tools. Our tool can ingest system logs and application performance measurements to extract key performance features, and apply the most sophisticated statistical tools and data mining methods on the performance data. It utilizes an efficient data processing engine to allow users to interactively analyze a large amount of different types of logs and measurements. To illustrate the functionality of the big data analysis framework, we conduct case studies on the workflows from an astronomy project known as the Palomar Transient Factory (PTF) and the job logs from the genome analysis scientific cluster. Our study processed many terabytes of system logs and application performance measurements collected on the HPC systems at NERSC. The implementation of our tool is generic enough to be used for analyzing the performance of other HPC systems and Big Data workows.

  18. Toward a Learning Health-care System – Knowledge Delivery at the Point of Care Empowered by Big Data and NLP

    Science.gov (United States)

    Kaggal, Vinod C.; Elayavilli, Ravikumar Komandur; Mehrabi, Saeed; Pankratz, Joshua J.; Sohn, Sunghwan; Wang, Yanshan; Li, Dingcheng; Rastegar, Majid Mojarad; Murphy, Sean P.; Ross, Jason L.; Chaudhry, Rajeev; Buntrock, James D.; Liu, Hongfang

    2016-01-01

    The concept of optimizing health care by understanding and generating knowledge from previous evidence, ie, the Learning Health-care System (LHS), has gained momentum and now has national prominence. Meanwhile, the rapid adoption of electronic health records (EHRs) enables the data collection required to form the basis for facilitating LHS. A prerequisite for using EHR data within the LHS is an infrastructure that enables access to EHR data longitudinally for health-care analytics and real time for knowledge delivery. Additionally, significant clinical information is embedded in the free text, making natural language processing (NLP) an essential component in implementing an LHS. Herein, we share our institutional implementation of a big data-empowered clinical NLP infrastructure, which not only enables health-care analytics but also has real-time NLP processing capability. The infrastructure has been utilized for multiple institutional projects including the MayoExpertAdvisor, an individualized care recommendation solution for clinical care. We compared the advantages of big data over two other environments. Big data infrastructure significantly outperformed other infrastructure in terms of computing speed, demonstrating its value in making the LHS a possibility in the near future. PMID:27385912

  19. Big-Leaf Mahogany on CITES Appendix II: Big Challenge, Big Opportunity

    Science.gov (United States)

    JAMES GROGAN; PAULO BARRETO

    2005-01-01

    On 15 November 2003, big-leaf mahogany (Swietenia macrophylla King, Meliaceae), the most valuable widely traded Neotropical timber tree, gained strengthened regulatory protection from its listing on Appendix II of the Convention on International Trade in Endangered Species ofWild Fauna and Flora (CITES). CITES is a United Nations-chartered agreement signed by 164...

  20. Big Data as Information Barrier

    Directory of Open Access Journals (Sweden)

    Victor Ya. Tsvetkov

    2014-07-01

    Full Text Available The article covers analysis of ‘Big Data’ which has been discussed over last 10 years. The reasons and factors for the issue are revealed. It has proved that the factors creating ‘Big Data’ issue has existed for quite a long time, and from time to time, would cause the informational barriers. Such barriers were successfully overcome through the science and technologies. The conducted analysis refers the “Big Data” issue to a form of informative barrier. This issue may be solved correctly and encourages development of scientific and calculating methods.

  1. Big Data in Space Science

    OpenAIRE

    Barmby, Pauline

    2018-01-01

    It seems like “big data” is everywhere these days. In planetary science and astronomy, we’ve been dealing with large datasets for a long time. So how “big” is our data? How does it compare to the big data that a bank or an airline might have? What new tools do we need to analyze big datasets, and how can we make better use of existing tools? What kinds of science problems can we address with these? I’ll address these questions with examples including ESA’s Gaia mission, ...

  2. Modelling quality dynamics on business value and firm performance in big data analytics environment

    OpenAIRE

    Ji-fan Ren, S; Fosso Wamba, S; Akter, S; Dubey, R; Childe, SJ

    2017-01-01

    Big data analytics have become an increasingly important component for firms across advanced economies. This paper examines the quality dynamics in big data environment that are linked with enhancing business value and firm performance. The study identifies that system quality (i.e., system reliability, accessibility, adaptability, integration, response time and privacy) and information quality (i.e., completeness, accuracy, format and currency) are key to enhance business value and firm perf...

  3. Frameworks for management, storage and preparation of large data volumes Big Data

    Directory of Open Access Journals (Sweden)

    Marco Antonio Almeida Pamiño

    2017-05-01

    Full Text Available Weather systems like the World Meteorological Organization ́s Global Information System need to store different kinds of images, data and files. Big Data and its 3V paradigm can provide a suitable solution to solve this problem. This tutorial presents some concepts around the Hadoop framework, de facto standard implementation of Big Data, and how to store semi-estructured data generated by automatic weather stations using this framework. Finally, a formal method to generate weather reports using Hadoop ́s ecosystem frameworks is presented.

  4. A SWOT Analysis of Big Data

    Science.gov (United States)

    Ahmadi, Mohammad; Dileepan, Parthasarati; Wheatley, Kathleen K.

    2016-01-01

    This is the decade of data analytics and big data, but not everyone agrees with the definition of big data. Some researchers see it as the future of data analysis, while others consider it as hype and foresee its demise in the near future. No matter how it is defined, big data for the time being is having its glory moment. The most important…

  5. A survey of big data research

    Science.gov (United States)

    Fang, Hua; Zhang, Zhaoyang; Wang, Chanpaul Jin; Daneshmand, Mahmoud; Wang, Chonggang; Wang, Honggang

    2015-01-01

    Big data create values for business and research, but pose significant challenges in terms of networking, storage, management, analytics and ethics. Multidisciplinary collaborations from engineers, computer scientists, statisticians and social scientists are needed to tackle, discover and understand big data. This survey presents an overview of big data initiatives, technologies and research in industries and academia, and discusses challenges and potential solutions. PMID:26504265

  6. Big Data in Action for Government : Big Data Innovation in Public Services, Policy, and Engagement

    OpenAIRE

    World Bank

    2017-01-01

    Governments have an opportunity to harness big data solutions to improve productivity, performance and innovation in service delivery and policymaking processes. In developing countries, governments have an opportunity to adopt big data solutions and leapfrog traditional administrative approaches

  7. An overview of big data and data science education at South African universities

    Directory of Open Access Journals (Sweden)

    Eduan Kotzé

    2016-02-01

    Full Text Available Man and machine are generating data electronically at an astronomical speed and in such a way that society is experiencing cognitive challenges to analyse this data meaningfully. Big data firms, such as Google and Facebook, identified this problem several years ago and are continuously developing new technologies or improving existing technologies in order to facilitate the cognitive analysis process of these large data sets. The purpose of this article is to contribute to our theoretical understanding of the role that big data might play in creating new training opportunities for South African universities. The article investigates emerging literature on the characteristics and main components of big data, together with the Hadoop application stack as an example of big data technology. Due to the rapid development of big data technology, a paradigm shift of human resources is required to analyse these data sets; therefore, this study examines the state of big data teaching at South African universities. This article also provides an overview of possible big data sources for South African universities, as well as relevant big data skills that data scientists need. The study also investigates existing academic programs in South Africa, where the focus is on teaching advanced database systems. The study found that big data and data science topics are introduced to students on a postgraduate level, but that the scope is very limited. This article contributes by proposing important theoretical topics that could be introduced as part of the existing academic programs. More research is required, however, to expand these programs in order to meet the growing demand for data scientists with big data skills.

  8. 78 FR 3911 - Big Stone National Wildlife Refuge, Big Stone and Lac Qui Parle Counties, MN; Final Comprehensive...

    Science.gov (United States)

    2013-01-17

    ... DEPARTMENT OF THE INTERIOR Fish and Wildlife Service [FWS-R3-R-2012-N259; FXRS1265030000-134-FF03R06000] Big Stone National Wildlife Refuge, Big Stone and Lac Qui Parle Counties, MN; Final Comprehensive... significant impact (FONSI) for the environmental assessment (EA) for Big Stone National Wildlife Refuge...

  9. Big domains are novel Ca²+-binding modules: evidences from big domains of Leptospira immunoglobulin-like (Lig proteins.

    Directory of Open Access Journals (Sweden)

    Rajeev Raman

    Full Text Available BACKGROUND: Many bacterial surface exposed proteins mediate the host-pathogen interaction more effectively in the presence of Ca²+. Leptospiral immunoglobulin-like (Lig proteins, LigA and LigB, are surface exposed proteins containing Bacterial immunoglobulin like (Big domains. The function of proteins which contain Big fold is not known. Based on the possible similarities of immunoglobulin and βγ-crystallin folds, we here explore the important question whether Ca²+ binds to a Big domains, which would provide a novel functional role of the proteins containing Big fold. PRINCIPAL FINDINGS: We selected six individual Big domains for this study (three from the conserved part of LigA and LigB, denoted as Lig A3, Lig A4, and LigBCon5; two from the variable region of LigA, i.e., 9(th (Lig A9 and 10(th repeats (Lig A10; and one from the variable region of LigB, i.e., LigBCen2. We have also studied the conserved region covering the three and six repeats (LigBCon1-3 and LigCon. All these proteins bind the calcium-mimic dye Stains-all. All the selected four domains bind Ca²+ with dissociation constants of 2-4 µM. Lig A9 and Lig A10 domains fold well with moderate thermal stability, have β-sheet conformation and form homodimers. Fluorescence spectra of Big domains show a specific doublet (at 317 and 330 nm, probably due to Trp interaction with a Phe residue. Equilibrium unfolding of selected Big domains is similar and follows a two-state model, suggesting the similarity in their fold. CONCLUSIONS: We demonstrate that the Lig are Ca²+-binding proteins, with Big domains harbouring the binding motif. We conclude that despite differences in sequence, a Big motif binds Ca²+. This work thus sets up a strong possibility for classifying the proteins containing Big domains as a novel family of Ca²+-binding proteins. Since Big domain is a part of many proteins in bacterial kingdom, we suggest a possible function these proteins via Ca²+ binding.

  10. [Big data in imaging].

    Science.gov (United States)

    Sewerin, Philipp; Ostendorf, Benedikt; Hueber, Axel J; Kleyer, Arnd

    2018-04-01

    Until now, most major medical advancements have been achieved through hypothesis-driven research within the scope of clinical trials. However, due to a multitude of variables, only a certain number of research questions could be addressed during a single study, thus rendering these studies expensive and time consuming. Big data acquisition enables a new data-based approach in which large volumes of data can be used to investigate all variables, thus opening new horizons. Due to universal digitalization of the data as well as ever-improving hard- and software solutions, imaging would appear to be predestined for such analyses. Several small studies have already demonstrated that automated analysis algorithms and artificial intelligence can identify pathologies with high precision. Such automated systems would also seem well suited for rheumatology imaging, since a method for individualized risk stratification has long been sought for these patients. However, despite all the promising options, the heterogeneity of the data and highly complex regulations covering data protection in Germany would still render a big data solution for imaging difficult today. Overcoming these boundaries is challenging, but the enormous potential advances in clinical management and science render pursuit of this goal worthwhile.

  11. Application of Workflow Technology for Big Data Analysis Service

    Directory of Open Access Journals (Sweden)

    Bin Zhang

    2018-04-01

    Full Text Available This study presents a lightweight representational state transfer-based cloud workflow system to construct a big data intelligent software-as-a-service (SaaS platform. The system supports the dynamic construction and operation of an intelligent data analysis application, and realizes rapid development and flexible deployment of the business analysis process that can improve the interaction and response time of the process. The proposed system integrates offline-batch and online-streaming analysis models that allow users to conduct batch and streaming computing simultaneously. Users can rend cloud capabilities and customize a set of big data analysis applications in the form of workflow processes. This study elucidates the architecture and application modeling, customization, dynamic construction, and scheduling of a cloud workflow system. A chain workflow foundation mechanism is proposed to combine several analysis components into a chain component that can promote efficiency. Four practical application cases are provided to verify the analysis capability of the system. Experimental results show that the proposed system can support multiple users in accessing the system concurrently and effectively uses data analysis algorithms. The proposed SaaS workflow system has been used in network operators and has achieved good results.

  12. The Review of Visual Analysis Methods of Multi-modal Spatio-temporal Big Data

    Directory of Open Access Journals (Sweden)

    ZHU Qing

    2017-10-01

    Full Text Available The visual analysis of spatio-temporal big data is not only the state-of-art research direction of both big data analysis and data visualization, but also the core module of pan-spatial information system. This paper reviews existing visual analysis methods at three levels:descriptive visual analysis, explanatory visual analysis and exploratory visual analysis, focusing on spatio-temporal big data's characteristics of multi-source, multi-granularity, multi-modal and complex association.The technical difficulties and development tendencies of multi-modal feature selection, innovative human-computer interaction analysis and exploratory visual reasoning in the visual analysis of spatio-temporal big data were discussed. Research shows that the study of descriptive visual analysis for data visualizationis is relatively mature.The explanatory visual analysis has become the focus of the big data analysis, which is mainly based on interactive data mining in a visual environment to diagnose implicit reason of problem. And the exploratory visual analysis method needs a major break-through.

  13. New 'bigs' in cosmology

    International Nuclear Information System (INIS)

    Yurov, Artyom V.; Martin-Moruno, Prado; Gonzalez-Diaz, Pedro F.

    2006-01-01

    This paper contains a detailed discussion on new cosmic solutions describing the early and late evolution of a universe that is filled with a kind of dark energy that may or may not satisfy the energy conditions. The main distinctive property of the resulting space-times is that they make to appear twice the single singular events predicted by the corresponding quintessential (phantom) models in a manner which can be made symmetric with respect to the origin of cosmic time. Thus, big bang and big rip singularity are shown to take place twice, one on the positive branch of time and the other on the negative one. We have also considered dark energy and phantom energy accretion onto black holes and wormholes in the context of these new cosmic solutions. It is seen that the space-times of these holes would then undergo swelling processes leading to big trip and big hole events taking place on distinct epochs along the evolution of the universe. In this way, the possibility is considered that the past and future be connected in a non-paradoxical manner in the universes described by means of the new symmetric solutions

  14. Genomic sequencing: assessing the health care system, policy, and big-data implications.

    Science.gov (United States)

    Phillips, Kathryn A; Trosman, Julia R; Kelley, Robin K; Pletcher, Mark J; Douglas, Michael P; Weldon, Christine B

    2014-07-01

    New genomic sequencing technologies enable the high-speed analysis of multiple genes simultaneously, including all of those in a person's genome. Sequencing is a prominent example of a "big data" technology because of the massive amount of information it produces and its complexity, diversity, and timeliness. Our objective in this article is to provide a policy primer on sequencing and illustrate how it can affect health care system and policy issues. Toward this end, we developed an easily applied classification of sequencing based on inputs, methods, and outputs. We used it to examine the implications of sequencing for three health care system and policy issues: making care more patient-centered, developing coverage and reimbursement policies, and assessing economic value. We conclude that sequencing has great promise but that policy challenges include how to optimize patient engagement as well as privacy, develop coverage policies that distinguish research from clinical uses and account for bioinformatics costs, and determine the economic value of sequencing through complex economic models that take into account multiple findings and downstream costs. Project HOPE—The People-to-People Health Foundation, Inc.

  15. 2nd INNS Conference on Big Data

    CERN Document Server

    Manolopoulos, Yannis; Iliadis, Lazaros; Roy, Asim; Vellasco, Marley

    2017-01-01

    The book offers a timely snapshot of neural network technologies as a significant component of big data analytics platforms. It promotes new advances and research directions in efficient and innovative algorithmic approaches to analyzing big data (e.g. deep networks, nature-inspired and brain-inspired algorithms); implementations on different computing platforms (e.g. neuromorphic, graphics processing units (GPUs), clouds, clusters); and big data analytics applications to solve real-world problems (e.g. weather prediction, transportation, energy management). The book, which reports on the second edition of the INNS Conference on Big Data, held on October 23–25, 2016, in Thessaloniki, Greece, depicts an interesting collaborative adventure of neural networks with big data and other learning technologies.

  16. Data Management and Preservation Planning for Big Science

    Directory of Open Access Journals (Sweden)

    Juan Bicarregui

    2013-06-01

    Full Text Available ‘Big Science’ - that is, science which involves large collaborations with dedicated facilities, and involving large data volumes and multinational investments – is often seen as different when it comes to data management and preservation planning. Big Science handles its data differently from other disciplines and has data management problems that are qualitatively different from other disciplines. In part, these differences arise from the quantities of data involved, but possibly more importantly from the cultural, organisational and technical distinctiveness of these academic cultures. Consequently, the data management systems are typically and rationally bespoke, but this means that the planning for data management and preservation (DMP must also be bespoke.These differences are such that ‘just read and implement the OAIS specification’ is reasonable Data Management and Preservation (DMP advice, but this bald prescription can and should be usefully supported by a methodological ‘toolkit’, including overviews, case-studies and costing models to provide guidance on developing best practice in DMP policy and infrastructure for these projects, as well as considering OAIS validation, audit and cost modelling.In this paper, we build on previous work with the LIGO collaboration to consider the role of DMP planning within these big science scenarios, and discuss how to apply current best practice. We discuss the result of the MaRDI-Gross project (Managing Research Data Infrastructures – Big Science, which has been developing a toolkit to provide guidelines on the application of best practice in DMP planning within big science projects. This is targeted primarily at projects’ engineering managers, but intending also to help funders collaborate on DMP plans which satisfy the requirements imposed on them.

  17. The ethics of biomedical big data

    CERN Document Server

    Mittelstadt, Brent Daniel

    2016-01-01

    This book presents cutting edge research on the new ethical challenges posed by biomedical Big Data technologies and practices. ‘Biomedical Big Data’ refers to the analysis of aggregated, very large datasets to improve medical knowledge and clinical care. The book describes the ethical problems posed by aggregation of biomedical datasets and re-use/re-purposing of data, in areas such as privacy, consent, professionalism, power relationships, and ethical governance of Big Data platforms. Approaches and methods are discussed that can be used to address these problems to achieve the appropriate balance between the social goods of biomedical Big Data research and the safety and privacy of individuals. Seventeen original contributions analyse the ethical, social and related policy implications of the analysis and curation of biomedical Big Data, written by leading experts in the areas of biomedical research, medical and technology ethics, privacy, governance and data protection. The book advances our understan...

  18. Scalable privacy-preserving big data aggregation mechanism

    Directory of Open Access Journals (Sweden)

    Dapeng Wu

    2016-08-01

    Full Text Available As the massive sensor data generated by large-scale Wireless Sensor Networks (WSNs recently become an indispensable part of ‘Big Data’, the collection, storage, transmission and analysis of the big sensor data attract considerable attention from researchers. Targeting the privacy requirements of large-scale WSNs and focusing on the energy-efficient collection of big sensor data, a Scalable Privacy-preserving Big Data Aggregation (Sca-PBDA method is proposed in this paper. Firstly, according to the pre-established gradient topology structure, sensor nodes in the network are divided into clusters. Secondly, sensor data is modified by each node according to the privacy-preserving configuration message received from the sink. Subsequently, intra- and inter-cluster data aggregation is employed during the big sensor data reporting phase to reduce energy consumption. Lastly, aggregated results are recovered by the sink to complete the privacy-preserving big data aggregation. Simulation results validate the efficacy and scalability of Sca-PBDA and show that the big sensor data generated by large-scale WSNs is efficiently aggregated to reduce network resource consumption and the sensor data privacy is effectively protected to meet the ever-growing application requirements.

  19. Big Data Technologies: New Opportunities for Diabetes Management.

    Science.gov (United States)

    Bellazzi, Riccardo; Dagliati, Arianna; Sacchi, Lucia; Segagni, Daniele

    2015-04-24

    The so-called big data revolution provides substantial opportunities to diabetes management. At least 3 important directions are currently of great interest. First, the integration of different sources of information, from primary and secondary care to administrative information, may allow depicting a novel view of patient's care processes and of single patient's behaviors, taking into account the multifaceted nature of chronic care. Second, the availability of novel diabetes technologies, able to gather large amounts of real-time data, requires the implementation of distributed platforms for data analysis and decision support. Finally, the inclusion of geographical and environmental information into such complex IT systems may further increase the capability of interpreting the data gathered and extract new knowledge from them. This article reviews the main concepts and definitions related to big data, it presents some efforts in health care, and discusses the potential role of big data in diabetes care. Finally, as an example, it describes the research efforts carried on in the MOSAIC project, funded by the European Commission. © 2015 Diabetes Technology Society.

  20. Big data based fraud risk management at Alibaba

    Directory of Open Access Journals (Sweden)

    Jidong Chen

    2015-12-01

    Full Text Available With development of mobile internet and finance, fraud risk comes in all shapes and sizes. This paper is to introduce the Fraud Risk Management at Alibaba under big data. Alibaba has built a fraud risk monitoring and management system based on real-time big data processing and intelligent risk models. It captures fraud signals directly from huge amount data of user behaviors and network, analyzes them in real-time using machine learning, and accurately predicts the bad users and transactions. To extend the fraud risk prevention ability to external customers, Alibaba also built up a big data based fraud prevention product called AntBuckler. AntBuckler aims to identify and prevent all flavors of malicious behaviors with flexibility and intelligence for online merchants and banks. By combining large amount data of Alibaba and customers', AntBuckler uses the RAIN score engine to quantify risk levels of users or transactions for fraud prevention. It also has a user-friendly visualization UI with risk scores, top reasons and fraud connections.

  1. Ethische aspecten van big data

    NARCIS (Netherlands)

    N. (Niek) van Antwerpen; Klaas Jan Mollema

    2017-01-01

    Big data heeft niet alleen geleid tot uitdagende technische vraagstukken, ook gaat het gepaard met allerlei nieuwe ethische en morele kwesties. Om verantwoord met big data om te gaan, moet ook over deze kwesties worden nagedacht. Want slecht datagebruik kan nadelige gevolgen hebben voor

  2. Epidemiology in wonderland: Big Data and precision medicine.

    Science.gov (United States)

    Saracci, Rodolfo

    2018-03-01

    Big Data and precision medicine, two major contemporary challenges for epidemiology, are critically examined from two different angles. In Part 1 Big Data collected for research purposes (Big research Data) and Big Data used for research although collected for other primary purposes (Big secondary Data) are discussed in the light of the fundamental common requirement of data validity, prevailing over "bigness". Precision medicine is treated developing the key point that high relative risks are as a rule required to make a variable or combination of variables suitable for prediction of disease occurrence, outcome or response to treatment; the commercial proliferation of allegedly predictive tests of unknown or poor validity is commented. Part 2 proposes a "wise epidemiology" approach to: (a) choosing in a context imprinted by Big Data and precision medicine-epidemiological research projects actually relevant to population health, (b) training epidemiologists, (c) investigating the impact on clinical practices and doctor-patient relation of the influx of Big Data and computerized medicine and (d) clarifying whether today "health" may be redefined-as some maintain in purely technological terms.

  3. Big Data and Analytics in Healthcare.

    Science.gov (United States)

    Tan, S S-L; Gao, G; Koch, S

    2015-01-01

    This editorial is part of the Focus Theme of Methods of Information in Medicine on "Big Data and Analytics in Healthcare". The amount of data being generated in the healthcare industry is growing at a rapid rate. This has generated immense interest in leveraging the availability of healthcare data (and "big data") to improve health outcomes and reduce costs. However, the nature of healthcare data, and especially big data, presents unique challenges in processing and analyzing big data in healthcare. This Focus Theme aims to disseminate some novel approaches to address these challenges. More specifically, approaches ranging from efficient methods of processing large clinical data to predictive models that could generate better predictions from healthcare data are presented.

  4. Big Data Knowledge in Global Health Education.

    Science.gov (United States)

    Olayinka, Olaniyi; Kekeh, Michele; Sheth-Chandra, Manasi; Akpinar-Elci, Muge

    The ability to synthesize and analyze massive amounts of data is critical to the success of organizations, including those that involve global health. As countries become highly interconnected, increasing the risk for pandemics and outbreaks, the demand for big data is likely to increase. This requires a global health workforce that is trained in the effective use of big data. To assess implementation of big data training in global health, we conducted a pilot survey of members of the Consortium of Universities of Global Health. More than half the respondents did not have a big data training program at their institution. Additionally, the majority agreed that big data training programs will improve global health deliverables, among other favorable outcomes. Given the observed gap and benefits, global health educators may consider investing in big data training for students seeking a career in global health. Copyright © 2017 Icahn School of Medicine at Mount Sinai. Published by Elsevier Inc. All rights reserved.

  5. Big data for bipolar disorder.

    Science.gov (United States)

    Monteith, Scott; Glenn, Tasha; Geddes, John; Whybrow, Peter C; Bauer, Michael

    2016-12-01

    The delivery of psychiatric care is changing with a new emphasis on integrated care, preventative measures, population health, and the biological basis of disease. Fundamental to this transformation are big data and advances in the ability to analyze these data. The impact of big data on the routine treatment of bipolar disorder today and in the near future is discussed, with examples that relate to health policy, the discovery of new associations, and the study of rare events. The primary sources of big data today are electronic medical records (EMR), claims, and registry data from providers and payers. In the near future, data created by patients from active monitoring, passive monitoring of Internet and smartphone activities, and from sensors may be integrated with the EMR. Diverse data sources from outside of medicine, such as government financial data, will be linked for research. Over the long term, genetic and imaging data will be integrated with the EMR, and there will be more emphasis on predictive models. Many technical challenges remain when analyzing big data that relates to size, heterogeneity, complexity, and unstructured text data in the EMR. Human judgement and subject matter expertise are critical parts of big data analysis, and the active participation of psychiatrists is needed throughout the analytical process.

  6. Big Data Challenges

    Directory of Open Access Journals (Sweden)

    Alexandru Adrian TOLE

    2013-10-01

    Full Text Available The amount of data that is traveling across the internet today, not only that is large, but is complex as well. Companies, institutions, healthcare system etc., all of them use piles of data which are further used for creating reports in order to ensure continuity regarding the services that they have to offer. The process behind the results that these entities requests represents a challenge for software developers and companies that provide IT infrastructure. The challenge is how to manipulate an impressive volume of data that has to be securely delivered through the internet and reach its destination intact. This paper treats the challenges that Big Data creates.

  7. BIG DATA IN TAMIL: OPPORTUNITIES, BENEFITS AND CHALLENGES

    OpenAIRE

    R.S. Vignesh Raj; Babak Khazaei; Ashik Ali

    2015-01-01

    This paper gives an overall introduction on big data and has tried to introduce Big Data in Tamil. It discusses the potential opportunities, benefits and likely challenges from a very Tamil and Tamil Nadu perspective. The paper has also made original contribution by proposing the ‘big data’s’ terminology in Tamil. The paper further suggests a few areas to explore using big data Tamil on the lines of the Tamil Nadu Government ‘vision 2023’. Whilst, big data has something to offer everyone, it ...

  8. Big data in biomedicine.

    Science.gov (United States)

    Costa, Fabricio F

    2014-04-01

    The increasing availability and growth rate of biomedical information, also known as 'big data', provides an opportunity for future personalized medicine programs that will significantly improve patient care. Recent advances in information technology (IT) applied to biomedicine are changing the landscape of privacy and personal information, with patients getting more control of their health information. Conceivably, big data analytics is already impacting health decisions and patient care; however, specific challenges need to be addressed to integrate current discoveries into medical practice. In this article, I will discuss the major breakthroughs achieved in combining omics and clinical health data in terms of their application to personalized medicine. I will also review the challenges associated with using big data in biomedicine and translational science. Copyright © 2013 Elsevier Ltd. All rights reserved.

  9. Big Data’s Role in Precision Public Health

    Science.gov (United States)

    Dolley, Shawn

    2018-01-01

    Precision public health is an emerging practice to more granularly predict and understand public health risks and customize treatments for more specific and homogeneous subpopulations, often using new data, technologies, and methods. Big data is one element that has consistently helped to achieve these goals, through its ability to deliver to practitioners a volume and variety of structured or unstructured data not previously possible. Big data has enabled more widespread and specific research and trials of stratifying and segmenting populations at risk for a variety of health problems. Examples of success using big data are surveyed in surveillance and signal detection, predicting future risk, targeted interventions, and understanding disease. Using novel big data or big data approaches has risks that remain to be resolved. The continued growth in volume and variety of available data, decreased costs of data capture, and emerging computational methods mean big data success will likely be a required pillar of precision public health into the future. This review article aims to identify the precision public health use cases where big data has added value, identify classes of value that big data may bring, and outline the risks inherent in using big data in precision public health efforts. PMID:29594091

  10. Big inquiry

    Energy Technology Data Exchange (ETDEWEB)

    Wynne, B [Lancaster Univ. (UK)

    1979-06-28

    The recently published report entitled 'The Big Public Inquiry' from the Council for Science and Society and the Outer Circle Policy Unit is considered, with especial reference to any future enquiry which may take place into the first commercial fast breeder reactor. Proposals embodied in the report include stronger rights for objectors and an attempt is made to tackle the problem that participation in a public inquiry is far too late to be objective. It is felt by the author that the CSS/OCPU report is a constructive contribution to the debate about big technology inquiries but that it fails to understand the deeper currents in the economic and political structure of technology which so influence the consequences of whatever formal procedures are evolved.

  11. Intelligent Control of Micro Grid: A Big Data-Based Control Center

    Science.gov (United States)

    Liu, Lu; Wang, Yanping; Liu, Li; Wang, Zhiseng

    2018-01-01

    In this paper, a structure of micro grid system with big data-based control center is introduced. Energy data from distributed generation, storage and load are analized through the control center, and from the results new trends will be predicted and applied as a feedback to optimize the control. Therefore, each step proceeded in micro grid can be adjusted and orgnized in a form of comprehensive management. A framework of real-time data collection, data processing and data analysis will be proposed by employing big data technology. Consequently, a integrated distributed generation and a optimized energy storage and transmission process can be implemented in the micro grid system.

  12. A Study on SE Methodology for Design of Big Data Pilot Platform to Improve Nuclear Power Plant Safety

    Energy Technology Data Exchange (ETDEWEB)

    Shin, Junguk; Cha, Jae-Min; Kim, Jun-Young; Park, Sung-Ho; Yeom, Choong-Sub [Institute for Advanced Engineering (IAE), Yongin (Korea, Republic of)

    2016-10-15

    A big data concept is expected to have a large impact on the safety of the nuclear power plant (NPP) from the beginning of the big data era. Though there are high interests on the NPP safety with the big data, almost no studies on the logical and physical structures and the systematic design methods of the big data platform for the NPP safety have been conducted. For the current study, a new big data pilot platform for the NPP safety is designed with the main focus on the health monitoring and early warning systems, and a tailored design process based on the systems engineering approaches is proposed to manage inherent high complexity of the platform design. The big data concept is expected to have a large impact on the safety of the NPP. So, in this study, the big data pilot platform for the health monitoring and early warning of the NPP is designed. For this, the development process based on the SE approach for the pilot platform is proposed and the design results along with the proposed process are also presented. Implementation of the individual modules and integrations of those are in currently progress.

  13. A Study on SE Methodology for Design of Big Data Pilot Platform to Improve Nuclear Power Plant Safety

    International Nuclear Information System (INIS)

    Shin, Junguk; Cha, Jae-Min; Kim, Jun-Young; Park, Sung-Ho; Yeom, Choong-Sub

    2016-01-01

    A big data concept is expected to have a large impact on the safety of the nuclear power plant (NPP) from the beginning of the big data era. Though there are high interests on the NPP safety with the big data, almost no studies on the logical and physical structures and the systematic design methods of the big data platform for the NPP safety have been conducted. For the current study, a new big data pilot platform for the NPP safety is designed with the main focus on the health monitoring and early warning systems, and a tailored design process based on the systems engineering approaches is proposed to manage inherent high complexity of the platform design. The big data concept is expected to have a large impact on the safety of the NPP. So, in this study, the big data pilot platform for the health monitoring and early warning of the NPP is designed. For this, the development process based on the SE approach for the pilot platform is proposed and the design results along with the proposed process are also presented. Implementation of the individual modules and integrations of those are in currently progress

  14. Big data analytics with R and Hadoop

    CERN Document Server

    Prajapati, Vignesh

    2013-01-01

    Big Data Analytics with R and Hadoop is a tutorial style book that focuses on all the powerful big data tasks that can be achieved by integrating R and Hadoop.This book is ideal for R developers who are looking for a way to perform big data analytics with Hadoop. This book is also aimed at those who know Hadoop and want to build some intelligent applications over Big data with R packages. It would be helpful if readers have basic knowledge of R.

  15. Big data in forensic science and medicine.

    Science.gov (United States)

    Lefèvre, Thomas

    2018-07-01

    In less than a decade, big data in medicine has become quite a phenomenon and many biomedical disciplines got their own tribune on the topic. Perspectives and debates are flourishing while there is a lack for a consensual definition for big data. The 3Vs paradigm is frequently evoked to define the big data principles and stands for Volume, Variety and Velocity. Even according to this paradigm, genuine big data studies are still scarce in medicine and may not meet all expectations. On one hand, techniques usually presented as specific to the big data such as machine learning techniques are supposed to support the ambition of personalized, predictive and preventive medicines. These techniques are mostly far from been new and are more than 50 years old for the most ancient. On the other hand, several issues closely related to the properties of big data and inherited from other scientific fields such as artificial intelligence are often underestimated if not ignored. Besides, a few papers temper the almost unanimous big data enthusiasm and are worth attention since they delineate what is at stakes. In this context, forensic science is still awaiting for its position papers as well as for a comprehensive outline of what kind of contribution big data could bring to the field. The present situation calls for definitions and actions to rationally guide research and practice in big data. It is an opportunity for grounding a true interdisciplinary approach in forensic science and medicine that is mainly based on evidence. Copyright © 2017 Elsevier Ltd and Faculty of Forensic and Legal Medicine. All rights reserved.

  16. Characteristics of business intelligence and big data in e-government

    DEFF Research Database (Denmark)

    Gaardboe, Rikke; Jonasen, Tanja Svarre; Kanstrup, Anne Marie

    2015-01-01

    Business intelligence and big data represent two different technologies within decision support systems. The present paper concerns the two concepts within the context of e-government. Thus, the purpose of the paper is to present the preliminary findings regarding publication patterns and topic...... coverage within the two technologies by conducting a comparative literature review. A total of 281 papers published in the years 2005–2014 were included in the analysis. A rapid increase of papers regarding big data were identified, the majority being journal papers. As regards business intelligence......, researchers publish in conference proceedings to a greater extent. Further, big data journal papers are published within a broader range of journal topics compared to business intelligence journal papers. The paper concludes by pointing to further analyses that will be carried out within the 281 selected...

  17. Surface urban heat island across 419 global big cities.

    Science.gov (United States)

    Peng, Shushi; Piao, Shilong; Ciais, Philippe; Friedlingstein, Pierre; Ottle, Catherine; Bréon, François-Marie; Nan, Huijuan; Zhou, Liming; Myneni, Ranga B

    2012-01-17

    Urban heat island is among the most evident aspects of human impacts on the earth system. Here we assess the diurnal and seasonal variation of surface urban heat island intensity (SUHII) defined as the surface temperature difference between urban area and suburban area measured from the MODIS. Differences in SUHII are analyzed across 419 global big cities, and we assess several potential biophysical and socio-economic driving factors. Across the big cities, we show that the average annual daytime SUHII (1.5 ± 1.2 °C) is higher than the annual nighttime SUHII (1.1 ± 0.5 °C) (P < 0.001). But no correlation is found between daytime and nighttime SUHII across big cities (P = 0.84), suggesting different driving mechanisms between day and night. The distribution of nighttime SUHII correlates positively with the difference in albedo and nighttime light between urban area and suburban area, while the distribution of daytime SUHII correlates negatively across cities with the difference of vegetation cover and activity between urban and suburban areas. Our results emphasize the key role of vegetation feedbacks in attenuating SUHII of big cities during the day, in particular during the growing season, further highlighting that increasing urban vegetation cover could be one effective way to mitigate the urban heat island effect.

  18. Big-deep-smart data in imaging for guiding materials design

    Science.gov (United States)

    Kalinin, Sergei V.; Sumpter, Bobby G.; Archibald, Richard K.

    2015-10-01

    Harnessing big data, deep data, and smart data from state-of-the-art imaging might accelerate the design and realization of advanced functional materials. Here we discuss new opportunities in materials design enabled by the availability of big data in imaging and data analytics approaches, including their limitations, in material systems of practical interest. We specifically focus on how these tools might help realize new discoveries in a timely manner. Such methodologies are particularly appropriate to explore in light of continued improvements in atomistic imaging, modelling and data analytics methods.

  19. Revisiting sample size: are big trials the answer?

    Science.gov (United States)

    Lurati Buse, Giovanna A L; Botto, Fernando; Devereaux, P J

    2012-07-18

    The superiority of the evidence generated in randomized controlled trials over observational data is not only conditional to randomization. Randomized controlled trials require proper design and implementation to provide a reliable effect estimate. Adequate random sequence generation, allocation implementation, analyses based on the intention-to-treat principle, and sufficient power are crucial to the quality of a randomized controlled trial. Power, or the probability of the trial to detect a difference when a real difference between treatments exists, strongly depends on sample size. The quality of orthopaedic randomized controlled trials is frequently threatened by a limited sample size. This paper reviews basic concepts and pitfalls in sample-size estimation and focuses on the importance of large trials in the generation of valid evidence.

  20. New algorithms for processing time-series big EEG data within mobile health monitoring systems.

    Science.gov (United States)

    Serhani, Mohamed Adel; Menshawy, Mohamed El; Benharref, Abdelghani; Harous, Saad; Navaz, Alramzana Nujum

    2017-10-01

    Recent advances in miniature biomedical sensors, mobile smartphones, wireless communications, and distributed computing technologies provide promising techniques for developing mobile health systems. Such systems are capable of monitoring epileptic seizures reliably, which are classified as chronic diseases. Three challenging issues raised in this context with regard to the transformation, compression, storage, and visualization of big data, which results from a continuous recording of epileptic seizures using mobile devices. In this paper, we address the above challenges by developing three new algorithms to process and analyze big electroencephalography data in a rigorous and efficient manner. The first algorithm is responsible for transforming the standard European Data Format (EDF) into the standard JavaScript Object Notation (JSON) and compressing the transformed JSON data to decrease the size and time through the transfer process and to increase the network transfer rate. The second algorithm focuses on collecting and storing the compressed files generated by the transformation and compression algorithm. The collection process is performed with respect to the on-the-fly technique after decompressing files. The third algorithm provides relevant real-time interaction with signal data by prospective users. It particularly features the following capabilities: visualization of single or multiple signal channels on a smartphone device and query data segments. We tested and evaluated the effectiveness of our approach through a software architecture model implementing a mobile health system to monitor epileptic seizures. The experimental findings from 45 experiments are promising and efficiently satisfy the approach's objectives in a price of linearity. Moreover, the size of compressed JSON files and transfer times are reduced by 10% and 20%, respectively, while the average total time is remarkably reduced by 67% through all performed experiments. Our approach

  1. Biometric Attendance and Big Data Analysis for Optimizing Work Processes.

    Science.gov (United States)

    Verma, Neetu; Xavier, Teenu; Agrawal, Deepak

    2016-01-01

    Although biometric attendance management is available, large healthcare organizations have difficulty in big data analysis for optimization of work processes. The aim of this project was to assess the implementation of a biometric attendance system and its utility following big data analysis. In this prospective study the implementation of biometric system was evaluated over 3 month period at our institution. Software integration with other existing systems for data analysis was also evaluated. Implementation of the biometric system could be successfully done over a two month period with enrollment of 10,000 employees into the system. However generating reports and taking action this large number of staff was a challenge. For this purpose software was made for capturing the duty roster of each employee and integrating it with the biometric system and adding an SMS gateway. This helped in automating the process of sending SMSs to each employee who had not signed in. Standalone biometric systems have limited functionality in large organizations unless it is meshed with employee duty roster.

  2. Small data in the era of big data

    OpenAIRE

    Kitchin, Rob; Lauriault, Tracey P.

    2015-01-01

    Academic knowledge building has progressed for the past few centuries using small data studies characterized by sampled data generated to answer specific questions. It is a strategy that has been remarkably successful, enabling the sciences, social sciences and humanities to advance in leaps and bounds. This approach is presently being challenged by the development of big data. Small data studies will however, we argue, continue to be popular and valuable in the fut...

  3. m-Health 2.0: New perspectives on mobile health, Machine Learning and Big Data Analytics.

    Science.gov (United States)

    Istepanian, Robert S H; Al-Anzi, Turki

    2018-06-08

    Mobile health (m-Health) has been repeatedly called the biggest technological breakthrough of our modern times. Similarly, the concept of big data in the context of healthcare is considered one of the transformative drivers for intelligent healthcare delivery systems. In recent years, big data has become increasingly synonymous with mobile health, however key challenges of 'Big Data and mobile health', remain largely untackled. This is becoming particularly important with the continued deluge of the structured and unstructured data sets generated on daily basis from the proliferation of mobile health applications within different healthcare systems and products globally. The aim of this paper is of twofold. First we present the relevant big data issues from the mobile health (m-Health) perspective. In particular we discuss these issues from the technological areas and building blocks (communications, sensors and computing) of mobile health and the newly defined (m-Health 2.0) concept. The second objective is to present the relevant rapprochement issues of big m-Health data analytics with m-Health. Further, we also present the current and future roles of machine and deep learning within the current smart phone centric m-health model. The critical balance between these two important areas will depend on how different stakeholder from patients, clinicians, healthcare providers, medical and m-health market businesses and regulators will perceive these developments. These new perspectives are essential for better understanding the fine balance between the new insights of how intelligent and connected the future mobile health systems will look like and the inherent risks and clinical complexities associated with the big data sets and analytical tools used in these systems. These topics will be subject for extensive work and investigations in the foreseeable future for the areas of data analytics, computational and artificial intelligence methods applied for mobile health

  4. Traffic information computing platform for big data

    Energy Technology Data Exchange (ETDEWEB)

    Duan, Zongtao, E-mail: ztduan@chd.edu.cn; Li, Ying, E-mail: ztduan@chd.edu.cn; Zheng, Xibin, E-mail: ztduan@chd.edu.cn; Liu, Yan, E-mail: ztduan@chd.edu.cn; Dai, Jiting, E-mail: ztduan@chd.edu.cn; Kang, Jun, E-mail: ztduan@chd.edu.cn [Chang' an University School of Information Engineering, Xi' an, China and Shaanxi Engineering and Technical Research Center for Road and Traffic Detection, Xi' an (China)

    2014-10-06

    Big data environment create data conditions for improving the quality of traffic information service. The target of this article is to construct a traffic information computing platform for big data environment. Through in-depth analysis the connotation and technology characteristics of big data and traffic information service, a distributed traffic atomic information computing platform architecture is proposed. Under the big data environment, this type of traffic atomic information computing architecture helps to guarantee the traffic safety and efficient operation, more intelligent and personalized traffic information service can be used for the traffic information users.

  5. Traffic information computing platform for big data

    International Nuclear Information System (INIS)

    Duan, Zongtao; Li, Ying; Zheng, Xibin; Liu, Yan; Dai, Jiting; Kang, Jun

    2014-01-01

    Big data environment create data conditions for improving the quality of traffic information service. The target of this article is to construct a traffic information computing platform for big data environment. Through in-depth analysis the connotation and technology characteristics of big data and traffic information service, a distributed traffic atomic information computing platform architecture is proposed. Under the big data environment, this type of traffic atomic information computing architecture helps to guarantee the traffic safety and efficient operation, more intelligent and personalized traffic information service can be used for the traffic information users

  6. Fremtidens landbrug bliver big business

    DEFF Research Database (Denmark)

    Hansen, Henning Otte

    2016-01-01

    Landbrugets omverdensforhold og konkurrencevilkår ændres, og det vil nødvendiggøre en udvikling i retning af “big business“, hvor landbrugene bliver endnu større, mere industrialiserede og koncentrerede. Big business bliver en dominerende udvikling i dansk landbrug - men ikke den eneste...

  7. RESEARCH ON THE CONSTRUCTION OF REMOTE SENSING AUTOMATIC INTERPRETATION SYMBOL BIG DATA

    Directory of Open Access Journals (Sweden)

    Y. Gao

    2018-04-01

    Full Text Available Remote sensing automatic interpretation symbol (RSAIS is an inexpensive and fast method in providing precise in-situ information for image interpretation and accuracy. This study designed a scientific and precise RSAIS data characterization method, as well as a distributed and cloud architecture massive data storage method. Additionally, it introduced an offline and online data update mode and a dynamic data evaluation mechanism, with the aim to create an efficient approach for RSAIS big data construction. Finally, a national RSAIS database with more than 3 million samples covering 86 land types was constructed during 2013–2015 based on the National Geographic Conditions Monitoring Project of China and then annually updated since the 2016 period. The RSAIS big data has proven to be a good method for large scale image interpretation and field validation. It is also notable that it has the potential to solve image automatic interpretation with the assistance of deep learning technology in the remote sensing big data era.

  8. Research on the Construction of Remote Sensing Automatic Interpretation Symbol Big Data

    Science.gov (United States)

    Gao, Y.; Liu, R.; Liu, J.; Cheng, T.

    2018-04-01

    Remote sensing automatic interpretation symbol (RSAIS) is an inexpensive and fast method in providing precise in-situ information for image interpretation and accuracy. This study designed a scientific and precise RSAIS data characterization method, as well as a distributed and cloud architecture massive data storage method. Additionally, it introduced an offline and online data update mode and a dynamic data evaluation mechanism, with the aim to create an efficient approach for RSAIS big data construction. Finally, a national RSAIS database with more than 3 million samples covering 86 land types was constructed during 2013-2015 based on the National Geographic Conditions Monitoring Project of China and then annually updated since the 2016 period. The RSAIS big data has proven to be a good method for large scale image interpretation and field validation. It is also notable that it has the potential to solve image automatic interpretation with the assistance of deep learning technology in the remote sensing big data era.

  9. Quantum nature of the big bang.

    Science.gov (United States)

    Ashtekar, Abhay; Pawlowski, Tomasz; Singh, Parampreet

    2006-04-14

    Some long-standing issues concerning the quantum nature of the big bang are resolved in the context of homogeneous isotropic models with a scalar field. Specifically, the known results on the resolution of the big-bang singularity in loop quantum cosmology are significantly extended as follows: (i) the scalar field is shown to serve as an internal clock, thereby providing a detailed realization of the "emergent time" idea; (ii) the physical Hilbert space, Dirac observables, and semiclassical states are constructed rigorously; (iii) the Hamiltonian constraint is solved numerically to show that the big bang is replaced by a big bounce. Thanks to the nonperturbative, background independent methods, unlike in other approaches the quantum evolution is deterministic across the deep Planck regime.

  10. Mentoring in Schools: An Impact Study of Big Brothers Big Sisters School-Based Mentoring

    Science.gov (United States)

    Herrera, Carla; Grossman, Jean Baldwin; Kauh, Tina J.; McMaken, Jennifer

    2011-01-01

    This random assignment impact study of Big Brothers Big Sisters School-Based Mentoring involved 1,139 9- to 16-year-old students in 10 cities nationwide. Youth were randomly assigned to either a treatment group (receiving mentoring) or a control group (receiving no mentoring) and were followed for 1.5 school years. At the end of the first school…

  11. Adoption of geodemographic and ethno-cultural taxonomies for analysing Big Data

    Directory of Open Access Journals (Sweden)

    Richard James Webber

    2015-05-01

    Full Text Available This paper is intended to contribute to the discussion of the differential level of adoption of Big Data among research communities. Recognising the impracticality of conducting an audit across all forms and uses of Big Data, we have restricted our enquiry to one very specific form of Big Data, namely general purpose taxonomies, of which Mosaic, Acorn and Origins are examples, that rely on data from a variety of Big Data feeds. The intention of these taxonomies is to enable the records of consumers and citizens held on Big Data datasets to be coded according to type of residential neighbourhood or ethno-cultural heritage without any use of questionnaires. Based on our respective experience in the academic social sciences, in government and in the design and marketing of these taxonomies, we identify the features of these classifications which appear to render them attractive or problematic to different categories of potential user or researcher depending on how the relationship is conceived. We conclude by identifying seven classifications of user or potential user who, on account of their background, current position and future career expectations, tend to respond in different ways to the opportunity to adopt these generic systems as aids for understanding social processes.

  12. Intelligent Decision Support and Big Data for Logistics and Supply Chain Management

    DEFF Research Database (Denmark)

    Voss, Stefan; Sebastian, Hans-Jürgen; Pahl, Julia

    2017-01-01

    Intelligent Decision Support and Big Data for Logistics and Supply Chain Management” features theoretical developments, real-world applications and information systems related to solving decision problems in logistics and supply chain management. Methods include optimization, heuristics, metaheur......Intelligent Decision Support and Big Data for Logistics and Supply Chain Management” features theoretical developments, real-world applications and information systems related to solving decision problems in logistics and supply chain management. Methods include optimization, heuristics......, metaheuristics and matheuristics, simulation, agent technologies, and descriptive methods. In a sense, we were and are representing the future of logistics over the years....

  13. Big data processing in the cloud - Challenges and platforms

    Science.gov (United States)

    Zhelev, Svetoslav; Rozeva, Anna

    2017-12-01

    Choosing the appropriate architecture and technologies for a big data project is a difficult task, which requires extensive knowledge in both the problem domain and in the big data landscape. The paper analyzes the main big data architectures and the most widely implemented technologies used for processing and persisting big data. Clouds provide for dynamic resource scaling, which makes them a natural fit for big data applications. Basic cloud computing service models are presented. Two architectures for processing big data are discussed, Lambda and Kappa architectures. Technologies for big data persistence are presented and analyzed. Stream processing as the most important and difficult to manage is outlined. The paper highlights main advantages of cloud and potential problems.

  14. Ethics and Epistemology in Big Data Research.

    Science.gov (United States)

    Lipworth, Wendy; Mason, Paul H; Kerridge, Ian; Ioannidis, John P A

    2017-12-01

    Biomedical innovation and translation are increasingly emphasizing research using "big data." The hope is that big data methods will both speed up research and make its results more applicable to "real-world" patients and health services. While big data research has been embraced by scientists, politicians, industry, and the public, numerous ethical, organizational, and technical/methodological concerns have also been raised. With respect to technical and methodological concerns, there is a view that these will be resolved through sophisticated information technologies, predictive algorithms, and data analysis techniques. While such advances will likely go some way towards resolving technical and methodological issues, we believe that the epistemological issues raised by big data research have important ethical implications and raise questions about the very possibility of big data research achieving its goals.

  15. Victoria Stodden: Scholarly Communication in the Era of Big Data and Big Computation

    OpenAIRE

    Stodden, Victoria

    2015-01-01

    Victoria Stodden gave the keynote address for Open Access Week 2015. "Scholarly communication in the era of big data and big computation" was sponsored by the University Libraries, Computational Modeling and Data Analytics, the Department of Computer Science, the Department of Statistics, the Laboratory for Interdisciplinary Statistical Analysis (LISA), and the Virginia Bioinformatics Institute. Victoria Stodden is an associate professor in the Graduate School of Library and Information Scien...

  16. Smart Cities in Taiwan: A Perspective on Big Data Applications

    Directory of Open Access Journals (Sweden)

    Shiann Ming Wu

    2018-01-01

    Full Text Available In this paper, we discuss the concept of a smart city based on information and communication technology (ICT, analyze the objectives of smart city development in Taiwan, and explain the supporting technologies that make such development possible. Subsequently, we propose a hierarchical structure framework of smart city systems with levels of complexity ranging from low to high and interconnections and interactive relationships in five dimensions: the Internet of Things (IoT, cloud computing, Big Data, Mobile Network, and smart business. We integrate each key resource of the core operation systems of cities to promote the innovative operation of cities and further optimize city development. We then propose a Big Data platform data flow framework that uses information from ubiquitous sensor networks and information equipment to analyze the Big Data application process of smart cities and determine the resulting advantages and challenges. Additionally, we analyze the current state of development of smart cities in Taiwan. Finally, we discuss a new philosophy of smart city development and provide a practical blueprint for the formation, operation, and development of the smart cities with the aim of creating a bright future for the smart cities of Taiwan.

  17. Big data analytics a management perspective

    CERN Document Server

    Corea, Francesco

    2016-01-01

    This book is about innovation, big data, and data science seen from a business perspective. Big data is a buzzword nowadays, and there is a growing necessity within practitioners to understand better the phenomenon, starting from a clear stated definition. This book aims to be a starting reading for executives who want (and need) to keep the pace with the technological breakthrough introduced by new analytical techniques and piles of data. Common myths about big data will be explained, and a series of different strategic approaches will be provided. By browsing the book, it will be possible to learn how to implement a big data strategy and how to use a maturity framework to monitor the progress of the data science team, as well as how to move forward from one stage to the next. Crucial challenges related to big data will be discussed, where some of them are more general - such as ethics, privacy, and ownership – while others concern more specific business situations (e.g., initial public offering, growth st...

  18. Big Data and HPC: A Happy Marriage

    KAUST Repository

    Mehmood, Rashid

    2016-01-25

    International Data Corporation (IDC) defines Big Data technologies as “a new generation of technologies and architectures, designed to economically extract value from very large volumes of a wide variety of data produced every day, by enabling high velocity capture, discovery, and/or analysis”. High Performance Computing (HPC) most generally refers to “the practice of aggregating computing power in a way that delivers much higher performance than one could get out of a typical desktop computer or workstation in order to solve large problems in science, engineering, or business”. Big data platforms are built primarily considering the economics and capacity of the system for dealing with the 4V characteristics of data. HPC traditionally has been more focussed on the speed of digesting (computing) the data. For these reasons, the two domains (HPC and Big Data) have developed their own paradigms and technologies. However, recently, these two have grown fond of each other. HPC technologies are needed by Big Data to deal with the ever increasing Vs of data in order to forecast and extract insights from existing and new domains, faster, and with greater accuracy. Increasingly more data is being produced by scientific experiments from areas such as bioscience, physics, and climate, and therefore, HPC needs to adopt data-driven paradigms. Moreover, there are synergies between them with unimaginable potential for developing new computing paradigms, solving long-standing grand challenges, and making new explorations and discoveries. Therefore, they must get married to each other. In this talk, we will trace the HPC and big data landscapes through time including their respective technologies, paradigms and major applications areas. Subsequently, we will present the factors that are driving the convergence of the two technologies, the synergies between them, as well as the benefits of their convergence to the biosciences field. The opportunities and challenges of the

  19. Long-Term Developmental Changes in Children's Lower-Order Big Five Personality Facets

    NARCIS (Netherlands)

    A.D. de Haan (Amaranta); S.S.W. de Pauw (Sarah); A.L. van den Akker (Alithe); M. Deković (Maja); P.J. Prinzie (Peter)

    2016-01-01

    markdownabstract__Objective:__ This study examined long-term developmental changes in mother-rated lower-order facets of children's Big Five dimensions. __Method:__ Two independent community samples covering early childhood (2-4.5 years; N=365, 39% girls) and middle childhood to the end of middle

  20. Human factors in Big Data

    NARCIS (Netherlands)

    Boer, J. de

    2016-01-01

    Since 2014 I am involved in various (research) projects that try to make the hype around Big Data more concrete and tangible for the industry and government. Big Data is about multiple sources of (real-time) data that can be analysed, transformed to information and be used to make 'smart' decisions.