WorldWideScience

Sample records for big genomes facilitate

  1. BIG DATA TECHNOLOGY ACCELERATE GENOMICS PRECISION MEDICINE

    Directory of Open Access Journals (Sweden)

    HAO LI

    2017-01-01

    Full Text Available During genomics life science research, the data volume of whole genomics and life science algorithm is going bigger and bigger, which is calculated as TB, PB or EB etc. The key problem will be how to store and analyze the data with optimized way. This paper demonstrates how Intel Big Data Technology and Architecture help to facilitate and accelerate the genomics life science research in data store and utilization. Intel defines high performance GenomicsDB for variant call data query and Lustre filesystem with Hierarchal Storage Management for genomics data store. Based on these great technology, Intel defines genomics knowledge share and exchange architecture, which is landed and validated in BGI China and Shanghai Children Hospital with very positive feedback. And these big data technology can definitely be scaled to much more genomics life science partners in the world

  2. Big Data: Astronomical or Genomical?

    Science.gov (United States)

    Stephens, Zachary D; Lee, Skylar Y; Faghri, Faraz; Campbell, Roy H; Zhai, Chengxiang; Efron, Miles J; Iyer, Ravishankar; Schatz, Michael C; Sinha, Saurabh; Robinson, Gene E

    2015-07-01

    Genomics is a Big Data science and is going to get much bigger, very soon, but it is not known whether the needs of genomics will exceed other Big Data domains. Projecting to the year 2025, we compared genomics with three other major generators of Big Data: astronomy, YouTube, and Twitter. Our estimates show that genomics is a "four-headed beast"--it is either on par with or the most demanding of the domains analyzed here in terms of data acquisition, storage, distribution, and analysis. We discuss aspects of new technologies that will need to be developed to rise up and meet the computational challenges that genomics poses for the near future. Now is the time for concerted, community-wide planning for the "genomical" challenges of the next decade.

  3. Big Data: Astronomical or Genomical?

    Directory of Open Access Journals (Sweden)

    Zachary D Stephens

    2015-07-01

    Full Text Available Genomics is a Big Data science and is going to get much bigger, very soon, but it is not known whether the needs of genomics will exceed other Big Data domains. Projecting to the year 2025, we compared genomics with three other major generators of Big Data: astronomy, YouTube, and Twitter. Our estimates show that genomics is a "four-headed beast"--it is either on par with or the most demanding of the domains analyzed here in terms of data acquisition, storage, distribution, and analysis. We discuss aspects of new technologies that will need to be developed to rise up and meet the computational challenges that genomics poses for the near future. Now is the time for concerted, community-wide planning for the "genomical" challenges of the next decade.

  4. Big Data Analytics for Genomic Medicine.

    Science.gov (United States)

    He, Karen Y; Ge, Dongliang; He, Max M

    2017-02-15

    Genomic medicine attempts to build individualized strategies for diagnostic or therapeutic decision-making by utilizing patients' genomic information. Big Data analytics uncovers hidden patterns, unknown correlations, and other insights through examining large-scale various data sets. While integration and manipulation of diverse genomic data and comprehensive electronic health records (EHRs) on a Big Data infrastructure exhibit challenges, they also provide a feasible opportunity to develop an efficient and effective approach to identify clinically actionable genetic variants for individualized diagnosis and therapy. In this paper, we review the challenges of manipulating large-scale next-generation sequencing (NGS) data and diverse clinical data derived from the EHRs for genomic medicine. We introduce possible solutions for different challenges in manipulating, managing, and analyzing genomic and clinical data to implement genomic medicine. Additionally, we also present a practical Big Data toolset for identifying clinically actionable genetic variants using high-throughput NGS data and EHRs.

  5. Big Data Analytics for Genomic Medicine

    Science.gov (United States)

    He, Karen Y.; Ge, Dongliang; He, Max M.

    2017-01-01

    Genomic medicine attempts to build individualized strategies for diagnostic or therapeutic decision-making by utilizing patients’ genomic information. Big Data analytics uncovers hidden patterns, unknown correlations, and other insights through examining large-scale various data sets. While integration and manipulation of diverse genomic data and comprehensive electronic health records (EHRs) on a Big Data infrastructure exhibit challenges, they also provide a feasible opportunity to develop an efficient and effective approach to identify clinically actionable genetic variants for individualized diagnosis and therapy. In this paper, we review the challenges of manipulating large-scale next-generation sequencing (NGS) data and diverse clinical data derived from the EHRs for genomic medicine. We introduce possible solutions for different challenges in manipulating, managing, and analyzing genomic and clinical data to implement genomic medicine. Additionally, we also present a practical Big Data toolset for identifying clinically actionable genetic variants using high-throughput NGS data and EHRs. PMID:28212287

  6. Big Data Analysis of Human Genome Variations

    KAUST Repository

    Gojobori, Takashi

    2016-01-25

    Since the human genome draft sequence was in public for the first time in 2000, genomic analyses have been intensively extended to the population level. The following three international projects are good examples for large-scale studies of human genome variations: 1) HapMap Data (1,417 individuals) (http://hapmap.ncbi.nlm.nih.gov/downloads/genotypes/2010-08_phaseII+III/forward/), 2) HGDP (Human Genome Diversity Project) Data (940 individuals) (http://www.hagsc.org/hgdp/files.html), 3) 1000 genomes Data (2,504 individuals) http://ftp.1000genomes.ebi.ac.uk/vol1/ftp/release/20130502/ If we can integrate all three data into a single volume of data, we should be able to conduct a more detailed analysis of human genome variations for a total number of 4,861 individuals (= 1,417+940+2,504 individuals). In fact, we successfully integrated these three data sets by use of information on the reference human genome sequence, and we conducted the big data analysis. In particular, we constructed a phylogenetic tree of about 5,000 human individuals at the genome level. As a result, we were able to identify clusters of ethnic groups, with detectable admixture, that were not possible by an analysis of each of the three data sets. Here, we report the outcome of this kind of big data analyses and discuss evolutionary significance of human genomic variations. Note that the present study was conducted in collaboration with Katsuhiko Mineta and Kosuke Goto at KAUST.

  7. 'Big data', Hadoop and cloud computing in genomics.

    Science.gov (United States)

    O'Driscoll, Aisling; Daugelaite, Jurate; Sleator, Roy D

    2013-10-01

    Since the completion of the Human Genome project at the turn of the Century, there has been an unprecedented proliferation of genomic sequence data. A consequence of this is that the medical discoveries of the future will largely depend on our ability to process and analyse large genomic data sets, which continue to expand as the cost of sequencing decreases. Herein, we provide an overview of cloud computing and big data technologies, and discuss how such expertise can be used to deal with biology's big data sets. In particular, big data technologies such as the Apache Hadoop project, which provides distributed and parallelised data processing and analysis of petabyte (PB) scale data sets will be discussed, together with an overview of the current usage of Hadoop within the bioinformatics community.

  8. Tools for Extracting Actionable Medical Knowledge from Genomic Big Data

    OpenAIRE

    Goldstein, Theodore Charles

    2013-01-01

    Cancer is an ideal target for personal genomics-based medicine that uses high-throughput genome assays such as DNA sequencing, RNA sequencing, and expression analysis (collectively called omics); however, researchers and physicians are overwhelmed by the quantities of big data from these assays and cannot interpret this information accurately without specialized tools. To address this problem, I have created software methods and tools called OCCAM (OmiC data Cancer Analytic Model) and DIPSC (...

  9. The Human Genome Project: big science transforms biology and medicine.

    Science.gov (United States)

    Hood, Leroy; Rowen, Lee

    2013-01-01

    The Human Genome Project has transformed biology through its integrated big science approach to deciphering a reference human genome sequence along with the complete sequences of key model organisms. The project exemplifies the power, necessity and success of large, integrated, cross-disciplinary efforts - so-called 'big science' - directed towards complex major objectives. In this article, we discuss the ways in which this ambitious endeavor led to the development of novel technologies and analytical tools, and how it brought the expertise of engineers, computer scientists and mathematicians together with biologists. It established an open approach to data sharing and open-source software, thereby making the data resulting from the project accessible to all. The genome sequences of microbes, plants and animals have revolutionized many fields of science, including microbiology, virology, infectious disease and plant biology. Moreover, deeper knowledge of human sequence variation has begun to alter the practice of medicine. The Human Genome Project has inspired subsequent large-scale data acquisition initiatives such as the International HapMap Project, 1000 Genomes, and The Cancer Genome Atlas, as well as the recently announced Human Brain Project and the emerging Human Proteome Project.

  10. Genome-Facilitated Analyses of Geomicrobial Processes

    Energy Technology Data Exchange (ETDEWEB)

    Kenneth H. Nealson

    2012-05-02

    This project had the goal(s) of understanding the mechanism(s) of extracellular electron transport (EET) in the microbe Shewanella oneidensis MR-1, and a number of other strains and species in the genus Shewanella. The major accomplishments included sequencing, annotation, and analysis of more than 20 Shewanella genomes. The comparative genomics enabled the beginning of a systems biology approach to this genus. Another major contribution involved the study of gene regulation, primarily in the model organism, MR-1. As part of this work, we took advantage of special facilities at the DOE: e.g., the synchrotron radiation facility at ANL, where we successfully used this system for elemental characterization of single cells in different metabolic states (1). We began work with purified enzymes, and identification of partially purified enzymes, leading to initial characterization of several of the 42 c-type cytochromes from MR-1 (2). As the genome became annotated, we began experiments on transcriptome analysis under different conditions of growth, the first step towards systems biology (3,4). Conductive appendages of Shewanella, called bacterial nanowires were identified and characterized during this work (5, 11, 20,21). For the first time, it was possible to measure the electron transfer rate between single cells and a solid substrate (20), a rate that has been confirmed by several other laboratories. We also showed that MR-1 cells preferentially attach to cells at a given charge, and are not attracted, or even repelled by other charges. The interaction with the charged surfaces begins with a stimulation of motility (called electrokinesis), and eventually leads to attachment and growth. One of the things that genomics allows is the comparative analysis of the various Shewanella strains, which led to several important insights. First, while the genomes predicted that none of the strains looked like they should be able to degrade N-acetyl glucosamine (NAG), the monomer

  11. Heterogeneous Cloud Framework for Big Data Genome Sequencing.

    Science.gov (United States)

    Wang, Chao; Li, Xi; Chen, Peng; Wang, Aili; Zhou, Xuehai; Yu, Hong

    2015-01-01

    The next generation genome sequencing problem with short (long) reads is an emerging field in numerous scientific and big data research domains. However, data sizes and ease of access for scientific researchers are growing and most current methodologies rely on one acceleration approach and so cannot meet the requirements imposed by explosive data scales and complexities. In this paper, we propose a novel FPGA-based acceleration solution with MapReduce framework on multiple hardware accelerators. The combination of hardware acceleration and MapReduce execution flow could greatly accelerate the task of aligning short length reads to a known reference genome. To evaluate the performance and other metrics, we conducted a theoretical speedup analysis on a MapReduce programming platform, which demonstrates that our proposed architecture have efficient potential to improve the speedup for large scale genome sequencing applications. Also, as a practical study, we have built a hardware prototype on the real Xilinx FPGA chip. Significant metrics on speedup, sensitivity, mapping quality, error rate, and hardware cost are evaluated, respectively. Experimental results demonstrate that the proposed platform could efficiently accelerate the next generation sequencing problem with satisfactory accuracy and acceptable hardware cost.

  12. Opportunities and challenges of big data for the social sciences: The case of genomic data.

    Science.gov (United States)

    Liu, Hexuan; Guo, Guang

    2016-09-01

    In this paper, we draw attention to one unique and valuable source of big data, genomic data, by demonstrating the opportunities they provide to social scientists. We discuss different types of large-scale genomic data and recent advances in statistical methods and computational infrastructure used to address challenges in managing and analyzing such data. We highlight how these data and methods can be used to benefit social science research.

  13. Nucleotide sequence and genomic organization of an ophiovirus associated with lettuce big-vein disease

    NARCIS (Netherlands)

    Wilk, van der F.; Dullemans, A.M.; Verbeek, M.; Heuvel, van den J.F.J.M.

    2002-01-01

    The complete nucleotide sequence of an ophiovirus associated with lettuce big-vein disease has been elucidated. The genome consisted of four RNA molecules of approximately 7ò8, 1ò7, 1ò5 and 1ò4 kb. Virus particles were shown to contain nearly equimolar amounts of RNA molecules of both polarities. Th

  14. Mapping our genes: The genome projects: How big, how fast

    Energy Technology Data Exchange (ETDEWEB)

    none,

    1988-04-01

    For the past 2 years, scientific and technical journals in biology and medicine have extensively covered a debate about whether and how to determine the function and order of human genes on human chromosomes and when to determine the sequence of molecular building blocks that comprise DNA in those chromosomes. In 1987, these issues rose to become part of the public agenda. The debate involves science, technology, and politics. Congress is responsible for /open quotes/writing the rules/close quotes/ of what various federal agencies do and for funding their work. This report surveys the points made so far in the debate, focusing on those that most directly influence the policy options facing the US Congress. Congressional interest focused on how to assess the rationales for conducting human genome projects, how to fund human genome projects (at what level and through which mechanisms), how to coordinate the scientific and technical programs of the several federal agencies and private interests already supporting various genome projects, and how to strike a balance regarding the impact of genome projects on international scientific cooperation and international economic competition in biotechnology. OTA prepared this report with the assistance of several hundred experts throughout the world. 342 refs., 26 figs., 11 tabs.

  15. Mapping Our Genes: The Genome Projects: How Big, How Fast

    Science.gov (United States)

    1988-04-01

    For the past 2 years, scientific and technical journals in biology and medicine have extensively covered a debate about whether and how to determine the function and order of human genes on human chromosomes and when to determine the sequence of molecular building blocks that comprise DNA in those chromosomes. In 1987, these issues rose to become part of the public agenda. The debate involves science, technology, and politics. Congress is responsible for �writing the rules� of what various federal agencies do and for funding their work. This report surveys the points made so far in the debate, focusing on those that most directly influence the policy options facing the US Congress. Congressional interest focused on how to assess the rationales for conducting human genome projects, how to fund human genome projects (at what level and through which mechanisms), how to coordinate the scientific and technical programs of the several federal agencies and private interests already supporting various genome projects, and how to strike a balance regarding the impact of genome projects on international scientific cooperation and international economic competition in biotechnology. The Office of Technology Assessment (OTA) prepared this report with the assistance of several hundred experts throughout the world.

  16. VESPA: Software to Facilitate Genomic Annotation of Prokaryotic Organisms Through Integration of Proteomic and Transcriptomic Data

    Energy Technology Data Exchange (ETDEWEB)

    Peterson, Elena S.; McCue, Lee Ann; Rutledge, Alexandra C.; Jensen, Jeffrey L.; Walker, Julia; Kobold, Mark A.; Webb, Samantha R.; Payne, Samuel H.; Ansong, Charles; Adkins, Joshua N.; Cannon, William R.; Webb-Robertson, Bobbie-Jo M.

    2012-04-25

    Visual Exploration and Statistics to Promote Annotation (VESPA) is an interactive visual analysis software tool that facilitates the discovery of structural mis-annotations in prokaryotic genomes. VESPA integrates high-throughput peptide-centric proteomics data and oligo-centric or RNA-Seq transcriptomics data into a genomic context. The data may be interrogated via visual analysis across multiple levels of genomic resolution, linked searches, exports and interaction with BLAST to rapidly identify location of interest within the genome and evaluate potential mis-annotations.

  17. Personalized medicine beyond genomics: alternative futures in big data-proteomics, environtome and the social proteome.

    Science.gov (United States)

    Özdemir, Vural; Dove, Edward S; Gürsoy, Ulvi K; Şardaş, Semra; Yıldırım, Arif; Yılmaz, Şenay Görücü; Ömer Barlas, I; Güngör, Kıvanç; Mete, Alper; Srivastava, Sanjeeva

    2017-01-01

    No field in science and medicine today remains untouched by Big Data, and psychiatry is no exception. Proteomics is a Big Data technology and a next generation biomarker, supporting novel system diagnostics and therapeutics in psychiatry. Proteomics technology is, in fact, much older than genomics and dates to the 1970s, well before the launch of the international Human Genome Project. While the genome has long been framed as the master or "elite" executive molecule in cell biology, the proteome by contrast is humble. Yet the proteome is critical for life-it ensures the daily functioning of cells and whole organisms. In short, proteins are the blue-collar workers of biology, the down-to-earth molecules that we cannot live without. Since 2010, proteomics has found renewed meaning and international attention with the launch of the Human Proteome Project and the growing interest in Big Data technologies such as proteomics. This article presents an interdisciplinary technology foresight analysis and conceptualizes the terms "environtome" and "social proteome". We define "environtome" as the entire complement of elements external to the human host, from microbiome, ambient temperature and weather conditions to government innovation policies, stock market dynamics, human values, political power and social norms that collectively shape the human host spatially and temporally. The "social proteome" is the subset of the environtome that influences the transition of proteomics technology to innovative applications in society. The social proteome encompasses, for example, new reimbursement schemes and business innovation models for proteomics diagnostics that depart from the "once-a-life-time" genotypic tests and the anticipated hype attendant to context and time sensitive proteomics tests. Building on the "nesting principle" for governance of complex systems as discussed by Elinor Ostrom, we propose here a 3-tiered organizational architecture for Big Data science such as

  18. Figure 2 from Integrative Genomics Viewer: Visualizing Big Data | Office of Cancer Genomics

    Science.gov (United States)

    Grouping and sorting genomic data in IGV. The IGV user interface displaying 202 glioblastoma samples from TCGA. Samples are grouped by tumor subtype (second annotation column) and data type (first annotation column) and sorted by copy number of the EGFR locus (middle column). Adapted from Figure 1; Robinson et al. 2011

  19. Figure 4 from Integrative Genomics Viewer: Visualizing Big Data | Office of Cancer Genomics

    Science.gov (United States)

    Gene-list view of genomic data. The gene-list view allows users to compare data across a set of loci. The data in this figure includes copy number, mutation, and clinical data from 202 glioblastoma samples from TCGA. Adapted from Figure 7; Thorvaldsdottir H et al. 2012

  20. Figure 5 from Integrative Genomics Viewer: Visualizing Big Data | Office of Cancer Genomics

    Science.gov (United States)

    Split-Screen View. The split-screen view is useful for exploring relationships of genomic features that are independent of chromosomal location. Color is used here to indicate mate pairs that map to different chromosomes, chromosomes 1 and 6, suggesting a translocation event. Adapted from Figure 8; Thorvaldsdottir H et al. 2012

  1. Big data, open science and the brain: lessons learned from genomics

    Directory of Open Access Journals (Sweden)

    Suparna eChoudhury

    2014-05-01

    Full Text Available The BRAIN Initiative aims to break new ground in the scale and speed of data collection in neuroscience, requiring tools to handle data in the magnitude of yottabytes (1024. The scale, investment and organization of it are being compared to the Human Genome Project (HGP, which has exemplified ‘big science’ for biology. In line with the trend towards Big Data in genomic research, the promise of the BRAIN Initiative, as well as the European Human Brain Project, rests on the possibility to amass vast quantities of data to model the complex interactions between the brain and behaviour and inform the diagnosis and prevention of neurological disorders and psychiatric disease. Advocates of this ‘data driven’ paradigm in neuroscience argue that harnessing the large quantities of data generated across laboratories worldwide has numerous methodological, ethical and economic advantages, but it requires the neuroscience community to adopt a culture of data sharing and open access to benefit from them. In this article, we examine the rationale for data sharing among advocates and briefly exemplify these in terms of new ‘open neuroscience’ projects. Then, drawing on the frequently invoked model of data sharing in genomics, we go on to demonstrate the complexities of data sharing, shedding light on the sociological and ethical challenges within the realms of institutions, researchers and participants, namely dilemmas around public/private interests in data, (lack of motivation to share in the academic community, and potential loss of participant anonymity. Our paper serves to highlight some foreseeable tensions around data sharing relevant to the emergent ‘open neuroscience’ movement.

  2. Big data, open science and the brain: lessons learned from genomics.

    Science.gov (United States)

    Choudhury, Suparna; Fishman, Jennifer R; McGowan, Michelle L; Juengst, Eric T

    2014-01-01

    The BRAIN Initiative aims to break new ground in the scale and speed of data collection in neuroscience, requiring tools to handle data in the magnitude of yottabytes (10(24)). The scale, investment and organization of it are being compared to the Human Genome Project (HGP), which has exemplified "big science" for biology. In line with the trend towards Big Data in genomic research, the promise of the BRAIN Initiative, as well as the European Human Brain Project, rests on the possibility to amass vast quantities of data to model the complex interactions between the brain and behavior and inform the diagnosis and prevention of neurological disorders and psychiatric disease. Advocates of this "data driven" paradigm in neuroscience argue that harnessing the large quantities of data generated across laboratories worldwide has numerous methodological, ethical and economic advantages, but it requires the neuroscience community to adopt a culture of data sharing and open access to benefit from them. In this article, we examine the rationale for data sharing among advocates and briefly exemplify these in terms of new "open neuroscience" projects. Then, drawing on the frequently invoked model of data sharing in genomics, we go on to demonstrate the complexities of data sharing, shedding light on the sociological and ethical challenges within the realms of institutions, researchers and participants, namely dilemmas around public/private interests in data, (lack of) motivation to share in the academic community, and potential loss of participant anonymity. Our paper serves to highlight some foreseeable tensions around data sharing relevant to the emergent "open neuroscience" movement.

  3. The spotted gar genome illuminates vertebrate evolution and facilitates human-teleost comparisons.

    Science.gov (United States)

    Braasch, Ingo; Gehrke, Andrew R; Smith, Jeramiah J; Kawasaki, Kazuhiko; Manousaki, Tereza; Pasquier, Jeremy; Amores, Angel; Desvignes, Thomas; Batzel, Peter; Catchen, Julian; Berlin, Aaron M; Campbell, Michael S; Barrell, Daniel; Martin, Kyle J; Mulley, John F; Ravi, Vydianathan; Lee, Alison P; Nakamura, Tetsuya; Chalopin, Domitille; Fan, Shaohua; Wcisel, Dustin; Cañestro, Cristian; Sydes, Jason; Beaudry, Felix E G; Sun, Yi; Hertel, Jana; Beam, Michael J; Fasold, Mario; Ishiyama, Mikio; Johnson, Jeremy; Kehr, Steffi; Lara, Marcia; Letaw, John H; Litman, Gary W; Litman, Ronda T; Mikami, Masato; Ota, Tatsuya; Saha, Nil Ratan; Williams, Louise; Stadler, Peter F; Wang, Han; Taylor, John S; Fontenot, Quenton; Ferrara, Allyse; Searle, Stephen M J; Aken, Bronwen; Yandell, Mark; Schneider, Igor; Yoder, Jeffrey A; Volff, Jean-Nicolas; Meyer, Axel; Amemiya, Chris T; Venkatesh, Byrappa; Holland, Peter W H; Guiguen, Yann; Bobe, Julien; Shubin, Neil H; Di Palma, Federica; Alföldi, Jessica; Lindblad-Toh, Kerstin; Postlethwait, John H

    2016-04-01

    To connect human biology to fish biomedical models, we sequenced the genome of spotted gar (Lepisosteus oculatus), whose lineage diverged from teleosts before teleost genome duplication (TGD). The slowly evolving gar genome has conserved in content and size many entire chromosomes from bony vertebrate ancestors. Gar bridges teleosts to tetrapods by illuminating the evolution of immunity, mineralization and development (mediated, for example, by Hox, ParaHox and microRNA genes). Numerous conserved noncoding elements (CNEs; often cis regulatory) undetectable in direct human-teleost comparisons become apparent using gar: functional studies uncovered conserved roles for such cryptic CNEs, facilitating annotation of sequences identified in human genome-wide association studies. Transcriptomic analyses showed that the sums of expression domains and expression levels for duplicated teleost genes often approximate the patterns and levels of expression for gar genes, consistent with subfunctionalization. The gar genome provides a resource for understanding evolution after genome duplication, the origin of vertebrate genomes and the function of human regulatory sequences.

  4. The spotted gar genome illuminates vertebrate evolution and facilitates human-to-teleost comparisons

    Science.gov (United States)

    Braasch, Ingo; Gehrke, Andrew R.; Smith, Jeramiah J.; Kawasaki, Kazuhiko; Manousaki, Tereza; Pasquier, Jeremy; Amores, Angel; Desvignes, Thomas; Batzel, Peter; Catchen, Julian; Berlin, Aaron M.; Campbell, Michael S.; Barrell, Daniel; Martin, Kyle J.; Mulley, John F.; Ravi, Vydianathan; Lee, Alison P.; Nakamura, Tetsuya; Chalopin, Domitille; Fan, Shaohua; Wcisel, Dustin; Cañestro, Cristian; Sydes, Jason; Beaudry, Felix E. G.; Sun, Yi; Hertel, Jana; Beam, Michael J.; Fasold, Mario; Ishiyama, Mikio; Johnson, Jeremy; Kehr, Steffi; Lara, Marcia; Letaw, John H.; Litman, Gary W.; Litman, Ronda T.; Mikami, Masato; Ota, Tatsuya; Saha, Nil Ratan; Williams, Louise; Stadler, Peter F.; Wang, Han; Taylor, John S.; Fontenot, Quenton; Ferrara, Allyse; Searle, Stephen M. J.; Aken, Bronwen; Yandell, Mark; Schneider, Igor; Yoder, Jeffrey A.; Volff, Jean-Nicolas; Meyer, Axel; Amemiya, Chris T.; Venkatesh, Byrappa; Holland, Peter W. H.; Guiguen, Yann; Bobe, Julien; Shubin, Neil H.; Di Palma, Federica; Alföldi, Jessica; Lindblad-Toh, Kerstin; Postlethwait, John H.

    2016-01-01

    To connect human biology to fish biomedical models, we sequenced the genome of spotted gar (Lepisosteus oculatus), whose lineage diverged from teleosts before the teleost genome duplication (TGD). The slowly evolving gar genome conserved in content and size many entire chromosomes from bony vertebrate ancestors. Gar bridges teleosts to tetrapods by illuminating the evolution of immunity, mineralization, and development (e.g., Hox, ParaHox, and miRNA genes). Numerous conserved non-coding elements (CNEs, often cis-regulatory) undetectable in direct human-teleost comparisons become apparent using gar: functional studies uncovered conserved roles of such cryptic CNEs, facilitating annotation of sequences identified in human genome-wide association studies. Transcriptomic analyses revealed that the sum of expression domains and levels from duplicated teleost genes often approximate patterns and levels of gar genes, consistent with subfunctionalization. The gar genome provides a resource for understanding evolution after genome duplication, the origin of vertebrate genomes, and the function of human regulatory sequences. PMID:26950095

  5. Barriers and Facilitators to Adoption of Genomic Services for Colorectal Care within the Veterans Health Administration

    Directory of Open Access Journals (Sweden)

    Nina R. Sperber

    2016-04-01

    Full Text Available We examined facilitators and barriers to adoption of genomic services for colorectal care, one of the first genomic medicine applications, within the Veterans Health Administration to shed light on areas for practice change. We conducted semi-structured interviews with 58 clinicians to understand use of the following genomic services for colorectal care: family health history documentation, molecular and genetic testing, and genetic counseling. Data collection and analysis were informed by two conceptual frameworks, the Greenhalgh Diffusion of Innovation and Andersen Behavioral Model, to allow for concurrent examination of both access and innovation factors. Specialists were more likely than primary care clinicians to obtain family history to investigate hereditary colorectal cancer (CRC, but with limited detail; clinicians suggested templates to facilitate retrieval and documentation of family history according to guidelines. Clinicians identified advantage of molecular tumor analysis prior to genetic testing, but tumor testing was infrequently used due to perceived low disease burden. Support from genetic counselors was regarded as facilitative for considering hereditary basis of CRC diagnosis, but there was variability in awareness of and access to this expertise. Our data suggest the need for tools and policies to establish and disseminate well-defined processes for accessing services and adhering to guidelines.

  6. Barriers and Facilitators to Adoption of Genomic Services for Colorectal Care within the Veterans Health Administration.

    Science.gov (United States)

    Sperber, Nina R; Andrews, Sara M; Voils, Corrine I; Green, Gregory L; Provenzale, Dawn; Knight, Sara

    2016-04-28

    We examined facilitators and barriers to adoption of genomic services for colorectal care, one of the first genomic medicine applications, within the Veterans Health Administration to shed light on areas for practice change. We conducted semi-structured interviews with 58 clinicians to understand use of the following genomic services for colorectal care: family health history documentation, molecular and genetic testing, and genetic counseling. Data collection and analysis were informed by two conceptual frameworks, the Greenhalgh Diffusion of Innovation and Andersen Behavioral Model, to allow for concurrent examination of both access and innovation factors. Specialists were more likely than primary care clinicians to obtain family history to investigate hereditary colorectal cancer (CRC), but with limited detail; clinicians suggested templates to facilitate retrieval and documentation of family history according to guidelines. Clinicians identified advantage of molecular tumor analysis prior to genetic testing, but tumor testing was infrequently used due to perceived low disease burden. Support from genetic counselors was regarded as facilitative for considering hereditary basis of CRC diagnosis, but there was variability in awareness of and access to this expertise. Our data suggest the need for tools and policies to establish and disseminate well-defined processes for accessing services and adhering to guidelines.

  7. Crowd-funded micro-grants for genomics and "big data": an actionable idea connecting small (artisan) science, infrastructure science, and citizen philanthropy.

    Science.gov (United States)

    Özdemir, Vural; Badr, Kamal F; Dove, Edward S; Endrenyi, Laszlo; Geraci, Christy Jo; Hotez, Peter J; Milius, Djims; Neves-Pereira, Maria; Pang, Tikki; Rotimi, Charles N; Sabra, Ramzi; Sarkissian, Christineh N; Srivastava, Sanjeeva; Tims, Hesther; Zgheib, Nathalie K; Kickbusch, Ilona

    2013-04-01

    Biomedical science in the 21(st) century is embedded in, and draws from, a digital commons and "Big Data" created by high-throughput Omics technologies such as genomics. Classic Edisonian metaphors of science and scientists (i.e., "the lone genius" or other narrow definitions of expertise) are ill equipped to harness the vast promises of the 21(st) century digital commons. Moreover, in medicine and life sciences, experts often under-appreciate the important contributions made by citizen scholars and lead users of innovations to design innovative products and co-create new knowledge. We believe there are a large number of users waiting to be mobilized so as to engage with Big Data as citizen scientists-only if some funding were available. Yet many of these scholars may not meet the meta-criteria used to judge expertise, such as a track record in obtaining large research grants or a traditional academic curriculum vitae. This innovation research article describes a novel idea and action framework: micro-grants, each worth $1000, for genomics and Big Data. Though a relatively small amount at first glance, this far exceeds the annual income of the "bottom one billion"-the 1.4 billion people living below the extreme poverty level defined by the World Bank ($1.25/day). We describe two types of micro-grants. Type 1 micro-grants can be awarded through established funding agencies and philanthropies that create micro-granting programs to fund a broad and highly diverse array of small artisan labs and citizen scholars to connect genomics and Big Data with new models of discovery such as open user innovation. Type 2 micro-grants can be funded by existing or new science observatories and citizen think tanks through crowd-funding mechanisms described herein. Type 2 micro-grants would also facilitate global health diplomacy by co-creating crowd-funded micro-granting programs across nation-states in regions facing political and financial instability, while sharing similar disease

  8. VESPA: software to facilitate genomic annotation of prokaryotic organisms through integration of proteomic and transcriptomic data

    Directory of Open Access Journals (Sweden)

    Peterson Elena S

    2012-04-01

    Full Text Available Abstract Background The procedural aspects of genome sequencing and assembly have become relatively inexpensive, yet the full, accurate structural annotation of these genomes remains a challenge. Next-generation sequencing transcriptomics (RNA-Seq, global microarrays, and tandem mass spectrometry (MS/MS-based proteomics have demonstrated immense value to genome curators as individual sources of information, however, integrating these data types to validate and improve structural annotation remains a major challenge. Current visual and statistical analytic tools are focused on a single data type, or existing software tools are retrofitted to analyze new data forms. We present Visual Exploration and Statistics to Promote Annotation (VESPA is a new interactive visual analysis software tool focused on assisting scientists with the annotation of prokaryotic genomes though the integration of proteomics and transcriptomics data with current genome location coordinates. Results VESPA is a desktop Java™ application that integrates high-throughput proteomics data (peptide-centric and transcriptomics (probe or RNA-Seq data into a genomic context, all of which can be visualized at three levels of genomic resolution. Data is interrogated via searches linked to the genome visualizations to find regions with high likelihood of mis-annotation. Search results are linked to exports for further validation outside of VESPA or potential coding-regions can be analyzed concurrently with the software through interaction with BLAST. VESPA is demonstrated on two use cases (Yersinia pestis Pestoides F and Synechococcus sp. PCC 7002 to demonstrate the rapid manner in which mis-annotations can be found and explored in VESPA using either proteomics data alone, or in combination with transcriptomic data. Conclusions VESPA is an interactive visual analytics tool that integrates high-throughput data into a genomic context to facilitate the discovery of structural mis

  9. The Mouse Genome Database (MGD): facilitating mouse as a model for human biology and disease.

    Science.gov (United States)

    Eppig, Janan T; Blake, Judith A; Bult, Carol J; Kadin, James A; Richardson, Joel E

    2015-01-01

    The Mouse Genome Database (MGD, http://www.informatics.jax.org) serves the international biomedical research community as the central resource for integrated genomic, genetic and biological data on the laboratory mouse. To facilitate use of mouse as a model in translational studies, MGD maintains a core of high-quality curated data and integrates experimentally and computationally generated data sets. MGD maintains a unified catalog of genes and genome features, including functional RNAs, QTL and phenotypic loci. MGD curates and provides functional and phenotype annotations for mouse genes using the Gene Ontology and Mammalian Phenotype Ontology. MGD integrates phenotype data and associates mouse genotypes to human diseases, providing critical mouse-human relationships and access to repositories holding mouse models. MGD is the authoritative source of nomenclature for genes, genome features, alleles and strains following guidelines of the International Committee on Standardized Genetic Nomenclature for Mice. A new addition to MGD, the Human-Mouse: Disease Connection, allows users to explore gene-phenotype-disease relationships between human and mouse. MGD has also updated search paradigms for phenotypic allele attributes, incorporated incidental mutation data, added a module for display and exploration of genes and microRNA interactions and adopted the JBrowse genome browser. MGD resources are freely available to the scientific community.

  10. The Naked Mole Rat Genome Resource: facilitating analyses of cancer and longevity-related adaptations

    Science.gov (United States)

    Keane, Michael; Craig, Thomas; Alföldi, Jessica; Berlin, Aaron M.; Johnson, Jeremy; Seluanov, Andrei; Gorbunova, Vera; Di Palma, Federica; Lindblad-Toh, Kerstin; Church, George M.; de Magalhães, João Pedro

    2014-01-01

    Motivation: The naked mole rat (Heterocephalus glaber) is an exceptionally long-lived and cancer-resistant rodent native to East Africa. Although its genome was previously sequenced, here we report a new assembly sequenced by us with substantially higher N50 values for scaffolds and contigs. Results: We analyzed the annotation of this new improved assembly and identified candidate genomic adaptations which may have contributed to the evolution of the naked mole rat’s extraordinary traits, including in regions of p53, and the hyaluronan receptors CD44 and HMMR (RHAMM). Furthermore, we developed a freely available web portal, the Naked Mole Rat Genome Resource (http://www.naked-mole-rat.org), featuring the data and results of our analysis, to assist researchers interested in the genome and genes of the naked mole rat, and also to facilitate further studies on this fascinating species. Availability and implementation: The Naked Mole Rat Genome Resource is freely available online at http://www.naked-mole-rat.org. This resource is open source and the source code is available at https://github.com/maglab/naked-mole-rat-portal. Contact: jp@senescence.info PMID:25172923

  11. Reframed Genome-Scale Metabolic Model to Facilitate Genetic Design and Integration with Expression Data.

    Science.gov (United States)

    Gu, Deqing; Jian, Xingxing; Zhang, Cheng; Hua, Qiang

    2016-06-08

    Genome-scale metabolic network models (GEMs) have played important roles in the design of genetically engineered strains and helped biologists to decipher metabolism. However, due to the complex gene-reaction relationships that exist in model systems, most algorithms have limited capabilities with respect to directly predicting accurate genetic design for metabolic engineering. In particular, methods that predict reaction knockout strategies leading to overproduction are often impractical in terms of gene manipulations. Recently, we proposed a method named LTM (logical transformation of model) to simplify the gene-reaction associations by introducing intermediate pseudo reactions, which makes it possible to generate genetic design. Here, we propose an alternative method to relieve researchers from deciphering complex gene-reactions by adding pseudo gene controlling reactions. In comparison to LTM, this new method introduces fewer pseudo reactions and generates a much smaller model system named as gModel. We showed that gModel allows two seldom reported applications: identification of minimal genomes and design of minimal cell factories within a modified OptKnock framework. In addition, gModel could be used to integrate expression data directly and improve the performance of the E-Fmin method for predicting fluxes. In conclusion, the model transformation procedure will facilitate genetic research based on GEMs, extending their applications.

  12. Big Data

    Directory of Open Access Journals (Sweden)

    Prachi More

    2013-05-01

    Full Text Available Demand and spurt in collections and accumulation of data has coined new term “Big Data” has begun. Accidently, incidentally and by interaction of people, information so called data is massively generated. This BIG DATA is to be smartly and effectively used Computer scientists, physicists, economists, mathematicians, political scientists, bio-informaticists, sociologists and many Variety of Intellegesia debate over the potential benefits and costs of analysing information from Twitter, Google, Facebook, Wikipedia and every space where large groups of people leave digital traces and deposit data. Given the rise of Big Data as both a phenomenon and a methodological persuasion, it is time to start critically interrogating this phenomenon, its assumptions and its biases. Big data, which refers to the data sets that are too big to be handled using the existing database management tools, are emerging in many important applications, such as Internet search, business informatics, social networks, social media, genomics, and meteorology. Big data presents a grand challenge for database and data analytics research. This paper is a blend of non-technical and introductory-level technical detail, ideal for the novice. We conclude with some technical challenges as well as the solutions that can be used to these challenges. Big Data differs from other data with five characteristics like volume, variety, value, velocity and complexity. The article will focus on some current and future cases and causes for BIG DATA.

  13. High-throughput genome editing and phenotyping facilitated by high resolution melting curve analysis.

    Directory of Open Access Journals (Sweden)

    Holly R Thomas

    Full Text Available With the goal to generate and characterize the phenotypes of null alleles in all genes within an organism and the recent advances in custom nucleases, genome editing limitations have moved from mutation generation to mutation detection. We previously demonstrated that High Resolution Melting (HRM analysis is a rapid and efficient means of genotyping known zebrafish mutants. Here we establish optimized conditions for HRM based detection of novel mutant alleles. Using these conditions, we demonstrate that HRM is highly efficient at mutation detection across multiple genome editing platforms (ZFNs, TALENs, and CRISPRs; we observed nuclease generated HRM positive targeting in 1 of 6 (16% open pool derived ZFNs, 14 of 23 (60% TALENs, and 58 of 77 (75% CRISPR nucleases. Successful targeting, based on HRM of G0 embryos correlates well with successful germline transmission (46 of 47 nucleases; yet, surprisingly mutations in the somatic tail DNA weakly correlate with mutations in the germline F1 progeny DNA. This suggests that analysis of G0 tail DNA is a good indicator of the efficiency of the nuclease, but not necessarily a good indicator of germline alleles that will be present in the F1s. However, we demonstrate that small amplicon HRM curve profiles of F1 progeny DNA can be used to differentiate between specific mutant alleles, facilitating rare allele identification and isolation; and that HRM is a powerful technique for screening possible off-target mutations that may be generated by the nucleases. Our data suggest that micro-homology based alternative NHEJ repair is primarily utilized in the generation of CRISPR mutant alleles and allows us to predict likelihood of generating a null allele. Lastly, we demonstrate that HRM can be used to quickly distinguish genotype-phenotype correlations within F1 embryos derived from G0 intercrosses. Together these data indicate that custom nucleases, in conjunction with the ease and speed of HRM, will

  14. Big Data in Plant Science: Resources and Data Mining Tools for Plant Genomics and Proteomics.

    Science.gov (United States)

    Popescu, George V; Noutsos, Christos; Popescu, Sorina C

    2016-01-01

    In modern plant biology, progress is increasingly defined by the scientists' ability to gather and analyze data sets of high volume and complexity, otherwise known as "big data". Arguably, the largest increase in the volume of plant data sets over the last decade is a consequence of the application of the next-generation sequencing and mass-spectrometry technologies to the study of experimental model and crop plants. The increase in quantity and complexity of biological data brings challenges, mostly associated with data acquisition, processing, and sharing within the scientific community. Nonetheless, big data in plant science create unique opportunities in advancing our understanding of complex biological processes at a level of accuracy without precedence, and establish a base for the plant systems biology. In this chapter, we summarize the major drivers of big data in plant science and big data initiatives in life sciences with a focus on the scope and impact of iPlant, a representative cyberinfrastructure platform for plant science.

  15. MSeqDR: A Centralized Knowledge Repository and Bioinformatics Web Resource to Facilitate Genomic Investigations in Mitochondrial Disease.

    Science.gov (United States)

    Shen, Lishuang; Diroma, Maria Angela; Gonzalez, Michael; Navarro-Gomez, Daniel; Leipzig, Jeremy; Lott, Marie T; van Oven, Mannis; Wallace, Douglas C; Muraresku, Colleen Clarke; Zolkipli-Cunningham, Zarazuela; Chinnery, Patrick F; Attimonelli, Marcella; Zuchner, Stephan; Falk, Marni J; Gai, Xiaowu

    2016-06-01

    MSeqDR is the Mitochondrial Disease Sequence Data Resource, a centralized and comprehensive genome and phenome bioinformatics resource built by the mitochondrial disease community to facilitate clinical diagnosis and research investigations of individual patient phenotypes, genomes, genes, and variants. A central Web portal (https://mseqdr.org) integrates community knowledge from expert-curated databases with genomic and phenotype data shared by clinicians and researchers. MSeqDR also functions as a centralized application server for Web-based tools to analyze data across both mitochondrial and nuclear DNA, including investigator-driven whole exome or genome dataset analyses through MSeqDR-Genesis. MSeqDR-GBrowse genome browser supports interactive genomic data exploration and visualization with custom tracks relevant to mtDNA variation and mitochondrial disease. MSeqDR-LSDB is a locus-specific database that currently manages 178 mitochondrial diseases, 1,363 genes associated with mitochondrial biology or disease, and 3,711 pathogenic variants in those genes. MSeqDR Disease Portal allows hierarchical tree-style disease exploration to evaluate their unique descriptions, phenotypes, and causative variants. Automated genomic data submission tools are provided that capture ClinVar compliant variant annotations. PhenoTips will be used for phenotypic data submission on deidentified patients using human phenotype ontology terminology. The development of a dynamic informed patient consent process to guide data access is underway to realize the full potential of these resources.

  16. LLNL's Big Science Capabilities Help Spur Over $796 Billion in U.S. Economic Activity Sequencing the Human Genome

    Energy Technology Data Exchange (ETDEWEB)

    Stewart, Jeffrey S. [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States)

    2015-07-28

    LLNL’s successful history of taking on big science projects spans beyond national security and has helped create billions of dollars per year in new economic activity. One example is LLNL’s role in helping sequence the human genome. Over $796 billion in new economic activity in over half a dozen fields has been documented since LLNL successfully completed this Grand Challenge.

  17. Whole-genome sequencing of 234 bulls facilitates mapping of monogenic and complex traits in cattle

    DEFF Research Database (Denmark)

    Daetwyler, Hans D; Capitan, Aurélien; Pausch, Hubert

    2014-01-01

    The 1000 bull genomes project supports the goal of accelerating the rates of genetic gain in domestic cattle while at the same time considering animal health and welfare by providing the annotated sequence variants and genotypes of key ancestor bulls. In the first phase of the 1000 bull genomes p...

  18. Bridging the gap between Big Genome Data Analysis and Database Management Systems

    NARCIS (Netherlands)

    Cijvat, C.P.

    2014-01-01

    The bioinformatics field has encountered a data deluge over the last years, due to in- creasing speed and decreasing cost of DNA sequencing technology. Today, sequencing the DNA of a single genome only takes about a week, and it can result in up to a ter- abyte of data. The sequencing data are usual

  19. High Capsid–Genome Correlation Facilitates Creation of AAV Libraries for Directed Evolution

    Science.gov (United States)

    Nonnenmacher, Mathieu; van Bakel, Harm; Hajjar, Roger J; Weber, Thomas

    2015-01-01

    Directed evolution of adeno-associated virus (AAV) through successive rounds of phenotypic selection is a powerful method to isolate variants with improved properties from large libraries of capsid mutants. Importantly, AAV libraries used for directed evolution are based on the “natural” AAV genome organization where the capsid proteins are encoded in cis from replicating genomes. This is necessary to allow the recovery of the capsid DNA after each step of phenotypic selection. For directed evolution to be used successfully, it is essential to minimize the random mixing of capsomers and the encapsidation of nonmatching viral genomes during the production of the viral libraries. Here, we demonstrate that multiple AAV capsid variants expressed from Rep/Cap containing viral genomes result in near-homogeneous capsids that display an unexpectedly high capsid–DNA correlation. Next-generation sequencing of AAV progeny generated by bulk transfection of a semi-random peptide library showed a strong counter-selection of capsid variants encoding premature stop codons, which further supports a strong capsid–genome identity correlation. Overall, our observations demonstrate that production of “natural” AAVs results in low capsid mosaicism and high capsid–genome correlation. These unique properties allow the production of highly diverse AAV libraries in a one-step procedure with a minimal loss in phenotype–genotype correlation. PMID:25586687

  20. High capsid-genome correlation facilitates creation of AAV libraries for directed evolution.

    Science.gov (United States)

    Nonnenmacher, Mathieu; van Bakel, Harm; Hajjar, Roger J; Weber, Thomas

    2015-04-01

    Directed evolution of adeno-associated virus (AAV) through successive rounds of phenotypic selection is a powerful method to isolate variants with improved properties from large libraries of capsid mutants. Importantly, AAV libraries used for directed evolution are based on the "natural" AAV genome organization where the capsid proteins are encoded in cis from replicating genomes. This is necessary to allow the recovery of the capsid DNA after each step of phenotypic selection. For directed evolution to be used successfully, it is essential to minimize the random mixing of capsomers and the encapsidation of nonmatching viral genomes during the production of the viral libraries. Here, we demonstrate that multiple AAV capsid variants expressed from Rep/Cap containing viral genomes result in near-homogeneous capsids that display an unexpectedly high capsid-DNA correlation. Next-generation sequencing of AAV progeny generated by bulk transfection of a semi-random peptide library showed a strong counter-selection of capsid variants encoding premature stop codons, which further supports a strong capsid-genome identity correlation. Overall, our observations demonstrate that production of "natural" AAVs results in low capsid mosaicism and high capsid-genome correlation. These unique properties allow the production of highly diverse AAV libraries in a one-step procedure with a minimal loss in phenotype-genotype correlation.

  1. From functional genomics to functional immunomics: new challenges, old problems, big rewards.

    Directory of Open Access Journals (Sweden)

    Ulisses M Braga-Neto

    2006-07-01

    Full Text Available The development of DNA microarray technology a decade ago led to the establishment of functional genomics as one of the most active and successful scientific disciplines today. With the ongoing development of immunomic microarray technology-a spatially addressable, large-scale technology for measurement of specific immunological response-the new challenge of functional immunomics is emerging, which bears similarities to but is also significantly different from functional genomics. Immunonic data has been successfully used to identify biological markers involved in autoimmune diseases, allergies, viral infections such as human immunodeficiency virus (HIV, influenza, diabetes, and responses to cancer vaccines. This review intends to provide a coherent vision of this nascent scientific field, and speculate on future research directions. We discuss at some length issues such as epitope prediction, immunomic microarray technology and its applications, and computation and statistical challenges related to functional immunomics. Based on the recent discovery of regulation mechanisms in T cell responses, we envision the use of immunomic microarrays as a tool for advances in systems biology of cellular immune responses, by means of immunomic regulatory network models.

  2. Single-Cell-Genomics-Facilitated Read Binning of Candidate Phylum EM19 Genomes from Geothermal Spring Metagenomes.

    Science.gov (United States)

    Becraft, Eric D; Dodsworth, Jeremy A; Murugapiran, Senthil K; Ohlsson, J Ingemar; Briggs, Brandon R; Kanbar, Jad; De Vlaminck, Iwijn; Quake, Stephen R; Dong, Hailiang; Hedlund, Brian P; Swingley, Wesley D

    2015-12-04

    The vast majority of microbial life remains uncatalogued due to the inability to cultivate these organisms in the laboratory. This "microbial dark matter" represents a substantial portion of the tree of life and of the populations that contribute to chemical cycling in many ecosystems. In this work, we leveraged an existing single-cell genomic data set representing the candidate bacterial phylum "Calescamantes" (EM19) to calibrate machine learning algorithms and define metagenomic bins directly from pyrosequencing reads derived from Great Boiling Spring in the U.S. Great Basin. Compared to other assembly-based methods, taxonomic binning with a read-based machine learning approach yielded final assemblies with the highest predicted genome completeness of any method tested. Read-first binning subsequently was used to extract Calescamantes bins from all metagenomes with abundant Calescamantes populations, including metagenomes from Octopus Spring and Bison Pool in Yellowstone National Park and Gongxiaoshe Spring in Yunnan Province, China. Metabolic reconstruction suggests that Calescamantes are heterotrophic, facultative anaerobes, which can utilize oxidized nitrogen sources as terminal electron acceptors for respiration in the absence of oxygen and use proteins as their primary carbon source. Despite their phylogenetic divergence, the geographically separate Calescamantes populations were highly similar in their predicted metabolic capabilities and core gene content, respiring O2, or oxidized nitrogen species for energy conservation in distant but chemically similar hot springs.

  3. Facilitating genome navigation : survey sequencing and dense radiation-hybrid gene mapping

    NARCIS (Netherlands)

    Hitte, C; Madeoy, J; Kirkness, EF; Priat, C; Lorentzen, TD; Senger, F; Thomas, D; Derrien, T; Ramirez, C; Scott, C; Evanno, G; Pullar, B; Cadieu, E; Oza, [No Value; Lourgant, K; Jaffe, DB; Tacher, S; Dreano, S; Berkova, N; Andre, C; Deloukas, P; Fraser, C; Lindblad-Toh, K; Ostrander, EA; Galibert, F

    2005-01-01

    Accurate and comprehensive sequence coverage for large genomes has been restricted to only a few species of specific interest. Lower sequence coverage (survey sequencing) of related species can yield a wealth of information about gene content and putative regulatory elements. But survey sequences la

  4. The genome sequence of Barbarea vulgaris facilitates the study of ecological biochemistry

    DEFF Research Database (Denmark)

    Byrne, Stephen L.; Erthmann, Pernille Østerbye; Agerbirk, Niels

    2017-01-01

    The genus Barbarea has emerged as a model for evolution and ecology of plant defense compounds, due to its unusual glucosinolate profile and production of saponins, unique to the Brassicaceae. One species, B. vulgaris, includes two ‘types’, G-type and P-type that differ in trichome density, and t...... deter larvae to the extent that they die. The B. vulgaris genome will promote the study of mechanisms in ecological biochemistry to benefit crop resistance breeding....

  5. Human CST Facilitates Genome-wide RAD51 Recruitment to GC-Rich Repetitive Sequences in Response to Replication Stress.

    Science.gov (United States)

    Chastain, Megan; Zhou, Qing; Shiva, Olga; Whitmore, Leanne; Jia, Pingping; Dai, Xueyu; Huang, Chenhui; Fadri-Moskwik, Maria; Ye, Ping; Chai, Weihang

    2016-08-01

    The telomeric CTC1/STN1/TEN1 (CST) complex has been implicated in promoting replication recovery under replication stress at genomic regions, yet its precise role is unclear. Here, we report that STN1 is enriched at GC-rich repetitive sequences genome-wide in response to hydroxyurea (HU)-induced replication stress. STN1 deficiency exacerbates the fragility of these sequences under replication stress, resulting in chromosome fragmentation. We find that upon fork stalling, CST proteins form distinct nuclear foci that colocalize with RAD51. Furthermore, replication stress induces physical association of CST with RAD51 in an ATR-dependent manner. Strikingly, CST deficiency diminishes HU-induced RAD51 foci formation and reduces RAD51 recruitment to telomeres and non-telomeric GC-rich fragile sequences. Collectively, our findings establish that CST promotes RAD51 recruitment to GC-rich repetitive sequences in response to replication stress to facilitate replication restart, thereby providing insights into the mechanism underlying genome stability maintenance.

  6. Characterizing Big Data Management

    Directory of Open Access Journals (Sweden)

    Rogério Rossi

    2015-06-01

    Full Text Available Big data management is a reality for an increasing number of organizations in many areas and represents a set of challenges involving big data modeling, storage and retrieval, analysis and visualization. However, technological resources, people and processes are crucial to facilitate the management of big data in any kind of organization, allowing information and knowledge from a large volume of data to support decision-making. Big data management can be supported by these three dimensions: technology, people and processes. Hence, this article discusses these dimensions: the technological dimension that is related to storage, analytics and visualization of big data; the human aspects of big data; and, in addition, the process management dimension that involves in a technological and business approach the aspects of big data management.

  7. Moleculo Long-Read Sequencing Facilitates Assembly and Genomic Binning from Complex Soil Metagenomes

    Energy Technology Data Exchange (ETDEWEB)

    White, Richard Allen; Bottos, Eric M.; Roy Chowdhury, Taniya; Zucker, Jeremy D.; Brislawn, Colin J.; Nicora, Carrie D.; Fansler, Sarah J.; Glaesemann, Kurt R.; Glass, Kevin; Jansson, Janet K.; Langille, Morgan

    2016-06-28

    ABSTRACT

    Soil metagenomics has been touted as the “grand challenge” for metagenomics, as the high microbial diversity and spatial heterogeneity of soils make them unamenable to current assembly platforms. Here, we aimed to improve soil metagenomic sequence assembly by applying the Moleculo synthetic long-read sequencing technology. In total, we obtained 267 Gbp of raw sequence data from a native prairie soil; these data included 109.7 Gbp of short-read data (~100 bp) from the Joint Genome Institute (JGI), an additional 87.7 Gbp of rapid-mode read data (~250 bp), plus 69.6 Gbp (>1.5 kbp) from Moleculo sequencing. The Moleculo data alone yielded over 5,600 reads of >10 kbp in length, and over 95% of the unassembled reads mapped to contigs of >1.5 kbp. Hybrid assembly of all data resulted in more than 10,000 contigs over 10 kbp in length. We mapped three replicate metatranscriptomes derived from the same parent soil to the Moleculo subassembly and found that 95% of the predicted genes, based on their assignments to Enzyme Commission (EC) numbers, were expressed. The Moleculo subassembly also enabled binning of >100 microbial genome bins. We obtained via direct binning the first complete genome, that of “CandidatusPseudomonas sp. strain JKJ-1” from a native soil metagenome. By mapping metatranscriptome sequence reads back to the bins, we found that several bins corresponding to low-relative-abundanceAcidobacteriawere highly transcriptionally active, whereas bins corresponding to high-relative-abundanceVerrucomicrobiawere not. These results demonstrate that Moleculo sequencing provides a significant advance for resolving complex soil microbial communities.

    IMPORTANCESoil microorganisms carry out key processes for life on our planet, including cycling of carbon and other nutrients and supporting growth of plants. However, there is poor molecular-level understanding of their

  8. Recombination and evolution of duplicate control regions in the mitochondrial genome of the Asian big-headed turtle, Platysternon megacephalum.

    Directory of Open Access Journals (Sweden)

    Chenfei Zheng

    Full Text Available Complete mitochondrial (mt genome sequences with duplicate control regions (CRs have been detected in various animal species. In Testudines, duplicate mtCRs have been reported in the mtDNA of the Asian big-headed turtle, Platysternon megacephalum, which has three living subspecies. However, the evolutionary pattern of these CRs remains unclear. In this study, we report the completed sequences of duplicate CRs from 20 individuals belonging to three subspecies of this turtle and discuss the micro-evolutionary analysis of the evolution of duplicate CRs. Genetic distances calculated with MEGA 4.1 using the complete duplicate CR sequences revealed that within turtle subspecies, genetic distances between orthologous copies from different individuals were 0.63% for CR1 and 1.2% for CR2app:addword:respectively, and the average distance between paralogous copies of CR1 and CR2 was 4.8%. Phylogenetic relationships were reconstructed from the CR sequences, excluding the variable number of tandem repeats (VNTRs at the 3' end using three methods: neighbor-joining, maximum likelihood algorithm, and Bayesian inference. These data show that any two CRs within individuals were more genetically distant from orthologous genes in different individuals within the same subspecies. This suggests independent evolution of the two mtCRs within each P. megacephalum subspecies. Reconstruction of separate phylogenetic trees using different CR components (TAS, CD, CSB, and VNTRs suggested the role of recombination in the evolution of duplicate CRs. Consequently, recombination events were detected using RDP software with break points at ≈290 bp and ≈1,080 bp. Based on these results, we hypothesize that duplicate CRs in P. megacephalum originated from heterological ancestral recombination of mtDNA. Subsequent recombination could have resulted in homogenization during independent evolutionary events, thus maintaining the functions of duplicate CRs in the mtDNA of P

  9. The draft genome sequence of the ferret (Mustela putorius furo) facilitates study of human respiratory disease.

    Science.gov (United States)

    Peng, Xinxia; Alföldi, Jessica; Gori, Kevin; Eisfeld, Amie J; Tyler, Scott R; Tisoncik-Go, Jennifer; Brawand, David; Law, G Lynn; Skunca, Nives; Hatta, Masato; Gasper, David J; Kelly, Sara M; Chang, Jean; Thomas, Matthew J; Johnson, Jeremy; Berlin, Aaron M; Lara, Marcia; Russell, Pamela; Swofford, Ross; Turner-Maier, Jason; Young, Sarah; Hourlier, Thibaut; Aken, Bronwen; Searle, Steve; Sun, Xingshen; Yi, Yaling; Suresh, M; Tumpey, Terrence M; Siepel, Adam; Wisely, Samantha M; Dessimoz, Christophe; Kawaoka, Yoshihiro; Birren, Bruce W; Lindblad-Toh, Kerstin; Di Palma, Federica; Engelhardt, John F; Palermo, Robert E; Katze, Michael G

    2014-12-01

    The domestic ferret (Mustela putorius furo) is an important animal model for multiple human respiratory diseases. It is considered the 'gold standard' for modeling human influenza virus infection and transmission. Here we describe the 2.41 Gb draft genome assembly of the domestic ferret, constituting 2.28 Gb of sequence plus gaps. We annotated 19,910 protein-coding genes on this assembly using RNA-seq data from 21 ferret tissues. We characterized the ferret host response to two influenza virus infections by RNA-seq analysis of 42 ferret samples from influenza time-course data and showed distinct signatures in ferret trachea and lung tissues specific to 1918 or 2009 human pandemic influenza virus infections. Using microarray data from 16 ferret samples reflecting cystic fibrosis disease progression, we showed that transcriptional changes in the CFTR-knockout ferret lung reflect pathways of early disease that cannot be readily studied in human infants with cystic fibrosis disease.

  10. Big Data: Big Confusion? Big Challenges?

    Science.gov (United States)

    2015-05-01

    12th Annual Acquisition Research Symposium 12th Annual Acquisition Research Symposium Big Data : Big Confusion? Big Challenges? Mary Maureen... Data : Big Confusion? Big Challenges? 5a. CONTRACT NUMBER 5b. GRANT NUMBER 5c. PROGRAM ELEMENT NUMBER 6. AUTHOR(S) 5d. PROJECT NUMBER 5e. TASK...Acquisition Research Symposium • ~!& UNC CHARlD1TE 90% of the data in the world today was created in the last two years Big Data growth from

  11. A Drosophila melanogaster cell line (S2) facilitates post-genome functional analysis of receptors and ion channels.

    Science.gov (United States)

    Towers, Paula R; Sattelle, David B

    2002-11-01

    The complete sequencing of the genome of the fruit fly Drosophila melanogaster offers the prospect of detailed functional analysis of the extensive gene families in this genetic model organism. Comprehensive functional analysis of family members is facilitated by access to a robust, stable and inducible expression system in a fly cell line. Here we show how the Schneider S2 cell line, derived from the Drosophila embryo, provides such an expression system, with the bonus that radioligand binding studies, second messenger assays, ion imaging, patch-clamp electrophysiology and gene silencing can readily be applied. Drosophila is also ideal for the study of new control strategies for insect pests since the receptors and ion channels that many new animal health drugs and crop protection chemicals target can be expressed in this cell line. In addition, many useful orthologues of human disease genes are emerging from the Drosophila genome and the study of their functions and interactions is another area for postgenome applications of S2 cell lines.

  12. Accurate Dna Assembly And Direct Genome Integration With Optimized Uracil Excision Cloning To Facilitate Engineering Of Escherichia Coli As A Cell Factory

    DEFF Research Database (Denmark)

    Cavaleiro, Mafalda; Kim, Se Hyeuk; Nørholm, Morten

    2015-01-01

    Plants produce a vast diversity of valuable compounds with medical properties, but these are often difficult to purify from the natural source or produce by organic synthesis. An alternative is to transfer the biosynthetic pathways to an efficient production host like the bacterium Escherichia co......-excision-based cloning and combining it with a genome-engineering approach to allow direct integration of whole metabolic pathways into the genome of E. coli, to facilitate the advanced engineering of cell factories....

  13. Machine learning for Big Data analytics in plants.

    Science.gov (United States)

    Ma, Chuang; Zhang, Hao Helen; Wang, Xiangfeng

    2014-12-01

    Rapid advances in high-throughput genomic technology have enabled biology to enter the era of 'Big Data' (large datasets). The plant science community not only needs to build its own Big-Data-compatible parallel computing and data management infrastructures, but also to seek novel analytical paradigms to extract information from the overwhelming amounts of data. Machine learning offers promising computational and analytical solutions for the integrative analysis of large, heterogeneous and unstructured datasets on the Big-Data scale, and is gradually gaining popularity in biology. This review introduces the basic concepts and procedures of machine-learning applications and envisages how machine learning could interface with Big Data technology to facilitate basic research and biotechnology in the plant sciences.

  14. Big data for health.

    Science.gov (United States)

    Andreu-Perez, Javier; Poon, Carmen C Y; Merrifield, Robert D; Wong, Stephen T C; Yang, Guang-Zhong

    2015-07-01

    This paper provides an overview of recent developments in big data in the context of biomedical and health informatics. It outlines the key characteristics of big data and how medical and health informatics, translational bioinformatics, sensor informatics, and imaging informatics will benefit from an integrated approach of piecing together different aspects of personalized information from a diverse range of data sources, both structured and unstructured, covering genomics, proteomics, metabolomics, as well as imaging, clinical diagnosis, and long-term continuous physiological sensing of an individual. It is expected that recent advances in big data will expand our knowledge for testing new hypotheses about disease management from diagnosis to prevention to personalized treatment. The rise of big data, however, also raises challenges in terms of privacy, security, data ownership, data stewardship, and governance. This paper discusses some of the existing activities and future opportunities related to big data for health, outlining some of the key underlying issues that need to be tackled.

  15. Big data, big governance

    NARCIS (Netherlands)

    Reep, Frans van der

    2016-01-01

    “Natuurlijk is het leuk dat mijn koelkast zelf melk bestelt op basis van data gerelateerde patronen. Deep learning op basis van big data kent grote beloften,” zegt Frans van der Reep van Inholland. Geen wonder dat dit op de Hannover Messe tijdens de Wissenstag van ScienceGuide een hoofdthema zal zij

  16. Big data, big responsibilities

    Directory of Open Access Journals (Sweden)

    Primavera De Filippi

    2014-01-01

    Full Text Available Big data refers to the collection and aggregation of large quantities of data produced by and about people, things or the interactions between them. With the advent of cloud computing, specialised data centres with powerful computational hardware and software resources can be used for processing and analysing a humongous amount of aggregated data coming from a variety of different sources. The analysis of such data is all the more valuable to the extent that it allows for specific patterns to be found and new correlations to be made between different datasets, so as to eventually deduce or infer new information, as well as to potentially predict behaviours or assess the likelihood for a certain event to occur. This article will focus specifically on the legal and moral obligations of online operators collecting and processing large amounts of data, to investigate the potential implications of big data analysis on the privacy of individual users and on society as a whole.

  17. Big, Fat World of Lipids

    Science.gov (United States)

    ... Science Home Page The Big, Fat World of Lipids By Emily Carlson Posted August 9, 2012 Cholesterol ... ways to diagnose and treat lipid-related conditions. Lipid Encyclopedia Just as genomics and proteomics spurred advances ...

  18. Clinical research of traditional Chinese medicine in big data era.

    Science.gov (United States)

    Zhang, Junhua; Zhang, Boli

    2014-09-01

    With the advent of big data era, our thinking, technology and methodology are being transformed. Data-intensive scientific discovery based on big data, named "The Fourth Paradigm," has become a new paradigm of scientific research. Along with the development and application of the Internet information technology in the field of healthcare, individual health records, clinical data of diagnosis and treatment, and genomic data have been accumulated dramatically, which generates big data in medical field for clinical research and assessment. With the support of big data, the defects and weakness may be overcome in the methodology of the conventional clinical evaluation based on sampling. Our research target shifts from the "causality inference" to "correlativity analysis." This not only facilitates the evaluation of individualized treatment, disease prediction, prevention and prognosis, but also is suitable for the practice of preventive healthcare and symptom pattern differentiation for treatment in terms of traditional Chinese medicine (TCM), and for the post-marketing evaluation of Chinese patent medicines. To conduct clinical studies involved in big data in TCM domain, top level design is needed and should be performed orderly. The fundamental construction and innovation studies should be strengthened in the sections of data platform creation, data analysis technology and big-data professionals fostering and training.

  19. Big Data in industry

    Science.gov (United States)

    Latinović, T. S.; Preradović, D. M.; Barz, C. R.; Latinović, M. T.; Petrica, P. P.; Pop-Vadean, A.

    2016-08-01

    The amount of data at the global level has grown exponentially. Along with this phenomena, we have a need for a new unit of measure like exabyte, zettabyte, and yottabyte as the last unit measures the amount of data. The growth of data gives a situation where the classic systems for the collection, storage, processing, and visualization of data losing the battle with a large amount, speed, and variety of data that is generated continuously. Many of data that is created by the Internet of Things, IoT (cameras, satellites, cars, GPS navigation, etc.). It is our challenge to come up with new technologies and tools for the management and exploitation of these large amounts of data. Big Data is a hot topic in recent years in IT circles. However, Big Data is recognized in the business world, and increasingly in the public administration. This paper proposes an ontology of big data analytics and examines how to enhance business intelligence through big data analytics as a service by presenting a big data analytics services-oriented architecture. This paper also discusses the interrelationship between business intelligence and big data analytics. The proposed approach in this paper might facilitate the research and development of business analytics, big data analytics, and business intelligence as well as intelligent agents.

  20. Expression of IMP1 enhances production of murine leukemia virus vector by facilitating viral genomic RNA packaging.

    Directory of Open Access Journals (Sweden)

    Yun Mai

    Full Text Available Murine leukemia virus (MLV-based retroviral vector is widely used for gene transfer. Efficient packaging of the genomic RNA is critical for production of high-titer virus. Here, we report that expression of the insulin-like growth factor II mRNA binding protein 1 (IMP1 enhanced the production of infectious MLV vector. Overexpression of IMP1 increased the stability of viral genomic RNA in virus producer cells and packaging of the RNA into progeny virus in a dose-dependent manner. Downregulation of IMP1 in virus producer cells resulted in reduced production of the retroviral vector. These results indicate that IMP1 plays a role in regulating the packaging of MLV genomic RNA and can be used for improving production of retroviral vectors.

  1. Expression of IMP1 enhances production of murine leukemia virus vector by facilitating viral genomic RNA packaging.

    Science.gov (United States)

    Mai, Yun; Gao, Guangxia

    2010-12-29

    Murine leukemia virus (MLV)-based retroviral vector is widely used for gene transfer. Efficient packaging of the genomic RNA is critical for production of high-titer virus. Here, we report that expression of the insulin-like growth factor II mRNA binding protein 1 (IMP1) enhanced the production of infectious MLV vector. Overexpression of IMP1 increased the stability of viral genomic RNA in virus producer cells and packaging of the RNA into progeny virus in a dose-dependent manner. Downregulation of IMP1 in virus producer cells resulted in reduced production of the retroviral vector. These results indicate that IMP1 plays a role in regulating the packaging of MLV genomic RNA and can be used for improving production of retroviral vectors.

  2. Human telomeres that carry an integrated copy of human herpesvirus 6 are often short and unstable, facilitating release of the viral genome from the chromosome.

    Science.gov (United States)

    Huang, Yan; Hidalgo-Bravo, Alberto; Zhang, Enjie; Cotton, Victoria E; Mendez-Bermudez, Aaron; Wig, Gunjan; Medina-Calzada, Zahara; Neumann, Rita; Jeffreys, Alec J; Winney, Bruce; Wilson, James F; Clark, Duncan A; Dyer, Martin J; Royle, Nicola J

    2014-01-01

    Linear chromosomes are stabilized by telomeres, but the presence of short dysfunctional telomeres triggers cellular senescence in human somatic tissues, thus contributing to ageing. Approximately 1% of the population inherits a chromosomally integrated copy of human herpesvirus 6 (CI-HHV-6), but the consequences of integration for the virus and for the telomere with the insertion are unknown. Here we show that the telomere on the distal end of the integrated virus is frequently the shortest measured in somatic cells but not the germline. The telomere carrying the CI-HHV-6 is also prone to truncations that result in the formation of a short telomere at a novel location within the viral genome. We detected extra-chromosomal circular HHV-6 molecules, some surprisingly comprising the entire viral genome with a single fully reconstituted direct repeat region (DR) with both terminal cleavage and packaging elements (PAC1 and PAC2). Truncated CI-HHV-6 and extra-chromosomal circular molecules are likely reciprocal products that arise through excision of a telomere-loop (t-loop) formed within the CI-HHV-6 genome. In summary, we show that the CI-HHV-6 genome disrupts stability of the associated telomere and this facilitates the release of viral sequences as circular molecules, some of which have the potential to become fully functioning viruses.

  3. Big universe, big data

    DEFF Research Database (Denmark)

    Kremer, Jan; Stensbo-Smidt, Kristoffer; Gieseke, Fabian Cristian

    2017-01-01

    , modern astronomy requires big data know-how, in particular it demands highly efficient machine learning and image analysis algorithms. But scalability is not the only challenge: Astronomy applications touch several current machine learning research questions, such as learning from biased data and dealing......Astrophysics and cosmology are rich with data. The advent of wide-area digital cameras on large aperture telescopes has led to ever more ambitious surveys of the sky. Data volumes of entire surveys a decade ago can now be acquired in a single night and real-time analysis is often desired. Thus...... with label and measurement noise. We argue that this makes astronomy a great domain for computer science research, as it pushes the boundaries of data analysis. In the following, we will present this exciting application area for data scientists. We will focus on exemplary results, discuss main challenges...

  4. Can fat explain the human brain's big bang evolution?-Horrobin's leads for comparative and functional genomics.

    Science.gov (United States)

    Erren, T C; Erren, M

    2004-04-01

    When David Horrobin suggested that phospholipid and fatty acid metabolism played a major role in human evolution, his 'fat utilization hypothesis' unified intriguing work from paleoanthropology, evolutionary biology, genetic and nervous system research in a novel and coherent lipid-related context. Interestingly, unlike most other evolutionary concepts, the hypothesis allows specific predictions which can be empirically tested in the near future. This paper summarizes some of Horrobin's intriguing propositions and suggests as to how approaches of comparative genomics published in Cell, Nature, Science and elsewhere since 1997 may be used to examine his evolutionary hypothesis. Indeed, systematic investigations of the genomic clock in the species' mitochondrial DNA, the Y and autosomal chromosomes as evidence of evolutionary relationships and distinctions can help to scrutinize associated predictions for their validity, namely that key mutations which differentiate us from Neanderthals and from great apes are in the genes coding for proteins which regulate fat metabolism, and particularly the phospholipid metabolism of the synapses of the brain. It is concluded that beyond clues to humans' relationships with living primates and to the Neanderthals' cognitive performance and their disappearance, the suggested molecular clock analyses may provide crucial insights into the biochemical evolution-and means of possible manipulation-of our brain.

  5. Facilitating the indirect detection of genomic DNA in an electrochemical DNA biosensor using magnetic nanoparticles and DNA ligase

    Directory of Open Access Journals (Sweden)

    Roozbeh Hushiarian

    2015-12-01

    This technique was found to be reliably repeatable. The indirect detection of genomic DNA using this method is significantly improved and showed high efficiency in small amounts of samples with the detection limit of 5.37 × 10−14 M.

  6. Test balloons? Small signs of big events: a qualitative study on circumstances facilitating adults' awareness of children's first signs of sexual abuse.

    Science.gov (United States)

    Flåm, Anna Margrete; Haugstvedt, Eli

    2013-09-01

    This research examined caregivers' awareness of children's first signs of sexual abuse. The aim was to explore circumstances that facilitate adults' awareness of first signs in everyday natural settings. Data were obtained from a Norwegian university hospital's outpatient specialty mental health clinic. Included were all cases (N=20) referred during a two-year period for treatment after the disclosure of sexual abuse that was reported to the police and child protective service. Nonabusing caregivers' awareness of first signs were recollected in hindsight as part of therapy. Qualitative analysis was conducted to capture caregivers' experiences. As identified by caregivers, all children gave signs. Thereafter, children either stopped, delayed, or immediately disclosed sexual abuse. At first signs, each child had time and attention from trusted adults, connection to the abuser, and exhibited signs of reservation against that person or related activities. Then, if met with closed answers, first signs were rebuffed as once-occurring events. If met with open answers and follow-up questions, children continued to tell. Unambiguous messages were prompted only in settings with intimate bodily activity or sexual abuse related content. In sum, when trusted adults provided door-openings, children used them; when carefully prompted, children talked; when thoughtfully asked, children told. The study suggests that children's signs of sexual abuse can be understood as "test balloons" to explore understanding and whether anything is to be done. A disclosing continuation hinges on the trusted adult's dialogical attunement and supplementary door-openings. Divergent from an idea of behavioural markers, or purposeful versus accidental disclosures, this study calls for a broader attention: Moments of first signs are embedded in dialogue. A uniqueness at moments of first signs appears: Both to form such moments and to transform them into moments of meeting for joint exploration and

  7. The Drosophila melanogaster PeptideAtlas facilitates the use of peptide data for improved fly proteomics and genome annotation

    Directory of Open Access Journals (Sweden)

    King Nichole L

    2009-02-01

    Full Text Available Abstract Background Crucial foundations of any quantitative systems biology experiment are correct genome and proteome annotations. Protein databases compiled from high quality empirical protein identifications that are in turn based on correct gene models increase the correctness, sensitivity, and quantitative accuracy of systems biology genome-scale experiments. Results In this manuscript, we present the Drosophila melanogaster PeptideAtlas, a fly proteomics and genomics resource of unsurpassed depth. Based on peptide mass spectrometry data collected in our laboratory the portal http://www.drosophila-peptideatlas.org allows querying fly protein data observed with respect to gene model confirmation and splice site verification as well as for the identification of proteotypic peptides suited for targeted proteomics studies. Additionally, the database provides consensus mass spectra for observed peptides along with qualitative and quantitative information about the number of observations of a particular peptide and the sample(s in which it was observed. Conclusion PeptideAtlas is an open access database for the Drosophila community that has several features and applications that support (1 reduction of the complexity inherently associated with performing targeted proteomic studies, (2 designing and accelerating shotgun proteomics experiments, (3 confirming or questioning gene models, and (4 adjusting gene models such that they are in line with observed Drosophila peptides. While the database consists of proteomic data it is not required that the user is a proteomics expert.

  8. Big data

    DEFF Research Database (Denmark)

    Madsen, Anders Koed; Flyverbom, Mikkel; Hilbert, Martin

    2016-01-01

    The claim that big data can revolutionize strategy and governance in the context of international relations is increasingly hard to ignore. Scholars of international political sociology have mainly discussed this development through the themes of security and surveillance. The aim of this paper...... is to outline a research agenda that can be used to raise a broader set of sociological and practice-oriented questions about the increasing datafication of international relations and politics. First, it proposes a way of conceptualizing big data that is broad enough to open fruitful investigations...... into the emerging use of big data in these contexts. This conceptualization includes the identification of three moments contained in any big data practice. Second, it suggests a research agenda built around a set of subthemes that each deserve dedicated scrutiny when studying the interplay between big data...

  9. Big data

    DEFF Research Database (Denmark)

    Madsen, Anders Koed; Ruppert, Evelyn; Flyverbom, Mikkel

    2016-01-01

    The claim that big data can revolutionize strategy and governance in the context of international relations is increasingly hard to ignore. Scholars of international political sociology have mainly discussed this development through the themes of security and surveillance. The aim of this paper...... is to outline a research agenda that can be used to raise a broader set of sociological and practice-oriented questions about the increasing datafication of international relations and politics. First, it proposes a way of conceptualizing big data that is broad enough to open fruitful investigations...... into the emerging use of big data in these contexts. This conceptualization includes the identification of three moments that is contained in any big data practice. Secondly, it suggest a research agenda built around a set of sub-themes that each deserve dedicated scrutiny when studying the interplay between big...

  10. ZINC-INDUCED FACILITATOR-LIKE family in plants: lineage-specific expansion in monocotyledons and conserved genomic and expression features among rice (Oryza sativa paralogs

    Directory of Open Access Journals (Sweden)

    Lopes Karina L

    2011-01-01

    Full Text Available Abstract Background Duplications are very common in the evolution of plant genomes, explaining the high number of members in plant gene families. New genes born after duplication can undergo pseudogenization, neofunctionalization or subfunctionalization. Rice is a model for functional genomics research, an important crop for human nutrition and a target for biofortification. Increased zinc and iron content in the rice grain could be achieved by manipulation of metal transporters. Here, we describe the ZINC-INDUCED FACILITATOR-LIKE (ZIFL gene family in plants, and characterize the genomic structure and expression of rice paralogs, which are highly affected by segmental duplication. Results Sequences of sixty-eight ZIFL genes, from nine plant species, were comparatively analyzed. Although related to MSF_1 proteins, ZIFL protein sequences consistently grouped separately. Specific ZIFL sequence signatures were identified. Monocots harbor a larger number of ZIFL genes in their genomes than dicots, probably a result of a lineage-specific expansion. The rice ZIFL paralogs were named OsZIFL1 to OsZIFL13 and characterized. The genomic organization of the rice ZIFL genes seems to be highly influenced by segmental and tandem duplications and concerted evolution, as rice genome contains five highly similar ZIFL gene pairs. Most rice ZIFL promoters are enriched for the core sequence of the Fe-deficiency-related box IDE1. Gene expression analyses of different plant organs, growth stages and treatments, both from our qPCR data and from microarray databases, revealed that the duplicated ZIFL gene pairs are mostly co-expressed. Transcripts of OsZIFL4, OsZIFL5, OsZIFL7, and OsZIFL12 accumulate in response to Zn-excess and Fe-deficiency in roots, two stresses with partially overlapping responses. Conclusions We suggest that ZIFL genes have different evolutionary histories in monocot and dicot lineages. In rice, concerted evolution affected ZIFL duplicated genes

  11. Big Egos in Big Science

    DEFF Research Database (Denmark)

    Jeppesen, Jacob; Vaarst Andersen, Kristina; Lauto, Giancarlo;

    and locations, having a diverse knowledge set and capable of tackling more and more complex problems. This prose the question if Big Egos continues to dominate in this rising paradigm of big science. Using a dataset consisting of full bibliometric coverage from a Large Scale Research Facility, we utilize...

  12. Big Data

    OpenAIRE

    2013-01-01

    Demand and spurt in collections and accumulation of data has coined new term “Big Data” has begun. Accidently, incidentally and by interaction of people, information so called data is massively generated. This BIG DATA is to be smartly and effectively used Computer scientists, physicists, economists, mathematicians, political scientists, bio-informaticists, sociologists and many Variety of Intellegesia debate over the potential benefits and costs of analysing information from Twitter, Google,...

  13. Big Surveys, Big Data Centres

    Science.gov (United States)

    Schade, D.

    2016-06-01

    Well-designed astronomical surveys are powerful and have consistently been keystones of scientific progress. The Byurakan Surveys using a Schmidt telescope with an objective prism produced a list of about 3000 UV-excess Markarian galaxies but these objects have stimulated an enormous amount of further study and appear in over 16,000 publications. The CFHT Legacy Surveys used a wide-field imager to cover thousands of square degrees and those surveys are mentioned in over 1100 publications since 2002. Both ground and space-based astronomy have been increasing their investments in survey work. Survey instrumentation strives toward fair samples and large sky coverage and therefore strives to produce massive datasets. Thus we are faced with the "big data" problem in astronomy. Survey datasets require specialized approaches to data management. Big data places additional challenging requirements for data management. If the term "big data" is defined as data collections that are too large to move then there are profound implications for the infrastructure that supports big data science. The current model of data centres is obsolete. In the era of big data the central problem is how to create architectures that effectively manage the relationship between data collections, networks, processing capabilities, and software, given the science requirements of the projects that need to be executed. A stand alone data silo cannot support big data science. I'll describe the current efforts of the Canadian community to deal with this situation and our successes and failures. I'll talk about how we are planning in the next decade to try to create a workable and adaptable solution to support big data science.

  14. Big Opportunities and Big Concerns of Big Data in Education

    Science.gov (United States)

    Wang, Yinying

    2016-01-01

    Against the backdrop of the ever-increasing influx of big data, this article examines the opportunities and concerns over big data in education. Specifically, this article first introduces big data, followed by delineating the potential opportunities of using big data in education in two areas: learning analytics and educational policy. Then, the…

  15. Big Dreams

    Science.gov (United States)

    Benson, Michael T.

    2015-01-01

    The Keen Johnson Building is symbolic of Eastern Kentucky University's historic role as a School of Opportunity. It is a place that has inspired generations of students, many from disadvantaged backgrounds, to dream big dreams. The construction of the Keen Johnson Building was inspired by a desire to create a student union facility that would not…

  16. The complete mitochondrial genome of the cryptic "lineage A" big-fin reef squid, Sepioteuthis lessoniana (Cephalopoda: Loliginidae) in Indo-West Pacific.

    Science.gov (United States)

    Hsiao, Chung-Der; Shen, Kang-Ning; Ching, Tzu-Yun; Wang, Ya-Hsien; Ye, Jeng-Jia; Tsai, Shiou-Yi; Wu, Shan-Chun; Chen, Ching-Hung; Wang, Chia-Hui

    2016-07-01

    In this study, the complete mitogenome sequence of the cryptic "lineage A" big-fin reef squid, Sepioteuthis lessoniana (Cephalopoda: Loliginidae) has been sequenced by the next-generation sequencing method. The assembled mitogenome consists of 16,605 bp, which includes 13 protein-coding genes, 22 transfer RNAs, and 2 ribosomal RNAs genes. The overall base composition of "lineage A" S. lessoniana is 37.5% for A, 17.4% for C, 9.1% for G, and 35.9% for T and shows 87% identities to "lineage C" S. lessoniana. It is also noticed by its high T + A content (73.4%), two non-coding regions with TA tandem repeats. The complete mitogenome of the cryptic "lineage A" S. lessoniana provides essential and important DNA molecular data for further phylogeography and evolutionary analysis for big-fin reef squid species complex.

  17. Transforming Big Data into cancer-relevant insight: An initial, multi-tier approach to assess reproducibility and relevance | Office of Cancer Genomics

    Science.gov (United States)

    The Cancer Target Discovery and Development (CTD^2) Network was established to accelerate the transformation of "Big Data" into novel pharmacological targets, lead compounds, and biomarkers for rapid translation into improved patient outcomes. It rapidly became clear in this collaborative network that a key central issue was to define what constitutes sufficient computational or experimental evidence to support a biologically or clinically relevant finding.

  18. The BIG Data Center: from deposition to integration to translation.

    Science.gov (United States)

    2017-01-04

    Biological data are generated at unprecedentedly exponential rates, posing considerable challenges in big data deposition, integration and translation. The BIG Data Center, established at Beijing Institute of Genomics (BIG), Chinese Academy of Sciences, provides a suite of database resources, including (i) Genome Sequence Archive, a data repository specialized for archiving raw sequence reads, (ii) Gene Expression Nebulas, a data portal of gene expression profiles based entirely on RNA-Seq data, (iii) Genome Variation Map, a comprehensive collection of genome variations for featured species, (iv) Genome Warehouse, a centralized resource housing genome-scale data with particular focus on economically important animals and plants, (v) Methylation Bank, an integrated database of whole-genome single-base resolution methylomes and (vi) Science Wikis, a central access point for biological wikis developed for community annotations. The BIG Data Center is dedicated to constructing and maintaining biological databases through big data integration and value-added curation, conducting basic research to translate big data into big knowledge and providing freely open access to a variety of data resources in support of worldwide research activities in both academia and industry. All of these resources are publicly available and can be found at http://bigd.big.ac.cn.

  19. The BIG Data Center: from deposition to integration to translation

    Science.gov (United States)

    2017-01-01

    Biological data are generated at unprecedentedly exponential rates, posing considerable challenges in big data deposition, integration and translation. The BIG Data Center, established at Beijing Institute of Genomics (BIG), Chinese Academy of Sciences, provides a suite of database resources, including (i) Genome Sequence Archive, a data repository specialized for archiving raw sequence reads, (ii) Gene Expression Nebulas, a data portal of gene expression profiles based entirely on RNA-Seq data, (iii) Genome Variation Map, a comprehensive collection of genome variations for featured species, (iv) Genome Warehouse, a centralized resource housing genome-scale data with particular focus on economically important animals and plants, (v) Methylation Bank, an integrated database of whole-genome single-base resolution methylomes and (vi) Science Wikis, a central access point for biological wikis developed for community annotations. The BIG Data Center is dedicated to constructing and maintaining biological databases through big data integration and value-added curation, conducting basic research to translate big data into big knowledge and providing freely open access to a variety of data resources in support of worldwide research activities in both academia and industry. All of these resources are publicly available and can be found at http://bigd.big.ac.cn. PMID:27899658

  20. Networking for big data

    CERN Document Server

    Yu, Shui; Misic, Jelena; Shen, Xuemin (Sherman)

    2015-01-01

    Networking for Big Data supplies an unprecedented look at cutting-edge research on the networking and communication aspects of Big Data. Starting with a comprehensive introduction to Big Data and its networking issues, it offers deep technical coverage of both theory and applications.The book is divided into four sections: introduction to Big Data, networking theory and design for Big Data, networking security for Big Data, and platforms and systems for Big Data applications. Focusing on key networking issues in Big Data, the book explains network design and implementation for Big Data. It exa

  1. Systems biology in the context of big data and networks.

    Science.gov (United States)

    Altaf-Ul-Amin, Md; Afendi, Farit Mochamad; Kiboi, Samuel Kuria; Kanaya, Shigehiko

    2014-01-01

    Science is going through two rapidly changing phenomena: one is the increasing capabilities of the computers and software tools from terabytes to petabytes and beyond, and the other is the advancement in high-throughput molecular biology producing piles of data related to genomes, transcriptomes, proteomes, metabolomes, interactomes, and so on. Biology has become a data intensive science and as a consequence biology and computer science have become complementary to each other bridged by other branches of science such as statistics, mathematics, physics, and chemistry. The combination of versatile knowledge has caused the advent of big-data biology, network biology, and other new branches of biology. Network biology for instance facilitates the system-level understanding of the cell or cellular components and subprocesses. It is often also referred to as systems biology. The purpose of this field is to understand organisms or cells as a whole at various levels of functions and mechanisms. Systems biology is now facing the challenges of analyzing big molecular biological data and huge biological networks. This review gives an overview of the progress in big-data biology, and data handling and also introduces some applications of networks and multivariate analysis in systems biology.

  2. Big Data Analytics in Healthcare.

    Science.gov (United States)

    Belle, Ashwin; Thiagarajan, Raghuram; Soroushmehr, S M Reza; Navidi, Fatemeh; Beard, Daniel A; Najarian, Kayvan

    2015-01-01

    The rapidly expanding field of big data analytics has started to play a pivotal role in the evolution of healthcare practices and research. It has provided tools to accumulate, manage, analyze, and assimilate large volumes of disparate, structured, and unstructured data produced by current healthcare systems. Big data analytics has been recently applied towards aiding the process of care delivery and disease exploration. However, the adoption rate and research development in this space is still hindered by some fundamental problems inherent within the big data paradigm. In this paper, we discuss some of these major challenges with a focus on three upcoming and promising areas of medical research: image, signal, and genomics based analytics. Recent research which targets utilization of large volumes of medical data while combining multimodal data from disparate sources is discussed. Potential areas of research within this field which have the ability to provide meaningful impact on healthcare delivery are also examined.

  3. Big Data Analytics in Healthcare

    Directory of Open Access Journals (Sweden)

    Ashwin Belle

    2015-01-01

    Full Text Available The rapidly expanding field of big data analytics has started to play a pivotal role in the evolution of healthcare practices and research. It has provided tools to accumulate, manage, analyze, and assimilate large volumes of disparate, structured, and unstructured data produced by current healthcare systems. Big data analytics has been recently applied towards aiding the process of care delivery and disease exploration. However, the adoption rate and research development in this space is still hindered by some fundamental problems inherent within the big data paradigm. In this paper, we discuss some of these major challenges with a focus on three upcoming and promising areas of medical research: image, signal, and genomics based analytics. Recent research which targets utilization of large volumes of medical data while combining multimodal data from disparate sources is discussed. Potential areas of research within this field which have the ability to provide meaningful impact on healthcare delivery are also examined.

  4. Big Data

    DEFF Research Database (Denmark)

    Aaen, Jon; Nielsen, Jeppe Agger

    2016-01-01

    Big Data byder sig til som en af tidens mest hypede teknologiske innovationer, udråbt til at rumme kimen til nye, værdifulde operationelle indsigter for private virksomheder og offentlige organisationer. Mens de optimistiske udmeldinger er mange, er forskningen i Big Data i den offentlige sektor...... indtil videre begrænset. Denne artikel belyser, hvordan den offentlige sundhedssektor kan genanvende og udnytte en stadig større mængde data under hensyntagen til offentlige værdier. Artiklen bygger på et casestudie af anvendelsen af store mængder sundhedsdata i Dansk AlmenMedicinsk Database (DAMD......). Analysen viser, at (gen)brug af data i nye sammenhænge er en flerspektret afvejning mellem ikke alene økonomiske rationaler og kvalitetshensyn, men også kontrol over personfølsomme data og etiske implikationer for borgeren. I DAMD-casen benyttes data på den ene side ”i den gode sags tjeneste” til...

  5. Big Man

    Institute of Scientific and Technical Information of China (English)

    郑秀文

    2012-01-01

    <正>梁炳"Edmond"说他演唱会后会跟太太去旅行。无论飞机降落在地球的哪角,有伴在旁就是幸福。他的concert名字是big man,初时我看错是big mac演唱会:心想干吗是大汉堡演唱会?嘻!后来才知看错。但其实细想,在成长路上,谁不曾是活得像个傻傻的面包,一团面粉暴露在这大千世界,时间和各式人生经历就是酵母,多少年月日,你我都会发酵成长。友情也是激发彼此成长的酵母,看到对方早已经从男仔成了男人,我都原来一早已不再能够以"女仔"称呼自己。在我眼中,他的改变是大的,爱玩外向的个性收窄了,现在的我们,

  6. Big data analytics turning big data into big money

    CERN Document Server

    Ohlhorst, Frank J

    2012-01-01

    Unique insights to implement big data analytics and reap big returns to your bottom line Focusing on the business and financial value of big data analytics, respected technology journalist Frank J. Ohlhorst shares his insights on the newly emerging field of big data analytics in Big Data Analytics. This breakthrough book demonstrates the importance of analytics, defines the processes, highlights the tangible and intangible values and discusses how you can turn a business liability into actionable material that can be used to redefine markets, improve profits and identify new business opportuni

  7. Big Egos in Big Science

    DEFF Research Database (Denmark)

    Jeppesen, Jacob; Vaarst Andersen, Kristina; Lauto, Giancarlo

    In this paper we investigate the micro-mechanisms governing the structural evolution of a scientific collaboration. Empirical evidence indicates that we have transcended into a new paradigm with a new modus operandi where scientific discovery are not lead by so called lone ?stars?, or big egos......, but instead by a group of people, from a multitude of institutions, having a diverse knowledge set and capable of operating more and more complex instrumentation. Using a dataset consisting of full bibliometric coverage from a Large Scale Research Facility, we utilize a stochastic actor oriented model...

  8. Big Data Application in Biomedical Research and Health Care: A Literature Review.

    Science.gov (United States)

    Luo, Jake; Wu, Min; Gopukumar, Deepika; Zhao, Yiqing

    2016-01-01

    Big data technologies are increasingly used for biomedical and health-care informatics research. Large amounts of biological and clinical data have been generated and collected at an unprecedented speed and scale. For example, the new generation of sequencing technologies enables the processing of billions of DNA sequence data per day, and the application of electronic health records (EHRs) is documenting large amounts of patient data. The cost of acquiring and analyzing biomedical data is expected to decrease dramatically with the help of technology upgrades, such as the emergence of new sequencing machines, the development of novel hardware and software for parallel computing, and the extensive expansion of EHRs. Big data applications present new opportunities to discover new knowledge and create novel methods to improve the quality of health care. The application of big data in health care is a fast-growing field, with many new discoveries and methodologies published in the last five years. In this paper, we review and discuss big data application in four major biomedical subdisciplines: (1) bioinformatics, (2) clinical informatics, (3) imaging informatics, and (4) public health informatics. Specifically, in bioinformatics, high-throughput experiments facilitate the research of new genome-wide association studies of diseases, and with clinical informatics, the clinical field benefits from the vast amount of collected patient data for making intelligent decisions. Imaging informatics is now more rapidly integrated with cloud platforms to share medical image data and workflows, and public health informatics leverages big data techniques for predicting and monitoring infectious disease outbreaks, such as Ebola. In this paper, we review the recent progress and breakthroughs of big data applications in these health-care domains and summarize the challenges, gaps, and opportunities to improve and advance big data applications in health care.

  9. Big Data Application in Biomedical Research and Health Care: A Literature Review

    Science.gov (United States)

    Luo, Jake; Wu, Min; Gopukumar, Deepika; Zhao, Yiqing

    2016-01-01

    Big data technologies are increasingly used for biomedical and health-care informatics research. Large amounts of biological and clinical data have been generated and collected at an unprecedented speed and scale. For example, the new generation of sequencing technologies enables the processing of billions of DNA sequence data per day, and the application of electronic health records (EHRs) is documenting large amounts of patient data. The cost of acquiring and analyzing biomedical data is expected to decrease dramatically with the help of technology upgrades, such as the emergence of new sequencing machines, the development of novel hardware and software for parallel computing, and the extensive expansion of EHRs. Big data applications present new opportunities to discover new knowledge and create novel methods to improve the quality of health care. The application of big data in health care is a fast-growing field, with many new discoveries and methodologies published in the last five years. In this paper, we review and discuss big data application in four major biomedical subdisciplines: (1) bioinformatics, (2) clinical informatics, (3) imaging informatics, and (4) public health informatics. Specifically, in bioinformatics, high-throughput experiments facilitate the research of new genome-wide association studies of diseases, and with clinical informatics, the clinical field benefits from the vast amount of collected patient data for making intelligent decisions. Imaging informatics is now more rapidly integrated with cloud platforms to share medical image data and workflows, and public health informatics leverages big data techniques for predicting and monitoring infectious disease outbreaks, such as Ebola. In this paper, we review the recent progress and breakthroughs of big data applications in these health-care domains and summarize the challenges, gaps, and opportunities to improve and advance big data applications in health care. PMID:26843812

  10. Big data analytics methods and applications

    CERN Document Server

    Rao, BLS; Rao, SB

    2016-01-01

    This book has a collection of articles written by Big Data experts to describe some of the cutting-edge methods and applications from their respective areas of interest, and provides the reader with a detailed overview of the field of Big Data Analytics as it is practiced today. The chapters cover technical aspects of key areas that generate and use Big Data such as management and finance; medicine and healthcare; genome, cytome and microbiome; graphs and networks; Internet of Things; Big Data standards; bench-marking of systems; and others. In addition to different applications, key algorithmic approaches such as graph partitioning, clustering and finite mixture modelling of high-dimensional data are also covered. The varied collection of themes in this volume introduces the reader to the richness of the emerging field of Big Data Analytics.

  11. Databases and web tools for cancer genomics study.

    Science.gov (United States)

    Yang, Yadong; Dong, Xunong; Xie, Bingbing; Ding, Nan; Chen, Juan; Li, Yongjun; Zhang, Qian; Qu, Hongzhu; Fang, Xiangdong

    2015-02-01

    Publicly-accessible resources have promoted the advance of scientific discovery. The era of genomics and big data has brought the need for collaboration and data sharing in order to make effective use of this new knowledge. Here, we describe the web resources for cancer genomics research and rate them on the basis of the diversity of cancer types, sample size, omics data comprehensiveness, and user experience. The resources reviewed include data repository and analysis tools; and we hope such introduction will promote the awareness and facilitate the usage of these resources in the cancer research community.

  12. Databases and Web Tools for Cancer Genomics Study

    Institute of Scientific and Technical Information of China (English)

    Yadong Yang; Xunong Dong; Bingbing Xie; Nan Ding; Juan Chen; Yongjun Li; Qian Zhang; Hongzhu Qu; Xiangdong Fang

    2015-01-01

    Publicly-accessible resources have promoted the advance of scientific discovery. The era of genomics and big data has brought the need for collaboration and data sharing in order to make effective use of this new knowledge. Here, we describe the web resources for cancer genomics research and rate them on the basis of the diversity of cancer types, sample size, omics data com-prehensiveness, and user experience. The resources reviewed include data repository and analysis tools;and we hope such introduction will promote the awareness and facilitate the usage of these resources in the cancer research community.

  13. Big data are coming to psychiatry: a general introduction.

    Science.gov (United States)

    Monteith, Scott; Glenn, Tasha; Geddes, John; Bauer, Michael

    2015-12-01

    Big data are coming to the study of bipolar disorder and all of psychiatry. Data are coming from providers and payers (including EMR, imaging, insurance claims and pharmacy data), from omics (genomic, proteomic, and metabolomic data), and from patients and non-providers (data from smart phone and Internet activities, sensors and monitoring tools). Analysis of the big data will provide unprecedented opportunities for exploration, descriptive observation, hypothesis generation, and prediction, and the results of big data studies will be incorporated into clinical practice. Technical challenges remain in the quality, analysis and management of big data. This paper discusses some of the fundamental opportunities and challenges of big data for psychiatry.

  14. The big bang of genome editing technology: development and application of the CRISPR/Cas9 system in disease animal models.

    Science.gov (United States)

    Shao, Ming; Xu, Tian-Rui; Chen, Ce-Shi

    2016-07-18

    Targeted genome editing technology has been widely used in biomedical studies. The CRISPR-associated RNA-guided endonuclease Cas9 has become a versatile genome editing tool. The CRISPR/Cas9 system is useful for studying gene function through efficient knock-out, knock-in or chromatin modification of the targeted gene loci in various cell types and organisms. It can be applied in a number of fields, such as genetic breeding, disease treatment and gene functional investigation. In this review, we introduce the most recent developments and applications, the challenges, and future directions of Cas9 in generating disease animal model. Derived from the CRISPR adaptive immune system of bacteria, the development trend of Cas9 will inevitably fuel the vital applications from basic research to biotechnology and bio-medicine.

  15. On Big Data Benchmarking

    OpenAIRE

    Han, Rui; Lu, Xiaoyi

    2014-01-01

    Big data systems address the challenges of capturing, storing, managing, analyzing, and visualizing big data. Within this context, developing benchmarks to evaluate and compare big data systems has become an active topic for both research and industry communities. To date, most of the state-of-the-art big data benchmarks are designed for specific types of systems. Based on our experience, however, we argue that considering the complexity, diversity, and rapid evolution of big data systems, fo...

  16. Big data=Big marketing?!

    Institute of Scientific and Technical Information of China (English)

    肖明超

    2012-01-01

    <正>互联网刚刚兴起的时候,有句话很流行:"在网上,没人知道你是一条狗。"但是,在20多年后的今天,这句话已经早被扔进了历史的垃圾堆,因为在技术的推动下,随着移动互联、社交网络、电子商务等的迅速发展,消费者的"行踪"变得越来越容易被把握,消费者在互联网上的眼球、行为轨迹、谈论、喜好、购物经历等等都可能被捕捉到,消费者进入一个几乎透明化生存的"大数据时代"(Age of Big Data)。数据不仅仅正在变得更加可用,人工智能(AI)技术,包括自然语言处理、模式识别和机器学习等技术的发展,正在让数据变得更加容易被计算机所理解,

  17. True Randomness from Big Data

    Science.gov (United States)

    Papakonstantinou, Periklis A.; Woodruff, David P.; Yang, Guang

    2016-09-01

    Generating random bits is a difficult task, which is important for physical systems simulation, cryptography, and many applications that rely on high-quality random bits. Our contribution is to show how to generate provably random bits from uncertain events whose outcomes are routinely recorded in the form of massive data sets. These include scientific data sets, such as in astronomics, genomics, as well as data produced by individuals, such as internet search logs, sensor networks, and social network feeds. We view the generation of such data as the sampling process from a big source, which is a random variable of size at least a few gigabytes. Our view initiates the study of big sources in the randomness extraction literature. Previous approaches for big sources rely on statistical assumptions about the samples. We introduce a general method that provably extracts almost-uniform random bits from big sources and extensively validate it empirically on real data sets. The experimental findings indicate that our method is efficient enough to handle large enough sources, while previous extractor constructions are not efficient enough to be practical. Quality-wise, our method at least matches quantum randomness expanders and classical world empirical extractors as measured by standardized tests.

  18. True Randomness from Big Data

    Science.gov (United States)

    Papakonstantinou, Periklis A.; Woodruff, David P.; Yang, Guang

    2016-01-01

    Generating random bits is a difficult task, which is important for physical systems simulation, cryptography, and many applications that rely on high-quality random bits. Our contribution is to show how to generate provably random bits from uncertain events whose outcomes are routinely recorded in the form of massive data sets. These include scientific data sets, such as in astronomics, genomics, as well as data produced by individuals, such as internet search logs, sensor networks, and social network feeds. We view the generation of such data as the sampling process from a big source, which is a random variable of size at least a few gigabytes. Our view initiates the study of big sources in the randomness extraction literature. Previous approaches for big sources rely on statistical assumptions about the samples. We introduce a general method that provably extracts almost-uniform random bits from big sources and extensively validate it empirically on real data sets. The experimental findings indicate that our method is efficient enough to handle large enough sources, while previous extractor constructions are not efficient enough to be practical. Quality-wise, our method at least matches quantum randomness expanders and classical world empirical extractors as measured by standardized tests. PMID:27666514

  19. Facilitating Transfers

    DEFF Research Database (Denmark)

    Kjær, Poul F.

    that the essential functional and normative purpose of regulatory governance is to facilitate, stabilise and justify the transfer of condensed social components (such as economic capital and products, political decisions, legal judgements, religious beliefs and scientific knowledge) from one social contexts...

  20. Indian microchip for Big Bang research in Geneva

    CERN Document Server

    Bhabani, Soudhriti

    2007-01-01

    "A premier nuclear physics institute here has come up with India's first indigenously designed microchip that will facilitate research on the Big Bang theory in Geneva's CERN, the world's largest particle physics laboratory." (1 page)

  1. Big Game Reporting Stations

    Data.gov (United States)

    Vermont Center for Geographic Information — Point locations of big game reporting stations. Big game reporting stations are places where hunters can legally report harvested deer, bear, or turkey. These are...

  2. Social big data mining

    CERN Document Server

    Ishikawa, Hiroshi

    2015-01-01

    Social Media. Big Data and Social Data. Hypotheses in the Era of Big Data. Social Big Data Applications. Basic Concepts in Data Mining. Association Rule Mining. Clustering. Classification. Prediction. Web Structure Mining. Web Content Mining. Web Access Log Mining, Information Extraction and Deep Web Mining. Media Mining. Scalability and Outlier Detection.

  3. Five Big Ideas

    Science.gov (United States)

    Morgan, Debbie

    2012-01-01

    Designing quality continuing professional development (CPD) for those teaching mathematics in primary schools is a challenge. If the CPD is to be built on the scaffold of five big ideas in mathematics, what might be these five big ideas? Might it just be a case of, if you tell me your five big ideas, then I'll tell you mine? Here, there is…

  4. Big data computing

    CERN Document Server

    Akerkar, Rajendra

    2013-01-01

    Due to market forces and technological evolution, Big Data computing is developing at an increasing rate. A wide variety of novel approaches and tools have emerged to tackle the challenges of Big Data, creating both more opportunities and more challenges for students and professionals in the field of data computation and analysis. Presenting a mix of industry cases and theory, Big Data Computing discusses the technical and practical issues related to Big Data in intelligent information management. Emphasizing the adoption and diffusion of Big Data tools and technologies in industry, the book i

  5. Microsoft big data solutions

    CERN Document Server

    Jorgensen, Adam; Welch, John; Clark, Dan; Price, Christopher; Mitchell, Brian

    2014-01-01

    Tap the power of Big Data with Microsoft technologies Big Data is here, and Microsoft's new Big Data platform is a valuable tool to help your company get the very most out of it. This timely book shows you how to use HDInsight along with HortonWorks Data Platform for Windows to store, manage, analyze, and share Big Data throughout the enterprise. Focusing primarily on Microsoft and HortonWorks technologies but also covering open source tools, Microsoft Big Data Solutions explains best practices, covers on-premises and cloud-based solutions, and features valuable case studies. Best of all,

  6. Transcriptome analysis of tetraploid cells identifies cyclin D2 as a facilitator of adaptation to genome doubling in the presence of p53.

    Science.gov (United States)

    Potapova, Tamara A; Seidel, Christopher W; Box, Andrew C; Rancati, Giulia; Li, Rong

    2016-10-15

    Tetraploidization, or genome doubling, is a prominent event in tumorigenesis, primarily because cell division in polyploid cells is error-prone and produces aneuploid cells. This study investigates changes in gene expression evoked in acute and adapted tetraploid cells and their effect on cell-cycle progression. Acute polyploidy was generated by knockdown of the essential regulator of cytokinesis anillin, which resulted in cytokinesis failure and formation of binucleate cells, or by chemical inhibition of Aurora kinases, causing abnormal mitotic exit with formation of single cells with aberrant nuclear morphology. Transcriptome analysis of these acute tetraploid cells revealed common signatures of activation of the tumor-suppressor protein p53. Suppression of proliferation in these cells was dependent on p53 and its transcriptional target, CDK inhibitor p21. Rare proliferating tetraploid cells can emerge from acute polyploid populations. Gene expression analysis of single cell-derived, adapted tetraploid clones showed up-regulation of several p53 target genes and cyclin D2, the activator of CDK4/6/2. Overexpression of cyclin D2 in diploid cells strongly potentiated the ability to proliferate with increased DNA content despite the presence of functional p53. These results indicate that p53-mediated suppression of proliferation of polyploid cells can be averted by increased levels of oncogenes such as cyclin D2, elucidating a possible route for tetraploidy-mediated genomic instability in carcinogenesis.

  7. Evaluation of a Phylogenetic Marker Based on Genomic Segment B of Infectious Bursal Disease Virus: Facilitating a Feasible Incorporation of this Segment to the Molecular Epidemiology Studies for this Viral Agent.

    Directory of Open Access Journals (Sweden)

    Abdulahi Alfonso-Morales

    Full Text Available Infectious bursal disease (IBD is a highly contagious and acute viral disease, which has caused high mortality rates in birds and considerable economic losses in different parts of the world for more than two decades and it still represents a considerable threat to poultry. The current study was designed to rigorously measure the reliability of a phylogenetic marker included into segment B. This marker can facilitate molecular epidemiology studies, incorporating this segment of the viral genome, to better explain the links between emergence, spreading and maintenance of the very virulent IBD virus (vvIBDV strains worldwide.Sequences of the segment B gene from IBDV strains isolated from diverse geographic locations were obtained from the GenBank Database; Cuban sequences were obtained in the current work. A phylogenetic marker named B-marker was assessed by different phylogenetic principles such as saturation of substitution, phylogenetic noise and high consistency. This last parameter is based on the ability of B-marker to reconstruct the same topology as the complete segment B of the viral genome. From the results obtained from B-marker, demographic history for both main lineages of IBDV regarding segment B was performed by Bayesian skyline plot analysis. Phylogenetic analysis for both segments of IBDV genome was also performed, revealing the presence of a natural reassortant strain with segment A from vvIBDV strains and segment B from non-vvIBDV strains within Cuban IBDV population.This study contributes to a better understanding of the emergence of vvIBDV strains, describing molecular epidemiology of IBDV using the state-of-the-art methodology concerning phylogenetic reconstruction. This study also revealed the presence of a novel natural reassorted strain as possible manifest of change in the genetic structure and stability of the vvIBDV strains. Therefore, it highlights the need to obtain information about both genome segments of IBDV for

  8. Big data and visual analytics in anaesthesia and health care.

    Science.gov (United States)

    Simpao, A F; Ahumada, L M; Rehman, M A

    2015-09-01

    Advances in computer technology, patient monitoring systems, and electronic health record systems have enabled rapid accumulation of patient data in electronic form (i.e. big data). Organizations such as the Anesthesia Quality Institute and Multicenter Perioperative Outcomes Group have spearheaded large-scale efforts to collect anaesthesia big data for outcomes research and quality improvement. Analytics--the systematic use of data combined with quantitative and qualitative analysis to make decisions--can be applied to big data for quality and performance improvements, such as predictive risk assessment, clinical decision support, and resource management. Visual analytics is the science of analytical reasoning facilitated by interactive visual interfaces, and it can facilitate performance of cognitive activities involving big data. Ongoing integration of big data and analytics within anaesthesia and health care will increase demand for anaesthesia professionals who are well versed in both the medical and the information sciences.

  9. Big Data: Survey, Technologies, Opportunities, and Challenges

    Directory of Open Access Journals (Sweden)

    Nawsher Khan

    2014-01-01

    Full Text Available Big Data has gained much attention from the academia and the IT industry. In the digital and computing world, information is generated and collected at a rate that rapidly exceeds the boundary range. Currently, over 2 billion people worldwide are connected to the Internet, and over 5 billion individuals own mobile phones. By 2020, 50 billion devices are expected to be connected to the Internet. At this point, predicted data production will be 44 times greater than that in 2009. As information is transferred and shared at light speed on optic fiber and wireless networks, the volume of data and the speed of market growth increase. However, the fast growth rate of such large data generates numerous challenges, such as the rapid growth of data, transfer speed, diverse data, and security. Nonetheless, Big Data is still in its infancy stage, and the domain has not been reviewed in general. Hence, this study comprehensively surveys and classifies the various attributes of Big Data, including its nature, definitions, rapid growth rate, volume, management, analysis, and security. This study also proposes a data life cycle that uses the technologies and terminologies of Big Data. Future research directions in this field are determined based on opportunities and several open issues in Big Data domination. These research directions facilitate the exploration of the domain and the development of optimal techniques to address Big Data.

  10. HARNESSING BIG DATA VOLUMES

    Directory of Open Access Journals (Sweden)

    Bogdan DINU

    2014-04-01

    Full Text Available Big Data can revolutionize humanity. Hidden within the huge amounts and variety of the data we are creating we may find information, facts, social insights and benchmarks that were once virtually impossible to find or were simply inexistent. Large volumes of data allow organizations to tap in real time the full potential of all the internal or external information they possess. Big data calls for quick decisions and innovative ways to assist customers and the society as a whole. Big data platforms and product portfolio will help customers harness to the full the value of big data volumes. This paper deals with technical and technological issues related to handling big data volumes in the Big Data environment.

  11. Summary big data

    CERN Document Server

    2014-01-01

    This work offers a summary of Cukier the book: "Big Data: A Revolution That Will Transform How we Live, Work, and Think" by Viktor Mayer-Schonberg and Kenneth. Summary of the ideas in Viktor Mayer-Schonberg's and Kenneth Cukier's book: " Big Data " explains that big data is where we use huge quantities of data to make better predictions based on the fact we identify patters in the data rather than trying to understand the underlying causes in more detail. This summary highlights that big data will be a source of new economic value and innovation in the future. Moreover, it shows that it will

  12. Plant Genome Duplication Database.

    Science.gov (United States)

    Lee, Tae-Ho; Kim, Junah; Robertson, Jon S; Paterson, Andrew H

    2017-01-01

    Genome duplication, widespread in flowering plants, is a driving force in evolution. Genome alignments between/within genomes facilitate identification of homologous regions and individual genes to investigate evolutionary consequences of genome duplication. PGDD (the Plant Genome Duplication Database), a public web service database, provides intra- or interplant genome alignment information. At present, PGDD contains information for 47 plants whose genome sequences have been released. Here, we describe methods for identification and estimation of dates of genome duplication and speciation by functions of PGDD.The database is freely available at http://chibba.agtec.uga.edu/duplication/.

  13. Bliver big data til big business?

    DEFF Research Database (Denmark)

    Ritter, Thomas

    2015-01-01

    Danmark har en digital infrastruktur, en registreringskultur og it-kompetente medarbejdere og kunder, som muliggør en førerposition, men kun hvis virksomhederne gør sig klar til næste big data-bølge.......Danmark har en digital infrastruktur, en registreringskultur og it-kompetente medarbejdere og kunder, som muliggør en førerposition, men kun hvis virksomhederne gør sig klar til næste big data-bølge....

  14. Big Boss Interval Games

    NARCIS (Netherlands)

    Alparslan-Gok, S.Z.; Brânzei, R.; Tijs, S.H.

    2008-01-01

    In this paper big boss interval games are introduced and various characterizations are given. The structure of the core of a big boss interval game is explicitly described and plays an important role relative to interval-type bi-monotonic allocation schemes for such games. Specifically, each element

  15. Big Ideas in Art

    Science.gov (United States)

    Day, Kathleen

    2008-01-01

    In this article, the author shares how she was able to discover some big ideas about art education. She relates how she found great ideas to improve her teaching from the book "Rethinking Curriculum in Art." She also shares how she designed a "Big Idea" unit in her class.

  16. Big data integration: scalability and sustainability

    KAUST Repository

    Zhang, Zhang

    2016-01-26

    Integration of various types of omics data is critically indispensable for addressing most important and complex biological questions. In the era of big data, however, data integration becomes increasingly tedious, time-consuming and expensive, posing a significant obstacle to fully exploit the wealth of big biological data. Here we propose a scalable and sustainable architecture that integrates big omics data through community-contributed modules. Community modules are contributed and maintained by different committed groups and each module corresponds to a specific data type, deals with data collection, processing and visualization, and delivers data on-demand via web services. Based on this community-based architecture, we build Information Commons for Rice (IC4R; http://ic4r.org), a rice knowledgebase that integrates a variety of rice omics data from multiple community modules, including genome-wide expression profiles derived entirely from RNA-Seq data, resequencing-based genomic variations obtained from re-sequencing data of thousands of rice varieties, plant homologous genes covering multiple diverse plant species, post-translational modifications, rice-related literatures, and community annotations. Taken together, such architecture achieves integration of different types of data from multiple community-contributed modules and accordingly features scalable, sustainable and collaborative integration of big data as well as low costs for database update and maintenance, thus helpful for building IC4R into a comprehensive knowledgebase covering all aspects of rice data and beneficial for both basic and translational researches.

  17. Modeling genomic regulatory networks with big data.

    Science.gov (United States)

    Bolouri, Hamid

    2014-05-01

    High-throughput sequencing, large-scale data generation projects, and web-based cloud computing are changing how computational biology is performed, who performs it, and what biological insights it can deliver. I review here the latest developments in available data, methods, and software, focusing on the modeling and analysis of the gene regulatory interactions in cells. Three key findings are: (i) although sophisticated computational resources are increasingly available to bench biologists, tailored ongoing education is necessary to avoid the erroneous use of these resources. (ii) Current models of the regulation of gene expression are far too simplistic and need updating. (iii) Integrative computational analysis of large-scale datasets is becoming a fundamental component of molecular biology. I discuss current and near-term opportunities and challenges related to these three points.

  18. Big Data in Caenorhabditis elegans: quo vadis?

    Science.gov (United States)

    Hutter, Harald; Moerman, Donald

    2015-11-05

    A clear definition of what constitutes "Big Data" is difficult to identify, but we find it most useful to define Big Data as a data collection that is complete. By this criterion, researchers on Caenorhabditis elegans have a long history of collecting Big Data, since the organism was selected with the idea of obtaining a complete biological description and understanding of development. The complete wiring diagram of the nervous system, the complete cell lineage, and the complete genome sequence provide a framework to phrase and test hypotheses. Given this history, it might be surprising that the number of "complete" data sets for this organism is actually rather small--not because of lack of effort, but because most types of biological experiments are not currently amenable to complete large-scale data collection. Many are also not inherently limited, so that it becomes difficult to even define completeness. At present, we only have partial data on mutated genes and their phenotypes, gene expression, and protein-protein interaction--important data for many biological questions. Big Data can point toward unexpected correlations, and these unexpected correlations can lead to novel investigations; however, Big Data cannot establish causation. As a result, there is much excitement about Big Data, but there is also a discussion on just what Big Data contributes to solving a biological problem. Because of its relative simplicity, C. elegans is an ideal test bed to explore this issue and at the same time determine what is necessary to build a multicellular organism from a single cell.

  19. Big data, big knowledge: big data for personalized healthcare.

    Science.gov (United States)

    Viceconti, Marco; Hunter, Peter; Hose, Rod

    2015-07-01

    The idea that the purely phenomenological knowledge that we can extract by analyzing large amounts of data can be useful in healthcare seems to contradict the desire of VPH researchers to build detailed mechanistic models for individual patients. But in practice no model is ever entirely phenomenological or entirely mechanistic. We propose in this position paper that big data analytics can be successfully combined with VPH technologies to produce robust and effective in silico medicine solutions. In order to do this, big data technologies must be further developed to cope with some specific requirements that emerge from this application. Such requirements are: working with sensitive data; analytics of complex and heterogeneous data spaces, including nontextual information; distributed data management under security and performance constraints; specialized analytics to integrate bioinformatics and systems biology information with clinical observations at tissue, organ and organisms scales; and specialized analytics to define the "physiological envelope" during the daily life of each patient. These domain-specific requirements suggest a need for targeted funding, in which big data technologies for in silico medicine becomes the research priority.

  20. Big data a primer

    CERN Document Server

    Bhuyan, Prachet; Chenthati, Deepak

    2015-01-01

    This book is a collection of chapters written by experts on various aspects of big data. The book aims to explain what big data is and how it is stored and used. The book starts from  the fundamentals and builds up from there. It is intended to serve as a review of the state-of-the-practice in the field of big data handling. The traditional framework of relational databases can no longer provide appropriate solutions for handling big data and making it available and useful to users scattered around the globe. The study of big data covers a wide range of issues including management of heterogeneous data, big data frameworks, change management, finding patterns in data usage and evolution, data as a service, service-generated data, service management, privacy and security. All of these aspects are touched upon in this book. It also discusses big data applications in different domains. The book will prove useful to students, researchers, and practicing database and networking engineers.

  1. Malaria Genome Sequencing Project

    Science.gov (United States)

    2004-01-01

    facts have stimulated efforts to develop an international, coordinated strategy for malaria research and control . Development of new drugs and...Interpolated Markov models for facilitate the development of new drugs and vaccines, the genome eukaryotic gene finding. Genomics 59, 24-31 (1999). of...Gardner, M. I. & Tettelin, H. Interpolated Markov models for facilitate the development of new drugs and vaccines, the genome eukaryotic gene finding

  2. Recht voor big data, big data voor recht

    NARCIS (Netherlands)

    Lafarre, Anne

    2016-01-01

    Big data is een niet meer weg te denken fenomeen in onze maatschappij. Het is de hype cycle voorbij en de eerste implementaties van big data-technieken worden uitgevoerd. Maar wat is nu precies big data? Wat houden de vijf V's in die vaak genoemd worden in relatie tot big data? Ter inleiding van dez

  3. Big data in oncologic imaging.

    Science.gov (United States)

    Regge, Daniele; Mazzetti, Simone; Giannini, Valentina; Bracco, Christian; Stasi, Michele

    2016-09-13

    Cancer is a complex disease and unfortunately understanding how the components of the cancer system work does not help understand the behavior of the system as a whole. In the words of the Greek philosopher Aristotle "the whole is greater than the sum of parts." To date, thanks to improved information technology infrastructures, it is possible to store data from each single cancer patient, including clinical data, medical images, laboratory tests, and pathological and genomic information. Indeed, medical archive storage constitutes approximately one-third of total global storage demand and a large part of the data are in the form of medical images. The opportunity is now to draw insight on the whole to the benefit of each individual patient. In the oncologic patient, big data analysis is at the beginning but several useful applications can be envisaged including development of imaging biomarkers to predict disease outcome, assessing the risk of X-ray dose exposure or of renal damage following the administration of contrast agents, and tracking and optimizing patient workflow. The aim of this review is to present current evidence of how big data derived from medical images may impact on the diagnostic pathway of the oncologic patient.

  4. Assessing Big Data

    DEFF Research Database (Denmark)

    Leimbach, Timo; Bachlechner, Daniel

    2015-01-01

    In recent years, big data has been one of the most controversially discussed technologies in terms of its possible positive and negative impact. Therefore, the need for technology assessments is obvious. This paper first provides, based on the results of a technology assessment study, an overview...... of the potential and challenges associated with big data and then describes the problems experienced during the study as well as methods found helpful to address them. The paper concludes with reflections on how the insights from the technology assessment study may have an impact on the future governance of big...... data....

  5. Inhomogeneous Big Bang Cosmology

    CERN Document Server

    Wagh, S M

    2002-01-01

    In this letter, we outline an inhomogeneous model of the Big Bang cosmology. For the inhomogeneous spacetime used here, the universe originates in the infinite past as the one dominated by vacuum energy and ends in the infinite future as the one consisting of "hot and relativistic" matter. The spatial distribution of matter in the considered inhomogeneous spacetime is {\\em arbitrary}. Hence, observed structures can arise in this cosmology from suitable "initial" density contrast. Different problems of the standard model of Big Bang cosmology are also resolved in the present inhomogeneous model. This inhomogeneous model of the Big Bang Cosmology predicts "hot death" for the universe.

  6. Big data for dummies

    CERN Document Server

    Hurwitz, Judith; Halper, Fern; Kaufman, Marcia

    2013-01-01

    Find the right big data solution for your business or organization Big data management is one of the major challenges facing business, industry, and not-for-profit organizations. Data sets such as customer transactions for a mega-retailer, weather patterns monitored by meteorologists, or social network activity can quickly outpace the capacity of traditional data management tools. If you need to develop or manage big data solutions, you'll appreciate how these four experts define, explain, and guide you through this new and often confusing concept. You'll learn what it is, why it m

  7. Big Data in der Cloud

    DEFF Research Database (Denmark)

    Leimbach, Timo; Bachlechner, Daniel

    2014-01-01

    Technology assessment of big data, in particular cloud based big data services, for the Office for Technology Assessment at the German federal parliament (Bundestag)......Technology assessment of big data, in particular cloud based big data services, for the Office for Technology Assessment at the German federal parliament (Bundestag)...

  8. Big data opportunities and challenges

    CERN Document Server

    2014-01-01

    This ebook aims to give practical guidance for all those who want to understand big data better and learn how to make the most of it. Topics range from big data analysis, mobile big data and managing unstructured data to technologies, governance and intellectual property and security issues surrounding big data.

  9. Reframing Open Big Data

    DEFF Research Database (Denmark)

    Marton, Attila; Avital, Michel; Jensen, Tina Blegind

    2013-01-01

    Recent developments in the techniques and technologies of collecting, sharing and analysing data are challenging the field of information systems (IS) research let alone the boundaries of organizations and the established practices of decision-making. Coined ‘open data’ and ‘big data......’, these developments introduce an unprecedented level of societal and organizational engagement with the potential of computational data to generate new insights and information. Based on the commonalities shared by open data and big data, we develop a research framework that we refer to as open big data (OBD......) by employing the dimensions of ‘order’ and ‘relationality’. We argue that these dimensions offer a viable approach for IS research on open and big data because they address one of the core value propositions of IS; i.e. how to support organizing with computational data. We contrast these dimensions with two...

  10. Big Data Revisited

    DEFF Research Database (Denmark)

    Kallinikos, Jannis; Constantiou, Ioanna

    2015-01-01

    We elaborate on key issues of our paper New games, new rules: big data and the changing context of strategy as a means of addressing some of the concerns raised by the paper’s commentators. We initially deal with the issue of social data and the role it plays in the current data revolution...... and the technological recording of facts. We further discuss the significance of the very mechanisms by which big data is produced as distinct from the very attributes of big data, often discussed in the literature. In the final section of the paper, we qualify the alleged importance of algorithms and claim...... that the structures of data capture and the architectures in which data generation is embedded are fundamental to the phenomenon of big data....

  11. Big Data as Governmentality

    DEFF Research Database (Denmark)

    Flyverbom, Mikkel; Klinkby Madsen, Anders; Rasche, Andreas

    This paper conceptualizes how large-scale data and algorithms condition and reshape knowledge production when addressing international development challenges. The concept of governmentality and four dimensions of an analytics of government are proposed as a theoretical framework to examine how big...... data is constituted as an aspiration to improve the data and knowledge underpinning development efforts. Based on this framework, we argue that big data’s impact on how relevant problems are governed is enabled by (1) new techniques of visualizing development issues, (2) linking aspects...... shows that big data problematizes selected aspects of traditional ways to collect and analyze data for development (e.g. via household surveys). We also demonstrate that using big data analyses to address development challenges raises a number of questions that can deteriorate its impact....

  12. Boarding to Big data

    Directory of Open Access Journals (Sweden)

    Oana Claudia BRATOSIN

    2016-05-01

    Full Text Available Today Big data is an emerging topic, as the quantity of the information grows exponentially, laying the foundation for its main challenge, the value of the information. The information value is not only defined by the value extraction from huge data sets, as fast and optimal as possible, but also by the value extraction from uncertain and inaccurate data, in an innovative manner using Big data analytics. At this point, the main challenge of the businesses that use Big data tools is to clearly define the scope and the necessary output of the business so that the real value can be gained. This article aims to explain the Big data concept, its various classifications criteria, architecture, as well as the impact in the world wide processes.

  13. The Big Bang Singularity

    Science.gov (United States)

    Ling, Eric

    The big bang theory is a model of the universe which makes the striking prediction that the universe began a finite amount of time in the past at the so called "Big Bang singularity." We explore the physical and mathematical justification of this surprising result. After laying down the framework of the universe as a spacetime manifold, we combine physical observations with global symmetrical assumptions to deduce the FRW cosmological models which predict a big bang singularity. Next we prove a couple theorems due to Stephen Hawking which show that the big bang singularity exists even if one removes the global symmetrical assumptions. Lastly, we investigate the conditions one needs to impose on a spacetime if one wishes to avoid a singularity. The ideas and concepts used here to study spacetimes are similar to those used to study Riemannian manifolds, therefore we compare and contrast the two geometries throughout.

  14. Big Data Analytics

    Indian Academy of Sciences (India)

    2016-08-01

    The volume and variety of data being generated using computersis doubling every two years. It is estimated that in 2015,8 Zettabytes (Zetta=1021) were generated which consistedmostly of unstructured data such as emails, blogs, Twitter,Facebook posts, images, and videos. This is called big data. Itis possible to analyse such huge data collections with clustersof thousands of inexpensive computers to discover patterns inthe data that have many applications. But analysing massiveamounts of data available in the Internet has the potential ofimpinging on our privacy. Inappropriate analysis of big datacan lead to misleading conclusions. In this article, we explainwhat is big data, how it is analysed, and give some case studiesillustrating the potentials and pitfalls of big data analytics.

  15. Big Creek Pit Tags

    Data.gov (United States)

    National Oceanic and Atmospheric Administration, Department of Commerce — The BCPITTAGS database is used to store data from an Oncorhynchus mykiss (steelhead/rainbow trout) population dynamics study in Big Creek, a coastal stream along the...

  16. Conociendo Big Data

    Directory of Open Access Journals (Sweden)

    Juan José Camargo-Vega

    2014-12-01

    Full Text Available Teniendo en cuenta la importancia que ha adquirido el término Big Data, la presente investigación buscó estudiar y analizar de manera exhaustiva el estado del arte del Big Data; además, y como segundo objetivo, analizó las características, las herramientas, las tecnologías, los modelos y los estándares relacionados con Big Data, y por último buscó identificar las características más relevantes en la gestión de Big Data, para que con ello se pueda conocer todo lo concerniente al tema central de la investigación.La metodología utilizada incluyó revisar el estado del arte de Big Data y enseñar su situación actual; conocer las tecnologías de Big Data; presentar algunas de las bases de datos NoSQL, que son las que permiten procesar datos con formatos no estructurados, y mostrar los modelos de datos y las tecnologías de análisis de ellos, para terminar con algunos beneficios de Big Data.El diseño metodológico usado para la investigación fue no experimental, pues no se manipulan variables, y de tipo exploratorio, debido a que con esta investigación se empieza a conocer el ambiente del Big Data.

  17. Minsky on "Big Government"

    Directory of Open Access Journals (Sweden)

    Daniel de Santana Vasconcelos

    2014-03-01

    Full Text Available This paper objective is to assess, in light of the main works of Minsky, his view and analysis of what he called the "Big Government" as that huge institution which, in parallels with the "Big Bank" was capable of ensuring stability in the capitalist system and regulate its inherently unstable financial system in mid-20th century. In this work, we analyze how Minsky proposes an active role for the government in a complex economic system flawed by financial instability.

  18. ANALYTICS OF BIG DATA

    OpenAIRE

    Asst. Prof. Shubhada Talegaon

    2014-01-01

    Big Data analytics has started to impact all types of organizations, as it carries the potential power to extract embedded knowledge from big amounts of data and react according to it in real time. The current technology enables us to efficiently store and query large datasets, the focus is now on techniques that make use of the complete data set, instead of sampling. This has tremendous implications in areas like machine learning, pattern recognition and classification, senti...

  19. Big data need big theory too.

    Science.gov (United States)

    Coveney, Peter V; Dougherty, Edward R; Highfield, Roger R

    2016-11-13

    The current interest in big data, machine learning and data analytics has generated the widespread impression that such methods are capable of solving most problems without the need for conventional scientific methods of inquiry. Interest in these methods is intensifying, accelerated by the ease with which digitized data can be acquired in virtually all fields of endeavour, from science, healthcare and cybersecurity to economics, social sciences and the humanities. In multiscale modelling, machine learning appears to provide a shortcut to reveal correlations of arbitrary complexity between processes at the atomic, molecular, meso- and macroscales. Here, we point out the weaknesses of pure big data approaches with particular focus on biology and medicine, which fail to provide conceptual accounts for the processes to which they are applied. No matter their 'depth' and the sophistication of data-driven methods, such as artificial neural nets, in the end they merely fit curves to existing data. Not only do these methods invariably require far larger quantities of data than anticipated by big data aficionados in order to produce statistically reliable results, but they can also fail in circumstances beyond the range of the data used to train them because they are not designed to model the structural characteristics of the underlying system. We argue that it is vital to use theory as a guide to experimental design for maximal efficiency of data collection and to produce reliable predictive models and conceptual knowledge. Rather than continuing to fund, pursue and promote 'blind' big data projects with massive budgets, we call for more funding to be allocated to the elucidation of the multiscale and stochastic processes controlling the behaviour of complex systems, including those of life, medicine and healthcare.This article is part of the themed issue 'Multiscale modelling at the physics-chemistry-biology interface'.

  20. Big data need big theory too

    OpenAIRE

    Coveney, Peter V.; Dougherty, Edward R; Highfield, Roger R.

    2016-01-01

    The current interest in big data, machine learning and data analytics has generated the widespread impression that such methods are capable of solving most problems without the need for conventional scientific methods of inquiry. Interest in these methods is intensifying, accelerated by the ease with which digitized data can be acquired in virtually all fields of endeavour, from science, healthcare and cybersecurity to economics, social sciences and the humanities. In multiscale modelling, ma...

  1. Big data need big theory too.

    OpenAIRE

    Coveney, P. V.; Dougherty, E. R.; Highfield, R. R.

    2016-01-01

    The current interest in big data, machine learning and data analytics has generated the widespread impression that such methods are capable of solving most problems without the need for conventional scientific methods of inquiry. Interest in these methods is intensifying, accelerated by the ease with which digitized data can be acquired in virtually all fields of endeavour, from science, healthcare and cybersecurity to economics, social sciences and the humanities. In multiscale modelling, ma...

  2. Big data need big theory too

    Science.gov (United States)

    Dougherty, Edward R.; Highfield, Roger R.

    2016-01-01

    The current interest in big data, machine learning and data analytics has generated the widespread impression that such methods are capable of solving most problems without the need for conventional scientific methods of inquiry. Interest in these methods is intensifying, accelerated by the ease with which digitized data can be acquired in virtually all fields of endeavour, from science, healthcare and cybersecurity to economics, social sciences and the humanities. In multiscale modelling, machine learning appears to provide a shortcut to reveal correlations of arbitrary complexity between processes at the atomic, molecular, meso- and macroscales. Here, we point out the weaknesses of pure big data approaches with particular focus on biology and medicine, which fail to provide conceptual accounts for the processes to which they are applied. No matter their ‘depth’ and the sophistication of data-driven methods, such as artificial neural nets, in the end they merely fit curves to existing data. Not only do these methods invariably require far larger quantities of data than anticipated by big data aficionados in order to produce statistically reliable results, but they can also fail in circumstances beyond the range of the data used to train them because they are not designed to model the structural characteristics of the underlying system. We argue that it is vital to use theory as a guide to experimental design for maximal efficiency of data collection and to produce reliable predictive models and conceptual knowledge. Rather than continuing to fund, pursue and promote ‘blind’ big data projects with massive budgets, we call for more funding to be allocated to the elucidation of the multiscale and stochastic processes controlling the behaviour of complex systems, including those of life, medicine and healthcare. This article is part of the themed issue ‘Multiscale modelling at the physics–chemistry–biology interface’. PMID:27698035

  3. Focus : big data, little questions?

    OpenAIRE

    Uprichard, Emma

    2013-01-01

    Big data. Little data. Deep data. Surface data. Noisy, unstructured data. Big. The world of data has gone from being analogue and digital, qualitative and quantitative, transactional and a by-product, to, simply, BIG. It is as if we couldn’t quite deal with its omnipotence and just ran out of adjectives. BIG. With all the data power it is supposedly meant to entail, one might have thought that a slightly better descriptive term might have been latched onto. But, no. BIG. Just BIG.

  4. Big data challenges

    DEFF Research Database (Denmark)

    Bachlechner, Daniel; Leimbach, Timo

    2016-01-01

    Although reports on big data success stories have been accumulating in the media, most organizations dealing with high-volume, high-velocity and high-variety information assets still face challenges. Only a thorough understanding of these challenges puts organizations into a position in which...... they can make an informed decision for or against big data, and, if the decision is positive, overcome the challenges smoothly. The combination of a series of interviews with leading experts from enterprises, associations and research institutions, and focused literature reviews allowed not only...... framework are also relevant. For large enterprises and startups specialized in big data, it is typically easier to overcome the challenges than it is for other enterprises and public administration bodies....

  5. Big and Small

    CERN Document Server

    Ekers, R D

    2010-01-01

    Technology leads discovery in astronomy, as in all other areas of science, so growth in technology leads to the continual stream of new discoveries which makes our field so fascinating. Derek de Solla Price had analysed the discovery process in science in the 1960s and he introduced the terms 'Little Science' and 'Big Science' as part of his discussion of the role of exponential growth in science. I will show how the development of astronomical facilities has followed this same trend from 'Little Science' to 'Big Science' as a field matures. We can see this in the discoveries resulting in Nobel Prizes in astronomy. A more detailed analysis of discoveries in radio astronomy shows the same effect. I include a digression to look at how science progresses, comparing the roles of prediction, serendipity, measurement and explanation. Finally I comment on the differences between the 'Big Science' culture in Physics and in Astronomy.

  6. Big Data and Cycling

    NARCIS (Netherlands)

    Romanillos, Gustavo; Zaltz Austwick, Martin; Ettema, Dick; De Kruijf, Joost

    2016-01-01

    Big Data has begun to create significant impacts in urban and transport planning. This paper covers the explosion in data-driven research on cycling, most of which has occurred in the last ten years. We review the techniques, objectives and findings of a growing number of studies we have classified

  7. Big data in history

    CERN Document Server

    Manning, Patrick

    2013-01-01

    Big Data in History introduces the project to create a world-historical archive, tracing the last four centuries of historical dynamics and change. Chapters address the archive's overall plan, how to interpret the past through a global archive, the missions of gathering records, linking local data into global patterns, and exploring the results.

  8. The big bang

    Science.gov (United States)

    Silk, Joseph

    Our universe was born billions of years ago in a hot, violent explosion of elementary particles and radiation - the big bang. What do we know about this ultimate moment of creation, and how do we know it? Drawing upon the latest theories and technology, this new edition of The big bang, is a sweeping, lucid account of the event that set the universe in motion. Joseph Silk begins his story with the first microseconds of the big bang, on through the evolution of stars, galaxies, clusters of galaxies, quasars, and into the distant future of our universe. He also explores the fascinating evidence for the big bang model and recounts the history of cosmological speculation. Revised and updated, this new edition features all the most recent astronomical advances, including: Photos and measurements from the Hubble Space Telescope, Cosmic Background Explorer Satellite (COBE), and Infrared Space Observatory; the latest estimates of the age of the universe; new ideas in string and superstring theory; recent experiments on neutrino detection; new theories about the presence of dark matter in galaxies; new developments in the theory of the formation and evolution of galaxies; the latest ideas about black holes, worm holes, quantum foam, and multiple universes.

  9. The Big Bang

    CERN Multimedia

    Moods, Patrick

    2006-01-01

    How did the Universe begin? The favoured theory is that everything - space, time, matter - came into existence at the same moment, around 13.7 thousand million years ago. This event was scornfully referred to as the "Big Bang" by Sir Fred Hoyle, who did not believe in it and maintained that the Universe had always existed.

  10. The Big Sky inside

    Science.gov (United States)

    Adams, Earle; Ward, Tony J.; Vanek, Diana; Marra, Nancy; Hester, Carolyn; Knuth, Randy; Spangler, Todd; Jones, David; Henthorn, Melissa; Hammill, Brock; Smith, Paul; Salisbury, Rob; Reckin, Gene; Boulafentis, Johna

    2009-01-01

    The University of Montana (UM)-Missoula has implemented a problem-based program in which students perform scientific research focused on indoor air pollution. The Air Toxics Under the Big Sky program (Jones et al. 2007; Adams et al. 2008; Ward et al. 2008) provides a community-based framework for understanding the complex relationship between poor…

  11. Big Java late objects

    CERN Document Server

    Horstmann, Cay S

    2012-01-01

    Big Java: Late Objects is a comprehensive introduction to Java and computer programming, which focuses on the principles of programming, software engineering, and effective learning. It is designed for a two-semester first course in programming for computer science students.

  12. Big Data ethics

    NARCIS (Netherlands)

    Zwitter, Andrej

    2014-01-01

    The speed of development in Big Data and associated phenomena, such as social media, has surpassed the capacity of the average consumer to understand his or her actions and their knock-on effects. We are moving towards changes in how ethics has to be perceived: away from individual decisions with sp

  13. A Big Bang Lab

    Science.gov (United States)

    Scheider, Walter

    2005-01-01

    The February 2005 issue of The Science Teacher (TST) reminded everyone that by learning how scientists study stars, students gain an understanding of how science measures things that can not be set up in lab, either because they are too big, too far away, or happened in a very distant past. The authors of "How Far are the Stars?" show how the…

  14. Space big book

    CERN Document Server

    Homer, Charlene

    2007-01-01

    Our Combined resource includes all necessary areas of Space for grades five to eight. Get the big picture about the Solar System, Galaxies and the Universe as your students become fascinated by the interesting information about the Sun, Earth, Moon, Comets, Asteroids Meteoroids, Stars and Constellations. Also, thrill your young astronomers as they connect Earth and space cycles with their daily life.

  15. Mouse Genome Informatics (MGI)

    Data.gov (United States)

    U.S. Department of Health & Human Services — MGI is the international database resource for the laboratory mouse, providing integrated genetic, genomic, and biological data to facilitate the study of human...

  16. Identifying Dwarfs Workloads in Big Data Analytics

    OpenAIRE

    Gao, Wanling; Luo, Chunjie; Zhan, Jianfeng; Ye, Hainan; He, Xiwen; Wang, Lei; Zhu, Yuqing; Tian, Xinhui

    2015-01-01

    Big data benchmarking is particularly important and provides applicable yardsticks for evaluating booming big data systems. However, wide coverage and great complexity of big data computing impose big challenges on big data benchmarking. How can we construct a benchmark suite using a minimum set of units of computation to represent diversity of big data analytics workloads? Big data dwarfs are abstractions of extracting frequently appearing operations in big data computing. One dwarf represen...

  17. Big Data and Chemical Education

    Science.gov (United States)

    Pence, Harry E.; Williams, Antony J.

    2016-01-01

    The amount of computerized information that organizations collect and process is growing so large that the term Big Data is commonly being used to describe the situation. Accordingly, Big Data is defined by a combination of the Volume, Variety, Velocity, and Veracity of the data being processed. Big Data tools are already having an impact in…

  18. Ontology for Genome Comparison and Genomic Rearrangements

    Directory of Open Access Journals (Sweden)

    Anil Wipat

    2006-04-01

    Full Text Available We present an ontology for describing genomes, genome comparisons, their evolution and biological function. This ontology will support the development of novel genome comparison algorithms and aid the community in discussing genomic evolution. It provides a framework for communication about comparative genomics, and a basis upon which further automated analysis can be built. The nomenclature defined by the ontology will foster clearer communication between biologists, and also standardize terms used by data publishers in the results of analysis programs. The overriding aim of this ontology is the facilitation of consistent annotation of genomes through computational methods, rather than human annotators. To this end, the ontology includes definitions that support computer analysis and automated transfer of annotations between genomes, rather than relying upon human mediation.

  19. The British Library Big Data Experiment: Experimental Interfaces, Experimental Teaching

    OpenAIRE

    2015-01-01

    The British Library Big Data Experiment is an ongoing collaboration between British Library Digital Research and UCL Department of Computer Science (UCLCS), facilitated by UCL Centre for Digital Humanities (UCLDH), engaging computer science students with humanities research and digital libraries as part of their core assessed work.

  20. Business and Science - Big Data, Big Picture

    Science.gov (United States)

    Rosati, A.

    2013-12-01

    Data Science is more than the creation, manipulation, and transformation of data. It is more than Big Data. The business world seems to have a hold on the term 'data science' and, for now, they define what it means. But business is very different than science. In this talk, I address how large datasets, Big Data, and data science are conceptually different in business and science worlds. I focus on the types of questions each realm asks, the data needed, and the consequences of findings. Gone are the days of datasets being created or collected to serve only one purpose or project. The trick with data reuse is to become familiar enough with a dataset to be able to combine it with other data and extract accurate results. As a Data Curator for the Advanced Cooperative Arctic Data and Information Service (ACADIS), my specialty is communication. Our team enables Arctic sciences by ensuring datasets are well documented and can be understood by reusers. Previously, I served as a data community liaison for the North American Regional Climate Change Assessment Program (NARCCAP). Again, my specialty was communicating complex instructions and ideas to a broad audience of data users. Before entering the science world, I was an entrepreneur. I have a bachelor's degree in economics and a master's degree in environmental social science. I am currently pursuing a Ph.D. in Geography. Because my background has embraced both the business and science worlds, I would like to share my perspectives on data, data reuse, data documentation, and the presentation or communication of findings. My experiences show that each can inform and support the other.

  1. How Big Are "Martin's Big Words"? Thinking Big about the Future.

    Science.gov (United States)

    Gardner, Traci

    "Martin's Big Words: The Life of Dr. Martin Luther King, Jr." tells of King's childhood determination to use "big words" through biographical information and quotations. In this lesson, students in grades 3 to 5 explore information on Dr. King to think about his "big" words, then they write about their own…

  2. Big Data and reality

    Directory of Open Access Journals (Sweden)

    Ryan Shaw

    2015-11-01

    Full Text Available DNA sequencers, Twitter, MRIs, Facebook, particle accelerators, Google Books, radio telescopes, Tumblr: what do these things have in common? According to the evangelists of “data science,” all of these are instruments for observing reality at unprecedentedly large scales and fine granularities. This perspective ignores the social reality of these very different technological systems, ignoring how they are made, how they work, and what they mean in favor of an exclusive focus on what they generate: Big Data. But no data, big or small, can be interpreted without an understanding of the process that generated them. Statistical data science is applicable to systems that have been designed as scientific instruments, but is likely to lead to confusion when applied to systems that have not. In those cases, a historical inquiry is preferable.

  3. Big Data-Survey

    Directory of Open Access Journals (Sweden)

    P.S.G. Aruna Sri

    2016-03-01

    Full Text Available Big data is the term for any gathering of information sets, so expensive and complex, that it gets to be hard to process for utilizing customary information handling applications. The difficulties incorporate investigation, catch, duration, inquiry, sharing, stockpiling, Exchange, perception, and protection infringement. To reduce spot business patterns, anticipate diseases, conflict etc., we require bigger data sets when compared with the smaller data sets. Enormous information is hard to work with utilizing most social database administration frameworks and desktop measurements and perception bundles, needing rather enormously parallel programming running on tens, hundreds, or even a large number of servers. In this paper there was an observation on Hadoop architecture, different tools used for big data and its security issues.

  4. Big Data as Governmentality

    DEFF Research Database (Denmark)

    Flyverbom, Mikkel; Madsen, Anders Koed; Rasche, Andreas

    data is constituted as an aspiration to improve the data and knowledge underpinning development efforts. Based on this framework, we argue that big data’s impact on how relevant problems are governed is enabled by (1) new techniques of visualizing development issues, (2) linking aspects......This paper conceptualizes how large-scale data and algorithms condition and reshape knowledge production when addressing international development challenges. The concept of governmentality and four dimensions of an analytics of government are proposed as a theoretical framework to examine how big...... of the international development agenda to algorithms that synthesize large-scale data, (3) novel ways of rationalizing knowledge claims that underlie development policies, and (4) shifts in professional and organizational identities of those concerned with producing and processing data for development. Our discussion...

  5. Really big numbers

    CERN Document Server

    Schwartz, Richard Evan

    2014-01-01

    In the American Mathematical Society's first-ever book for kids (and kids at heart), mathematician and author Richard Evan Schwartz leads math lovers of all ages on an innovative and strikingly illustrated journey through the infinite number system. By means of engaging, imaginative visuals and endearing narration, Schwartz manages the monumental task of presenting the complex concept of Big Numbers in fresh and relatable ways. The book begins with small, easily observable numbers before building up to truly gigantic ones, like a nonillion, a tredecillion, a googol, and even ones too huge for names! Any person, regardless of age, can benefit from reading this book. Readers will find themselves returning to its pages for a very long time, perpetually learning from and growing with the narrative as their knowledge deepens. Really Big Numbers is a wonderful enrichment for any math education program and is enthusiastically recommended to every teacher, parent and grandparent, student, child, or other individual i...

  6. ANALYTICS OF BIG DATA

    Directory of Open Access Journals (Sweden)

    Prof. Shubhada Talegaon

    2015-10-01

    Full Text Available Big Data analytics has started to impact all types of organizations, as it carries the potential power to extract embedded knowledge from big amounts of data and react according to it in real time. The current technology enables us to efficiently store and query large datasets, the focus is now on techniques that make use of the complete data set, instead of sampling. This has tremendous implications in areas like machine learning, pattern recognition and classification, sentiment analysis, social networking analysis to name a few. Therefore, there are a number of requirements for moving beyond standard data mining technique. Purpose of this paper is to understand various techniques to analysis data.

  7. Finding the big bang

    CERN Document Server

    Page, Lyman A; Partridge, R Bruce

    2009-01-01

    Cosmology, the study of the universe as a whole, has become a precise physical science, the foundation of which is our understanding of the cosmic microwave background radiation (CMBR) left from the big bang. The story of the discovery and exploration of the CMBR in the 1960s is recalled for the first time in this collection of 44 essays by eminent scientists who pioneered the work. Two introductory chapters put the essays in context, explaining the general ideas behind the expanding universe and fossil remnants from the early stages of the expanding universe. The last chapter describes how the confusion of ideas and measurements in the 1960s grew into the present tight network of tests that demonstrate the accuracy of the big bang theory. This book is valuable to anyone interested in how science is done, and what it has taught us about the large-scale nature of the physical universe.

  8. DARPA's Big Mechanism program.

    Science.gov (United States)

    Cohen, Paul R

    2015-07-16

    Reductionist science produces causal models of small fragments of complicated systems. Causal models of entire systems can be hard to construct because what is known of them is distributed across a vast amount of literature. The Big Mechanism program aims to have machines read the literature and assemble the causal fragments found in individual papers into huge causal models, automatically. The current domain of the program is cell signalling associated with Ras-driven cancers.

  9. DARPA's Big Mechanism program

    Science.gov (United States)

    Cohen, Paul R.

    2015-07-01

    Reductionist science produces causal models of small fragments of complicated systems. Causal models of entire systems can be hard to construct because what is known of them is distributed across a vast amount of literature. The Big Mechanism program aims to have machines read the literature and assemble the causal fragments found in individual papers into huge causal models, automatically. The current domain of the program is cell signalling associated with Ras-driven cancers.

  10. Big Bang 8

    CERN Document Server

    Apolin, Martin

    2008-01-01

    Physik soll verständlich sein und Spaß machen! Deshalb beginnt jedes Kapitel in Big Bang mit einem motivierenden Überblick und Fragestellungen und geht dann von den Grundlagen zu den Anwendungen, vom Einfachen zum Komplizierten. Dabei bleibt die Sprache einfach, alltagsorientiert und belletristisch. Band 8 vermittelt auf verständliche Weise Relativitätstheorie, Kern- und Teilchenphysik (und deren Anwendungen in der Kosmologie und Astrophysik), Nanotechnologie sowie Bionik.

  11. Big Bang 6

    CERN Document Server

    Apolin, Martin

    2008-01-01

    Physik soll verständlich sein und Spaß machen! Deshalb beginnt jedes Kapitel in Big Bang mit einem motivierenden Überblick und Fragestellungen und geht dann von den Grundlagen zu den Anwendungen, vom Einfachen zum Komplizierten. Dabei bleibt die Sprache einfach, alltagsorientiert und belletristisch. Der Band 6 RG behandelt die Gravitation, Schwingungen und Wellen, Thermodynamik und eine Einführung in die Elektrizität anhand von Alltagsbeispielen und Querverbindungen zu anderen Disziplinen.

  12. Big Bang 5

    CERN Document Server

    Apolin, Martin

    2007-01-01

    Physik soll verständlich sein und Spaß machen! Deshalb beginnt jedes Kapitel in Big Bang mit einem motivierenden Überblick und Fragestellungen und geht dann von den Grundlagen zu den Anwendungen, vom Einfachen zum Komplizierten. Dabei bleibt die Sprache einfach, alltagsorientiert und belletristisch. Der Band 5 RG behandelt die Grundlagen (Maßsystem, Größenordnungen) und die Mechanik (Translation, Rotation, Kraft, Erhaltungssätze).

  13. Big Bang 7

    CERN Document Server

    Apolin, Martin

    2008-01-01

    Physik soll verständlich sein und Spaß machen! Deshalb beginnt jedes Kapitel in Big Bang mit einem motivierenden Überblick und Fragestellungen und geht dann von den Grundlagen zu den Anwendungen, vom Einfachen zum Komplizierten. Dabei bleibt die Sprache einfach, alltagsorientiert und belletristisch. In Band 7 werden neben einer Einführung auch viele aktuelle Aspekte von Quantenmechanik (z. Beamen) und Elektrodynamik (zB Elektrosmog), sowie die Klimaproblematik und die Chaostheorie behandelt.

  14. Big Data Knowledge Mining

    Directory of Open Access Journals (Sweden)

    Huda Umar Banuqitah

    2016-11-01

    Full Text Available Big Data (BD era has been arrived. The ascent of big data applications where information accumulation has grown beyond the ability of the present programming instrument to catch, manage and process within tolerable short time. The volume is not only the characteristic that defines big data, but also velocity, variety, and value. Many resources contain BD that should be processed. The biomedical research literature is one among many other domains that hides a rich knowledge. MEDLINE is a huge biomedical research database which remain a significantly underutilized source of biological information. Discovering the useful knowledge from such huge corpus leading to many problems related to the type of information such as the related concepts of the domain of texts and the semantic relationship associated with them. In this paper, an agent-based system of two–level for Self-supervised relation extraction from MEDLINE using Unified Medical Language System (UMLS Knowledgebase, has been proposed . The model uses a Self-supervised Approach for Relation Extraction (RE by constructing enhanced training examples using information from UMLS with hybrid text features. The model incorporates Apache Spark and HBase BD technologies with multiple data mining and machine learning technique with the Multi Agent System (MAS. The system shows a better result in comparison with the current state of the art and naïve approach in terms of Accuracy, Precision, Recall and F-score.

  15. The NOAA Big Data Project

    Science.gov (United States)

    de la Beaujardiere, J.

    2015-12-01

    The US National Oceanic and Atmospheric Administration (NOAA) is a Big Data producer, generating tens of terabytes per day from hundreds of sensors on satellites, radars, aircraft, ships, and buoys, and from numerical models. These data are of critical importance and value for NOAA's mission to understand and predict changes in climate, weather, oceans, and coasts. In order to facilitate extracting additional value from this information, NOAA has established Cooperative Research and Development Agreements (CRADAs) with five Infrastructure-as-a-Service (IaaS) providers — Amazon, Google, IBM, Microsoft, Open Cloud Consortium — to determine whether hosting NOAA data in publicly-accessible Clouds alongside on-demand computational capability stimulates the creation of new value-added products and services and lines of business based on the data, and if the revenue generated by these new applications can support the costs of data transmission and hosting. Each IaaS provider is the anchor of a "Data Alliance" which organizations or entrepreneurs can join to develop and test new business or research avenues. This presentation will report on progress and lessons learned during the first 6 months of the 3-year CRADAs.

  16. Disaggregating asthma: Big investigation versus big data.

    Science.gov (United States)

    Belgrave, Danielle; Henderson, John; Simpson, Angela; Buchan, Iain; Bishop, Christopher; Custovic, Adnan

    2017-02-01

    We are facing a major challenge in bridging the gap between identifying subtypes of asthma to understand causal mechanisms and translating this knowledge into personalized prevention and management strategies. In recent years, "big data" has been sold as a panacea for generating hypotheses and driving new frontiers of health care; the idea that the data must and will speak for themselves is fast becoming a new dogma. One of the dangers of ready accessibility of health care data and computational tools for data analysis is that the process of data mining can become uncoupled from the scientific process of clinical interpretation, understanding the provenance of the data, and external validation. Although advances in computational methods can be valuable for using unexpected structure in data to generate hypotheses, there remains a need for testing hypotheses and interpreting results with scientific rigor. We argue for combining data- and hypothesis-driven methods in a careful synergy, and the importance of carefully characterized birth and patient cohorts with genetic, phenotypic, biological, and molecular data in this process cannot be overemphasized. The main challenge on the road ahead is to harness bigger health care data in ways that produce meaningful clinical interpretation and to translate this into better diagnoses and properly personalized prevention and treatment plans. There is a pressing need for cross-disciplinary research with an integrative approach to data science, whereby basic scientists, clinicians, data analysts, and epidemiologists work together to understand the heterogeneity of asthma.

  17. Vertical landscraping, a big regionalism for Dubai.

    Science.gov (United States)

    Wilson, Matthew

    2010-01-01

    Dubai's ecologic and economic complications are exacerbated by six years of accelerated expansion, a fixed top-down approach to urbanism and the construction of iconic single-phase mega-projects. With recent construction delays, project cancellations and growing landscape issues, Dubai's tower typologies have been unresponsive to changing environmental, socio-cultural and economic patterns (BBC, 2009; Gillet, 2009; Lewis, 2009). In this essay, a theory of "Big Regionalism" guides an argument for an economically and ecologically linked tower typology called the Condenser. This phased "box-to-tower" typology is part of a greater Landscape Urbanist strategy called Vertical Landscraping. Within this strategy, the Condenser's role is to densify the city, facilitating the creation of ecologic voids that order the urban region. Delineating "Big Regional" principles, the Condenser provides a time-based, global-local urban growth approach that weaves Bigness into a series of urban-regional, economic and ecological relationships, builds upon the environmental performance of the city's regional architecture and planning, promotes a continuity of Dubai's urban history, and responds to its landscape issues while condensing development. These speculations permit consideration of the overlooked opportunities embedded within Dubai's mega-projects and their long-term impact on the urban morphology.

  18. Navigating a Sea of Big Data

    Science.gov (United States)

    Kinkade, D.; Chandler, C. L.; Groman, R. C.; Shepherd, A.; Allison, M. D.; Rauch, S.; Wiebe, P. H.; Glover, D. M.

    2014-12-01

    Oceanographic research is evolving rapidly. New technologies, strategies, and related infrastructures have catalyzed a change in the nature of oceanographic data. Heterogeneous and complex data types can be produced and transferred at great speeds. This shift in volume, variety, and velocity of data produced has led to increased challenges in managing these Big Data. In addition, distributed research communities have greater needs for data quality control, discovery and public accessibility, and seamless integration for interdisciplinary study. Organizations charged with curating oceanographic data must also evolve to meet these needs and challenges, by employing new technologies and strategies. The Biological and Chemical Oceanography Data Management Office (BCO-DMO) was created in 2006, to fulfill the data management needs of investigators funded by the NSF Ocean Sciences Biological and Chemical Sections and Polar Programs Antarctic Organisms and Ecosystems Program. Since its inception, the Office has had to modify internal systems and operations to address Big Data challenges to meet the needs of the ever-evolving oceanographic research community. Some enhancements include automated procedures replacing labor-intensive manual tasks, adoption of metadata standards facilitating machine client access, a geospatial interface and the use of Semantic Web technologies to increase data discovery and interoperability. This presentation will highlight some of the BCO-DMO advances that enable us to successfully fulfill our mission in a Big Data world.

  19. Leveraging big data to transform target selection and drug discovery

    Science.gov (United States)

    Butte, AJ

    2016-01-01

    The advances of genomics, sequencing, and high throughput technologies have led to the creation of large volumes of diverse datasets for drug discovery. Analyzing these datasets to better understand disease and discover new drugs is becoming more common. Recent open data initiatives in basic and clinical research have dramatically increased the types of data available to the public. The past few years have witnessed successful use of big data in many sectors across the whole drug discovery pipeline. In this review, we will highlight the state of the art in leveraging big data to identify new targets, drug indications, and drug response biomarkers in this era of precision medicine. PMID:26659699

  20. Learning facilitating leadership

    DEFF Research Database (Denmark)

    Rasmussen, Lauge Baungaard; Hansen, Mette Sanne

    2016-01-01

    in teaching facilitation and the literature. These types of skills are most effectively acquired by combining conceptual lectures, classroom exercises and the facilitation of groups in a real-life context. The paper also reflects certain ‘shadow sides’ related to facilitation observed by the students...

  1. Real-Time Pathogen Detection in the Era of Whole-Genome Sequencing and Big Data: Comparison of k-mer and Site-Based Methods for Inferring the Genetic Distances among Tens of Thousands of Salmonella Samples.

    Science.gov (United States)

    Pettengill, James B; Pightling, Arthur W; Baugher, Joseph D; Rand, Hugh; Strain, Errol

    2016-01-01

    The adoption of whole-genome sequencing within the public health realm for molecular characterization of bacterial pathogens has been followed by an increased emphasis on real-time detection of emerging outbreaks (e.g., food-borne Salmonellosis). In turn, large databases of whole-genome sequence data are being populated. These databases currently contain tens of thousands of samples and are expected to grow to hundreds of thousands within a few years. For these databases to be of optimal use one must be able to quickly interrogate them to accurately determine the genetic distances among a set of samples. Being able to do so is challenging due to both biological (evolutionary diverse samples) and computational (petabytes of sequence data) issues. We evaluated seven measures of genetic distance, which were estimated from either k-mer profiles (Jaccard, Euclidean, Manhattan, Mash Jaccard, and Mash distances) or nucleotide sites (NUCmer and an extended multi-locus sequence typing (MLST) scheme). When analyzing empirical data (whole-genome sequence data from 18,997 Salmonella isolates) there are features (e.g., genomic, assembly, and contamination) that cause distances inferred from k-mer profiles, which treat absent data as informative, to fail to accurately capture the distance between samples when compared to distances inferred from differences in nucleotide sites. Thus, site-based distances, like NUCmer and extended MLST, are superior in performance, but accessing the computing resources necessary to perform them may be challenging when analyzing large databases.

  2. Single-cell Transcriptome Study as Big Data

    Institute of Scientific and Technical Information of China (English)

    Pingjian Yu; Wei Lin

    2016-01-01

    The rapid growth of single-cell RNA-seq studies (scRNA-seq) demands efficient data storage, processing, and analysis. Big-data technology provides a framework that facilitates the comprehensive discovery of biological signals from inter-institutional scRNA-seq datasets. The strategies to solve the stochastic and heterogeneous single-cell transcriptome signal are discussed in this article. After extensively reviewing the available big-data applications of next-generation sequencing (NGS)-based studies, we propose a workflow that accounts for the unique characteris-tics of scRNA-seq data and primary objectives of single-cell studies.

  3. Transforming business models through big data in the textile industry

    DEFF Research Database (Denmark)

    Aagaard, Annabeth

    as stressed by Zott et al. (2011), Weill et al. (2011) and David J. Teece (2010: 174), who states that: “the concept of a business model lacks theoretical grounding in economics or in business studies”. With the acceleration of digitization and use of big data analytics quality data are accessible......, such as textile, and have led to disruption of established business models (Westerman et al., 2014; Weill and Woerner, 2015). Yet, little is known of the managerial process and facilitation of the digital transformation of business models through big data (McAfee and Brynjolfsson, 2012; Markus and Loebbecke, 2013)....

  4. Single-cell Transcriptome Study as Big Data

    Science.gov (United States)

    Yu, Pingjian; Lin, Wei

    2016-01-01

    The rapid growth of single-cell RNA-seq studies (scRNA-seq) demands efficient data storage, processing, and analysis. Big-data technology provides a framework that facilitates the comprehensive discovery of biological signals from inter-institutional scRNA-seq datasets. The strategies to solve the stochastic and heterogeneous single-cell transcriptome signal are discussed in this article. After extensively reviewing the available big-data applications of next-generation sequencing (NGS)-based studies, we propose a workflow that accounts for the unique characteristics of scRNA-seq data and primary objectives of single-cell studies. PMID:26876720

  5. Asymmetric author-topic model for knowledge discovering of big data in toxicogenomics

    Directory of Open Access Journals (Sweden)

    Ming-Hua eChung

    2015-04-01

    Full Text Available The advancement of high-throughput screening technologies facilitates the generation of massive amount of biological data, a big data phenomena in biomedical science. Yet, researchers still heavily rely on keyword search and/or literature review to navigate the databases and analyses are often done in rather small-scale. As a result, the rich information of a database has not been fully utilized, particularly for the information embedded in the interactive nature between data points that are largely ignored and buried. For the past ten years, probabilistic topic modeling has been recognized as an effective machine learning algorithm to annotate the hidden thematic structure of massive collection of documents. The analogy between text corpus and large-scale genomic data enables the application of text mining tools, like probabilistic topic models, to explore hidden patterns of genomic data and to the extension of altered biological functions. In this paper, we developed a generalized probabilistic topic model to analyze a toxicogenomics dataset that consists of a large number of gene expression data from the rat livers treated with drugs in multiple dose and time-points. We discovered the hidden patterns in gene expression associated with the effect of doses and time-points of treatment. Finally, we illustrated the ability of our model to identify the evidence of potential reduction of animal use.

  6. How Big is Earth?

    Science.gov (United States)

    Thurber, Bonnie B.

    2015-08-01

    How Big is Earth celebrates the Year of Light. Using only the sunlight striking the Earth and a wooden dowel, students meet each other and then measure the circumference of the earth. Eratosthenes did it over 2,000 years ago. In Cosmos, Carl Sagan shared the process by which Eratosthenes measured the angle of the shadow cast at local noon when sunlight strikes a stick positioned perpendicular to the ground. By comparing his measurement to another made a distance away, Eratosthenes was able to calculate the circumference of the earth. How Big is Earth provides an online learning environment where students do science the same way Eratosthenes did. A notable project in which this was done was The Eratosthenes Project, conducted in 2005 as part of the World Year of Physics; in fact, we will be drawing on the teacher's guide developed by that project.How Big Is Earth? expands on the Eratosthenes project by providing an online learning environment provided by the iCollaboratory, www.icollaboratory.org, where teachers and students from Sweden, China, Nepal, Russia, Morocco, and the United States collaborate, share data, and reflect on their learning of science and astronomy. They are sharing their information and discussing their ideas/brainstorming the solutions in a discussion forum. There is an ongoing database of student measurements and another database to collect data on both teacher and student learning from surveys, discussions, and self-reflection done online.We will share our research about the kinds of learning that takes place only in global collaborations.The entrance address for the iCollaboratory is http://www.icollaboratory.org.

  7. Privacy and Big Data

    CERN Document Server

    Craig, Terence

    2011-01-01

    Much of what constitutes Big Data is information about us. Through our online activities, we leave an easy-to-follow trail of digital footprints that reveal who we are, what we buy, where we go, and much more. This eye-opening book explores the raging privacy debate over the use of personal data, with one undeniable conclusion: once data's been collected, we have absolutely no control over who uses it or how it is used. Personal data is the hottest commodity on the market today-truly more valuable than gold. We are the asset that every company, industry, non-profit, and government wants. Pri

  8. Big Data Challenges

    Directory of Open Access Journals (Sweden)

    Alexandru Adrian TOLE

    2013-10-01

    Full Text Available The amount of data that is traveling across the internet today, not only that is large, but is complex as well. Companies, institutions, healthcare system etc., all of them use piles of data which are further used for creating reports in order to ensure continuity regarding the services that they have to offer. The process behind the results that these entities requests represents a challenge for software developers and companies that provide IT infrastructure. The challenge is how to manipulate an impressive volume of data that has to be securely delivered through the internet and reach its destination intact. This paper treats the challenges that Big Data creates.

  9. Can Pleasant Goat and Big Big Wolf Save China's Animation Industry?

    Institute of Scientific and Technical Information of China (English)

    Guo Liqin

    2009-01-01

    "My dreamed husband is big big wolf," claimed Miss Fang, a young lady who works in KPMG Beijing Office. This big big wolf is a lovely cartoon wolf appeared in a Pleasant Goat and Big Big Wolf produced independently by Chinese.

  10. Asteroids Were Born Big

    CERN Document Server

    Morbidelli, Alessandro; Nesvorny, David; Levison, Harold F

    2009-01-01

    How big were the first planetesimals? We attempt to answer this question by conducting coagulation simulations in which the planetesimals grow by mutual collisions and form larger bodies and planetary embryos. The size frequency distribution (SFD) of the initial planetesimals is considered a free parameter in these simulations, and we search for the one that produces at the end objects with a SFD that is consistent with asteroid belt constraints. We find that, if the initial planetesimals were small (e.g. km-sized), the final SFD fails to fulfill these constraints. In particular, reproducing the bump observed at diameter D~100km in the current SFD of the asteroids requires that the minimal size of the initial planetesimals was also ~100km. This supports the idea that planetesimals formed big, namely that the size of solids in the proto-planetary disk ``jumped'' from sub-meter scale to multi-kilometer scale, without passing through intermediate values. Moreover, we find evidence that the initial planetesimals ...

  11. Visual explorer facilitator's guide

    CERN Document Server

    Palus, Charles J

    2010-01-01

    Grounded in research and practice, the Visual Explorer™ Facilitator's Guide provides a method for supporting collaborative, creative conversations about complex issues through the power of images. The guide is available as a component in the Visual Explorer Facilitator's Letter-sized Set, Visual Explorer Facilitator's Post card-sized Set, Visual Explorer Playing Card-sized Set, and is also available as a stand-alone title for purchase to assist multiple tool users in an organization.

  12. Passport to the Big Bang

    CERN Multimedia

    De Melis, Cinzia

    2013-01-01

    Le 2 juin 2013, le CERN inaugure le projet Passeport Big Bang lors d'un grand événement public. Affiche et programme. On 2 June 2013 CERN launches a scientific tourist trail through the Pays de Gex and the Canton of Geneva known as the Passport to the Big Bang. Poster and Programme.

  13. Fremtidens landbrug bliver big business

    DEFF Research Database (Denmark)

    Hansen, Henning Otte

    2016-01-01

    Landbrugets omverdensforhold og konkurrencevilkår ændres, og det vil nødvendiggøre en udvikling i retning af “big business“, hvor landbrugene bliver endnu større, mere industrialiserede og koncentrerede. Big business bliver en dominerende udvikling i dansk landbrug - men ikke den eneste...

  14. Mitochondrial Disease Sequence Data Resource (MSeqDR): A global grass-roots consortium to facilitate deposition, curation, annotation, and integrated analysis of genomic data for the mitochondrial disease clinical and research communities

    NARCIS (Netherlands)

    M.J. Falk (Marni J.); L. Shen (Lishuang); M. Gonzalez (Michael); J. Leipzig (Jeremy); M.T. Lott (Marie T.); A.P.M. Stassen (Alphons P.M.); M.A. Diroma (Maria Angela); D. Navarro-Gomez (Daniel); P. Yeske (Philip); R. Bai (Renkui); R.G. Boles (Richard G.); V. Brilhante (Virginia); D. Ralph (David); J.T. DaRe (Jeana T.); R. Shelton (Robert); S.F. Terry (Sharon); Z. Zhang (Zhe); W.C. Copeland (William C.); M. van Oven (Mannis); H. Prokisch (Holger); D.C. Wallace; M. Attimonelli (Marcella); D. Krotoski (Danuta); S. Zuchner (Stephan); X. Gai (Xiaowu); S. Bale (Sherri); J. Bedoyan (Jirair); D.M. Behar (Doron); P. Bonnen (Penelope); L. Brooks (Lisa); C. Calabrese (Claudia); S. Calvo (Sarah); P.F. Chinnery (Patrick); J. Christodoulou (John); D. Church (Deanna); R. Clima (Rosanna); B.H. Cohen (Bruce H.); R.G.H. Cotton (Richard); I.F.M. de Coo (René); O. Derbenevoa (Olga); J.T. den Dunnen (Johan); D. Dimmock (David); G. Enns (Gregory); G. Gasparre (Giuseppe); A. Goldstein (Amy); I. Gonzalez (Iris); K. Gwinn (Katrina); S. Hahn (Sihoun); R.H. Haas (Richard H.); H. Hakonarson (Hakon); M. Hirano (Michio); D. Kerr (Douglas); D. Li (Dong); M. Lvova (Maria); F. Macrae (Finley); D. Maglott (Donna); E. McCormick (Elizabeth); G. Mitchell (Grant); V.K. Mootha (Vamsi K.); Y. Okazaki (Yasushi); A. Pujol (Aurora); M. Parisi (Melissa); J.C. Perin (Juan Carlos); E.A. Pierce (Eric A.); V. Procaccio (Vincent); S. Rahman (Shamima); H. Reddi (Honey); H. Rehm (Heidi); E. Riggs (Erin); R.J.T. Rodenburg (Richard); Y. Rubinstein (Yaffa); R. Saneto (Russell); M. Santorsola (Mariangela); C. Scharfe (Curt); C. Sheldon (Claire); E.A. Shoubridge (Eric); D. Simone (Domenico); B. Smeets (Bert); J.A.M. Smeitink (Jan); C. Stanley (Christine); A. Suomalainen (Anu); M.A. Tarnopolsky (Mark); I. Thiffault (Isabelle); D.R. Thorburn (David R.); J.V. Hove (Johan Van); L. Wolfe (Lynne); L.-J. Wong (Lee-Jun)

    2015-01-01

    textabstractSuccess rates for genomic analyses of highly heterogeneous disorders can be greatly improved if a large cohort of patient data is assembled to enhance collective capabilities for accurate sequence variant annotation, analysis, and interpretation. Indeed, molecular diagnostics requires th

  15. Genome Maps, a new generation genome browser.

    Science.gov (United States)

    Medina, Ignacio; Salavert, Francisco; Sanchez, Rubén; de Maria, Alejandro; Alonso, Roberto; Escobar, Pablo; Bleda, Marta; Dopazo, Joaquín

    2013-07-01

    Genome browsers have gained importance as more genomes and related genomic information become available. However, the increase of information brought about by new generation sequencing technologies is, at the same time, causing a subtle but continuous decrease in the efficiency of conventional genome browsers. Here, we present Genome Maps, a genome browser that implements an innovative model of data transfer and management. The program uses highly efficient technologies from the new HTML5 standard, such as scalable vector graphics, that optimize workloads at both server and client sides and ensure future scalability. Thus, data management and representation are entirely carried out by the browser, without the need of any Java Applet, Flash or other plug-in technology installation. Relevant biological data on genes, transcripts, exons, regulatory features, single-nucleotide polymorphisms, karyotype and so forth, are imported from web services and are available as tracks. In addition, several DAS servers are already included in Genome Maps. As a novelty, this web-based genome browser allows the local upload of huge genomic data files (e.g. VCF or BAM) that can be dynamically visualized in real time at the client side, thus facilitating the management of medical data affected by privacy restrictions. Finally, Genome Maps can easily be integrated in any web application by including only a few lines of code. Genome Maps is an open source collaborative initiative available in the GitHub repository (https://github.com/compbio-bigdata-viz/genome-maps). Genome Maps is available at: http://www.genomemaps.org.

  16. Genome Maps, a new generation genome browser

    Science.gov (United States)

    Medina, Ignacio; Salavert, Francisco; Sanchez, Rubén; de Maria, Alejandro; Alonso, Roberto; Escobar, Pablo; Bleda, Marta; Dopazo, Joaquín

    2013-01-01

    Genome browsers have gained importance as more genomes and related genomic information become available. However, the increase of information brought about by new generation sequencing technologies is, at the same time, causing a subtle but continuous decrease in the efficiency of conventional genome browsers. Here, we present Genome Maps, a genome browser that implements an innovative model of data transfer and management. The program uses highly efficient technologies from the new HTML5 standard, such as scalable vector graphics, that optimize workloads at both server and client sides and ensure future scalability. Thus, data management and representation are entirely carried out by the browser, without the need of any Java Applet, Flash or other plug-in technology installation. Relevant biological data on genes, transcripts, exons, regulatory features, single-nucleotide polymorphisms, karyotype and so forth, are imported from web services and are available as tracks. In addition, several DAS servers are already included in Genome Maps. As a novelty, this web-based genome browser allows the local upload of huge genomic data files (e.g. VCF or BAM) that can be dynamically visualized in real time at the client side, thus facilitating the management of medical data affected by privacy restrictions. Finally, Genome Maps can easily be integrated in any web application by including only a few lines of code. Genome Maps is an open source collaborative initiative available in the GitHub repository (https://github.com/compbio-bigdata-viz/genome-maps). Genome Maps is available at: http://www.genomemaps.org. PMID:23748955

  17. The Rise of Big Data in Neurorehabilitation.

    Science.gov (United States)

    Faroqi-Shah, Yasmeen

    2016-02-01

    In some fields, Big Data has been instrumental in analyzing, predicting, and influencing human behavior. However, Big Data approaches have so far been less central in speech-language pathology. This article introduces the concept of Big Data and provides examples of Big Data initiatives pertaining to adult neurorehabilitation. It also discusses the potential theoretical and clinical contributions that Big Data can make. The article also recognizes some impediments in building and using Big Data for scientific and clinical inquiry.

  18. A Grey Theory Based Approach to Big Data Risk Management Using FMEA

    Directory of Open Access Journals (Sweden)

    Maisa Mendonça Silva

    2016-01-01

    Full Text Available Big data is the term used to denote enormous sets of data that differ from other classic databases in four main ways: (huge volume, (high velocity, (much greater variety, and (big value. In general, data are stored in a distributed fashion and on computing nodes as a result of which big data may be more susceptible to attacks by hackers. This paper presents a risk model for big data, which comprises Failure Mode and Effects Analysis (FMEA and Grey Theory, more precisely grey relational analysis. This approach has several advantages: it provides a structured approach in order to incorporate the impact of big data risk factors; it facilitates the assessment of risk by breaking down the overall risk to big data; and finally its efficient evaluation criteria can help enterprises reduce the risks associated with big data. In order to illustrate the applicability of our proposal in practice, a numerical example, with realistic data based on expert knowledge, was developed. The numerical example analyzes four dimensions, that is, managing identification and access, registering the device and application, managing the infrastructure, and data governance, and 20 failure modes concerning the vulnerabilities of big data. The results show that the most important aspect of risk to big data relates to data governance.

  19. BIG DATA AND STATISTICS

    Science.gov (United States)

    Rossell, David

    2016-01-01

    Big Data brings unprecedented power to address scientific, economic and societal issues, but also amplifies the possibility of certain pitfalls. These include using purely data-driven approaches that disregard understanding the phenomenon under study, aiming at a dynamically moving target, ignoring critical data collection issues, summarizing or preprocessing the data inadequately and mistaking noise for signal. We review some success stories and illustrate how statistical principles can help obtain more reliable information from data. We also touch upon current challenges that require active methodological research, such as strategies for efficient computation, integration of heterogeneous data, extending the underlying theory to increasingly complex questions and, perhaps most importantly, training a new generation of scientists to develop and deploy these strategies.

  20. Big Bounce Genesis

    CERN Document Server

    Li, Changhong; Cheung, Yeuk-Kwan E

    2014-01-01

    We report on the possibility to use dark matter mass and its interaction cross section as a smoking gun signal of the existence of a big bounce at the early stage in the evolution of our currently observed universe. A model independent study of dark matter production in the contraction and expansion phases of the bounce universe reveals a new venue for achieving the observed relic abundance in which a significantly smaller amount of dark matter--compared to the standard cosmology--is produced and survives until today, diluted only by the cosmic expansion since the radiation dominated era. Once DM mass and its interaction strength with ordinary matter are determined by experiments, this alternative route becomes a signature of the bounce universe scenario.

  1. Big Hero 6

    Institute of Scientific and Technical Information of China (English)

    2015-01-01

    看《超能陆战队》如何让普通人变身超级英雄拯救城市!Hiro Hamada,14,lives in the future city of San Fransokyo.He has a robot(机器人)friend Baymax.Baymax is big and soft.His job is to nurse sick(生病的)people.One day,a bad man wants to take control of(控制)SanFransokyo.Hiro hopes to save(挽救)the city with Baymax.ButBaymax is just a nursing robot.This is not a problem for Hiro,(ho一we套ve盔r.甲He)knows a lot about robots.He makes a suit of armorfor Baymax and turns him into a super robot!

  2. The Last Big Bang

    Energy Technology Data Exchange (ETDEWEB)

    McGuire, Austin D. [Los Alamos National Lab. (LANL), Los Alamos, NM (United States); Meade, Roger Allen [Los Alamos National Lab. (LANL), Los Alamos, NM (United States)

    2016-09-13

    As one of the very few people in the world to give the “go/no go” decision to detonate a nuclear device, Austin “Mac” McGuire holds a very special place in the history of both the Los Alamos National Laboratory and the world. As Commander of Joint Task Force Unit 8.1.1, on Christmas Island in the spring and summer of 1962, Mac directed the Los Alamos data collection efforts for twelve of the last atmospheric nuclear detonations conducted by the United States. Since data collection was at the heart of nuclear weapon testing, it fell to Mac to make the ultimate decision to detonate each test device. He calls his experience THE LAST BIG BANG, since these tests, part of Operation Dominic, were characterized by the dramatic displays of the heat, light, and sounds unique to atmospheric nuclear detonations – never, perhaps, to be witnessed again.

  3. Avoiding a Big Catastrophe

    Institute of Scientific and Technical Information of China (English)

    2008-01-01

    Before last October,the South China tiger had almost slipped into mythi- cal status as it had been absent for so long from the public eye.In the previous 20-plus years,these tigers could not be found in the wild in China and the number of those in captivity numbered only around 60. The species—a direct descendent of the earliest tigers thought to have originat- ed in China 2 million years ago—is functionally extinct,according to experts. The big cat’s return to the media spotlight was completely unexpected. On October 12,2007,a digital picture,showing a wild South China tiger

  4. Training facilitators and supervisors

    DEFF Research Database (Denmark)

    Kjær, Louise Binow; O Connor, Maja; Krogh, Kristian;

    At the Master’s program in Medicine at Aarhus University, Denmark, we have developed a faculty development program for facilitators and supervisors in 4 progressing student modules in communication, cooperation, and leadership. 1) A course for module 1 and 3 facilitators inspired by the apprentic...

  5. BIG: a large-scale data integration tool for renal physiology.

    Science.gov (United States)

    Zhao, Yue; Yang, Chin-Rang; Raghuram, Viswanathan; Parulekar, Jaya; Knepper, Mark A

    2016-10-01

    Due to recent advances in high-throughput techniques, we and others have generated multiple proteomic and transcriptomic databases to describe and quantify gene expression, protein abundance, or cellular signaling on the scale of the whole genome/proteome in kidney cells. The existence of so much data from diverse sources raises the following question: "How can researchers find information efficiently for a given gene product over all of these data sets without searching each data set individually?" This is the type of problem that has motivated the "Big-Data" revolution in Data Science, which has driven progress in fields such as marketing. Here we present an online Big-Data tool called BIG (Biological Information Gatherer) that allows users to submit a single online query to obtain all relevant information from all indexed databases. BIG is accessible at http://big.nhlbi.nih.gov/.

  6. The Storyboard's Big Picture

    Science.gov (United States)

    Malloy, Cheryl A.; Cooley, William

    2003-01-01

    At Science Applications International Corporation (SAIC), Cape Canaveral Office, we're using a project management tool that facilitates team communication, keeps our project team focused, streamlines work and identifies potential issues. What did it cost us to install the tool? Almost nothing.

  7. BigData as a Driver for Capacity Building in Astrophysics

    Science.gov (United States)

    Shastri, Prajval

    2015-08-01

    Exciting public interest in astrophysics acquires new significance in the era of Big Data. Since Big Data involves advanced technologies of both software and hardware, astrophysics with Big Data has the potential to inspire young minds with diverse inclinations - i.e., not just those attracted to physics but also those pursuing engineering careers. Digital technologies have become steadily cheaper, which can enable expansion of the Big Data user pool considerably, especially to communities that may not yet be in the astrophysics mainstream, but have high potential because of access to thesetechnologies. For success, however, capacity building at the early stages becomes key. The development of on-line pedagogical resources in astrophysics, astrostatistics, data-mining and data visualisation that are designed around the big facilities of the future can be an important effort that drives such capacity building, especially if facilitated by the IAU.

  8. Big Bang of Massenergy and Negative Big Bang of Spacetime

    Science.gov (United States)

    Cao, Dayong

    2017-01-01

    There is a balance between Big Bang of Massenergy and Negative Big Bang of Spacetime in the universe. Also some scientists considered there is an anti-Big Bang who could produce the antimatter. And the paper supposes there is a structure balance between Einstein field equation and negative Einstein field equation, a balance between massenergy structure and spacetime structure, a balance between an energy of nucleus of the stellar matter and a dark energy of nucleus of the dark matter-dark energy, and a balance between the particle and the wave-a balance system between massenergy (particle) and spacetime (wave). It should explain of the problems of the Big Bang. http://meetings.aps.org/Meeting/APR16/Session/M13.8

  9. Google BigQuery analytics

    CERN Document Server

    Tigani, Jordan

    2014-01-01

    How to effectively use BigQuery, avoid common mistakes, and execute sophisticated queries against large datasets Google BigQuery Analytics is the perfect guide for business and data analysts who want the latest tips on running complex queries and writing code to communicate with the BigQuery API. The book uses real-world examples to demonstrate current best practices and techniques, and also explains and demonstrates streaming ingestion, transformation via Hadoop in Google Compute engine, AppEngine datastore integration, and using GViz with Tableau to generate charts of query results. In addit

  10. Big Data: present and future

    Directory of Open Access Journals (Sweden)

    Mircea Raducu TRIFU

    2014-05-01

    Full Text Available The paper explains the importance of the Big Data concept, a concept that even now, after years of development, is for the most companies just a cool keyword. The paper also describes the level of the actual big data development and the things it can do, and also the things that can be done in the near future. The paper focuses on explaining to nontechnical and non-database related technical specialists what basically is big data, presents the three most important V's, as well as the new ones, the most important solutions used by companies like Google or Amazon, as well as some interesting perceptions based on this subject.

  11. The challenges of big data.

    Science.gov (United States)

    Mardis, Elaine R

    2016-05-01

    The largely untapped potential of big data analytics is a feeding frenzy that has been fueled by the production of many next-generation-sequencing-based data sets that are seeking to answer long-held questions about the biology of human diseases. Although these approaches are likely to be a powerful means of revealing new biological insights, there are a number of substantial challenges that currently hamper efforts to harness the power of big data. This Editorial outlines several such challenges as a means of illustrating that the path to big data revelations is paved with perils that the scientific community must overcome to pursue this important quest.

  12. Big Data: present and future

    OpenAIRE

    Mircea Raducu TRIFU; Mihaela Laura IVAN

    2014-01-01

    The paper explains the importance of the Big Data concept, a concept that even now, after years of development, is for the most companies just a cool keyword. The paper also describes the level of the actual big data development and the things it can do, and also the things that can be done in the near future. The paper focuses on explaining to nontechnical and non-database related technical specialists what basically is big data, presents the three most important V's, as well as the new ...

  13. Big Data Mining: Tools & Algorithms

    Directory of Open Access Journals (Sweden)

    Adeel Shiraz Hashmi

    2016-03-01

    Full Text Available We are now in Big Data era, and there is a growing demand for tools which can process and analyze it. Big data analytics deals with extracting valuable information from that complex data which can’t be handled by traditional data mining tools. This paper surveys the available tools which can handle large volumes of data as well as evolving data streams. The data mining tools and algorithms which can handle big data have also been summarized, and one of the tools has been used for mining of large datasets using distributed algorithms.

  14. The challenges of big data

    Science.gov (United States)

    2016-01-01

    ABSTRACT The largely untapped potential of big data analytics is a feeding frenzy that has been fueled by the production of many next-generation-sequencing-based data sets that are seeking to answer long-held questions about the biology of human diseases. Although these approaches are likely to be a powerful means of revealing new biological insights, there are a number of substantial challenges that currently hamper efforts to harness the power of big data. This Editorial outlines several such challenges as a means of illustrating that the path to big data revelations is paved with perils that the scientific community must overcome to pursue this important quest. PMID:27147249

  15. Big Data is invading big places as CERN

    CERN Document Server

    CERN. Geneva

    2017-01-01

    Big Data technologies are becoming more popular with the constant grow of data generation in different fields such as social networks, internet of things and laboratories like CERN. How is CERN making use of such technologies? How machine learning is applied at CERN with Big Data technologies? How much data we move and how it is analyzed? All these questions will be answered during the talk.

  16. Boosting Big National Lab Data

    Energy Technology Data Exchange (ETDEWEB)

    Kleese van Dam, Kerstin [Pacific Northwest National Lab. (PNNL), Richland, WA (United States)

    2013-02-21

    Introduction: Big data. Love it or hate it, solving the world’s most intractable problems requires the ability to make sense of huge and complex sets of data and do it quickly. Speeding up the process – from hours to minutes or from weeks to days – is key to our success. One major source of such big data are physical experiments. As many will know, these physical experiments are commonly used to solve challenges in fields such as energy security, manufacturing, medicine, pharmacology, environmental protection and national security. Experiments use different instruments and sensor types to research for example the validity of new drugs, the base cause for diseases, more efficient energy sources, new materials for every day goods, effective methods for environmental cleanup, the optimal ingredients composition for chocolate or determine how to preserve valuable antics. This is done by experimentally determining the structure, properties and processes that govern biological systems, chemical processes and materials. The speed and quality at which we can acquire new insights from experiments directly influences the rate of scientific progress, industrial innovation and competitiveness. And gaining new groundbreaking insights, faster, is key to the economic success of our nations. Recent years have seen incredible advances in sensor technologies, from house size detector systems in large experiments such as the Large Hadron Collider and the ‘Eye of Gaia’ billion pixel camera detector to high throughput genome sequencing. These developments have led to an exponential increase in data volumes, rates and variety produced by instruments used for experimental work. This increase is coinciding with a need to analyze the experimental results at the time they are collected. This speed is required to optimize the data taking and quality, and also to enable new adaptive experiments, where the sample is manipulated as it is observed, e.g. a substance is injected into a

  17. The Big Chills

    Science.gov (United States)

    Bond, G. C.; Dwyer, G. S.; Bauch, H. A.

    2002-12-01

    At the end of the last glacial, the Earth's climate system abruptly shifted into the Younger Dryas, a 1500-year long cold snap known in the popular media as the Big Chill. Following an abrupt warming ending the Younger Dryas about 11,600 years ago, the climate system has remained in an interglacial state, thought to have been relatively stable and devoid, with possibly one or two exceptions, of abrupt climate change. A growing amount of evidence suggests that this benign view of interglacial climate is incorrect. High resolution records of North Atlantic ice rafted sediment, now regarded as evidence of extreme multiyear sea ice drift, reveal abrupt shifts on centennial and millennial time scales. These have been traced from the end of the Younger Dryas to the present, revealing evidence of significant climate variability through all of the last two millennia. Correlatives of these events have been found in drift ice records from the Arctic's Laptev Sea, in the isotopic composition of North Grip ice, and in dissolved K from the GISP2 ice core, attesting to their regional extent and imprint in proxies of very different origins. Measurements of Mg/Ca ratios in planktic foraminifera over the last two millennia in the eastern North Atlantic demonstrate that increases in drifting multiyear sea ice were accompanied by abrupt decreases in sea surface temperatures, especially during the Little Ice Age. Estimated rates of temperature change are on the order of two degrees centigrade, more than thirty percent of the regional glacial to interglacial change, within a few decades. When compared at the same resolution, these interglacial variations are as abrupt as the last glacial's Dansgaard-Oeschger cycles. The interglacial abrupt changes are especially striking because they occurred within the core of the warm North Atlantic Current. The changes may have been triggered by variations in solar irradiance, but if so their large magnitude and regional extent requires amplifying

  18. Facilitating Understandings of Geometry.

    Science.gov (United States)

    Pappas, Christine C.; Bush, Sara

    1989-01-01

    Illustrates some learning encounters for facilitating first graders' understanding of geometry. Describes some of children's approaches using Cuisenaire rods and teacher's intervening. Presents six problems involving various combinations of Cuisenaire rods and cubes. (YP)

  19. Facilitating Knowledge Sharing

    DEFF Research Database (Denmark)

    Holdt Christensen, Peter

    Abstract This paper argues that knowledge sharing can be conceptualized as different situations of exchange in which individuals relate to each other in different ways, involving different rules, norms and traditions of reciprocity regulating the exchange. The main challenge for facilitating...... and the intermediaries regulating the exchange, and facilitating knowledge sharing should therefore be viewed as a continuum of practices under the influence of opportunistic behaviour, obedience or organizational citizenship behaviour. Keywords: Knowledge sharing, motivation, organizational settings, situations...

  20. Multi-OMICs and Genome Editing Perspectives on Liver Cancer Signaling Networks

    Science.gov (United States)

    Lin, Shengda; Yin, Yi A.; Jiang, Xiaoqian; Sahni, Nidhi; Yi, Song

    2016-01-01

    The advent of the human genome sequence and the resulting ~20,000 genes provide a crucial framework for a transition from traditional biology to an integrative “OMICs” arena (Lander et al., 2001; Venter et al., 2001; Kitano, 2002). This brings in a revolution for cancer research, which now enters a big data era. In the past decade, with the facilitation by next-generation sequencing, there have been a huge number of large-scale sequencing efforts, such as The Cancer Genome Atlas (TCGA), the HapMap, and the 1000 genomes project. As a result, a deluge of genomic information becomes available from patients stricken by a variety of cancer types. The list of cancer-associated genes is ever expanding. New discoveries are made on how frequent and highly penetrant mutations, such as those in the telomerase reverse transcriptase (TERT) and TP53, function in cancer initiation, progression, and metastasis. Most genes with relatively frequent but weakly penetrant cancer mutations still remain to be characterized. In addition, genes that harbor rare but highly penetrant cancer-associated mutations continue to emerge. Here, we review recent advances related to cancer genomics, proteomics, and systems biology and suggest new perspectives in targeted therapy and precision medicine. PMID:27403431

  1. Multi-OMICs and Genome Editing Perspectives on Liver Cancer Signaling Networks

    Directory of Open Access Journals (Sweden)

    Shengda Lin

    2016-01-01

    Full Text Available The advent of the human genome sequence and the resulting ~20,000 genes provide a crucial framework for a transition from traditional biology to an integrative “OMICs” arena (Lander et al., 2001; Venter et al., 2001; Kitano, 2002. This brings in a revolution for cancer research, which now enters a big data era. In the past decade, with the facilitation by next-generation sequencing, there have been a huge number of large-scale sequencing efforts, such as The Cancer Genome Atlas (TCGA, the HapMap, and the 1000 genomes project. As a result, a deluge of genomic information becomes available from patients stricken by a variety of cancer types. The list of cancer-associated genes is ever expanding. New discoveries are made on how frequent and highly penetrant mutations, such as those in the telomerase reverse transcriptase (TERT and TP53, function in cancer initiation, progression, and metastasis. Most genes with relatively frequent but weakly penetrant cancer mutations still remain to be characterized. In addition, genes that harbor rare but highly penetrant cancer-associated mutations continue to emerge. Here, we review recent advances related to cancer genomics, proteomics, and systems biology and suggest new perspectives in targeted therapy and precision medicine.

  2. Genome evolution of Oryza

    Directory of Open Access Journals (Sweden)

    Tieyan Liu

    2014-01-01

    Full Text Available The genus Oryza is composed of approximately 24 species. Wild species of Oryza contain a largely untapped resource of agronomically important genes. As an increasing number of genomes of wild rice species have been or will be sequenced, Oryza is becoming a model system for plant comparative, functional and evolutionary genomics studies. Comparative analyses of large genomic regions and whole-genome sequences have revealed molecular mechanisms involved in genome size variation, gene movement, genome evolution of polyploids, transition of euchromatin to heterochromatin and centromere evolution in the genus Oryza. Transposon activity and removal of transposable elements by unequal recombination or illegitimate recombination are two important factors contributing to expansion or contraction of Oryza genomes. Double-strand break repair mediated gene movement, especially non-homologous end joining, is an important source of non-colinear genes. Transition of euchromatin to heterochromatin is accompanied by transposable element amplification, segmental and tandem duplication of genic segments, and acquisition of heterochromatic genes from other genomic locations. Comparative analyses of multiple genomes dramatically improve the precision and sensitivity of evolutionary inference than single-genome analyses can provide. Further investigations on the impact of structural variation, lineage-specific genes and evolution of agriculturally important genes on phenotype diversity and adaptation in the genus Oryza should facilitate molecular breeding and genetic improvement of rice.

  3. Big Data and Perioperative Nursing.

    Science.gov (United States)

    Westra, Bonnie L; Peterson, Jessica J

    2016-10-01

    Big data are large volumes of digital data that can be collected from disparate sources and are challenging to analyze. These data are often described with the five "Vs": volume, velocity, variety, veracity, and value. Perioperative nurses contribute to big data through documentation in the electronic health record during routine surgical care, and these data have implications for clinical decision making, administrative decisions, quality improvement, and big data science. This article explores methods to improve the quality of perioperative nursing data and provides examples of how these data can be combined with broader nursing data for quality improvement. We also discuss a national action plan for nursing knowledge and big data science and how perioperative nurses can engage in collaborative actions to transform health care. Standardized perioperative nursing data has the potential to affect care far beyond the original patient.

  4. The BigBOSS Experiment

    Energy Technology Data Exchange (ETDEWEB)

    Schelgel, D.; Abdalla, F.; Abraham, T.; Ahn, C.; Allende Prieto, C.; Annis, J.; Aubourg, E.; Azzaro, M.; Bailey, S.; Baltay, C.; Baugh, C.; /APC, Paris /Brookhaven /IRFU, Saclay /Marseille, CPPM /Marseille, CPT /Durham U. / /IEU, Seoul /Fermilab /IAA, Granada /IAC, La Laguna

    2011-01-01

    BigBOSS will obtain observational constraints that will bear on three of the four 'science frontier' questions identified by the Astro2010 Cosmology and Fundamental Phyics Panel of the Decadal Survey: Why is the universe accelerating; what is dark matter and what are the properties of neutrinos? Indeed, the BigBOSS project was recommended for substantial immediate R and D support the PASAG report. The second highest ground-based priority from the Astro2010 Decadal Survey was the creation of a funding line within the NSF to support a 'Mid-Scale Innovations' program, and it used BigBOSS as a 'compelling' example for support. This choice was the result of the Decadal Survey's Program Priorization panels reviewing 29 mid-scale projects and recommending BigBOSS 'very highly'.

  5. Big Bang Nucleosynthesis: 2015

    CERN Document Server

    Cyburt, Richard H; Olive, Keith A; Yeh, Tsung-Han

    2015-01-01

    Big-bang nucleosynthesis (BBN) describes the production of the lightest nuclides via a dynamic interplay among the four fundamental forces during the first seconds of cosmic time. We briefly overview the essentials of this physics, and present new calculations of light element abundances through li6 and li7, with updated nuclear reactions and uncertainties including those in the neutron lifetime. We provide fits to these results as a function of baryon density and of the number of neutrino flavors, N_nu. We review recent developments in BBN, particularly new, precision Planck cosmic microwave background (CMB) measurements that now probe the baryon density, helium content, and the effective number of degrees of freedom, n_eff. These measurements allow for a tight test of BBN and of cosmology using CMB data alone. Our likelihood analysis convolves the 2015 Planck data chains with our BBN output and observational data. Adding astronomical measurements of light elements strengthens the power of BBN. We include a ...

  6. Big bang darkleosynthesis

    Directory of Open Access Journals (Sweden)

    Gordan Krnjaic

    2015-12-01

    Full Text Available In a popular class of models, dark matter comprises an asymmetric population of composite particles with short range interactions arising from a confined nonabelian gauge group. We show that coupling this sector to a well-motivated light mediator particle yields efficient darkleosynthesis, a dark-sector version of big-bang nucleosynthesis (BBN, in generic regions of parameter space. Dark matter self-interaction bounds typically require the confinement scale to be above ΛQCD, which generically yields large (≫MeV/dark-nucleon binding energies. These bounds further suggest the mediator is relatively weakly coupled, so repulsive forces between dark-sector nuclei are much weaker than Coulomb repulsion between standard-model nuclei, which results in an exponential barrier-tunneling enhancement over standard BBN. Thus, darklei are easier to make and harder to break than visible species with comparable mass numbers. This process can efficiently yield a dominant population of states with masses significantly greater than the confinement scale and, in contrast to dark matter that is a fundamental particle, may allow the dominant form of dark matter to have high spin (S≫3/2, whose discovery would be smoking gun evidence for dark nuclei.

  7. Big Data Comes to School

    OpenAIRE

    Bill Cope; Mary Kalantzis

    2016-01-01

    The prospect of “big data” at once evokes optimistic views of an information-rich future and concerns about surveillance that adversely impacts our personal and private lives. This overview article explores the implications of big data in education, focusing by way of example on data generated by student writing. We have chosen writing because it presents particular complexities, highlighting the range of processes for collecting and interpreting evidence of learning in the era of computer-me...

  8. Big Data for Precision Medicine

    OpenAIRE

    Daniel Richard Leff; Guang-Zhong Yang

    2015-01-01

    This article focuses on the potential impact of big data analysis to improve health, prevent and detect disease at an earlier stage, and personalize interventions. The role that big data analytics may have in interrogating the patient electronic health record toward improved clinical decision support is discussed. We examine developments in pharmacogenetics that have increased our appreciation of the reasons why patients respond differently to chemotherapy. We also assess the expansion of onl...

  9. The role of big laboratories

    CERN Document Server

    Heuer, Rolf-Dieter

    2013-01-01

    This paper presents the role of big laboratories in their function as research infrastructures. Starting from the general definition and features of big laboratories, the paper goes on to present the key ingredients and issues, based on scientific excellence, for the successful realization of large-scale science projects at such facilities. The paper concludes by taking the example of scientific research in the field of particle physics and describing the structures and methods required to be implemented for the way forward.

  10. Challenges of Big Data Analysis.

    Science.gov (United States)

    Fan, Jianqing; Han, Fang; Liu, Han

    2014-06-01

    Big Data bring new opportunities to modern society and challenges to data scientists. On one hand, Big Data hold great promises for discovering subtle population patterns and heterogeneities that are not possible with small-scale data. On the other hand, the massive sample size and high dimensionality of Big Data introduce unique computational and statistical challenges, including scalability and storage bottleneck, noise accumulation, spurious correlation, incidental endogeneity, and measurement errors. These challenges are distinguished and require new computational and statistical paradigm. This article gives overviews on the salient features of Big Data and how these features impact on paradigm change on statistical and computational methods as well as computing architectures. We also provide various new perspectives on the Big Data analysis and computation. In particular, we emphasize on the viability of the sparsest solution in high-confidence set and point out that exogeneous assumptions in most statistical methods for Big Data can not be validated due to incidental endogeneity. They can lead to wrong statistical inferences and consequently wrong scientific conclusions.

  11. Powering Big Data for Nursing Through Partnership.

    Science.gov (United States)

    Harper, Ellen M; Parkerson, Sara

    2015-01-01

    The Big Data Principles Workgroup (Workgroup) was established with support of the Healthcare Information and Management Systems Society. Building on the Triple Aim challenge, the Workgroup sought to identify Big Data principles, barriers, and challenges to nurse-sensitive data inclusion into Big Data sets. The product of this pioneering partnership Workgroup was the "Guiding Principles for Big Data in Nursing-Using Big Data to Improve the Quality of Care and Outcomes."

  12. Genomic signal processing

    CERN Document Server

    Shmulevich, Ilya

    2007-01-01

    Genomic signal processing (GSP) can be defined as the analysis, processing, and use of genomic signals to gain biological knowledge, and the translation of that knowledge into systems-based applications that can be used to diagnose and treat genetic diseases. Situated at the crossroads of engineering, biology, mathematics, statistics, and computer science, GSP requires the development of both nonlinear dynamical models that adequately represent genomic regulation, and diagnostic and therapeutic tools based on these models. This book facilitates these developments by providing rigorous mathema

  13. Small country, big business?

    DEFF Research Database (Denmark)

    Martens, Kerstin; Starke, Peter

    2008-01-01

    of education shows that this is not necessarily the case, at least not in the medium-term: New Zealand's government rather appears to be an active facilitator of the liberalisation process in education. We review its recent move towards treating education as an international export good and present data...... the example of New Zealand as a case study for the internationalisation of education services, the study depicts the way the government is involved in this process. Commodification of sectors traditionally subject to domestic public policy is often associated with a less interventionist state, but our example...

  14. Consistent wind Facilitates Vection

    Directory of Open Access Journals (Sweden)

    Masaki Ogawa

    2011-10-01

    Full Text Available We examined whether a consistent haptic cue suggesting forward self-motion facilitated vection. We used a fan with no blades (Dyson, AM01 providing a wind of constant strength and direction (wind speed was 6.37 m/s to the subjects' faces with the visual stimuli visible through the fan. We used an optic flow of expansion or contraction created by positioning 16,000 dots at random inside a simulated cube (length 20 m, and moving the observer's viewpoint to simulate forward or backward self-motion of 16 m/s. we tested three conditions for fan operation, which were normal operation, normal operation with the fan reversed (ie, no wind, and no operation (no wind and no sound. Vection was facilitated by the wind (shorter latency, longer duration and larger magnitude values with the expansion stimuli. The fan noise did not facilitate vection. The wind neither facilitated nor inhibited vection with the contraction stimuli, perhaps because a headwind is not consistent with backward self-motion. We speculate that the consistency between multi modalities is a key factor in facilitating vection.

  15. Implementing genomics and pharmacogenomics in the clinic: The National Human Genome Research Institute's genomic medicine portfolio.

    Science.gov (United States)

    Manolio, Teri A

    2016-10-01

    Increasing knowledge about the influence of genetic variation on human health and growing availability of reliable, cost-effective genetic testing have spurred the implementation of genomic medicine in the clinic. As defined by the National Human Genome Research Institute (NHGRI), genomic medicine uses an individual's genetic information in his or her clinical care, and has begun to be applied effectively in areas such as cancer genomics, pharmacogenomics, and rare and undiagnosed diseases. In 2011 NHGRI published its strategic vision for the future of genomic research, including an ambitious research agenda to facilitate and promote the implementation of genomic medicine. To realize this agenda, NHGRI is consulting and facilitating collaborations with the external research community through a series of "Genomic Medicine Meetings," under the guidance and leadership of the National Advisory Council on Human Genome Research. These meetings have identified and begun to address significant obstacles to implementation, such as lack of evidence of efficacy, limited availability of genomics expertise and testing, lack of standards, and difficulties in integrating genomic results into electronic medical records. The six research and dissemination initiatives comprising NHGRI's genomic research portfolio are designed to speed the evaluation and incorporation, where appropriate, of genomic technologies and findings into routine clinical care. Actual adoption of successful approaches in clinical care will depend upon the willingness, interest, and energy of professional societies, practitioners, patients, and payers to promote their responsible use and share their experiences in doing so.

  16. Array-based comparative genomic hybridization facilitates identification of breakpoints of a novel der(1)t(1;18)(p36.3;q23)dn in a child presenting with mental retardation.

    Science.gov (United States)

    Lennon, P A; Cooper, M L; Curtis, M A; Lim, C; Ou, Z; Patel, A; Cheung, S W; Bacino, C A

    2006-06-01

    Monosomy of distal 1p36 represents the most common terminal deletion in humans and results in one of the most frequently diagnosed mental retardation syndromes. This deletion is considered a contiguous gene deletion syndrome, and has been shown to vary in deletion sizes that contribute to the spectrum of phenotypic anomalies seen in patients with monosomy 1p36. We report on an 8-year-old female with characteristics of the monosomy 1p36 syndrome who demonstrated a novel der(1)t(1;18)(p36.3;q23). Initial G-banded karyotype analysis revealed a deleted chromosome 1, with a breakpoint within 1p36.3. Subsequent FISH and array-based comparative genomic hybridization not only confirmed and partially characterized the deletion of chromosome 1p36.3, but also uncovered distal trisomy for 18q23. In this patient, the duplicated 18q23 is translocated onto the deleted 1p36.3 region, suggesting telomere capture. Molecular characterization of this novel der(1)t(1;18)(p36.3;q23), guided by our clinical array-comparative genomic hybridization, demonstrated a 3.2 Mb terminal deletion of chromosome 1p36.3 and a 200 kb duplication of 18q23 onto the deleted 1p36.3, presumably stabilizing the deleted chromosome 1. DNA sequence analysis around the breakpoints demonstrated no homology, and therefore this telomere capture of distal 18q is apparently the result of a non-homologous recombination. Partial trisomy for 18q23 has not been previously reported. The importance of mapping the breakpoints of all balanced and unbalanced translocations found in the clinical laboratory, when phenotypic abnormalities are found, is discussed.

  17. 大数据和物联网在国外城市治理中的前沿应用:公共价值促生的可操作化%Frontier Application of Big Data and IoT in Urban Governance of Other Countries:Operationalize of Public Value Facilitate

    Institute of Scientific and Technical Information of China (English)

    李一男

    2015-01-01

    城市的不断发展带来更高的复杂性和各方社会主体相互冲突的价值需求,对城市治理和公共决策形成新的挑战。借助大数据和物联网技术,智慧城市的兴起成为改善城市治理的有效手段。文章介绍并评析了城市治理研究的三个前沿领域,以及大数据和物联网技术的两个应用案例,在此基础上对我国发展智慧城市以提高城市治理水平,推进新型城镇化建设提出了具体建议。借助大数据和物联网技术,政府有能力识别核心公共价值,并对相互冲突的价值需求进行调解,进而将理论模型可操作化,在具体执行中实现公共价值的促生。%The continuous development of urbanization meets more complexity and conflicting value requirements from different social bodies.This is-sue becomes new challenge for urban governance and public decision.Taking technological advantages of big data and internet of things,the rises of smart city makes it possible to improve urban governance.In this paper I introduce and review three frontiers of urban governance research,and two cases of ap-plication for big data and IoT technologies.On this base I give detailed suggestions for the development of smart city in China,in order to improve urban governance and promote new round of urbanization.

  18. [Landscape and ecological genomics].

    Science.gov (United States)

    Tetushkin, E Ia

    2013-10-01

    Landscape genomics is the modern version of landscape genetics, a discipline that arose approximately 10 years ago as a combination of population genetics, landscape ecology, and spatial statistics. It studies the effects of environmental variables on gene flow and other microevolutionary processes that determine genetic connectivity and variations in populations. In contrast to population genetics, it operates at the level of individual specimens rather than at the level of population samples. Another important difference between landscape genetics and genomics and population genetics is that, in the former, the analysis of gene flow and local adaptations takes quantitative account of landforms and features of the matrix, i.e., hostile spaces that separate species habitats. Landscape genomics is a part of population ecogenomics, which, along with community genomics, is a major part of ecological genomics. One of the principal purposes of landscape genomics is the identification and differentiation of various genome-wide and locus-specific effects. The approaches and computation tools developed for combined analysis of genomic and landscape variables make it possible to detect adaptation-related genome fragments, which facilitates the planning of conservation efforts and the prediction of species' fate in response to expected changes in the environment.

  19. Between two fern genomes.

    Science.gov (United States)

    Sessa, Emily B; Banks, Jo Ann; Barker, Michael S; Der, Joshua P; Duffy, Aaron M; Graham, Sean W; Hasebe, Mitsuyasu; Langdale, Jane; Li, Fay-Wei; Marchant, D Blaine; Pryer, Kathleen M; Rothfels, Carl J; Roux, Stanley J; Salmi, Mari L; Sigel, Erin M; Soltis, Douglas E; Soltis, Pamela S; Stevenson, Dennis W; Wolf, Paul G

    2014-01-01

    Ferns are the only major lineage of vascular plants not represented by a sequenced nuclear genome. This lack of genome sequence information significantly impedes our ability to understand and reconstruct genome evolution not only in ferns, but across all land plants. Azolla and Ceratopteris are ideal and complementary candidates to be the first ferns to have their nuclear genomes sequenced. They differ dramatically in genome size, life history, and habit, and thus represent the immense diversity of extant ferns. Together, this pair of genomes will facilitate myriad large-scale comparative analyses across ferns and all land plants. Here we review the unique biological characteristics of ferns and describe a number of outstanding questions in plant biology that will benefit from the addition of ferns to the set of taxa with sequenced nuclear genomes. We explain why the fern clade is pivotal for understanding genome evolution across land plants, and we provide a rationale for how knowledge of fern genomes will enable progress in research beyond the ferns themselves.

  20. Fungal Genomics Program

    Energy Technology Data Exchange (ETDEWEB)

    Grigoriev, Igor

    2012-03-12

    The JGI Fungal Genomics Program aims to scale up sequencing and analysis of fungal genomes to explore the diversity of fungi important for energy and the environment, and to promote functional studies on a system level. Combining new sequencing technologies and comparative genomics tools, JGI is now leading the world in fungal genome sequencing and analysis. Over 120 sequenced fungal genomes with analytical tools are available via MycoCosm (www.jgi.doe.gov/fungi), a web-portal for fungal biologists. Our model of interacting with user communities, unique among other sequencing centers, helps organize these communities, improves genome annotation and analysis work, and facilitates new larger-scale genomic projects. This resulted in 20 high-profile papers published in 2011 alone and contributing to the Genomics Encyclopedia of Fungi, which targets fungi related to plant health (symbionts, pathogens, and biocontrol agents) and biorefinery processes (cellulose degradation, sugar fermentation, industrial hosts). Our next grand challenges include larger scale exploration of fungal diversity (1000 fungal genomes), developing molecular tools for DOE-relevant model organisms, and analysis of complex systems and metagenomes.

  1. Programs | Office of Cancer Genomics

    Science.gov (United States)

    OCG facilitates cancer genomics research through a series of highly-focused programs. These programs generate and disseminate genomic data for use by the cancer research community. OCG programs also promote advances in technology-based infrastructure and create valuable experimental reagents and tools. OCG programs encourage collaboration by interconnecting with other genomics and cancer projects in order to accelerate translation of findings into the clinic. Below are OCG’s current, completed, and initiated programs:

  2. Big data for bipolar disorder.

    Science.gov (United States)

    Monteith, Scott; Glenn, Tasha; Geddes, John; Whybrow, Peter C; Bauer, Michael

    2016-12-01

    The delivery of psychiatric care is changing with a new emphasis on integrated care, preventative measures, population health, and the biological basis of disease. Fundamental to this transformation are big data and advances in the ability to analyze these data. The impact of big data on the routine treatment of bipolar disorder today and in the near future is discussed, with examples that relate to health policy, the discovery of new associations, and the study of rare events. The primary sources of big data today are electronic medical records (EMR), claims, and registry data from providers and payers. In the near future, data created by patients from active monitoring, passive monitoring of Internet and smartphone activities, and from sensors may be integrated with the EMR. Diverse data sources from outside of medicine, such as government financial data, will be linked for research. Over the long term, genetic and imaging data will be integrated with the EMR, and there will be more emphasis on predictive models. Many technical challenges remain when analyzing big data that relates to size, heterogeneity, complexity, and unstructured text data in the EMR. Human judgement and subject matter expertise are critical parts of big data analysis, and the active participation of psychiatrists is needed throughout the analytical process.

  3. Considerations on Geospatial Big Data

    Science.gov (United States)

    LIU, Zhen; GUO, Huadong; WANG, Changlin

    2016-11-01

    Geospatial data, as a significant portion of big data, has recently gained the full attention of researchers. However, few researchers focus on the evolution of geospatial data and its scientific research methodologies. When entering into the big data era, fully understanding the changing research paradigm associated with geospatial data will definitely benefit future research on big data. In this paper, we look deep into these issues by examining the components and features of geospatial big data, reviewing relevant scientific research methodologies, and examining the evolving pattern of geospatial data in the scope of the four ‘science paradigms’. This paper proposes that geospatial big data has significantly shifted the scientific research methodology from ‘hypothesis to data’ to ‘data to questions’ and it is important to explore the generality of growing geospatial data ‘from bottom to top’. Particularly, four research areas that mostly reflect data-driven geospatial research are proposed: spatial correlation, spatial analytics, spatial visualization, and scientific knowledge discovery. It is also pointed out that privacy and quality issues of geospatial data may require more attention in the future. Also, some challenges and thoughts are raised for future discussion.

  4. Official statistics and Big Data

    Directory of Open Access Journals (Sweden)

    Peter Struijs

    2014-07-01

    Full Text Available The rise of Big Data changes the context in which organisations producing official statistics operate. Big Data provides opportunities, but in order to make optimal use of Big Data, a number of challenges have to be addressed. This stimulates increased collaboration between National Statistical Institutes, Big Data holders, businesses and universities. In time, this may lead to a shift in the role of statistical institutes in the provision of high-quality and impartial statistical information to society. In this paper, the changes in context, the opportunities, the challenges and the way to collaborate are addressed. The collaboration between the various stakeholders will involve each partner building on and contributing different strengths. For national statistical offices, traditional strengths include, on the one hand, the ability to collect data and combine data sources with statistical products and, on the other hand, their focus on quality, transparency and sound methodology. In the Big Data era of competing and multiplying data sources, they continue to have a unique knowledge of official statistical production methods. And their impartiality and respect for privacy as enshrined in law uniquely position them as a trusted third party. Based on this, they may advise on the quality and validity of information of various sources. By thus positioning themselves, they will be able to play their role as key information providers in a changing society.

  5. Facilitation of Adult Development

    Science.gov (United States)

    Boydell, Tom

    2016-01-01

    Taking an autobiographical approach, I tell the story of my experiences facilitating adult development, in a polytechnic and as a management consultant. I relate these to a developmental framework of Modes of Being and Learning that I created and elaborated with colleagues. I connect this picture with a number of related models, theories,…

  6. From Teaching to Facilitation

    DEFF Research Database (Denmark)

    de Graaff, Erik

    2013-01-01

    A shift from teaching to learning is characteristic of the introduction of Problem Based Learning (PBL) in an existing school. As a consequence the teaching staff has to be trained in skills like facilitating group work and writing cases. Most importantly a change in thinking about teaching...

  7. Facilitation skills for trainers

    Directory of Open Access Journals (Sweden)

    F. Cilliers

    2000-06-01

    Full Text Available This research aims to develop the facilitation skills of trainers. Facilitation is defined form the Person-Centered approach, as providing an opportunity for the trainee to experience personal growth and learning. A facilitation skills workshop was presented to 40 trainers, focussing on enhancing selfactualisation, its intra and inter personal characteristics, and attending and responding behaviour. Measurement with the Personal Orientation Inventory and Carkhuff scales, indicate enhanced cognitive, affective and conative sensitivity and interpersonal skills. A post-interview indicates the trainers experienced empowerment in dealing with the providing of opportunities for growth amongst trainees, in all kinds of training situations. Recommendations are made to enhance facilitation development amongst trainers. Opsomming Hierdie navorsing poog om die fasiliteringsvaardighede van opieiers te ontwikkel. Fasilitering word gedefinieer vanuit die Persoonsgesentreerde benadering as die beskikbaarstelling van 'n geleentheid om persoonlike groei en leer te ervaar. 'n Fasiliteringsvaardighede werkswinkel is aangebied vir 40 opieiers, met die fokus op die stimulering van selfaktualisering, die intra en interpersoonlike kenmerke daarvan, en aandagskenk- en responderings- gedrag. Meting met die Persoonlike Orientasievraelys en die Carkhuff skale, dui op n toename in kognitiewe, affektiewe en konatiewe sensitiwiteit en interpersoonlike vaardighede. n Post-onderhoud dui op die opleier se ervaarde bemagtiging in die beskikbaarstelling van groeigeleenthede vir opleidelinge, in all tipe opleidingsituasies. Aanbevelings word gemaak om die ontwikkeling van fasiliteringsvaardighede by opleiers te verhoog.

  8. Facilitation skills for nurses

    Directory of Open Access Journals (Sweden)

    F Cilliers

    2000-09-01

    Full Text Available Using the pcrson-centered approach, facilitation in this study was conceptualised as providing opportunities for personal growth in the patient, and operationalised in a skills workshop for 40 nurses from different hospitals in Gauteng. The first objective was to evaluate the workshop and the second to ascertain its effect on the participant’s experienced performance. A combined quantitative and qualitative research design was used. The quantitative measurement (Personal Orientation Inventory, Carkhuff scales indicated that the workshop stimulated self-actualisation in terms of intrapersonal awareness, and the interpersonal skills of respect, realness, concreteness, empathy, as well as in terms of attending and responding behaviour. The qualitative measurement (a semi-structured interview indicated that the participants were able to empower patients to find their own answers to difficult personal questions. The alternative hypothesis was accepted, namely that this workshop in facilitations skills significantly enhanced the intra- and interpersonal characteristics associated with self-actualisation and the facilitation of growth in patients. The findings highlighted the difference between the two roles of instructor and facilitator, and recommendations to this effect were formulated.

  9. Facilitating leadership team communication

    OpenAIRE

    Hedman, Eerika

    2015-01-01

    The purpose of this study is to understand and describe how to facilitate competent communication in leadership teamwork. Grounded in the premises of social constructionism and informed by such theoretical frameworks as coordinated management of meaning theory (CMM), dialogic organization development (OD), systemic-constructionist leadership, communication competence, and reflexivity, this study seeks to produce further insights into understanding leadership team communicati...

  10. Facilitating functional annotation of chicken microarray data

    Directory of Open Access Journals (Sweden)

    Gresham Cathy R

    2009-10-01

    Full Text Available Abstract Background Modeling results from chicken microarray studies is challenging for researchers due to little functional annotation associated with these arrays. The Affymetrix GenChip chicken genome array, one of the biggest arrays that serve as a key research tool for the study of chicken functional genomics, is among the few arrays that link gene products to Gene Ontology (GO. However the GO annotation data presented by Affymetrix is incomplete, for example, they do not show references linked to manually annotated functions. In addition, there is no tool that facilitates microarray researchers to directly retrieve functional annotations for their datasets from the annotated arrays. This costs researchers amount of time in searching multiple GO databases for functional information. Results We have improved the breadth of functional annotations of the gene products associated with probesets on the Affymetrix chicken genome array by 45% and the quality of annotation by 14%. We have also identified the most significant diseases and disorders, different types of genes, and known drug targets represented on Affymetrix chicken genome array. To facilitate functional annotation of other arrays and microarray experimental datasets we developed an Array GO Mapper (AGOM tool to help researchers to quickly retrieve corresponding functional information for their dataset. Conclusion Results from this study will directly facilitate annotation of other chicken arrays and microarray experimental datasets. Researchers will be able to quickly model their microarray dataset into more reliable biological functional information by using AGOM tool. The disease, disorders, gene types and drug targets revealed in the study will allow researchers to learn more about how genes function in complex biological systems and may lead to new drug discovery and development of therapies. The GO annotation data generated will be available for public use via AgBase website and

  11. CLOUD COMPUTING WITH BIG DATA: A REVIEW

    OpenAIRE

    Anjali; Er. Amandeep Kaur; Mrs. Shakshi

    2016-01-01

    Big data is a collection of huge quantities of data. Big data is the process of examining large amounts of data. Big data and Cloud computing are the hot issues in Information Technology. Big data is the one of the main problem now a day’s. Researchers focusing how to handle huge amount of data with cloud computing and how to gain a perfect security for big data in cloud computing. To handle the Big Data problem Hadoop framework is used in which data is fragmented and executed parallel....

  12. 淀粉Big Bang!

    Institute of Scientific and Technical Information of China (English)

    2012-01-01

    Big Bang,也叫"大爆炸",指的是宇宙诞生时期从密度极大且温度极高的太初状态开始发生不断膨胀的过程。换句话说,从Big Bang开始,我们现在的宇宙慢慢形成了。0K,从本期开始,"少电"将在微博引发Big Bang!——淀粉大爆炸!具体怎么爆呢?我想,看到本页版式的你已经明白了七八分了吧?

  13. Multiwavelength astronomy and big data

    Science.gov (United States)

    Mickaelian, A. M.

    2016-09-01

    Two major characteristics of modern astronomy are multiwavelength (MW) studies (fromγ-ray to radio) and big data (data acquisition, storage and analysis). Present astronomical databases and archives contain billions of objects observed at various wavelengths, both galactic and extragalactic, and the vast amount of data on them allows new studies and discoveries. Astronomers deal with big numbers. Surveys are the main source for discovery of astronomical objects and accumulation of observational data for further analysis, interpretation, and achieving scientific results. We review the main characteristics of astronomical surveys, compare photographic and digital eras of astronomical studies (including the development of wide-field observations), describe the present state of MW surveys, and discuss the Big Data in astronomy and related topics of Virtual Observatories and Computational Astrophysics. The review includes many numbers and data that can be compared to have a possibly overall understanding on the Universe, cosmic numbers and their relationship to modern computational facilities.

  14. Towards a big crunch dual

    Energy Technology Data Exchange (ETDEWEB)

    Hertog, Thomas E-mail: hertog@vulcan2.physics.ucsb.edu; Horowitz, Gary T

    2004-07-01

    We show there exist smooth asymptotically anti-de Sitter initial data which evolve to a big crunch singularity in a low energy supergravity limit of string theory. This opens up the possibility of using the dual conformal field theory to obtain a fully quantum description of the cosmological singularity. A preliminary study of this dual theory suggests that the big crunch is an endpoint of evolution even in the full string theory. We also show that any theory with scalar solitons must have negative energy solutions. The results presented here clarify our earlier work on cosmic censorship violation in N=8 supergravity. (author)

  15. Was There A Big Bang?

    CERN Document Server

    Soberman, Robert K

    2008-01-01

    The big bang hypothesis is widely accepted despite numerous physics conflicts. It rests upon two experimental supports, galactic red shift and the cosmic microwave background. Both are produced by dark matter, shown here to be hydrogen dominated aggregates with a few percent of helium nodules. Scattering from these non-radiating intergalactic masses produce a red shift that normally correlates with distance. Warmed by our galaxy to an Eigenvalue of 2.735 K, drawn near the Earth, these bodies, kept cold by ablation, resonance radiate the Planckian microwave signal. Several tests are proposed that will distinguish between this model and the big bang.

  16. [Big Data- challenges and risks].

    Science.gov (United States)

    Krauß, Manuela; Tóth, Tamás; Hanika, Heinrich; Kozlovszky, Miklós; Dinya, Elek

    2015-12-01

    The term "Big Data" is commonly used to describe the growing mass of information being created recently. New conclusions can be drawn and new services can be developed by the connection, processing and analysis of these information. This affects all aspects of life, including health and medicine. The authors review the application areas of Big Data, and present examples from health and other areas. However, there are several preconditions of the effective use of the opportunities: proper infrastructure, well defined regulatory environment with particular emphasis on data protection and privacy. These issues and the current actions for solution are also presented.

  17. Do big gods cause anything?

    DEFF Research Database (Denmark)

    Geertz, Armin W.

    2014-01-01

    Dette er et bidrag til et review symposium vedrørende Ara Norenzayans bog Big Gods: How Religion Transformed Cooperation and Conflict (Princeton University Press 2013). Bogen er spændende men problematisk i forhold til kausalitet, ateisme og stereotyper om jægere-samlere.......Dette er et bidrag til et review symposium vedrørende Ara Norenzayans bog Big Gods: How Religion Transformed Cooperation and Conflict (Princeton University Press 2013). Bogen er spændende men problematisk i forhold til kausalitet, ateisme og stereotyper om jægere-samlere....

  18. Big society, big data. The radicalisation of the network society

    NARCIS (Netherlands)

    Frissen, V.

    2011-01-01

    During the British election campaign of 2010, David Cameron produced the idea of the ‘Big Society’ as a cornerstone of his political agenda. At the core of the idea is a stronger civil society and local community coupled with a more withdrawn government. Although many commentators have dismissed thi

  19. Little Science to Big Science: Big Scientists to Little Scientists?

    Science.gov (United States)

    Simonton, Dean Keith

    2010-01-01

    This article presents the author's response to Hisham B. Ghassib's essay entitled "Where Does Creativity Fit into a Productivist Industrial Model of Knowledge Production?" Professor Ghassib's (2010) essay presents a provocative portrait of how the little science of the Babylonians, Greeks, and Arabs became the Big Science of the modern industrial…

  20. Facilitating Learning at Conferences

    DEFF Research Database (Denmark)

    Ravn, Ib; Elsborg, Steen

    2011-01-01

    The typical conference consists of a series of PowerPoint presentations that tend to render participants passive. Students of learning have long abandoned the transfer model that underlies such one-way communication. We propose an al-ternative theory of conferences that sees them as a forum...... and facilitate a variety of simple learning techniques at thirty one- and two-day conferences of up to 300 participants each. We present ten of these techniques and data evaluating them. We conclude that if conference organizers allocate a fraction of the total conference time to facilitated processes...... for learning, mutual inspiration and human flourishing. We offer five design principles that specify how conferences may engage participants more and hence increase their learning. In the research-and-development effort reported here, our team collaborated with conference organizers in Denmark to introduce...

  1. Mindfulness for group facilitation

    DEFF Research Database (Denmark)

    Adriansen, Hanne Kirstine; Krohn, Simon

    2014-01-01

    In this paper, we argue that mindfulness techniques can be used for enhancing the outcome of group performance. The word mindfulness has different connotations in the academic literature. Broadly speaking there is ‘mindfulness without meditation’ or ‘Western’ mindfulness which involves active...... thinking and ‘Eastern’ mindfulness which refers to an open, accepting state of mind, as intended with Buddhist-inspired techniques such as meditation. In this paper, we are interested in the latter type of mindfulness and demonstrate how Eastern mindfulness techniques can be used as a tool for facilitation....... A brief introduction to the physiology and philosophy of Eastern mindfulness constitutes the basis for the arguments of the effect of mindfulness techniques. The use of mindfulness techniques for group facilitation is novel as it changes the focus from individuals’ mindfulness practice...

  2. Parkinson’s Brain Disease Prediction Using Big Data Analytics

    Directory of Open Access Journals (Sweden)

    N. Shamli

    2016-06-01

    Full Text Available In healthcare industries, the demand for maintaining large amount of patients’ data is steadily growing due to rising population which has resulted in the increase of details about clinical and laboratory tests, imaging, prescription and medication. These data can be called “Big Data”, because of their size, complexity and diversity. Big data analytics aims at improving patient care and identifying preventive measures proactively. To save lives and recommend life style changes for a peaceful and healthier life at low costs. The proposed predictive analytics framework is a combination of Decision Tree, Support Vector Machine and Artificial Neural Network which is used to gain insights from patients. Parkinson’s disease voice dataset from UCI Machine learning repository is used as input. The experimental results show that early detection of disease will facilitate clinical monitoring of elderly people and increase the chances of their life span and improved lifestyle to lead peaceful life.

  3. Program Facilitates Distributed Computing

    Science.gov (United States)

    Hui, Joseph

    1993-01-01

    KNET computer program facilitates distribution of computing between UNIX-compatible local host computer and remote host computer, which may or may not be UNIX-compatible. Capable of automatic remote log-in. User communicates interactively with remote host computer. Data output from remote host computer directed to local screen, to local file, and/or to local process. Conversely, data input from keyboard, local file, or local process directed to remote host computer. Written in ANSI standard C language.

  4. Facilitating Knowledge Sharing

    OpenAIRE

    Holdt Christensen, Peter

    2005-01-01

    Abstract This paper argues that knowledge sharing can be conceptualized as different situations of exchange in which individuals relate to each other in different ways, involving different rules, norms and traditions of reciprocity regulating the exchange. The main challenge for facilitating knowledge sharing is to ensure that the exchange is seen as equitable for the parties involved, and by viewing the problems of knowledge sharing as motivational problems situated in different organization...

  5. Development of self-compressing BLSOM for comprehensive analysis of big sequence data.

    Science.gov (United States)

    Kikuchi, Akihito; Ikemura, Toshimichi; Abe, Takashi

    2015-01-01

    With the remarkable increase in genomic sequence data from various organisms, novel tools are needed for comprehensive analyses of available big sequence data. We previously developed a Batch-Learning Self-Organizing Map (BLSOM), which can cluster genomic fragment sequences according to phylotype solely dependent on oligonucleotide composition and applied to genome and metagenomic studies. BLSOM is suitable for high-performance parallel-computing and can analyze big data simultaneously, but a large-scale BLSOM needs a large computational resource. We have developed Self-Compressing BLSOM (SC-BLSOM) for reduction of computation time, which allows us to carry out comprehensive analysis of big sequence data without the use of high-performance supercomputers. The strategy of SC-BLSOM is to hierarchically construct BLSOMs according to data class, such as phylotype. The first-layer BLSOM was constructed with each of the divided input data pieces that represents the data subclass, such as phylotype division, resulting in compression of the number of data pieces. The second BLSOM was constructed with a total of weight vectors obtained in the first-layer BLSOMs. We compared SC-BLSOM with the conventional BLSOM by analyzing bacterial genome sequences. SC-BLSOM could be constructed faster than BLSOM and cluster the sequences according to phylotype with high accuracy, showing the method's suitability for efficient knowledge discovery from big sequence data.

  6. Development of Self-Compressing BLSOM for Comprehensive Analysis of Big Sequence Data

    Directory of Open Access Journals (Sweden)

    Akihito Kikuchi

    2015-01-01

    Full Text Available With the remarkable increase in genomic sequence data from various organisms, novel tools are needed for comprehensive analyses of available big sequence data. We previously developed a Batch-Learning Self-Organizing Map (BLSOM, which can cluster genomic fragment sequences according to phylotype solely dependent on oligonucleotide composition and applied to genome and metagenomic studies. BLSOM is suitable for high-performance parallel-computing and can analyze big data simultaneously, but a large-scale BLSOM needs a large computational resource. We have developed Self-Compressing BLSOM (SC-BLSOM for reduction of computation time, which allows us to carry out comprehensive analysis of big sequence data without the use of high-performance supercomputers. The strategy of SC-BLSOM is to hierarchically construct BLSOMs according to data class, such as phylotype. The first-layer BLSOM was constructed with each of the divided input data pieces that represents the data subclass, such as phylotype division, resulting in compression of the number of data pieces. The second BLSOM was constructed with a total of weight vectors obtained in the first-layer BLSOMs. We compared SC-BLSOM with the conventional BLSOM by analyzing bacterial genome sequences. SC-BLSOM could be constructed faster than BLSOM and cluster the sequences according to phylotype with high accuracy, showing the method’s suitability for efficient knowledge discovery from big sequence data.

  7. The Problem with Big Data: Operating on Smaller Datasets to Bridge the Implementation Gap

    Directory of Open Access Journals (Sweden)

    Richard Mann

    2016-12-01

    Full Text Available Big datasets have the potential to revolutionize public health. However, there is a mismatch between the political and scientific optimism surrounding big data and the public’s perception of its benefit. We suggest a systematic and concerted emphasis on developing models derived from smaller datasets to illustrate to the public how big data can produce tangible benefits in the long-term. In order to highlight the immediate value of a small data approach, we produced a proof-of-concept model predicting hospital length of stay. The results demonstrate that existing small datasets can be used to create models that generate a reasonable prediction, facilitating healthcare delivery. We propose that greater attention (and funding needs to be directed toward the utilization of existing information resources in parallel with current efforts to create and exploit ‘big data’.

  8. A draft sequence of the rice(Oryza sativa ssp. indica) genome

    Institute of Scientific and Technical Information of China (English)

    2001-01-01

    The sequence of the rice genome holds fundamental information for its biology, including physiology, genetics, development, and evolution, as well as information on many beneficial phenotypes of economic significance. Using a "whole genome shotgun" approach, we have produced a draft rice genome sequence of Oryza sativa ssp. indica, the major crop rice subspecies in China and many other regions of Asia. The draft genome sequence is constructed from over 4.3 million successful sequencing traces with an accumulative total length of 2214.9 Mb. The initial assembly of the non-redundant sequences reached 409.76 Mb in length, based on 3.30 million successful sequencing traces with a total length of 1797.4 Mb from an indica variant cultivar 93-11, giving an estimated coverage of 95.29% of the rice genome with an average base accuracy of higher than 99%. The coverage of the draft sequence, the randomness of the sequence distribution, and the consistency of BIG-ASSEM- BLER, a custom-designed software package used for the initial assembly, were verified rigorously by comparisons against finished BAC clone sequences from both indica and japanica strains, available from the public databases. Over all, 96.3% of full-length cDNAs, 96.4% of STS, STR, RFLP markers, 94.0% of ESTs and 94.9% unigene clusters were identified from the draft sequence. Our preliminary analysis on the data set shows that our rice draft sequence is consistent with the comman standard accepted by the genome sequencing community. The unconditional release of the draft to the public also undoubtedly provides a fundamental resource to the international scientific communities to facilitate genomic and genetic studies on rice biology.

  9. The Uses of Big Data in Cities.

    Science.gov (United States)

    Bettencourt, Luís M A

    2014-03-01

    There is much enthusiasm currently about the possibilities created by new and more extensive sources of data to better understand and manage cities. Here, I explore how big data can be useful in urban planning by formalizing the planning process as a general computational problem. I show that, under general conditions, new sources of data coordinated with urban policy can be applied following fundamental principles of engineering to achieve new solutions to important age-old urban problems. I also show that comprehensive urban planning is computationally intractable (i.e., practically impossible) in large cities, regardless of the amounts of data available. This dilemma between the need for planning and coordination and its impossibility in detail is resolved by the recognition that cities are first and foremost self-organizing social networks embedded in space and enabled by urban infrastructure and services. As such, the primary role of big data in cities is to facilitate information flows and mechanisms of learning and coordination by heterogeneous individuals. However, processes of self-organization in cities, as well as of service improvement and expansion, must rely on general principles that enforce necessary conditions for cities to operate and evolve. Such ideas are the core of a developing scientific theory of cities, which is itself enabled by the growing availability of quantitative data on thousands of cities worldwide, across different geographies and levels of development. These three uses of data and information technologies in cities constitute then the necessary pillars for more successful urban policy and management that encourages, and does not stifle, the fundamental role of cities as engines of development and innovation in human societies.

  10. The Natural Science Underlying Big History

    Directory of Open Access Journals (Sweden)

    Eric J. Chaisson

    2014-01-01

    Full Text Available Nature’s many varied complex systems—including galaxies, stars, planets, life, and society—are islands of order within the increasingly disordered Universe. All organized systems are subject to physical, biological, or cultural evolution, which together comprise the grander interdisciplinary subject of cosmic evolution. A wealth of observational data supports the hypothesis that increasingly complex systems evolve unceasingly, uncaringly, and unpredictably from big bang to humankind. These are global history greatly extended, big history with a scientific basis, and natural history broadly portrayed across ∼14 billion years of time. Human beings and our cultural inventions are not special, unique, or apart from Nature; rather, we are an integral part of a universal evolutionary process connecting all such complex systems throughout space and time. Such evolution writ large has significant potential to unify the natural sciences into a holistic understanding of who we are and whence we came. No new science (beyond frontier, nonequilibrium thermodynamics is needed to describe cosmic evolution’s major milestones at a deep and empirical level. Quantitative models and experimental tests imply that a remarkable simplicity underlies the emergence and growth of complexity for a wide spectrum of known and diverse systems. Energy is a principal facilitator of the rising complexity of ordered systems within the expanding Universe; energy flows are as central to life and society as they are to stars and galaxies. In particular, energy rate density—contrasting with information content or entropy production—is an objective metric suitable to gauge relative degrees of complexity among a hierarchy of widely assorted systems observed throughout the material Universe. Operationally, those systems capable of utilizing optimum amounts of energy tend to survive, and those that cannot are nonrandomly eliminated.

  11. The natural science underlying big history.

    Science.gov (United States)

    Chaisson, Eric J

    2014-01-01

    Nature's many varied complex systems-including galaxies, stars, planets, life, and society-are islands of order within the increasingly disordered Universe. All organized systems are subject to physical, biological, or cultural evolution, which together comprise the grander interdisciplinary subject of cosmic evolution. A wealth of observational data supports the hypothesis that increasingly complex systems evolve unceasingly, uncaringly, and unpredictably from big bang to humankind. These are global history greatly extended, big history with a scientific basis, and natural history broadly portrayed across ∼14 billion years of time. Human beings and our cultural inventions are not special, unique, or apart from Nature; rather, we are an integral part of a universal evolutionary process connecting all such complex systems throughout space and time. Such evolution writ large has significant potential to unify the natural sciences into a holistic understanding of who we are and whence we came. No new science (beyond frontier, nonequilibrium thermodynamics) is needed to describe cosmic evolution's major milestones at a deep and empirical level. Quantitative models and experimental tests imply that a remarkable simplicity underlies the emergence and growth of complexity for a wide spectrum of known and diverse systems. Energy is a principal facilitator of the rising complexity of ordered systems within the expanding Universe; energy flows are as central to life and society as they are to stars and galaxies. In particular, energy rate density-contrasting with information content or entropy production-is an objective metric suitable to gauge relative degrees of complexity among a hierarchy of widely assorted systems observed throughout the material Universe. Operationally, those systems capable of utilizing optimum amounts of energy tend to survive, and those that cannot are nonrandomly eliminated.

  12. Data Partitioning View of Mining Big Data

    OpenAIRE

    Zhang, Shichao

    2016-01-01

    There are two main approximations of mining big data in memory. One is to partition a big dataset to several subsets, so as to mine each subset in memory. By this way, global patterns can be obtained by synthesizing all local patterns discovered from these subsets. Another is the statistical sampling method. This indicates that data partitioning should be an important strategy for mining big data. This paper recalls our work on mining big data with a data partitioning and shows some interesti...

  13. 77 FR 27245 - Big Stone National Wildlife Refuge, Big Stone and Lac Qui Parle Counties, MN

    Science.gov (United States)

    2012-05-09

    ... Fish and Wildlife Service Big Stone National Wildlife Refuge, Big Stone and Lac Qui Parle Counties, MN... comprehensive conservation plan (CCP) and environmental assessment (EA) for Big Stone National Wildlife Refuge...: r3planning@fws.gov . Include ``Big Stone Draft CCP/ EA'' in the subject line of the message. Fax:...

  14. BIG DATA IN BUSINESS ENVIRONMENT

    Directory of Open Access Journals (Sweden)

    Logica BANICA

    2015-06-01

    Full Text Available In recent years, dealing with a lot of data originating from social media sites and mobile communications among data from business environments and institutions, lead to the definition of a new concept, known as Big Data. The economic impact of the sheer amount of data produced in a last two years has increased rapidly. It is necessary to aggregate all types of data (structured and unstructured in order to improve current transactions, to develop new business models, to provide a real image of the supply and demand and thereby, generate market advantages. So, the companies that turn to Big Data have a competitive advantage over other firms. Looking from the perspective of IT organizations, they must accommodate the storage and processing Big Data, and provide analysis tools that are easily integrated into business processes. This paper aims to discuss aspects regarding the Big Data concept, the principles to build, organize and analyse huge datasets in the business environment, offering a three-layer architecture, based on actual software solutions. Also, the article refers to the graphical tools for exploring and representing unstructured data, Gephi and NodeXL.

  15. The International Big History Association

    Science.gov (United States)

    Duffy, Michael; Duffy, D'Neil

    2013-01-01

    IBHA, the International Big History Association, was organized in 2010 and "promotes the unified, interdisciplinary study and teaching of history of the Cosmos, Earth, Life, and Humanity." This is the vision that Montessori embraced long before the discoveries of modern science fleshed out the story of the evolving universe. "Big…

  16. The Big European Bubble Chamber

    CERN Multimedia

    1977-01-01

    The 3.70 metre Big European Bubble Chamber (BEBC), dismantled on 9 August 1984. During operation it was one of the biggest detectors in the world, producing direct visual recordings of particle tracks. 6.3 million photos of interactions were taken with the chamber in the course of its existence.

  17. YOUNG CITY,BIG PARTY

    Institute of Scientific and Technical Information of China (English)

    2011-01-01

    Shenzhen Universiade unites the world’s young people through sports with none of the usual hoop-la, no fireworks, no grand performances by celebrities and superstars, the Shenzhen Summer Universiade lowered the curtain on a big party for youth and college students on August 23.

  18. The BigBoss Experiment

    Energy Technology Data Exchange (ETDEWEB)

    Schelgel, D.; Abdalla, F.; Abraham, T.; Ahn, C.; Allende Prieto, C.; Annis, J.; Aubourg, E.; Azzaro, M.; Bailey, S.; Baltay, C.; Baugh, C.; Bebek, C.; Becerril, S.; Blanton, M.; Bolton, A.; Bromley, B.; Cahn, R.; Carton, P.-H.; Cervanted-Cota, J.L.; Chu, Y.; Cortes, M.; /APC, Paris /Brookhaven /IRFU, Saclay /Marseille, CPPM /Marseille, CPT /Durham U. / /IEU, Seoul /Fermilab /IAA, Granada /IAC, La Laguna / /IAC, Mexico / / /Madrid, IFT /Marseille, Lab. Astrophys. / / /New York U. /Valencia U.

    2012-06-07

    BigBOSS is a Stage IV ground-based dark energy experiment to study baryon acoustic oscillations (BAO) and the growth of structure with a wide-area galaxy and quasar redshift survey over 14,000 square degrees. It has been conditionally accepted by NOAO in response to a call for major new instrumentation and a high-impact science program for the 4-m Mayall telescope at Kitt Peak. The BigBOSS instrument is a robotically-actuated, fiber-fed spectrograph capable of taking 5000 simultaneous spectra over a wavelength range from 340 nm to 1060 nm, with a resolution R = {lambda}/{Delta}{lambda} = 3000-4800. Using data from imaging surveys that are already underway, spectroscopic targets are selected that trace the underlying dark matter distribution. In particular, targets include luminous red galaxies (LRGs) up to z = 1.0, extending the BOSS LRG survey in both redshift and survey area. To probe the universe out to even higher redshift, BigBOSS will target bright [OII] emission line galaxies (ELGs) up to z = 1.7. In total, 20 million galaxy redshifts are obtained to measure the BAO feature, trace the matter power spectrum at smaller scales, and detect redshift space distortions. BigBOSS will provide additional constraints on early dark energy and on the curvature of the universe by measuring the Ly-alpha forest in the spectra of over 600,000 2.2 < z < 3.5 quasars. BigBOSS galaxy BAO measurements combined with an analysis of the broadband power, including the Ly-alpha forest in BigBOSS quasar spectra, achieves a FOM of 395 with Planck plus Stage III priors. This FOM is based on conservative assumptions for the analysis of broad band power (k{sub max} = 0.15), and could grow to over 600 if current work allows us to push the analysis to higher wave numbers (k{sub max} = 0.3). BigBOSS will also place constraints on theories of modified gravity and inflation, and will measure the sum of neutrino masses to 0.024 eV accuracy.

  19. Big Data: Implications for Health System Pharmacy.

    Science.gov (United States)

    Stokes, Laura B; Rogers, Joseph W; Hertig, John B; Weber, Robert J

    2016-07-01

    Big Data refers to datasets that are so large and complex that traditional methods and hardware for collecting, sharing, and analyzing them are not possible. Big Data that is accurate leads to more confident decision making, improved operational efficiency, and reduced costs. The rapid growth of health care information results in Big Data around health services, treatments, and outcomes, and Big Data can be used to analyze the benefit of health system pharmacy services. The goal of this article is to provide a perspective on how Big Data can be applied to health system pharmacy. It will define Big Data, describe the impact of Big Data on population health, review specific implications of Big Data in health system pharmacy, and describe an approach for pharmacy leaders to effectively use Big Data. A few strategies involved in managing Big Data in health system pharmacy include identifying potential opportunities for Big Data, prioritizing those opportunities, protecting privacy concerns, promoting data transparency, and communicating outcomes. As health care information expands in its content and becomes more integrated, Big Data can enhance the development of patient-centered pharmacy services.

  20. A SWOT Analysis of Big Data

    Science.gov (United States)

    Ahmadi, Mohammad; Dileepan, Parthasarati; Wheatley, Kathleen K.

    2016-01-01

    This is the decade of data analytics and big data, but not everyone agrees with the definition of big data. Some researchers see it as the future of data analysis, while others consider it as hype and foresee its demise in the near future. No matter how it is defined, big data for the time being is having its glory moment. The most important…

  1. Data, Data, Data : Big, Linked & Open

    NARCIS (Netherlands)

    Folmer, E.J.A.; Krukkert, D.; Eckartz, S.M.

    2013-01-01

    De gehele business en IT-wereld praat op dit moment over Big Data, een trend die medio 2013 Cloud Computing is gepasseerd (op basis van Google Trends). Ook beleidsmakers houden zich actief bezig met Big Data. Neelie Kroes, vice-president van de Europese Commissie, spreekt over de ‘Big Data Revolutio

  2. Big sagebrush seed bank densities following wildfires

    Science.gov (United States)

    Big sagebrush (Artemisia spp.) is a critical shrub to many wildlife species including sage grouse (Centrocercus urophasianus), mule deer (Odocoileus hemionus), and pygmy rabbit (Brachylagus idahoensis). Big sagebrush is killed by wildfires and big sagebrush seed is generally short-lived and do not s...

  3. A survey of big data research

    OpenAIRE

    2015-01-01

    Big data create values for business and research, but pose significant challenges in terms of networking, storage, management, analytics and ethics. Multidisciplinary collaborations from engineers, computer scientists, statisticians and social scientists are needed to tackle, discover and understand big data. This survey presents an overview of big data initiatives, technologies and research in industries and academia, and discusses challenges and potential solutions.

  4. Facilitative root interactions in intercrops

    DEFF Research Database (Denmark)

    Hauggaard-Nielsen, H.; Jensen, E.S.

    2005-01-01

    Facilitation takes place when plants ameliorate the environment of their neighbours, and increase their growth and survival. Facilitation occurs in natural ecosystems as well as in agroecosystems. We discuss examples of facilitative root interactions in intercropped agroecosystems; including...... intensified cropping systems using chemical and mechanical inputs also show that facilitative interactions definitely can be of significance. It is concluded that a better understanding of the mechanisms behind facilitative interactions may allow us to benefit more from these phenomena in agriculture...

  5. The draft genome sequence of the nematode Caenorhabditis briggsae, a companion to C. elegans.

    Science.gov (United States)

    Gupta, Bhagwati P; Sternberg, Paul W

    2003-01-01

    The publication of the draft genome sequence of Caenorhabditis briggsae improves the annotation of the genome of its close relative Caenorhabditis elegans and will facilitate comparative genomics and the study of the evolutionary changes during development.

  6. GeNemo: a search engine for web-based functional genomic data

    OpenAIRE

    Zhang, Yongqing; Cao, Xiaoyi; Zhong, Sheng

    2016-01-01

    A set of new data types emerged from functional genomic assays, including ChIP-seq, DNase-seq, FAIRE-seq and others. The results are typically stored as genome-wide intensities (WIG/bigWig files) or functional genomic regions (peak/BED files). These data types present new challenges to big data science. Here, we present GeNemo, a web-based search engine for functional genomic data. GeNemo searches user-input data against online functional genomic datasets, including the entire collection of E...

  7. Essence: Facilitating Software Innovation

    DEFF Research Database (Denmark)

    Aaen, Ivan

    2008-01-01

      This paper suggests ways to facilitate creativity and innovation in software development. The paper applies four perspectives – Product, Project, Process, and People –to identify an outlook for software innovation. The paper then describes a new facility–Software Innovation Research Lab (SIRL......) – and a new method concept for software innovation – Essence – based on views, modes, and team roles. Finally, the paper reports from an early experiment using SIRL and Essence and identifies further research....

  8. Big Bang Day : The Great Big Particle Adventure - 3. Origins

    CERN Multimedia

    2008-01-01

    In this series, comedian and physicist Ben Miller asks the CERN scientists what they hope to find. If the LHC is successful, it will explain the nature of the Universe around us in terms of a few simple ingredients and a few simple rules. But the Universe now was forged in a Big Bang where conditions were very different, and the rules were very different, and those early moments were crucial to determining how things turned out later. At the LHC they can recreate conditions as they were billionths of a second after the Big Bang, before atoms and nuclei existed. They can find out why matter and antimatter didn't mutually annihilate each other to leave behind a Universe of pure, brilliant light. And they can look into the very structure of space and time - the fabric of the Universe

  9. Antigravity and the big crunch/big bang transition

    Science.gov (United States)

    Bars, Itzhak; Chen, Shih-Hung; Steinhardt, Paul J.; Turok, Neil

    2012-08-01

    We point out a new phenomenon which seems to be generic in 4d effective theories of scalar fields coupled to Einstein gravity, when applied to cosmology. A lift of such theories to a Weyl-invariant extension allows one to define classical evolution through cosmological singularities unambiguously, and hence construct geodesically complete background spacetimes. An attractor mechanism ensures that, at the level of the effective theory, generic solutions undergo a big crunch/big bang transition by contracting to zero size, passing through a brief antigravity phase, shrinking to zero size again, and re-emerging into an expanding normal gravity phase. The result may be useful for the construction of complete bouncing cosmologies like the cyclic model.

  10. Antigravity and the big crunch/big bang transition

    Energy Technology Data Exchange (ETDEWEB)

    Bars, Itzhak [Department of Physics and Astronomy, University of Southern California, Los Angeles, CA 90089-2535 (United States); Chen, Shih-Hung [Perimeter Institute for Theoretical Physics, Waterloo, ON N2L 2Y5 (Canada); Department of Physics and School of Earth and Space Exploration, Arizona State University, Tempe, AZ 85287-1404 (United States); Steinhardt, Paul J., E-mail: steinh@princeton.edu [Department of Physics and Princeton Center for Theoretical Physics, Princeton University, Princeton, NJ 08544 (United States); Turok, Neil [Perimeter Institute for Theoretical Physics, Waterloo, ON N2L 2Y5 (Canada)

    2012-08-29

    We point out a new phenomenon which seems to be generic in 4d effective theories of scalar fields coupled to Einstein gravity, when applied to cosmology. A lift of such theories to a Weyl-invariant extension allows one to define classical evolution through cosmological singularities unambiguously, and hence construct geodesically complete background spacetimes. An attractor mechanism ensures that, at the level of the effective theory, generic solutions undergo a big crunch/big bang transition by contracting to zero size, passing through a brief antigravity phase, shrinking to zero size again, and re-emerging into an expanding normal gravity phase. The result may be useful for the construction of complete bouncing cosmologies like the cyclic model.

  11. Perspectives on Big Data and Big Data Analytics

    Directory of Open Access Journals (Sweden)

    Elena Geanina ULARU

    2012-12-01

    Full Text Available Nowadays companies are starting to realize the importance of using more data in order to support decision for their strategies. It was said and proved through study cases that “More data usually beats better algorithms”. With this statement companies started to realize that they can chose to invest more in processing larger sets of data rather than investing in expensive algorithms. The large quantity of data is better used as a whole because of the possible correlations on a larger amount, correlations that can never be found if the data is analyzed on separate sets or on a smaller set. A larger amount of data gives a better output but also working with it can become a challenge due to processing limitations. This article intends to define the concept of Big Data and stress the importance of Big Data Analytics.

  12. Antigravity and the big crunch/big bang transition

    CERN Document Server

    Bars, Itzhak; Steinhardt, Paul J; Turok, Neil

    2011-01-01

    We point out a new phenomenon which seems to be generic in 4d effective theories of scalar fields coupled to Einstein gravity, when applied to cosmology. A lift of such theories to a Weyl-invariant extension allows one to define classical evolution through cosmological singularities unambiguously, and hence construct geodesically complete background spacetimes. An attractor mechanism ensures that, at the level of the effective theory, generic solutions undergo a big crunch/big bang transition by contracting to zero size, passing through a brief antigravity phase, shrinking to zero size again, and re-emerging into an expanding normal gravity phase. The result may be useful for the construction of complete bouncing cosmologies like the cyclic model.

  13. Solution of a Braneworld Big Crunch/Big Bang Cosmology

    CERN Document Server

    McFadden, P; Turok, N G; Fadden, Paul Mc; Steinhardt, Paul J.; Turok, Neil

    2005-01-01

    We solve for the cosmological perturbations in a five-dimensional background consisting of two separating or colliding boundary branes, as an expansion in the collision speed V divided by the speed of light c. Our solution permits a detailed check of the validity of four-dimensional effective theory in the vicinity of the event corresponding to the big crunch/big bang singularity. We show that the four-dimensional description fails at the first nontrivial order in (V/c)^2. At this order, there is nontrivial mixing of the two relevant four-dimensional perturbation modes (the growing and decaying modes) as the boundary branes move from the narrowly-separated limit described by Kaluza-Klein theory to the well-separated limit where gravity is confined to the positive-tension brane. We comment on the cosmological significance of the result and compute other quantities of interest in five-dimensional cosmological scenarios.

  14. The ethics of big data in big agriculture

    Directory of Open Access Journals (Sweden)

    Isabelle M. Carbonell

    2016-03-01

    Full Text Available This paper examines the ethics of big data in agriculture, focusing on the power asymmetry between farmers and large agribusinesses like Monsanto. Following the recent purchase of Climate Corp., Monsanto is currently the most prominent biotech agribusiness to buy into big data. With wireless sensors on tractors monitoring or dictating every decision a farmer makes, Monsanto can now aggregate large quantities of previously proprietary farming data, enabling a privileged position with unique insights on a field-by-field basis into a third or more of the US farmland. This power asymmetry may be rebalanced through open-sourced data, and publicly-funded data analytic tools which rival Climate Corp. in complexity and innovation for use in the public domain.

  15. The Obstacles in Big Data Process

    Directory of Open Access Journals (Sweden)

    Rasim M. Alguliyev

    2017-04-01

    Full Text Available The increasing amount of data and a need to analyze the given data in a timely manner for multiple purposes has created a serious barrier in the big data analysis process. This article describes the challenges that big data creates at each step of the big data analysis process. These problems include typical analytical problems as well as the most uncommon challenges that are futuristic for the big data only. The article breaks down problems for each step of the big data analysis process and discusses these problems separately at each stage. It also offers some simplistic ways to solve these problems.

  16. ISSUES, CHALLENGES, AND SOLUTIONS: BIG DATA MINING

    Directory of Open Access Journals (Sweden)

    Jaseena K.U.

    2014-12-01

    Full Text Available Data has become an indispensable part of every economy, industry, organization, business function and individual. Big Data is a term used to identify the datasets that whose size is beyond the ability of typical database software tools to store, manage and analyze. The Big Data introduce unique computational and statistical challenges, including scalability and storage bottleneck, noise accumulation, spurious correlation and measurement errors. These challenges are distinguished and require new computational and statistical paradigm. This paper presents the literature review about the Big data Mining and the issues and challenges with emphasis on the distinguished features of Big Data. It also discusses some methods to deal with big data.

  17. The perennial ryegrass GenomeZipper: targeted use of genome resources for comparative grass genomics.

    Science.gov (United States)

    Pfeifer, Matthias; Martis, Mihaela; Asp, Torben; Mayer, Klaus F X; Lübberstedt, Thomas; Byrne, Stephen; Frei, Ursula; Studer, Bruno

    2013-02-01

    Whole-genome sequences established for model and major crop species constitute a key resource for advanced genomic research. For outbreeding forage and turf grass species like ryegrasses (Lolium spp.), such resources have yet to be developed. Here, we present a model of the perennial ryegrass (Lolium perenne) genome on the basis of conserved synteny to barley (Hordeum vulgare) and the model grass genome Brachypodium (Brachypodium distachyon) as well as rice (Oryza sativa) and sorghum (Sorghum bicolor). A transcriptome-based genetic linkage map of perennial ryegrass served as a scaffold to establish the chromosomal arrangement of syntenic genes from model grass species. This scaffold revealed a high degree of synteny and macrocollinearity and was then utilized to anchor a collection of perennial ryegrass genes in silico to their predicted genome positions. This resulted in the unambiguous assignment of 3,315 out of 8,876 previously unmapped genes to the respective chromosomes. In total, the GenomeZipper incorporates 4,035 conserved grass gene loci, which were used for the first genome-wide sequence divergence analysis between perennial ryegrass, barley, Brachypodium, rice, and sorghum. The perennial ryegrass GenomeZipper is an ordered, information-rich genome scaffold, facilitating map-based cloning and genome assembly in perennial ryegrass and closely related Poaceae species. It also represents a milestone in describing synteny between perennial ryegrass and fully sequenced model grass genomes, thereby increasing our understanding of genome organization and evolution in the most important temperate forage and turf grass species.

  18. NCBI viral genomes resource.

    Science.gov (United States)

    Brister, J Rodney; Ako-Adjei, Danso; Bao, Yiming; Blinkova, Olga

    2015-01-01

    Recent technological innovations have ignited an explosion in virus genome sequencing that promises to fundamentally alter our understanding of viral biology and profoundly impact public health policy. Yet, any potential benefits from the billowing cloud of next generation sequence data hinge upon well implemented reference resources that facilitate the identification of sequences, aid in the assembly of sequence reads and provide reference annotation sources. The NCBI Viral Genomes Resource is a reference resource designed to bring order to this sequence shockwave and improve usability of viral sequence data. The resource can be accessed at http://www.ncbi.nlm.nih.gov/genome/viruses/ and catalogs all publicly available virus genome sequences and curates reference genome sequences. As the number of genome sequences has grown, so too have the difficulties in annotating and maintaining reference sequences. The rapid expansion of the viral sequence universe has forced a recalibration of the data model to better provide extant sequence representation and enhanced reference sequence products to serve the needs of the various viral communities. This, in turn, has placed increased emphasis on leveraging the knowledge of individual scientific communities to identify important viral sequences and develop well annotated reference virus genome sets.

  19. Big data is not a monolith

    CERN Document Server

    Ekbia, Hamid R; Mattioli, Michael

    2016-01-01

    Big data is ubiquitous but heterogeneous. Big data can be used to tally clicks and traffic on web pages, find patterns in stock trades, track consumer preferences, identify linguistic correlations in large corpuses of texts. This book examines big data not as an undifferentiated whole but contextually, investigating the varied challenges posed by big data for health, science, law, commerce, and politics. Taken together, the chapters reveal a complex set of problems, practices, and policies. The advent of big data methodologies has challenged the theory-driven approach to scientific knowledge in favor of a data-driven one. Social media platforms and self-tracking tools change the way we see ourselves and others. The collection of data by corporations and government threatens privacy while promoting transparency. Meanwhile, politicians, policy makers, and ethicists are ill-prepared to deal with big data's ramifications. The contributors look at big data's effect on individuals as it exerts social control throu...

  20. Big Data access and infrastructure for modern biology: case studies in data repository utility.

    Science.gov (United States)

    Boles, Nathan C; Stone, Tyler; Bergeron, Charles; Kiehl, Thomas R

    2017-01-01

    Big Data is no longer solely the purview of big organizations with big resources. Today's routine tools and experimental methods can generate large slices of data. For example, high-throughput sequencing can quickly interrogate biological systems for the expression levels of thousands of different RNAs, examine epigenetic marks throughout the genome, and detect differences in the genomes of individuals. Multichannel electrophysiology platforms produce gigabytes of data in just a few minutes of recording. Imaging systems generate videos capturing biological behaviors over the course of days. Thus, any researcher now has access to a veritable wealth of data. However, the ability of any given researcher to utilize that data is limited by her/his own resources and skills for downloading, storing, and analyzing the data. In this paper, we examine the necessary resources required to engage Big Data, survey the state of modern data analysis pipelines, present a few data repository case studies, and touch on current institutions and programs supporting the work that relies on Big Data.

  1. The big wheels of ATLAS

    CERN Multimedia

    2006-01-01

    The ATLAS cavern is filling up at an impressive rate. The installation of the first of the big wheels of the muon spectrometer, a thin gap chamber (TGC) wheel, was completed in September. The muon spectrometer will include four big moving wheels at each end, each measuring 25 metres in diameter. Of the eight wheels in total, six will be composed of thin gap chambers for the muon trigger system and the other two will consist of monitored drift tubes (MDTs) to measure the position of the muons (see Bulletin No. 13/2006). The installation of the 688 muon chambers in the barrel is progressing well, with three-quarters of them already installed between the coils of the toroid magnet.

  2. Big Numbers in String Theory

    CERN Document Server

    Schellekens, A N

    2016-01-01

    This paper contains some personal reflections on several computational contributions to what is now known as the "String Theory Landscape". It consists of two parts. The first part concerns the origin of big numbers, and especially the number $10^{1500}$ that appeared in work on the covariant lattice construction (with W. Lerche and D. Luest). This part contains some new results. I correct a huge but inconsequential error, discuss some more accurate estimates, and compare with the counting for free fermion constructions. In particular I prove that the latter only provide an exponentially small fraction of all even self-dual lattices for large lattice dimensions. The second part of the paper concerns dealing with big numbers, and contains some lessons learned from various vacuum scanning projects.

  3. Big Data for Precision Medicine

    Directory of Open Access Journals (Sweden)

    Daniel Richard Leff

    2015-09-01

    Full Text Available This article focuses on the potential impact of big data analysis to improve health, prevent and detect disease at an earlier stage, and personalize interventions. The role that big data analytics may have in interrogating the patient electronic health record toward improved clinical decision support is discussed. We examine developments in pharmacogenetics that have increased our appreciation of the reasons why patients respond differently to chemotherapy. We also assess the expansion of online health communications and the way in which this data may be capitalized on in order to detect public health threats and control or contain epidemics. Finally, we describe how a new generation of wearable and implantable body sensors may improve wellbeing, streamline management of chronic diseases, and improve the quality of surgical implants.

  4. Big data and ophthalmic research.

    Science.gov (United States)

    Clark, Antony; Ng, Jonathon Q; Morlet, Nigel; Semmens, James B

    2016-01-01

    Large population-based health administrative databases, clinical registries, and data linkage systems are a rapidly expanding resource for health research. Ophthalmic research has benefited from the use of these databases in expanding the breadth of knowledge in areas such as disease surveillance, disease etiology, health services utilization, and health outcomes. Furthermore, the quantity of data available for research has increased exponentially in recent times, particularly as e-health initiatives come online in health systems across the globe. We review some big data concepts, the databases and data linkage systems used in eye research-including their advantages and limitations, the types of studies previously undertaken, and the future direction for big data in eye research.

  5. George and the big bang

    CERN Document Server

    Hawking, Lucy; Parsons, Gary

    2012-01-01

    George has problems. He has twin baby sisters at home who demand his parents’ attention. His beloved pig Freddy has been exiled to a farm, where he’s miserable. And worst of all, his best friend, Annie, has made a new friend whom she seems to like more than George. So George jumps at the chance to help Eric with his plans to run a big experiment in Switzerland that seeks to explore the earliest moment of the universe. But there is a conspiracy afoot, and a group of evildoers is planning to sabotage the experiment. Can George repair his friendship with Annie and piece together the clues before Eric’s experiment is destroyed forever? This engaging adventure features essays by Professor Stephen Hawking and other eminent physicists about the origins of the universe and ends with a twenty-page graphic novel that explains how the Big Bang happened—in reverse!

  6. Next-generation sequencing: big data meets high performance computing.

    Science.gov (United States)

    Schmidt, Bertil; Hildebrandt, Andreas

    2017-02-02

    The progress of next-generation sequencing has a major impact on medical and genomic research. This high-throughput technology can now produce billions of short DNA or RNA fragments in excess of a few terabytes of data in a single run. This leads to massive datasets used by a wide range of applications including personalized cancer treatment and precision medicine. In addition to the hugely increased throughput, the cost of using high-throughput technologies has been dramatically decreasing. A low sequencing cost of around US$1000 per genome has now rendered large population-scale projects feasible. However, to make effective use of the produced data, the design of big data algorithms and their efficient implementation on modern high performance computing systems is required.

  7. Big biomedical data as the key resource for discovery science.

    Science.gov (United States)

    Toga, Arthur W; Foster, Ian; Kesselman, Carl; Madduri, Ravi; Chard, Kyle; Deutsch, Eric W; Price, Nathan D; Glusman, Gustavo; Heavner, Benjamin D; Dinov, Ivo D; Ames, Joseph; Van Horn, John; Kramer, Roger; Hood, Leroy

    2015-11-01

    Modern biomedical data collection is generating exponentially more data in a multitude of formats. This flood of complex data poses significant opportunities to discover and understand the critical interplay among such diverse domains as genomics, proteomics, metabolomics, and phenomics, including imaging, biometrics, and clinical data. The Big Data for Discovery Science Center is taking an "-ome to home" approach to discover linkages between these disparate data sources by mining existing databases of proteomic and genomic data, brain images, and clinical assessments. In support of this work, the authors developed new technological capabilities that make it easy for researchers to manage, aggregate, manipulate, integrate, and model large amounts of distributed data. Guided by biological domain expertise, the Center's computational resources and software will reveal relationships and patterns, aiding researchers in identifying biomarkers for the most confounding conditions and diseases, such as Parkinson's and Alzheimer's.

  8. Neuroblastoma, a Paradigm for Big Data Science in Pediatric Oncology.

    Science.gov (United States)

    Salazar, Brittany M; Balczewski, Emily A; Ung, Choong Yong; Zhu, Shizhen

    2016-12-27

    Pediatric cancers rarely exhibit recurrent mutational events when compared to most adult cancers. This poses a challenge in understanding how cancers initiate, progress, and metastasize in early childhood. Also, due to limited detected driver mutations, it is difficult to benchmark key genes for drug development. In this review, we use neuroblastoma, a pediatric solid tumor of neural crest origin, as a paradigm for exploring "big data" applications in pediatric oncology. Computational strategies derived from big data science-network- and machine learning-based modeling and drug repositioning-hold the promise of shedding new light on the molecular mechanisms driving neuroblastoma pathogenesis and identifying potential therapeutics to combat this devastating disease. These strategies integrate robust data input, from genomic and transcriptomic studies, clinical data, and in vivo and in vitro experimental models specific to neuroblastoma and other types of cancers that closely mimic its biological characteristics. We discuss contexts in which "big data" and computational approaches, especially network-based modeling, may advance neuroblastoma research, describe currently available data and resources, and propose future models of strategic data collection and analyses for neuroblastoma and other related diseases.

  9. Statistical Inference: The Big Picture.

    Science.gov (United States)

    Kass, Robert E

    2011-02-01

    Statistics has moved beyond the frequentist-Bayesian controversies of the past. Where does this leave our ability to interpret results? I suggest that a philosophy compatible with statistical practice, labelled here statistical pragmatism, serves as a foundation for inference. Statistical pragmatism is inclusive and emphasizes the assumptions that connect statistical models with observed data. I argue that introductory courses often mis-characterize the process of statistical inference and I propose an alternative "big picture" depiction.

  10. Personal Spaces in Public Repositories as a Facilitator for Open Educational Resource Usage

    Science.gov (United States)

    Cohen, Anat; Reisman, Sorel; Sperling, Barbra Bied

    2015-01-01

    Learning object repositories are a shared, open and public space; however, the possibility and ability of personal expression in an open, global, public space is crucial. The aim of this study is to explore personal spaces in a big learning object repository as a facilitator for adoption of Open Educational Resources (OER) into teaching practices…

  11. Big data: the management revolution.

    Science.gov (United States)

    McAfee, Andrew; Brynjolfsson, Erik

    2012-10-01

    Big data, the authors write, is far more powerful than the analytics of the past. Executives can measure and therefore manage more precisely than ever before. They can make better predictions and smarter decisions. They can target more-effective interventions in areas that so far have been dominated by gut and intuition rather than by data and rigor. The differences between big data and analytics are a matter of volume, velocity, and variety: More data now cross the internet every second than were stored in the entire internet 20 years ago. Nearly real-time information makes it possible for a company to be much more agile than its competitors. And that information can come from social networks, images, sensors, the web, or other unstructured sources. The managerial challenges, however, are very real. Senior decision makers have to learn to ask the right questions and embrace evidence-based decision making. Organizations must hire scientists who can find patterns in very large data sets and translate them into useful business information. IT departments have to work hard to integrate all the relevant internal and external sources of data. The authors offer two success stories to illustrate how companies are using big data: PASSUR Aerospace enables airlines to match their actual and estimated arrival times. Sears Holdings directly analyzes its incoming store data to make promotions much more precise and faster.

  12. The BigBOSS Experiment

    CERN Document Server

    Schlegel, D; Abraham, T; Ahn, C; Prieto, C Allende; Annis, J; Aubourg, E; Azzaro, M; Baltay, S Bailey C; Baugh, C; Bebek, C; Becerril, S; Blanton, M; Bolton, A; Bromley, B; Cahn, R; Carton, P -H; Cervantes-Cota, J L; Chu, Y; Cortes, M; Dawson, K; Dey, A; Dickinson, M; Diehl, H T; Doel, P; Ealet, A; Edelstein, J; Eppelle, D; Escoffier, S; Evrard, A; Faccioli, L; Frenk, C; Geha, M; Gerdes, D; Gondolo, P; Gonzalez-Arroyo, A; Grossan, B; Heckman, T; Heetderks, H; Ho, S; Honscheid, K; Huterer, D; Ilbert, O; Ivans, I; Jelinsky, P; Jing, Y; Joyce, D; Kennedy, R; Kent, S; Kieda, D; Kim, A; Kim, C; Kneib, J -P; Kong, X; Kosowsky, A; Krishnan, K; Lahav, O; Lampton, M; LeBohec, S; Brun, V Le; Levi, M; Li, C; Liang, M; Lim, H; Lin, W; Linder, E; Lorenzon, W; de la Macorra, A; Magneville, Ch; Malina, R; Marinoni, C; Martinez, V; Majewski, S; Matheson, T; McCloskey, R; McDonald, P; McKay, T; McMahon, J; Menard, B; Miralda-Escude, J; Modjaz, M; Montero-Dorta, A; Morales, I; Mostek, N; Newman, J; Nichol, R; Nugent, P; Olsen, K; Padmanabhan, N; Palanque-Delabrouille, N; Park, I; Peacock, J; Percival, W; Perlmutter, S; Peroux, C; Petitjean, P; Prada, F; Prieto, E; Prochaska, J; Reil, K; Rockosi, C; Roe, N; Rollinde, E; Roodman, A; Ross, N; Rudnick, G; Ruhlmann-Kleider, V; Sanchez, J; Sawyer, D; Schimd, C; Schubnell, M; Scoccimaro, R; Seljak, U; Seo, H; Sheldon, E; Sholl, M; Shulte-Ladbeck, R; Slosar, A; Smith, D S; Smoot, G; Springer, W; Stril, A; Szalay, A S; Tao, C; Tarle, G; Taylor, E; Tilquin, A; Tinker, J; Valdes, F; Wang, J; Wang, T; Weaver, B A; Weinberg, D; White, M; Wood-Vasey, M; Yang, J; Yeche, X Yang Ch; Zakamska, N; Zentner, A; Zhai, C; Zhang, P

    2011-01-01

    BigBOSS is a Stage IV ground-based dark energy experiment to study baryon acoustic oscillations (BAO) and the growth of structure with a wide-area galaxy and quasar redshift survey over 14,000 square degrees. It has been conditionally accepted by NOAO in response to a call for major new instrumentation and a high-impact science program for the 4-m Mayall telescope at Kitt Peak. The BigBOSS instrument is a robotically-actuated, fiber-fed spectrograph capable of taking 5000 simultaneous spectra over a wavelength range from 340 nm to 1060 nm, with a resolution R = 3000-4800. Using data from imaging surveys that are already underway, spectroscopic targets are selected that trace the underlying dark matter distribution. In particular, targets include luminous red galaxies (LRGs) up to z = 1.0, extending the BOSS LRG survey in both redshift and survey area. To probe the universe out to even higher redshift, BigBOSS will target bright [OII] emission line galaxies (ELGs) up to z = 1.7. In total, 20 million galaxy red...

  13. Big Data Comes to School

    Directory of Open Access Journals (Sweden)

    Bill Cope

    2016-03-01

    Full Text Available The prospect of “big data” at once evokes optimistic views of an information-rich future and concerns about surveillance that adversely impacts our personal and private lives. This overview article explores the implications of big data in education, focusing by way of example on data generated by student writing. We have chosen writing because it presents particular complexities, highlighting the range of processes for collecting and interpreting evidence of learning in the era of computer-mediated instruction and assessment as well as the challenges. Writing is significant not only because it is central to the core subject area of literacy; it is also an ideal medium for the representation of deep disciplinary knowledge across a number of subject areas. After defining what big data entails in education, we map emerging sources of evidence of learning that separately and together have the potential to generate unprecedented amounts of data: machine assessments, structured data embedded in learning, and unstructured data collected incidental to learning activity. Our case is that these emerging sources of evidence of learning have significant implications for the traditional relationships between assessment and instruction. Moreover, for educational researchers, these data are in some senses quite different from traditional evidentiary sources, and this raises a number of methodological questions. The final part of the article discusses implications for practice in an emerging field of education data science, including publication of data, data standards, and research ethics.

  14. Big Pharma: a former insider's view.

    Science.gov (United States)

    Badcott, David

    2013-05-01

    There is no lack of criticisms frequently levelled against the international pharmaceutical industry (Big Pharma): excessive profits, dubious or even dishonest practices, exploiting the sick and selective use of research data. Neither is there a shortage of examples used to support such opinions. A recent book by Brody (Hooked: Ethics, the Medical Profession and the Pharmaceutical Industry, 2008) provides a précis of the main areas of criticism, adopting a twofold strategy: (1) An assumption that the special nature and human need for pharmaceutical medicines requires that such products should not be treated like other commodities and (2) A multilevel descriptive approach that facilitates an ethical analysis of relationships and practices. At the same time, Brody is fully aware of the nature of the fundamental dilemma: the apparent addiction to (and denial of) the widespread availability of gifts and financial support for conferences etc., but recognises that 'Remove the industry and its products, and a considerable portion of scientific medicine's power to help the patient vanishes' (Brody 2008, p. 5). The paper explores some of the relevant issues, and argues that despite the identified shortcomings and a need for rigorous and perhaps enhanced regulation, and realistic price control, the commercially competitive pharmaceutical industry remains the best option for developing safer and more effective medicinal treatments. At the same time, adoption of a broader ethical basis for the industry's activities, such as a triple bottom line policy, would register an important move in the right direction and go some way toward answering critics.

  15. [Big data, medical language and biomedical terminology systems].

    Science.gov (United States)

    Schulz, Stefan; López-García, Pablo

    2015-08-01

    A variety of rich terminology systems, such as thesauri, classifications, nomenclatures and ontologies support information and knowledge processing in health care and biomedical research. Nevertheless, human language, manifested as individually written texts, persists as the primary carrier of information, in the description of disease courses or treatment episodes in electronic medical records, and in the description of biomedical research in scientific publications. In the context of the discussion about big data in biomedicine, we hypothesize that the abstraction of the individuality of natural language utterances into structured and semantically normalized information facilitates the use of statistical data analytics to distil new knowledge out of textual data from biomedical research and clinical routine. Computerized human language technologies are constantly evolving and are increasingly ready to annotate narratives with codes from biomedical terminology. However, this depends heavily on linguistic and terminological resources. The creation and maintenance of such resources is labor-intensive. Nevertheless, it is sensible to assume that big data methods can be used to support this process. Examples include the learning of hierarchical relationships, the grouping of synonymous terms into concepts and the disambiguation of homonyms. Although clear evidence is still lacking, the combination of natural language technologies, semantic resources, and big data analytics is promising.

  16. Perspective: Materials informatics and big data: Realization of the "fourth paradigm" of science in materials science

    Science.gov (United States)

    Agrawal, Ankit; Choudhary, Alok

    2016-05-01

    Our ability to collect "big data" has greatly surpassed our capability to analyze it, underscoring the emergence of the fourth paradigm of science, which is data-driven discovery. The need for data informatics is also emphasized by the Materials Genome Initiative (MGI), further boosting the emerging field of materials informatics. In this article, we look at how data-driven techniques are playing a big role in deciphering processing-structure-property-performance relationships in materials, with illustrative examples of both forward models (property prediction) and inverse models (materials discovery). Such analytics can significantly reduce time-to-insight and accelerate cost-effective materials discovery, which is the goal of MGI.

  17. Expert and novice facilitated modelling

    DEFF Research Database (Denmark)

    Tavella, Elena; Papadopoulos, Thanos

    2015-01-01

    This paper provides an empirical study based on action research in which expert and novice facilitators in facilitated modelling workshops are compared. There is limited empirical research analysing the differences between expert and novice facilitators. Aiming to address this gap we study...

  18. Big data analysis using modern statistical and machine learning methods in medicine.

    Science.gov (United States)

    Yoo, Changwon; Ramirez, Luis; Liuzzi, Juan

    2014-06-01

    In this article we introduce modern statistical machine learning and bioinformatics approaches that have been used in learning statistical relationships from big data in medicine and behavioral science that typically include clinical, genomic (and proteomic) and environmental variables. Every year, data collected from biomedical and behavioral science is getting larger and more complicated. Thus, in medicine, we also need to be aware of this trend and understand the statistical tools that are available to analyze these datasets. Many statistical analyses that are aimed to analyze such big datasets have been introduced recently. However, given many different types of clinical, genomic, and environmental data, it is rather uncommon to see statistical methods that combine knowledge resulting from those different data types. To this extent, we will introduce big data in terms of clinical data, single nucleotide polymorphism and gene expression studies and their interactions with environment. In this article, we will introduce the concept of well-known regression analyses such as linear and logistic regressions that has been widely used in clinical data analyses and modern statistical models such as Bayesian networks that has been introduced to analyze more complicated data. Also we will discuss how to represent the interaction among clinical, genomic, and environmental data in using modern statistical models. We conclude this article with a promising modern statistical method called Bayesian networks that is suitable in analyzing big data sets that consists with different type of large data from clinical, genomic, and environmental data. Such statistical model form big data will provide us with more comprehensive understanding of human physiology and disease.

  19. Perspectives on making big data analytics work for oncology.

    Science.gov (United States)

    El Naqa, Issam

    2016-12-01

    Oncology, with its unique combination of clinical, physical, technological, and biological data provides an ideal case study for applying big data analytics to improve cancer treatment safety and outcomes. An oncology treatment course such as chemoradiotherapy can generate a large pool of information carrying the 5Vs hallmarks of big data. This data is comprised of a heterogeneous mixture of patient demographics, radiation/chemo dosimetry, multimodality imaging features, and biological markers generated over a treatment period that can span few days to several weeks. Efforts using commercial and in-house tools are underway to facilitate data aggregation, ontology creation, sharing, visualization and varying analytics in a secure environment. However, open questions related to proper data structure representation and effective analytics tools to support oncology decision-making need to be addressed. It is recognized that oncology data constitutes a mix of structured (tabulated) and unstructured (electronic documents) that need to be processed to facilitate searching and subsequent knowledge discovery from relational or NoSQL databases. In this context, methods based on advanced analytics and image feature extraction for oncology applications will be discussed. On the other hand, the classical p (variables)≫n (samples) inference problem of statistical learning is challenged in the Big data realm and this is particularly true for oncology applications where p-omics is witnessing exponential growth while the number of cancer incidences has generally plateaued over the past 5-years leading to a quasi-linear growth in samples per patient. Within the Big data paradigm, this kind of phenomenon may yield undesirable effects such as echo chamber anomalies, Yule-Simpson reversal paradox, or misleading ghost analytics. In this work, we will present these effects as they pertain to oncology and engage small thinking methodologies to counter these effects ranging from

  20. BIG GEO DATA MANAGEMENT: AN EXPLORATION WITH SOCIAL MEDIA AND TELECOMMUNICATIONS OPEN DATA

    Directory of Open Access Journals (Sweden)

    C. Arias Munoz

    2016-06-01

    Full Text Available The term Big Data has been recently used to define big, highly varied, complex data sets, which are created and updated at a high speed and require faster processing, namely, a reduced time to filter and analyse relevant data. These data is also increasingly becoming Open Data (data that can be freely distributed made public by the government, agencies, private enterprises and among others. There are at least two issues that can obstruct the availability and use of Open Big Datasets: Firstly, the gathering and geoprocessing of these datasets are very computationally intensive; hence, it is necessary to integrate high-performance solutions, preferably internet based, to achieve the goals. Secondly, the problems of heterogeneity and inconsistency in geospatial data are well known and affect the data integration process, but is particularly problematic for Big Geo Data. Therefore, Big Geo Data integration will be one of the most challenging issues to solve. With these applications, we demonstrate that is possible to provide processed Big Geo Data to common users, using open geospatial standards and technologies. NoSQL databases like MongoDB and frameworks like RASDAMAN could offer different functionalities that facilitate working with larger volumes and more heterogeneous geospatial data sources.

  1. Big Geo Data Management: AN Exploration with Social Media and Telecommunications Open Data

    Science.gov (United States)

    Arias Munoz, C.; Brovelli, M. A.; Corti, S.; Zamboni, G.

    2016-06-01

    The term Big Data has been recently used to define big, highly varied, complex data sets, which are created and updated at a high speed and require faster processing, namely, a reduced time to filter and analyse relevant data. These data is also increasingly becoming Open Data (data that can be freely distributed) made public by the government, agencies, private enterprises and among others. There are at least two issues that can obstruct the availability and use of Open Big Datasets: Firstly, the gathering and geoprocessing of these datasets are very computationally intensive; hence, it is necessary to integrate high-performance solutions, preferably internet based, to achieve the goals. Secondly, the problems of heterogeneity and inconsistency in geospatial data are well known and affect the data integration process, but is particularly problematic for Big Geo Data. Therefore, Big Geo Data integration will be one of the most challenging issues to solve. With these applications, we demonstrate that is possible to provide processed Big Geo Data to common users, using open geospatial standards and technologies. NoSQL databases like MongoDB and frameworks like RASDAMAN could offer different functionalities that facilitate working with larger volumes and more heterogeneous geospatial data sources.

  2. Re-evaluation of the immunological Big Bang.

    Science.gov (United States)

    Flajnik, Martin F

    2014-11-03

    Classically the immunological 'Big Bang' of adaptive immunity was believed to have resulted from the insertion of a transposon into an immunoglobulin superfamily gene member, initiating antigen receptor gene rearrangement via the RAG recombinase in an ancestor of jawed vertebrates. However, the discovery of a second, convergent adaptive immune system in jawless fish, focused on the so-called variable lymphocyte receptors (VLRs), was arguably the most exciting finding of the past decade in immunology and has drastically changed the view of immune origins. The recent report of a new lymphocyte lineage in lampreys, defined by the antigen receptor VLRC, suggests that there were three lymphocyte lineages in the common ancestor of jawless and jawed vertebrates that co-opted different antigen receptor supertypes. The transcriptional control of these lineages during development is predicted to be remarkably similar in both the jawless (agnathan) and jawed (gnathostome) vertebrates, suggesting that an early 'division of labor' among lymphocytes was a driving force in the emergence of adaptive immunity. The recent cartilaginous fish genome project suggests that most effector cytokines and chemokines were also present in these fish, and further studies of the lamprey and hagfish genomes will determine just how explosive the Big Bang actually was.

  3. Program Facilitates CMMI Appraisals

    Science.gov (United States)

    Sweetser, Wesley

    2005-01-01

    A computer program has been written to facilitate appraisals according to the methodology of Capability Maturity Model Integration (CMMI). [CMMI is a government/industry standard, maintained by the Software Engineering Institute at Carnegie Mellon University, for objectively assessing the engineering capability and maturity of an organization (especially, an organization that produces software)]. The program assists in preparation for a CMMI appraisal by providing drop-down lists suggesting required artifacts or evidence. It identifies process areas for which similar evidence is required and includes a copy feature that reduces or eliminates repetitive data entry. It generates reports to show the entire framework for reference, the appraisal artifacts to determine readiness for an appraisal, and lists of interviewees and questions to ask them during the appraisal. During an appraisal, the program provides screens for entering observations and ratings, and reviewing evidence provided thus far. Findings concerning strengths and weaknesses can be exported for use in a report or a graphical presentation. The program generates a chart showing capability level ratings of the organization. A context-sensitive Windows help system enables a novice to use the program and learn about the CMMI appraisal process.

  4. Facilitating post traumatic growth

    Directory of Open Access Journals (Sweden)

    Cox Helen

    2004-07-01

    Full Text Available Abstract Background Whilst negative responses to traumatic injury have been well documented in the literature, there is a small but growing body of work that identifies posttraumatic growth as a salient feature of this experience. We contribute to this discourse by reporting on the experiences of 13 individuals who were traumatically injured, had undergone extensive rehabilitation and were discharged from formal care. All participants were injured through involvement in a motor vehicle accident, with the exception of one, who was injured through falling off the roof of a house. Methods In this qualitative study, we used an audio-taped in-depth interview with each participant as the means of data collection. Interviews were transcribed verbatim and analysed thematically to determine the participants' unique perspectives on the experience of recovery from traumatic injury. In reporting the findings, all participants' were given a pseudonym to assure their anonymity. Results Most participants indicated that their involvement in a traumatic occurrence was a springboard for growth that enabled them to develop new perspectives on life and living. Conclusion There are a number of contributions that health providers may make to the recovery of individuals who have been traumatically injured to assist them to develop new views of vulnerability and strength, make changes in relationships, and facilitate philosophical, physical and spiritual growth.

  5. Facilitating post traumatic growth

    Science.gov (United States)

    Turner, de Sales; Cox, Helen

    2004-01-01

    Background Whilst negative responses to traumatic injury have been well documented in the literature, there is a small but growing body of work that identifies posttraumatic growth as a salient feature of this experience. We contribute to this discourse by reporting on the experiences of 13 individuals who were traumatically injured, had undergone extensive rehabilitation and were discharged from formal care. All participants were injured through involvement in a motor vehicle accident, with the exception of one, who was injured through falling off the roof of a house. Methods In this qualitative study, we used an audio-taped in-depth interview with each participant as the means of data collection. Interviews were transcribed verbatim and analysed thematically to determine the participants' unique perspectives on the experience of recovery from traumatic injury. In reporting the findings, all participants' were given a pseudonym to assure their anonymity. Results Most participants indicated that their involvement in a traumatic occurrence was a springboard for growth that enabled them to develop new perspectives on life and living. Conclusion There are a number of contributions that health providers may make to the recovery of individuals who have been traumatically injured to assist them to develop new views of vulnerability and strength, make changes in relationships, and facilitate philosophical, physical and spiritual growth. PMID:15248894

  6. Earth Science Data Analysis in the Era of Big Data

    Science.gov (United States)

    Kuo, K.-S.; Clune, T. L.; Ramachandran, R.

    2014-01-01

    Anyone with even a cursory interest in information technology cannot help but recognize that "Big Data" is one of the most fashionable catchphrases of late. From accurate voice and facial recognition, language translation, and airfare prediction and comparison, to monitoring the real-time spread of flu, Big Data techniques have been applied to many seemingly intractable problems with spectacular successes. They appear to be a rewarding way to approach many currently unsolved problems. Few fields of research can claim a longer history with problems involving voluminous data than Earth science. The problems we are facing today with our Earth's future are more complex and carry potentially graver consequences than the examples given above. How has our climate changed? Beside natural variations, what is causing these changes? What are the processes involved and through what mechanisms are these connected? How will they impact life as we know it? In attempts to answer these questions, we have resorted to observations and numerical simulations with ever-finer resolutions, which continue to feed the "data deluge." Plausibly, many Earth scientists are wondering: How will Big Data technologies benefit Earth science research? As an example from the global water cycle, one subdomain among many in Earth science, how would these technologies accelerate the analysis of decades of global precipitation to ascertain the changes in its characteristics, to validate these changes in predictive climate models, and to infer the implications of these changes to ecosystems, economies, and public health? Earth science researchers need a viable way to harness the power of Big Data technologies to analyze large volumes and varieties of data with velocity and veracity. Beyond providing speedy data analysis capabilities, Big Data technologies can also play a crucial, albeit indirect, role in boosting scientific productivity by facilitating effective collaboration within an analysis environment

  7. Development and validation of Big Four personality scales for the Schedule for Nonadaptive and Adaptive Personality--Second Edition (SNAP-2).

    Science.gov (United States)

    Calabrese, William R; Rudick, Monica M; Simms, Leonard J; Clark, Lee Anna

    2012-09-01

    Recently, integrative, hierarchical models of personality and personality disorder (PD)--such as the Big Three, Big Four, and Big Five trait models--have gained support as a unifying dimensional framework for describing PD. However, no measures to date can simultaneously represent each of these potentially interesting levels of the personality hierarchy. To unify these measurement models psychometrically, we sought to develop Big Five trait scales within the Schedule for Nonadaptive and Adaptive Personality--Second Edition (SNAP-2). Through structural and content analyses, we examined relations between the SNAP-2, the Big Five Inventory (BFI), and the NEO Five-Factor Inventory (NEO-FFI) ratings in a large data set (N = 8,690), including clinical, military, college, and community participants. Results yielded scales consistent with the Big Four model of personality (i.e., Neuroticism, Conscientiousness, Introversion, and Antagonism) and not the Big Five, as there were insufficient items related to Openness. Resulting scale scores demonstrated strong internal consistency and temporal stability. Structural validity and external validity were supported by strong convergent and discriminant validity patterns between Big Four scale scores and other personality trait scores and expectable patterns of self-peer agreement. Descriptive statistics and community-based norms are provided. The SNAP-2 Big Four Scales enable researchers and clinicians to assess personality at multiple levels of the trait hierarchy and facilitate comparisons among competing big-trait models.

  8. Recent developments in genomics, bioinformatics and drug discovery to combat emerging drug-resistant tuberculosis.

    Science.gov (United States)

    Swaminathan, Soumya; Sundaramurthi, Jagadish Chandrabose; Palaniappan, Alangudi Natarajan; Narayanan, Sujatha

    2016-12-01

    Emergence of drug-resistant tuberculosis (DR-TB) is a big challenge in TB control. The delay in diagnosis of DR-TB leads to its increased transmission, and therefore prevalence. Recent developments in genomics have enabled whole genome sequencing (WGS) of Mycobacterium tuberculosis (M. tuberculosis) from 3-day-old liquid culture and directly from uncultured sputa, while new bioinformatics tools facilitate to determine DR mutations rapidly from the resulting sequences. The present drug discovery and development pipeline is filled with candidate drugs which have shown efficacy against DR-TB. Furthermore, some of the FDA-approved drugs are being evaluated for repurposing, and this approach appears promising as several drugs are reported to enhance efficacy of the standard TB drugs, reduce drug tolerance, or modulate the host immune response to control the growth of intracellular M. tuberculosis. Recent developments in genomics and bioinformatics along with new drug discovery collectively have the potential to result in synergistic impact leading to the development of a rapid protocol to determine the drug resistance profile of the infecting strain so as to provide personalized medicine. Hence, in this review, we discuss recent developments in WGS, bioinformatics and drug discovery to perceive how they would transform the management of tuberculosis in a timely manner.

  9. Domestication and plant genomes.

    Science.gov (United States)

    Tang, Haibao; Sezen, Uzay; Paterson, Andrew H

    2010-04-01

    The techniques of plant improvement have been evolving with the advancement of technology, progressing from crop domestication by Neolithic humans to scientific plant breeding, and now including DNA-based genotyping and genetic engineering. Archeological findings have shown that early human ancestors often unintentionally selected for and finally fixed a few major domestication traits over time. Recent advancement of molecular and genomic tools has enabled scientists to pinpoint changes to specific chromosomal regions and genetic loci that are responsible for dramatic morphological and other transitions that distinguish crops from their wild progenitors. Extensive studies in a multitude of additional crop species, facilitated by rapid progress in sequencing and resequencing(s) of crop genomes, will further our understanding of the genomic impact from both the unusual population history of cultivated plants and millennia of human selection.

  10. EAARL-B Topography-Big Thicket National Preserve: Big Sandy Creek Corridor Unit, Texas, 2014

    Data.gov (United States)

    U.S. Geological Survey, Department of the Interior — A first-surface topography Digital Elevation Model (DEM) mosaic for the Big Sandy Creek Corridor Unit of Big Thicket National Preserve in Texas was produced from...

  11. EAARL-B Topography-Big Thicket National Preserve: Big Sandy Creek Corridor Unit, Texas, 2014

    Data.gov (United States)

    U.S. Geological Survey, Department of the Interior — A bare-earth topography Digital Elevation Model (DEM) mosaic for the Big Sandy Creek Corridor Unit of Big Thicket National Preserve in Texas was produced from...

  12. EAARL-B Topography-Big Thicket National Preserve: Big Sandy Creek Unit, Texas, 2014

    Data.gov (United States)

    U.S. Geological Survey, Department of the Interior — A first-surface topography digital elevation model (DEM) mosaic for the Big Sandy Creek Unit of Big Thicket National Preserve in Texas, was produced from remotely...

  13. EAARL-B Topography-Big Thicket National Preserve: Big Sandy Creek Unit, Texas, 2014

    Data.gov (United States)

    U.S. Geological Survey, Department of the Interior — A bare-earth topography digital elevation model (DEM) mosaic for the Big Sandy Creek Unit of Big Thicket National Preserve in Texas, was produced from remotely...

  14. An Effective Big Data Supervised Imbalanced Classification Approach for Ortholog Detection in Related Yeast Species

    Directory of Open Access Journals (Sweden)

    Deborah Galpert

    2015-01-01

    Full Text Available Orthology detection requires more effective scaling algorithms. In this paper, a set of gene pair features based on similarity measures (alignment scores, sequence length, gene membership to conserved regions, and physicochemical profiles are combined in a supervised pairwise ortholog detection approach to improve effectiveness considering low ortholog ratios in relation to the possible pairwise comparison between two genomes. In this scenario, big data supervised classifiers managing imbalance between ortholog and nonortholog pair classes allow for an effective scaling solution built from two genomes and extended to other genome pairs. The supervised approach was compared with RBH, RSD, and OMA algorithms by using the following yeast genome pairs: Saccharomyces cerevisiae-Kluyveromyces lactis, Saccharomyces cerevisiae-Candida glabrata, and Saccharomyces cerevisiae-Schizosaccharomyces pombe as benchmark datasets. Because of the large amount of imbalanced data, the building and testing of the supervised model were only possible by using big data supervised classifiers managing imbalance. Evaluation metrics taking low ortholog ratios into account were applied. From the effectiveness perspective, MapReduce Random Oversampling combined with Spark SVM outperformed RBH, RSD, and OMA, probably because of the consideration of gene pair features beyond alignment similarities combined with the advances in big data supervised classification.

  15. Big data optimization recent developments and challenges

    CERN Document Server

    2016-01-01

    The main objective of this book is to provide the necessary background to work with big data by introducing some novel optimization algorithms and codes capable of working in the big data setting as well as introducing some applications in big data optimization for both academics and practitioners interested, and to benefit society, industry, academia, and government. Presenting applications in a variety of industries, this book will be useful for the researchers aiming to analyses large scale data. Several optimization algorithms for big data including convergent parallel algorithms, limited memory bundle algorithm, diagonal bundle method, convergent parallel algorithms, network analytics, and many more have been explored in this book.

  16. Medical big data: promise and challenges.

    Science.gov (United States)

    Lee, Choong Ho; Yoon, Hyung-Jin

    2017-03-01

    The concept of big data, commonly characterized by volume, variety, velocity, and veracity, goes far beyond the data type and includes the aspects of data analysis, such as hypothesis-generating, rather than hypothesis-testing. Big data focuses on temporal stability of the association, rather than on causal relationship and underlying probability distribution assumptions are frequently not required. Medical big data as material to be analyzed has various features that are not only distinct from big data of other disciplines, but also distinct from traditional clinical epidemiology. Big data technology has many areas of application in healthcare, such as predictive modeling and clinical decision support, disease or safety surveillance, public health, and research. Big data analytics frequently exploits analytic methods developed in data mining, including classification, clustering, and regression. Medical big data analyses are complicated by many technical issues, such as missing values, curse of dimensionality, and bias control, and share the inherent limitations of observation study, namely the inability to test causality resulting from residual confounding and reverse causation. Recently, propensity score analysis and instrumental variable analysis have been introduced to overcome these limitations, and they have accomplished a great deal. Many challenges, such as the absence of evidence of practical benefits of big data, methodological issues including legal and ethical issues, and clinical integration and utility issues, must be overcome to realize the promise of medical big data as the fuel of a continuous learning healthcare system that will improve patient outcome and reduce waste in areas including nephrology.

  17. Organizational Design Challenges Resulting From Big Data

    Directory of Open Access Journals (Sweden)

    Jay R. Galbraith

    2014-04-01

    Full Text Available Business firms and other types of organizations are feverishly exploring ways of taking advantage of the big data phenomenon. This article discusses firms that are at the leading edge of developing a big data analytics capability. Firms that are currently enjoying the most success in this area are able to use big data not only to improve their existing businesses but to create new businesses as well. Putting a strategic emphasis on big data requires adding an analytics capability to the existing organization. This transformation process results in power shifting to analytics experts and in decisions being made in real time.

  18. The big de Rham–Witt complex

    DEFF Research Database (Denmark)

    Hesselholt, Lars

    2015-01-01

    This paper gives a new and direct construction of the multi-prime big de Rham–Witt complex, which is defined for every commutative and unital ring; the original construction by Madsen and myself relied on the adjoint functor theorem and accordingly was very indirect. The construction given here....... It is the existence of these divided Frobenius operators that makes the new construction of the big de Rham–Witt complex possible. It is further shown that the big de Rham–Witt complex behaves well with respect to étale maps, and finally, the big de Rham–Witt complex of the ring of integers is explicitly evaluated....

  19. Big data analytics with R and Hadoop

    CERN Document Server

    Prajapati, Vignesh

    2013-01-01

    Big Data Analytics with R and Hadoop is a tutorial style book that focuses on all the powerful big data tasks that can be achieved by integrating R and Hadoop.This book is ideal for R developers who are looking for a way to perform big data analytics with Hadoop. This book is also aimed at those who know Hadoop and want to build some intelligent applications over Big data with R packages. It would be helpful if readers have basic knowledge of R.

  20. Traffic information computing platform for big data

    Energy Technology Data Exchange (ETDEWEB)

    Duan, Zongtao, E-mail: ztduan@chd.edu.cn; Li, Ying, E-mail: ztduan@chd.edu.cn; Zheng, Xibin, E-mail: ztduan@chd.edu.cn; Liu, Yan, E-mail: ztduan@chd.edu.cn; Dai, Jiting, E-mail: ztduan@chd.edu.cn; Kang, Jun, E-mail: ztduan@chd.edu.cn [Chang' an University School of Information Engineering, Xi' an, China and Shaanxi Engineering and Technical Research Center for Road and Traffic Detection, Xi' an (China)

    2014-10-06

    Big data environment create data conditions for improving the quality of traffic information service. The target of this article is to construct a traffic information computing platform for big data environment. Through in-depth analysis the connotation and technology characteristics of big data and traffic information service, a distributed traffic atomic information computing platform architecture is proposed. Under the big data environment, this type of traffic atomic information computing architecture helps to guarantee the traffic safety and efficient operation, more intelligent and personalized traffic information service can be used for the traffic information users.

  1. Big Data; A Management Revolution : The emerging role of big data in businesses

    OpenAIRE

    Blasiak, Kevin

    2014-01-01

    Big data is a term that was coined in 2012 and has since then emerged to one of the top trends in business and technology. Big data is an agglomeration of different technologies resulting in data processing capabilities that have been unreached before. Big data is generally characterized by 4 factors. Volume, velocity and variety. These three factors distinct it from the traditional data use. The possibilities to utilize this technology are vast. Big data technology has touch points in differ...

  2. Big Brother Has Bigger Say

    Institute of Scientific and Technical Information of China (English)

    Yang Wei

    2009-01-01

    @@ 156 delegates from all walks of life in Guangdong province composed the Guangdong delegation for the NPC this year. The import and export value of Guangdong makes up one-third of national total value, and accounts for one-eighth of national economic growth. Guangdong province has maintained its top spot in import and export value among China's many provinces and cities for several years, commonly referred to as "Big Brother". At the same time, it is the region where the global financial crisis has hit China hardest.

  3. Big Data en surveillance, deel 1 : Definities en discussies omtrent Big Data

    NARCIS (Netherlands)

    Timan, Tjerk

    2016-01-01

    Naar aanleiding van een (vrij kort) college over surveillance en Big Data, werd me gevraagd iets dieper in te gaan op het thema, definities en verschillende vraagstukken die te maken hebben met big data. In dit eerste deel zal ik proberen e.e.a. uiteen te zetten betreft Big Data theorie en terminolo

  4. Comparative validity of brief to medium-length Big Five and Big Six personality questionnaires

    NARCIS (Netherlands)

    Thalmayer, A.G.; Saucier, G.; Eigenhuis, A.

    2011-01-01

    A general consensus on the Big Five model of personality attributes has been highly generative for the field of personality psychology. Many important psychological and life outcome correlates with Big Five trait dimensions have been established. But researchers must choose between multiple Big Five

  5. Comparative Validity of Brief to Medium-Length Big Five and Big Six Personality Questionnaires

    Science.gov (United States)

    Thalmayer, Amber Gayle; Saucier, Gerard; Eigenhuis, Annemarie

    2011-01-01

    A general consensus on the Big Five model of personality attributes has been highly generative for the field of personality psychology. Many important psychological and life outcome correlates with Big Five trait dimensions have been established. But researchers must choose between multiple Big Five inventories when conducting a study and are…

  6. Big Challenges and Big Opportunities: The Power of "Big Ideas" to Change Curriculum and the Culture of Teacher Planning

    Science.gov (United States)

    Hurst, Chris

    2014-01-01

    Mathematical knowledge of pre-service teachers is currently "under the microscope" and the subject of research. This paper proposes a different approach to teacher content knowledge based on the "big ideas" of mathematics and the connections that exist within and between them. It is suggested that these "big ideas"…

  7. An analysis of cross-sectional differences in big and non-big public accounting firms' audit programs

    NARCIS (Netherlands)

    Blokdijk, J.H. (Hans); Drieenhuizen, F.; Stein, M.T.; Simunic, D.A.

    2006-01-01

    A significant body of prior research has shown that audits by the Big 5 (now Big 4) public accounting firms are quality differentiated relative to non-Big 5 audits. This result can be derived analytically by assuming that Big 5 and non-Big 5 firms face different loss functions for "audit failures" a

  8. Baryon symmetric big bang cosmology

    Science.gov (United States)

    Stecker, F. W.

    1978-01-01

    Both the quantum theory and Einsteins theory of special relativity lead to the supposition that matter and antimatter were produced in equal quantities during the big bang. It is noted that local matter/antimatter asymmetries may be reconciled with universal symmetry by assuming (1) a slight imbalance of matter over antimatter in the early universe, annihilation, and a subsequent remainder of matter; (2) localized regions of excess for one or the other type of matter as an initial condition; and (3) an extremely dense, high temperature state with zero net baryon number; i.e., matter/antimatter symmetry. Attention is given to the third assumption, which is the simplest and the most in keeping with current knowledge of the cosmos, especially as pertains the universality of 3 K background radiation. Mechanisms of galaxy formation are discussed, whereby matter and antimatter might have collided and annihilated each other, or have coexisted (and continue to coexist) at vast distances. It is pointed out that baryon symmetric big bang cosmology could probably be proved if an antinucleus could be detected in cosmic radiation.

  9. Georges et le big bang

    CERN Document Server

    Hawking, Lucy; Parsons, Gary

    2011-01-01

    Georges et Annie, sa meilleure amie, sont sur le point d'assister à l'une des plus importantes expériences scientifiques de tous les temps : explorer les premiers instants de l'Univers, le Big Bang ! Grâce à Cosmos, leur super ordinateur, et au Grand Collisionneur de hadrons créé par Éric, le père d'Annie, ils vont enfin pouvoir répondre à cette question essentielle : pourquoi existons nous ? Mais Georges et Annie découvrent qu'un complot diabolique se trame. Pire, c'est toute la recherche scientifique qui est en péril ! Entraîné dans d'incroyables aventures, Georges ira jusqu'aux confins de la galaxie pour sauver ses amis...Une plongée passionnante au coeur du Big Bang. Les toutes dernières théories de Stephen Hawking et des plus grands scientifiques actuels.

  10. Cancer genomics

    DEFF Research Database (Denmark)

    Norrild, Bodil; Guldberg, Per; Ralfkiær, Elisabeth Methner

    2007-01-01

    Almost all cells in the human body contain a complete copy of the genome with an estimated number of 25,000 genes. The sequences of these genes make up about three percent of the genome and comprise the inherited set of genetic information. The genome also contains information that determines whe...

  11. Research Progress of the Application of Big Data in China’s Urban Planning

    Institute of Scientific and Technical Information of China (English)

    Dang; Anrong; Xu; Jian; Tong; Biao; Li; Juan; Qian; Fang

    2015-01-01

    The arrival of the big data era facilitates the reform on related research and its application in the fi eld of urban planning from the mode of thinking to the technical method, which provides a technical foundation and platform for supporting the demands of residents as micro subjects in the process of city development. Beginning with analyzing the changes in urban planning thoughts under the infl uence of big data, this paper then summarizes the major reforms in urban planning research methodology and a city’s plan formulation process caused by the application of big data, such as thinking mode, researching method, and planning process, based on which relevant issues like publicity and sharing of data, authenticity of data, and security of data that need to be further discussed and solved are analyzed, with the expectation of promoting the development of urban planning research and application in this new era.

  12. Unlocking the Power of Big Data at the National Institutes of Health

    Science.gov (United States)

    Coakley, Meghan F.; Leerkes, Maarten R.; Barnett, Jason; Gabrielian, Andrei E.; Noble, Karlynn; Weber, M. Nick

    2013-01-01

    Abstract The era of “big data” presents immense opportunities for scientific discovery and technological progress, with the potential to have enormous impact on research and development in the public sector. In order to capitalize on these benefits, there are significant challenges to overcome in data analytics. The National Institute of Allergy and Infectious Diseases held a symposium entitled “Data Science: Unlocking the Power of Big Data” to create a forum for big data experts to present and share some of the creative and innovative methods to gleaning valuable knowledge from an overwhelming flood of biological data. A significant investment in infrastructure and tool development, along with more and better-trained data scientists, may facilitate methods for assimilation of data and machine learning, to overcome obstacles such as data security, data cleaning, and data integration. PMID:27442200

  13. Unlocking the Power of Big Data at the National Institutes of Health.

    Science.gov (United States)

    Coakley, Meghan F; Leerkes, Maarten R; Barnett, Jason; Gabrielian, Andrei E; Noble, Karlynn; Weber, M Nick; Huyen, Yentram

    2013-09-01

    The era of "big data" presents immense opportunities for scientific discovery and technological progress, with the potential to have enormous impact on research and development in the public sector. In order to capitalize on these benefits, there are significant challenges to overcome in data analytics. The National Institute of Allergy and Infectious Diseases held a symposium entitled "Data Science: Unlocking the Power of Big Data" to create a forum for big data experts to present and share some of the creative and innovative methods to gleaning valuable knowledge from an overwhelming flood of biological data. A significant investment in infrastructure and tool development, along with more and better-trained data scientists, may facilitate methods for assimilation of data and machine learning, to overcome obstacles such as data security, data cleaning, and data integration.

  14. Why Big Data Is a Big Deal (Ⅱ)

    Institute of Scientific and Technical Information of China (English)

    2011-01-01

    A new group of data mining technologies promises to change forever the way we sift through our vast stores of data,making it faster and cheaper.Some of the technologies are actively being used by people on the bleeding edge who need the technology now,like those involved in creating Web-based services that are driven by social media.They're also heavily contributing to these projects.In other vertical industries,businesses are realizing that much more of their value proposition is informationbased than they had previously thought,which will allow big data technologies to gain traction quickly,Olofson says.Couple that with affordable hardware and software,and enterprises find themselves in a perfect storm of business transformation opportunities.

  15. Big Data – Big Deal for Organization Design?

    Directory of Open Access Journals (Sweden)

    Janne J. Korhonen

    2014-04-01

    Full Text Available Analytics is an increasingly important source of competitive advantage. It has even been posited that big data will be the next strategic emphasis of organizations and that analytics capability will be manifested in organizational structure. In this article, I explore how analytics capability might be reflected in organizational structure using the notion of  “requisite organization” developed by Jaques (1998. Requisite organization argues that a new strategic emphasis requires the addition of a new stratum in the organization, resulting in greater organizational complexity. Requisite organization could serve as an objective, verifiable criterion for what qualifies as a genuine new strategic emphasis. Such a criterion is  necessary for research on the co-evolution of strategy and structure.

  16. Kansen voor Big data – WPA Vertrouwen

    NARCIS (Netherlands)

    Broek, T.A. van den; Roosendaal, A.P.C.; Veenstra, A.F.E. van; Nunen, A.M. van

    2014-01-01

    Big data is expected to become a driver for economic growth, but this can only be achieved when services based on (big) data are accepted by citizens and consumers. In a recent policy brief, the Cabinet Office mentions trust as one of the three pillars (the others being transparency and control) for

  17. The Death of the Big Men

    DEFF Research Database (Denmark)

    Martin, Keir

    2010-01-01

    Recently Tolai people og Papua New Guinea have adopted the term 'Big Shot' to decribe an emerging post-colonial political elite. The mergence of the term is a negative moral evaluation of new social possibilities that have arisen as a consequence of the Big Shots' privileged position within a glo...

  18. The Big Sleep in the Woods

    Institute of Scientific and Technical Information of China (English)

    王玉峰

    2002-01-01

    Now it's the time of the big sleep for the bees and the bears. Even the buds of the plants whose leaves fall off share in it. But the intensity of this winter sleep, or hibernation, depends on who's doing it.The big sleep of the bears ,for instance ,would probably be thought of as a

  19. A New Look at Big History

    Science.gov (United States)

    Hawkey, Kate

    2014-01-01

    The article sets out a "big history" which resonates with the priorities of our own time. A globalizing world calls for new spacial scales to underpin what the history curriculum addresses, "big history" calls for new temporal scales, while concern over climate change calls for a new look at subject boundaries. The article…

  20. 革命者BIG BANG

    Institute of Scientific and Technical Information of China (English)

    刘岩

    2015-01-01

    <正>在鄂尔多斯的繁荣时代,我遇见了那里的一位"意见领袖",因为他从美国回来,见过外面的世界,有着对奢侈品辽阔的见识和独到的品味。他引领着那座神秘财富城市中一个小圈子的购物风潮,他们一块接一块儿地购入Big Bang。那个时候,我并不太清楚他们迷恋这款腕表的原因,直到我一次次地去到巴塞尔表展,一次次地了解到Big Bang的想象力。是的,Big Bang的确充满了魅力。Big Bang进化史2005年Big Bang系列诞生2006年Big Bang全黑"全黑"理念使Big Bang更加纯粹和简洁。Big Bang全黑腕表从表壳到表盘浑然天成的亚光质感和多层次、不同材料融合起来的黑色,蕴含"不可见的可见"之禅意。

  1. An embedding for the big bang

    Science.gov (United States)

    Wesson, Paul S.

    1994-01-01

    A cosmological model is given that has good physical properties for the early and late universe but is a hypersurface in a flat five-dimensional manifold. The big bang can therefore be regarded as an effect of a choice of coordinates in a truncated higher-dimensional geometry. Thus the big bang is in some sense a geometrical illusion.

  2. Structuring the Curriculum around Big Ideas

    Science.gov (United States)

    Alleman, Janet; Knighton, Barbara; Brophy, Jere

    2010-01-01

    This article provides an inside look at Barbara Knighton's classroom teaching. She uses big ideas to guide her planning and instruction and gives other teachers suggestions for adopting the big idea approach and ways for making the approach easier. This article also represents a "small slice" of a dozen years of collaborative research,…

  3. Small molecules for big tasks

    Institute of Scientific and Technical Information of China (English)

    Jiarui Wu

    2011-01-01

    @@ One of the most important achievements in the post-genome era is discovery of microRNAs (miRNAs), which widely exist from simple-genome organisms such as viruses and bacteria to complexgenome organisms such as plants and animals.miRNAs are single-stranded non-coding RNAs of 18-25 nucleotides in length, which are generated from larger precursors that are transcribed from noncoding genes.As a new type of regulatory molecules, miRNAs present unique features in regulating gene and its products, including rapidly turning off protein production, reversibly, and compartmentalized regulating gene expression.

  4. Using Sex to Cure the Genome.

    Directory of Open Access Journals (Sweden)

    Eduardo P C Rocha

    2016-03-01

    Full Text Available The diversification of prokaryotes is accelerated by their ability to acquire DNA from other genomes. However, the underlying processes also facilitate genome infection by costly mobile genetic elements. The discovery that cells can uptake DNA by natural transformation was instrumental to the birth of molecular biology nearly a century ago. Surprisingly, a new study shows that this mechanism could efficiently cure the genome of mobile elements acquired through previous sexual exchanges.

  5. Cloud Based Big Data Infrastructure: Architectural Components and Automated Provisioning

    OpenAIRE

    Demchenko, Yuri; Turkmen, Fatih; Blanchet, Christophe; Loomis, Charles; Laat, Caees de

    2016-01-01

    This paper describes the general architecture and functional components of the cloud based Big Data Infrastructure (BDI). The proposed BDI architecture is based on the analysis of the emerging Big Data and data intensive technologies and supported by the definition of the Big Data Architecture Framework (BDAF) that defines the following components of the Big Data technologies: Big Data definition, Data Management including data lifecycle and data structures, Big Data Infrastructure (generical...

  6. Application Research on Agricultural Big Data%农业领域大数据的应用研究

    Institute of Scientific and Technical Information of China (English)

    光峰; 姚程宽; 王维进

    2015-01-01

    本文将大数据技术在农业中的特点、应用领域进行了详细的阐述,并指出了发展中面临的问题。%With the development of Internet technology and communication technology , human community has entered the era of huge amounts of data .The term “big data” has been very popular in the fields of industry and commerce.Although China is a big agricultural country , the foundation of agriculture is weak and the agricultural productivity is low .Big data technology applied in agriculture can facilitate the improvement of agricultural produc-tion conditions and the increase of farming population ’ s income.In this paper, the characteristics, application fields, as well as the current difficulties of big data applied in agriculture are detailed .

  7. Lecture 10: The European Bioinformatics Institute - "Big data" for biomedical sciences

    CERN Document Server

    CERN. Geneva; Dana, Jose

    2013-01-01

    Part 1: Big data for biomedical sciences (Tom Hancocks) Ten years ago witnessed the completion of the first international 'Big Biology' project that sequenced the human genome. In the years since biological sciences, have seen a vast growth in data. In the coming years advances will come from integration of experimental approaches and the translation into applied technologies is the hospital, clinic and even at home. This talk will examine the development of infrastructure, physical and virtual, that will allow millions of life scientists across Europe better access to biological data Tom studied Human Genetics at the University of Leeds and McMaster University, before completing an MSc in Analytical Genomics at the University of Birmingham. He has worked for the UK National Health Service in diagnostic genetics and in training healthcare scientists and clinicians in bioinformatics. Tom joined the EBI in 2012 and is responsible for the scientific development and delivery of training for the BioMedBridges pr...

  8. The ethics of biomedical big data

    CERN Document Server

    Mittelstadt, Brent Daniel

    2016-01-01

    This book presents cutting edge research on the new ethical challenges posed by biomedical Big Data technologies and practices. ‘Biomedical Big Data’ refers to the analysis of aggregated, very large datasets to improve medical knowledge and clinical care. The book describes the ethical problems posed by aggregation of biomedical datasets and re-use/re-purposing of data, in areas such as privacy, consent, professionalism, power relationships, and ethical governance of Big Data platforms. Approaches and methods are discussed that can be used to address these problems to achieve the appropriate balance between the social goods of biomedical Big Data research and the safety and privacy of individuals. Seventeen original contributions analyse the ethical, social and related policy implications of the analysis and curation of biomedical Big Data, written by leading experts in the areas of biomedical research, medical and technology ethics, privacy, governance and data protection. The book advances our understan...

  9. Ethics and Epistemology in Big Data Research.

    Science.gov (United States)

    Lipworth, Wendy; Mason, Paul H; Kerridge, Ian; Ioannidis, John P A

    2017-03-20

    Biomedical innovation and translation are increasingly emphasizing research using "big data." The hope is that big data methods will both speed up research and make its results more applicable to "real-world" patients and health services. While big data research has been embraced by scientists, politicians, industry, and the public, numerous ethical, organizational, and technical/methodological concerns have also been raised. With respect to technical and methodological concerns, there is a view that these will be resolved through sophisticated information technologies, predictive algorithms, and data analysis techniques. While such advances will likely go some way towards resolving technical and methodological issues, we believe that the epistemological issues raised by big data research have important ethical implications and raise questions about the very possibility of big data research achieving its goals.

  10. 2nd INNS Conference on Big Data

    CERN Document Server

    Manolopoulos, Yannis; Iliadis, Lazaros; Roy, Asim; Vellasco, Marley

    2017-01-01

    The book offers a timely snapshot of neural network technologies as a significant component of big data analytics platforms. It promotes new advances and research directions in efficient and innovative algorithmic approaches to analyzing big data (e.g. deep networks, nature-inspired and brain-inspired algorithms); implementations on different computing platforms (e.g. neuromorphic, graphics processing units (GPUs), clouds, clusters); and big data analytics applications to solve real-world problems (e.g. weather prediction, transportation, energy management). The book, which reports on the second edition of the INNS Conference on Big Data, held on October 23–25, 2016, in Thessaloniki, Greece, depicts an interesting collaborative adventure of neural networks with big data and other learning technologies.

  11. Review Study of Mining Big Data

    Directory of Open Access Journals (Sweden)

    Mohammad Misagh Javaherian

    2016-06-01

    Full Text Available Big data is time period for collecting extensive and complex data set which including both structured and nonstructured information. Data can come from everywhere. sensors for collecting environment data are presented in online networking targets, computer images and recording and so on , this information is known as big data. The valuable data can be extracted from this big data using data mining. Data mining is a method to find attractive samples and also logical models of information in wide scale. This article shown types of big data and future problems in extensive information as a chart. Study of issues in data-centered model in addition to big data will be analyzed.

  12. Facilitated Communication in Mainstream Schools.

    Science.gov (United States)

    Remington-Gurney, Jane; Crossley, Rosemary

    Facilitated communication is described as a method of training communication partners or facilitators to provide physical assistance to communication aid users, to help them overcome physical and emotional problems in using their aids. In Melbourne (Victoria, Australia), the DEAL (Dignity, Education and Language) Centre has identified 96 people…

  13. Big Data Initiatives for Agroecosystems

    Science.gov (United States)

    NAL has developed a workspace for research groups associated with the i5k initiative, which aims to sequence the genomes of all insesct species known to be important to worldwide agriculture, food safety, medicine, and energy production; all those used as models in biology; the most abundant in worl...

  14. Using Globus GridFTP to Transfer and Share Big Data | Poster

    Science.gov (United States)

    By Ashley DeVine, Staff Writer, and Mark Wance, Guest Writer; photo by Richard Frederickson, Staff Photographer Transferring big data, such as the genomics data delivered to customers from the Center for Cancer Research Sequencing Facility (CCR SF), has been difficult in the past because the transfer systems have not kept pace with the size of the data. However, the situation is changing as a result of the Globus GridFTP project.

  15. Advancements in Big Data Processing

    CERN Document Server

    Vaniachine, A; The ATLAS collaboration

    2012-01-01

    The ever-increasing volumes of scientific data present new challenges for Distributed Computing and Grid-technologies. The emerging Big Data revolution drives new discoveries in scientific fields including nanotechnology, astrophysics, high-energy physics, biology and medicine. New initiatives are transforming data-driven scientific fields by pushing Bid Data limits enabling massive data analysis in new ways. In petascale data processing scientists deal with datasets, not individual files. As a result, a task (comprised of many jobs) became a unit of petascale data processing on the Grid. Splitting of a large data processing task into jobs enabled fine-granularity checkpointing analogous to the splitting of a large file into smaller TCP/IP packets during data transfers. Transferring large data in small packets achieves reliability through automatic re-sending of the dropped TCP/IP packets. Similarly, transient job failures on the Grid can be recovered by automatic re-tries to achieve reliable Six Sigma produc...

  16. Evidence of the Big Fix

    CERN Document Server

    Hamada, Yuta; Kawana, Kiyoharu

    2014-01-01

    We give an evidence of the Big Fix. The theory of wormholes and multiverse suggests that the parameters of the Standard Model are fixed in such a way that the total entropy at the late stage of the universe is maximized, which we call the maximum entropy principle. In this paper, we discuss how it can be confirmed by the experimental data, and we show that it is indeed true for the Higgs vacuum expectation value $v_{h}$. We assume that the baryon number is produced by the sphaleron process, and that the current quark masses, the gauge couplings and the Higgs self coupling are fixed when we vary $v_{h}$. It turns out that the existence of the atomic nuclei plays a crucial role to maximize the entropy. This is reminiscent of the anthropic principle, however it is required by the fundamental low in our case.

  17. Evidence of the big fix

    Science.gov (United States)

    Hamada, Yuta; Kawai, Hikaru; Kawana, Kiyoharu

    2014-06-01

    We give an evidence of the Big Fix. The theory of wormholes and multiverse suggests that the parameters of the Standard Model are fixed in such a way that the total entropy at the late stage of the universe is maximized, which we call the maximum entropy principle. In this paper, we discuss how it can be confirmed by the experimental data, and we show that it is indeed true for the Higgs vacuum expectation value vh. We assume that the baryon number is produced by the sphaleron process, and that the current quark masses, the gauge couplings and the Higgs self-coupling are fixed when we vary vh. It turns out that the existence of the atomic nuclei plays a crucial role to maximize the entropy. This is reminiscent of the anthropic principle, however it is required by the fundamental law in our case.

  18. Big Book of Apple Hacks

    CERN Document Server

    Seibold, Chris

    2008-01-01

    Bigger in size, longer in length, broader in scope, and even more useful than our original Mac OS X Hacks, the new Big Book of Apple Hacks offers a grab bag of tips, tricks and hacks to get the most out of Mac OS X Leopard, as well as the new line of iPods, iPhone, and Apple TV. With 125 entirely new hacks presented in step-by-step fashion, this practical book is for serious Apple computer and gadget users who really want to take control of these systems. Many of the hacks take you under the hood and show you how to tweak system preferences, alter or add keyboard shortcuts, mount drives and

  19. Was the Big Bang hot?

    Science.gov (United States)

    Wright, E. L.

    1983-01-01

    Techniques for verifying the spectrum defined by Woody and Richards (WR, 1981), which serves as a base for dust-distorted models of the 3 K background, are discussed. WR detected a sharp deviation from the Planck curve in the 3 K background. The absolute intensity of the background may be determined by the frequency dependence of the dipole anisotropy of the background or the frequency dependence effect in galactic clusters. Both methods involve the Doppler shift; analytical formulae are defined for characterization of the dipole anisotropy. The measurement of the 30-300 GHz spectra of cold galactic dust may reveal the presence of significant amounts of needle-shaped grains, which would in turn support a theory of a cold Big Bang.

  20. Big ideas for psychotherapy training.

    Science.gov (United States)

    Fauth, James; Gates, Sarah; Vinca, Maria Ann; Boles, Shawna; Hayes, Jeffrey A

    2007-12-01

    Research indicates that traditional psychotherapy training practices are ineffective in durably improving the effectiveness of psychotherapists. In addition, the quantity and quality of psychotherapy training research has also been limited in several ways. Thus, based on extant scholarship and personal experience, we offer several suggestions for improving on this state of affairs. Specifically, we propose that future psychotherapy trainings focus on a few "big ideas," target psychotherapist meta-cognitive skills, and attend more closely to the organizational/treatment context in which the training takes place. In terms of future training research, we recommend that researchers include a wider range of intermediate outcomes in their studies, examine the nature of trainee skill development, and investigate the role that organizational/treatment culture plays in terms of the retention of changes elicited by psychotherapy training. (PsycINFO Database Record (c) 2010 APA, all rights reserved).

  1. Design and development of a medical big data processing system based on Hadoop.

    Science.gov (United States)

    Yao, Qin; Tian, Yu; Li, Peng-Fei; Tian, Li-Li; Qian, Yang-Ming; Li, Jing-Song

    2015-03-01

    Secondary use of medical big data is increasingly popular in healthcare services and clinical research. Understanding the logic behind medical big data demonstrates tendencies in hospital information technology and shows great significance for hospital information systems that are designing and expanding services. Big data has four characteristics--Volume, Variety, Velocity and Value (the 4 Vs)--that make traditional systems incapable of processing these data using standalones. Apache Hadoop MapReduce is a promising software framework for developing applications that process vast amounts of data in parallel with large clusters of commodity hardware in a reliable, fault-tolerant manner. With the Hadoop framework and MapReduce application program interface (API), we can more easily develop our own MapReduce applications to run on a Hadoop framework that can scale up from a single node to thousands of machines. This paper investigates a practical case of a Hadoop-based medical big data processing system. We developed this system to intelligently process medical big data and uncover some features of hospital information system user behaviors. This paper studies user behaviors regarding various data produced by different hospital information systems for daily work. In this paper, we also built a five-node Hadoop cluster to execute distributed MapReduce algorithms. Our distributed algorithms show promise in facilitating efficient data processing with medical big data in healthcare services and clinical research compared with single nodes. Additionally, with medical big data analytics, we can design our hospital information systems to be much more intelligent and easier to use by making personalized recommendations.

  2. Microsystems - The next big thing

    Energy Technology Data Exchange (ETDEWEB)

    STINNETT,REGAN W.

    2000-05-11

    Micro-Electro-Mechanical Systems (MEMS) is a big name for tiny devices that will soon make big changes in everyday life and the workplace. These and other types of Microsystems range in size from a few millimeters to a few microns, much smaller than a human hair. These Microsystems have the capability to enable new ways to solve problems in commercial applications ranging from automotive, aerospace, telecommunications, manufacturing equipment, medical diagnostics to robotics, and in national security applications such as nuclear weapons safety and security, battlefield intelligence, and protection against chemical and biological weapons. This broad range of applications of Microsystems reflects the broad capabilities of future Microsystems to provide the ability to sense, think, act, and communicate, all in a single integrated package. Microsystems have been called the next silicon revolution, but like many revolutions, they incorporate more elements than their predecessors. Microsystems do include MEMS components fabricated from polycrystalline silicon processed using techniques similar to those used in the manufacture of integrated electrical circuits. They also include optoelectronic components made from gallium arsenide and other semiconducting compounds from the III-V groups of the periodic table. Microsystems components are also being made from pure metals and metal alloys using the LIGA process, which utilizes lithography, etching, and casting at the micron scale. Generically, Microsystems are micron scale, integrated systems that have the potential to combine the ability to sense light, heat, pressure, acceleration, vibration, and chemicals with the ability to process the collected data using CMOS circuitry, execute an electrical, mechanical, or photonic response, and communicate either optically or with microwaves.

  3. Implementing Operational Analytics using Big Data Technologies to Detect and Predict Sensor Anomalies

    Science.gov (United States)

    Coughlin, J.; Mital, R.; Nittur, S.; SanNicolas, B.; Wolf, C.; Jusufi, R.

    2016-09-01

    Operational analytics when combined with Big Data technologies and predictive techniques have been shown to be valuable in detecting mission critical sensor anomalies that might be missed by conventional analytical techniques. Our approach helps analysts and leaders make informed and rapid decisions by analyzing large volumes of complex data in near real-time and presenting it in a manner that facilitates decision making. It provides cost savings by being able to alert and predict when sensor degradations pass a critical threshold and impact mission operations. Operational analytics, which uses Big Data tools and technologies, can process very large data sets containing a variety of data types to uncover hidden patterns, unknown correlations, and other relevant information. When combined with predictive techniques, it provides a mechanism to monitor and visualize these data sets and provide insight into degradations encountered in large sensor systems such as the space surveillance network. In this study, data from a notional sensor is simulated and we use big data technologies, predictive algorithms and operational analytics to process the data and predict sensor degradations. This study uses data products that would commonly be analyzed at a site. This study builds on a big data architecture that has previously been proven valuable in detecting anomalies. This paper outlines our methodology of implementing an operational analytic solution through data discovery, learning and training of data modeling and predictive techniques, and deployment. Through this methodology, we implement a functional architecture focused on exploring available big data sets and determine practical analytic, visualization, and predictive technologies.

  4. Big Data, data integrity, and the fracturing of the control zone

    Directory of Open Access Journals (Sweden)

    Carl Lagoze

    2014-11-01

    Full Text Available Despite all the attention to Big Data and the claims that it represents a “paradigm shift” in science, we lack understanding about what are the qualities of Big Data that may contribute to this revolutionary impact. In this paper, we look beyond the quantitative aspects of Big Data (i.e. lots of data and examine it from a sociotechnical perspective. We argue that a key factor that distinguishes “Big Data” from “lots of data” lies in changes to the traditional, well-established “control zones” that facilitated clear provenance of scientific data, thereby ensuring data integrity and providing the foundation for credible science. The breakdown of these control zones is a consequence of the manner in which our network technology and culture enable and encourage open, anonymous sharing of information, participation regardless of expertise, and collaboration across geographic, disciplinary, and institutional barriers. We are left with the conundrum—how to reap the benefits of Big Data while re-creating a trust fabric and an accountable chain of responsibility that make credible science possible.

  5. Improving public transport decision making, planning and operations by using big data: Cases from Sweden and the Netherlands

    NARCIS (Netherlands)

    Van Oort, N.; Cats, O.

    2015-01-01

    New big data (sources) in the public transport industry enable to deal with major challenges such as elevating efficiency, increasing passenger ridership and satisfaction and facilitate the information flow between service providers and service users. This paper presents two actual cases from the Ne

  6. The evolution of the Anopheles 16 genomes project

    NARCIS (Netherlands)

    Neafsey, Daniel E.; Christophides, George K.; Collins, Frank H.; Emrich, Scott J.; Fontaine, Michael C.; Gelbart, William; Hahn, Matthew W.; Howell, Paul I.; Kafatos, Fotis C.; Lawson, Daniel; Muskavitch, Marc A. T.; Waterhouse, Robert M.; Williams, Louise J.; Besansky, Nora J.

    2013-01-01

    We report the imminent completion of a set of reference genome assemblies for 16 species of Anopheles mosquitoes. In addition to providing a generally useful resource for comparative genomic analyses, these genome sequences will greatly facilitate exploration of the capacity exhibited by some Anophe

  7. What makes Big Data, Big Data? Exploring the ontological characteristics of 26 datasets

    Directory of Open Access Journals (Sweden)

    Rob Kitchin

    2016-02-01

    Full Text Available Big Data has been variously defined in the literature. In the main, definitions suggest that Big Data possess a suite of key traits: volume, velocity and variety (the 3Vs, but also exhaustivity, resolution, indexicality, relationality, extensionality and scalability. However, these definitions lack ontological clarity, with the term acting as an amorphous, catch-all label for a wide selection of data. In this paper, we consider the question ‘what makes Big Data, Big Data?’, applying Kitchin’s taxonomy of seven Big Data traits to 26 datasets drawn from seven domains, each of which is considered in the literature to constitute Big Data. The results demonstrate that only a handful of datasets possess all seven traits, and some do not possess either volume and/or variety. Instead, there are multiple forms of Big Data. Our analysis reveals that the key definitional boundary markers are the traits of velocity and exhaustivity. We contend that Big Data as an analytical category needs to be unpacked, with the genus of Big Data further delineated and its various species identified. It is only through such ontological work that we will gain conceptual clarity about what constitutes Big Data, formulate how best to make sense of it, and identify how it might be best used to make sense of the world.

  8. [Big data in medicine and healthcare].

    Science.gov (United States)

    Rüping, Stefan

    2015-08-01

    Healthcare is one of the business fields with the highest Big Data potential. According to the prevailing definition, Big Data refers to the fact that data today is often too large and heterogeneous and changes too quickly to be stored, processed, and transformed into value by previous technologies. The technological trends drive Big Data: business processes are more and more executed electronically, consumers produce more and more data themselves - e.g. in social networks - and finally ever increasing digitalization. Currently, several new trends towards new data sources and innovative data analysis appear in medicine and healthcare. From the research perspective, omics-research is one clear Big Data topic. In practice, the electronic health records, free open data and the "quantified self" offer new perspectives for data analytics. Regarding analytics, significant advances have been made in the information extraction from text data, which unlocks a lot of data from clinical documentation for analytics purposes. At the same time, medicine and healthcare is lagging behind in the adoption of Big Data approaches. This can be traced to particular problems regarding data complexity and organizational, legal, and ethical challenges. The growing uptake of Big Data in general and first best-practice examples in medicine and healthcare in particular, indicate that innovative solutions will be coming. This paper gives an overview of the potentials of Big Data in medicine and healthcare.

  9. Genome editing in cardiovascular diseases.

    Science.gov (United States)

    Strong, Alanna; Musunuru, Kiran

    2017-01-01

    Genome-editing tools, which include zinc finger nucleases (ZFNs), transcription activator-like effector nucleases (TALENs), and clustered regularly interspaced short palindromic repeats (CRISPR)/CRISPR-associated 9 (Cas9) systems, have emerged as an invaluable technology to achieve somatic and germline genomic manipulation in cells and model organisms for multiple applications, including the creation of knockout alleles, introducing desired mutations into genomic DNA, and inserting novel transgenes. Genome editing is being rapidly adopted into all fields of biomedical research, including the cardiovascular field, where it has facilitated a greater understanding of lipid metabolism, electrophysiology, cardiomyopathies, and other cardiovascular disorders, has helped to create a wider variety of cellular and animal models, and has opened the door to a new class of therapies. In this Review, we discuss the applications of genome-editing technology throughout cardiovascular disease research and the prospect of in vivo genome-editing therapies in the future. We also describe some of the existing limitations of genome-editing tools that will need to be addressed if cardiovascular genome editing is to achieve its full scientific and therapeutic potential.

  10. Big data as governmentality in international development

    DEFF Research Database (Denmark)

    Flyverbom, Mikkel; Madsen, Anders Koed; Rasche, Andreas

    2017-01-01

    Statistics have long shaped the field of visibility for the governance of development projects. The introduction of big data has altered the field of visibility. Employing Dean's “analytics of government” framework, we analyze two cases—malaria tracking in Kenya and monitoring of food prices...... in Indonesia. Our analysis shows that big data introduces a bias toward particular types of visualizations. What problems are being made visible through big data depends to some degree on how the underlying data is visualized and who is captured in the visualizations. It is also influenced by technical factors...

  11. Processing Solutions for Big Data in Astronomy

    Science.gov (United States)

    Fillatre, L.; Lepiller, D.

    2016-09-01

    This paper gives a simple introduction to processing solutions applied to massive amounts of data. It proposes a general presentation of the Big Data paradigm. The Hadoop framework, which is considered as the pioneering processing solution for Big Data, is described together with YARN, the integrated Hadoop tool for resource allocation. This paper also presents the main tools for the management of both the storage (NoSQL solutions) and computing capacities (MapReduce parallel processing schema) of a cluster of machines. Finally, more recent processing solutions like Spark are discussed. Big Data frameworks are now able to run complex applications while keeping the programming simple and greatly improving the computing speed.

  12. Big data governance an emerging imperative

    CERN Document Server

    Soares, Sunil

    2012-01-01

    Written by a leading expert in the field, this guide focuses on the convergence of two major trends in information management-big data and information governance-by taking a strategic approach oriented around business cases and industry imperatives. With the advent of new technologies, enterprises are expanding and handling very large volumes of data; this book, nontechnical in nature and geared toward business audiences, encourages the practice of establishing appropriate governance over big data initiatives and addresses how to manage and govern big data, highlighting the relevant processes,

  13. BLENDING IOT AND BIG DATA ANALYTICS

    OpenAIRE

    Tulasi.B*; Girish J Vemulkar

    2016-01-01

    Internet is continuously evolving and changing. Internet of Things (IoT) can be considered as the future of Internet applications which involves machine to machine learning (M2M). The actionable intelligence can be derived through fusion of Big Data and real time analytics with IoT. Big Data and IoT can be viewed as two sides of a coin. With the connection between Big Data and the objects on Internet benefits of IoT can be easily reaped. The applications of IoT spread across various domains l...

  14. Big data and the electronic health record.

    Science.gov (United States)

    Peters, Steve G; Buntrock, James D

    2014-01-01

    The electronic medical record has evolved from a digital representation of individual patient results and documents to information of large scale and complexity. Big Data refers to new technologies providing management and processing capabilities, targeting massive and disparate data sets. For an individual patient, techniques such as Natural Language Processing allow the integration and analysis of textual reports with structured results. For groups of patients, Big Data offers the promise of large-scale analysis of outcomes, patterns, temporal trends, and correlations. The evolution of Big Data analytics moves us from description and reporting to forecasting, predictive modeling, and decision optimization.

  15. Big Data and historical social science

    Directory of Open Access Journals (Sweden)

    Peter Bearman

    2015-11-01

    Full Text Available “Big Data” can revolutionize historical social science if it arises from substantively important contexts and is oriented towards answering substantively important questions. Such data may be especially important for answering previously largely intractable questions about the timing and sequencing of events, and of event boundaries. That said, “Big Data” makes no difference for social scientists and historians whose accounts rest on narrative sentences. Since such accounts are the norm, the effects of Big Data on the practice of historical social science may be more limited than one might wish.

  16. A Big Bang model of human colorectal tumor growth.

    Science.gov (United States)

    Sottoriva, Andrea; Kang, Haeyoun; Ma, Zhicheng; Graham, Trevor A; Salomon, Matthew P; Zhao, Junsong; Marjoram, Paul; Siegmund, Kimberly; Press, Michael F; Shibata, Darryl; Curtis, Christina

    2015-03-01

    What happens in early, still undetectable human malignancies is unknown because direct observations are impractical. Here we present and validate a 'Big Bang' model, whereby tumors grow predominantly as a single expansion producing numerous intermixed subclones that are not subject to stringent selection and where both public (clonal) and most detectable private (subclonal) alterations arise early during growth. Genomic profiling of 349 individual glands from 15 colorectal tumors showed an absence of selective sweeps, uniformly high intratumoral heterogeneity (ITH) and subclone mixing in distant regions, as postulated by our model. We also verified the prediction that most detectable ITH originates from early private alterations and not from later clonal expansions, thus exposing the profile of the primordial tumor. Moreover, some tumors appear 'born to be bad', with subclone mixing indicative of early malignant potential. This new model provides a quantitative framework to interpret tumor growth dynamics and the origins of ITH, with important clinical implications.

  17. Big and complex data analysis methodologies and applications

    CERN Document Server

    2017-01-01

    This volume conveys some of the surprises, puzzles and success stories in high-dimensional and complex data analysis and related fields. Its peer-reviewed contributions showcase recent advances in variable selection, estimation and prediction strategies for a host of useful models, as well as essential new developments in the field. The continued and rapid advancement of modern technology now allows scientists to collect data of increasingly unprecedented size and complexity. Examples include epigenomic data, genomic data, proteomic data, high-resolution image data, high-frequency financial data, functional and longitudinal data, and network data. Simultaneous variable selection and estimation is one of the key statistical problems involved in analyzing such big and complex data. The purpose of this book is to stimulate research and foster interaction between researchers in the area of high-dimensional data analysis. More concretely, its goals are to: 1) highlight and expand the breadth of existing methods in...

  18. Big questions, big science: meeting the challenges of global ecology.

    Science.gov (United States)

    Schimel, David; Keller, Michael

    2015-04-01

    Ecologists are increasingly tackling questions that require significant infrastucture, large experiments, networks of observations, and complex data and computation. Key hypotheses in ecology increasingly require more investment, and larger data sets to be tested than can be collected by a single investigator's or s group of investigator's labs, sustained for longer than a typical grant. Large-scale projects are expensive, so their scientific return on the investment has to justify the opportunity cost-the science foregone because resources were expended on a large project rather than supporting a number of individual projects. In addition, their management must be accountable and efficient in the use of significant resources, requiring the use of formal systems engineering and project management to mitigate risk of failure. Mapping the scientific method into formal project management requires both scientists able to work in the context, and a project implementation team sensitive to the unique requirements of ecology. Sponsoring agencies, under pressure from external and internal forces, experience many pressures that push them towards counterproductive project management but a scientific community aware and experienced in large project science can mitigate these tendencies. For big ecology to result in great science, ecologists must become informed, aware and engaged in the advocacy and governance of large ecological projects.

  19. Secure Genomic Computation through Site-Wise Encryption.

    Science.gov (United States)

    Zhao, Yongan; Wang, XiaoFeng; Tang, Haixu

    2015-01-01

    Commercial clouds provide on-demand IT services for big-data analysis, which have become an attractive option for users who have no access to comparable infrastructure. However, utilizing these services for human genome analysis is highly risky, as human genomic data contains identifiable information of human individuals and their disease susceptibility. Therefore, currently, no computation on personal human genomic data is conducted on public clouds. To address this issue, here we present a site-wise encryption approach to encrypt whole human genome sequences, which can be subject to secure searching of genomic signatures on public clouds. We implemented this method within the Hadoop framework, and tested it on the case of searching disease markers retrieved from the ClinVar database against patients' genomic sequences. The secure search runs only one order of magnitude slower than the simple search without encryption, indicating our method is ready to be used for secure genomic computation on public clouds.

  20. Soil biogeochemistry in the age of big data

    Science.gov (United States)

    Cécillon, Lauric; Barré, Pierre; Coissac, Eric; Plante, Alain; Rasse, Daniel

    2015-04-01

    Data is becoming one of the key resource of the XXIst century. Soil biogeochemistry is not spared by this new movement. The conservation of soils and their services recently came into the political agenda. However, clear knowledge on the links between soil characteristics and the various processes ensuring the provision of soil services is rare at the molecular or the plot scale, and does not exist at the landscape scale. This split between society's expectations on its natural capital, and scientific knowledge on the most complex material on earth has lead to an increasing number of studies on soils, using an increasing number of techniques of increasing complexity, with an increasing spatial and temporal coverage. From data scarcity with a basic data management system, soil biogeochemistry is now facing a proliferation of data, with few quality controls from data collection to publication and few skills to deal with them. Based on this observation, here we (1) address how big data could help in making sense of all these soil biogeochemical data, (2) point out several shortcomings of big data that most biogeochemists will experience in their future career. Massive storage of data is now common and recent opportunities for cloud storage enables data sharing among researchers all over the world. The need for integrative and collaborative computational databases in soil biogeochemistry is emerging through pioneering initiatives in this direction (molTERdb; earthcube), following soil microbiologists (GenBank). We expect that a series of data storage and management systems will rapidly revolutionize the way of accessing raw biogeochemical data, published or not. Data mining techniques combined with cluster or cloud computing hold significant promises for facilitating the use of complex analytical methods, and for revealing new insights previously hidden in complex data on soil mineralogy, organic matter and biodiversity. Indeed, important scientific advances have

  1. BIG SKY CARBON SEQUESTRATION PARTNERSHIP

    Energy Technology Data Exchange (ETDEWEB)

    Susan M. Capalbo

    2004-06-01

    The Big Sky Partnership, led by Montana State University, is comprised of research institutions, public entities and private sectors organizations, and the Confederated Salish and Kootenai Tribes and the Nez Perce Tribe. Efforts during the second performance period fall into four areas: evaluation of sources and carbon sequestration sinks; development of GIS-based reporting framework; designing an integrated suite of monitoring, measuring, and verification technologies; and initiating a comprehensive education and outreach program. At the first two Partnership meetings the groundwork was put in place to provide an assessment of capture and storage capabilities for CO{sub 2} utilizing the resources found in the Partnership region (both geological and terrestrial sinks), that would complement the ongoing DOE research. The region has a diverse array of geological formations that could provide storage options for carbon in one or more of its three states. Likewise, initial estimates of terrestrial sinks indicate a vast potential for increasing and maintaining soil C on forested, agricultural, and reclaimed lands. Both options include the potential for offsetting economic benefits to industry and society. Steps have been taken to assure that the GIS-based framework is consistent among types of sinks within the Big Sky Partnership area and with the efforts of other western DOE partnerships. Efforts are also being made to find funding to include Wyoming in the coverage areas for both geological and terrestrial sinks and sources. The Partnership recognizes the critical importance of measurement, monitoring, and verification technologies to support not only carbon trading but all policies and programs that DOE and other agencies may want to pursue in support of GHG mitigation. The efforts begun in developing and implementing MMV technologies for geological sequestration reflect this concern. Research is also underway to identify and validate best management practices for

  2. Distributed and Big Data Storage Management in Grid Computing

    Directory of Open Access Journals (Sweden)

    Ajay Kumar

    2012-07-01

    Full Text Available Big data storage management is one of the most challenging issues for Grid computing environments, since large amount of data intensive applications frequently involve a high degree of data access locality. Grid applications typically deal with large amounts of data. In traditional approaches high-performance computing consists dedicated servers that are used to data storage and data replication. In this paper we present a new mechanism for distributed and big data storage and resource discovery services. Here we proposed an architecture named Dynamic and Scalable Storage Management (DSSM architecture in grid environments. This allows in grid computing not only sharing the computational cycles, but also share the storage space. The storage can be transparently accessed from any grid machine, allowing easy data sharing among grid users and applications. The concept of virtual ids that, allows the creation of virtual spaces has been introduced and used. The DSSM divides all Grid Oriented Storage devices (nodes into multiple geographically distributed domains and to facilitate the locality and simplify the intra-domain storage management. Grid service based storage resources are adopted to stack simple modular service piece by piece as demand grows. To this end, we propose four axes that define: DSSM architecture and algorithms description, Storage resources and resource discovery into Grid service, Evaluate purpose prototype system, dynamically, scalability, and bandwidth, and Discuss results. Algorithms at bottom and upper level for standardization dynamic and scalable storage management, along with higher bandwidths have been designed.

  3. Big Bang–Big Crunch Optimization Algorithm for Linear Phase Fir Digital Filter Design

    Directory of Open Access Journals (Sweden)

    Ms. Rashmi Singh Dr. H. K. Verma

    2012-02-01

    Full Text Available The Big Bang–Big Crunch (BB–BC optimization algorithm is a new optimization method that relies on the Big Bang and Big Crunch theory, one of the theories of the evolution of the universe. In this paper, a Big Bang–Big Crunch algorithm has been used here for the design of linear phase finite impulse response (FIR filters. Here the experimented fitness function based on the mean squared error between the actual and the ideal filter response. This paper presents the plot of magnitude response of FIR filters and error graph. The BB-BC seems to be promising tool for FIR filter design especially in a dynamic environment where filter coefficients have to be adapted and fast convergence is of importance.

  4. BIG SKY CARBON SEQUESTRATION PARTNERSHIP

    Energy Technology Data Exchange (ETDEWEB)

    Susan M. Capalbo

    2005-01-31

    The Big Sky Carbon Sequestration Partnership, led by Montana State University, is comprised of research institutions, public entities and private sectors organizations, and the Confederated Salish and Kootenai Tribes and the Nez Perce Tribe. Efforts under this Partnership in Phase I fall into four areas: evaluation of sources and carbon sequestration sinks that will be used to determine the location of pilot demonstrations in Phase II; development of GIS-based reporting framework that links with national networks; designing an integrated suite of monitoring, measuring, and verification technologies and assessment frameworks; and initiating a comprehensive education and outreach program. The groundwork is in place to provide an assessment of storage capabilities for CO{sub 2} utilizing the resources found in the Partnership region (both geological and terrestrial sinks), that would complement the ongoing DOE research. Efforts are underway to showcase the architecture of the GIS framework and initial results for sources and sinks. The region has a diverse array of geological formations that could provide storage options for carbon in one or more of its three states. Likewise, initial estimates of terrestrial sinks indicate a vast potential for increasing and maintaining soil C on forested, agricultural, and reclaimed lands. Both options include the potential for offsetting economic benefits to industry and society. Steps have been taken to assure that the GIS-based framework is consistent among types of sinks within the Big Sky Partnership area and with the efforts of other western DOE partnerships. The Partnership recognizes the critical importance of measurement, monitoring, and verification technologies to support not only carbon trading but all policies and programs that DOE and other agencies may want to pursue in support of GHG mitigation. The efforts in developing and implementing MMV technologies for geological sequestration reflect this concern. Research is

  5. NOAA Big Data Partnership RFI

    Science.gov (United States)

    de la Beaujardiere, J.

    2014-12-01

    In February 2014, the US National Oceanic and Atmospheric Administration (NOAA) issued a Big Data Request for Information (RFI) from industry and other organizations (e.g., non-profits, research laboratories, and universities) to assess capability and interest in establishing partnerships to position a copy of NOAA's vast data holdings in the Cloud, co-located with easy and affordable access to analytical capabilities. This RFI was motivated by a number of concerns. First, NOAA's data facilities do not necessarily have sufficient network infrastructure to transmit all available observations and numerical model outputs to all potential users, or sufficient infrastructure to support simultaneous computation by many users. Second, the available data are distributed across multiple services and data facilities, making it difficult to find and integrate data for cross-domain analysis and decision-making. Third, large datasets require users to have substantial network, storage, and computing capabilities of their own in order to fully interact with and exploit the latent value of the data. Finally, there may be commercial opportunities for value-added products and services derived from our data. Putting a working copy of data in the Cloud outside of NOAA's internal networks and infrastructures should reduce demands and risks on our systems, and should enable users to interact with multiple datasets and create new lines of business (much like the industries built on government-furnished weather or GPS data). The NOAA Big Data RFI therefore solicited information on technical and business approaches regarding possible partnership(s) that -- at no net cost to the government and minimum impact on existing data facilities -- would unleash the commercial potential of its environmental observations and model outputs. NOAA would retain the master archival copy of its data. Commercial partners would not be permitted to charge fees for access to the NOAA data they receive, but

  6. Circos: an information aesthetic for comparative genomics.

    Science.gov (United States)

    Krzywinski, Martin; Schein, Jacqueline; Birol, Inanç; Connors, Joseph; Gascoyne, Randy; Horsman, Doug; Jones, Steven J; Marra, Marco A

    2009-09-01

    We created a visualization tool called Circos to facilitate the identification and analysis of similarities and differences arising from comparisons of genomes. Our tool is effective in displaying variation in genome structure and, generally, any other kind of positional relationships between genomic intervals. Such data are routinely produced by sequence alignments, hybridization arrays, genome mapping, and genotyping studies. Circos uses a circular ideogram layout to facilitate the display of relationships between pairs of positions by the use of ribbons, which encode the position, size, and orientation of related genomic elements. Circos is capable of displaying data as scatter, line, and histogram plots, heat maps, tiles, connectors, and text. Bitmap or vector images can be created from GFF-style data inputs and hierarchical configuration files, which can be easily generated by automated tools, making Circos suitable for rapid deployment in data analysis and reporting pipelines.

  7. 6 Top Tools for Taming Big Data%6Top Tools for Taming Big Data

    Institute of Scientific and Technical Information of China (English)

    JakoB BJ orklund

    2012-01-01

    The industry now has a buzzword,"big data," for how we're going to do something with the huge amount of information piling up."Big data" is replacing "business intelligence,"which subsumed "reporting," which put a nicer gloss on "spreadsheets," which beat out the old-fashioned "printouts."Managers who long ago studied printouts are now hiring mathematicians who claim to be big data specialists to help them solve the same old problem:What's selling and why?

  8. "Big Data" : big gaps of knowledge in the field of internet science

    OpenAIRE

    Snijders, CCP Chris; Matzat, U Uwe; Reips, UD

    2012-01-01

    Research on so-called 'Big Data' has received a considerable momentum and is expected to grow in the future. One very interesting stream of research on Big Data analyzes online networks. Many online networks are known to have some typical macro-characteristics, such as 'small world' properties. Much less is known about underlying micro-processes leading to these properties. The models used by Big Data researchers usually are inspired by mathematical ease of exposition. We propose to follow in...

  9. BDGS: A Scalable Big Data Generator Suite in Big Data Benchmarking

    OpenAIRE

    Ming, Zijian; Luo, Chunjie; Gao, Wanling; Han, Rui; Yang, Qiang; Wang, Lei; Zhan, Jianfeng

    2014-01-01

    Data generation is a key issue in big data benchmarking that aims to generate application-specific data sets to meet the 4V requirements of big data. Specifically, big data generators need to generate scalable data (Volume) of different types (Variety) under controllable generation rates (Velocity) while keeping the important characteristics of raw data (Veracity). This gives rise to various new challenges about how we design generators efficiently and successfully. To date, most existing tec...

  10. RIKEN mouse genome encyclopedia.

    Science.gov (United States)

    Hayashizaki, Yoshihide

    2003-01-01

    We have been working to establish the comprehensive mouse full-length cDNA collection and sequence database to cover as many genes as we can, named Riken mouse genome encyclopedia. Recently we are constructing higher-level annotation (Functional ANnoTation Of Mouse cDNA; FANTOM) not only with homology search based annotation but also with expression data profile, mapping information and protein-protein database. More than 1,000,000 clones prepared from 163 tissues were end-sequenced to classify into 159,789 clusters and 60,770 representative clones were fully sequenced. As a conclusion, the 60,770 sequences contained 33,409 unique. The next generation of life science is clearly based on all of the genome information and resources. Based on our cDNA clones we developed the additional system to explore gene function. We developed cDNA microarray system to print all of these cDNA clones, protein-protein interaction screening system, protein-DNA interaction screening system and so on. The integrated database of all the information is very useful not only for analysis of gene transcriptional network and for the connection of gene to phenotype to facilitate positional candidate approach. In this talk, the prospect of the application of these genome resourced should be discussed. More information is available at the web page: http://genome.gsc.riken.go.jp/.

  11. Poster: the macaque genome.

    Science.gov (United States)

    2007-04-13

    The rhesus macaque (Macaca mulatta) facilitates an extraordinary range of biomedical and basic research, and the publication of the genome only makes it a more powerful model for studies of human disease; moreover, the macaque's position relative to humans and chimpanzees affords the opportunity to learn about the processes that have shaped the last 25 million years of primate evolution. To allow users to explore these themes of the macaque genome, Science has created a special interactive version of the poster published in the print edition of the 13 April 2007 issue. The interactive version includes additional text and exploration, as well as embedded video featuring seven scientists discussing the importance of the macaque and its genome sequence in studies of biomedicine and evolution. We have also created an accompanying teaching resource, including a lesson plan aimed at teachers of advanced high school life science students, for exploring what a comparison of the macaque and human genomes can tell us about human biology and evolution. These items are free to all site visitors.

  12. 76 FR 7837 - Big Rivers Electric Corporation; Notice of Filing

    Science.gov (United States)

    2011-02-11

    ... December 1, 2010, the date that Big Rivers integrated its transmission facilities with the Midwest... Energy Regulatory Commission Big Rivers Electric Corporation; Notice of Filing Take notice that on February 4, 2011, Big Rivers Electric Corporation (Big Rivers) filed a notice of cancellation of its...

  13. The life cycle of a genome project: perspectives and guidelines inspired by insect genome projects.

    Science.gov (United States)

    Papanicolaou, Alexie

    2016-01-01

    Many research programs on non-model species biology have been empowered by genomics. In turn, genomics is underpinned by a reference sequence and ancillary information created by so-called "genome projects". The most reliable genome projects are the ones created as part of an active research program and designed to address specific questions but their life extends past publication. In this opinion paper I outline four key insights that have facilitated maintaining genomic communities: the key role of computational capability, the iterative process of building genomic resources, the value of community participation and the importance of manual curation. Taken together, these ideas can and do ensure the longevity of genome projects and the growing non-model species community can use them to focus a discussion with regards to its future genomic infrastructure.

  14. «Sochi Goes International» – Required Actions to be Taken to Facilitate the Stay for International Tourists

    Directory of Open Access Journals (Sweden)

    Eduard Besel

    2012-09-01

    Full Text Available As host city of many big international sport events, Sochi's publicity is increasing and people from all over the world will discover it as a new travel destination. To guarantee and enlarge the attractiveness and popularity of the city, some essential improvements, which would facilitate the stay for international tourists, are recommended.

  15. Soft computing in big data processing

    CERN Document Server

    Park, Seung-Jong; Lee, Jee-Hyong

    2014-01-01

    Big data is an essential key to build a smart world as a meaning of the streaming, continuous integration of large volume and high velocity data covering from all sources to final destinations. The big data range from data mining, data analysis and decision making, by drawing statistical rules and mathematical patterns through systematical or automatically reasoning. The big data helps serve our life better, clarify our future and deliver greater value. We can discover how to capture and analyze data. Readers will be guided to processing system integrity and implementing intelligent systems. With intelligent systems, we deal with the fundamental data management and visualization challenges in effective management of dynamic and large-scale data, and efficient processing of real-time and spatio-temporal data. Advanced intelligent systems have led to managing the data monitoring, data processing and decision-making in realistic and effective way. Considering a big size of data, variety of data and frequent chan...

  16. Scaling big data with Hadoop and Solr

    CERN Document Server

    Karambelkar, Hrishikesh Vijay

    2015-01-01

    This book is aimed at developers, designers, and architects who would like to build big data enterprise search solutions for their customers or organizations. No prior knowledge of Apache Hadoop and Apache Solr/Lucene technologies is required.

  17. Neutrino oscillations and Big Bang Nucleosynthesis

    OpenAIRE

    Bell, Nicole F.

    2001-01-01

    We outline how relic neutrino asymmetries may be generated in the early universe via active-sterile neutrino oscillations. We discuss possible consequences for big bang nucleosynthesis, within the context of a particular 4-neutrino model.

  18. Tick-Borne Diseases: The Big Two

    Science.gov (United States)

    ... of this page please turn Javascript on. Feature: Ticks and Diseases Tick-borne Diseases: The Big Two Past Issues / Spring - ... on the skin where there has been a tick bite. Photo: CDC/James Gathany Lyme disease Lyme ...

  19. Big Fish and Prized Trees Gain Protection

    Institute of Scientific and Technical Information of China (English)

    Fred Pearce; 吴敏

    2004-01-01

    @@ Decisions made at a key conservation① meeting are good news for big and quirky② fish and commercially prized trees. Several species will enjoy extra protection against trade following rulings made at the Convention on International Trade in Endangered Species (CITES).

  20. Big Data for Business Ecosystem Players

    Directory of Open Access Journals (Sweden)

    Perko Igor

    2016-06-01

    Full Text Available In the provided research, some of the Big Data most prospective usage domains connect with distinguished player groups found in the business ecosystem. Literature analysis is used to identify the state of the art of Big Data related research in the major domains of its use-namely, individual marketing, health treatment, work opportunities, financial services, and security enforcement. System theory was used to identify business ecosystem major player types disrupted by Big Data: individuals, small and mid-sized enterprises, large organizations, information providers, and regulators. Relationships between the domains and players were explained through new Big Data opportunities and threats and by players’ responsive strategies. System dynamics was used to visualize relationships in the provided model.

  1. ARC Code TI: BigView

    Data.gov (United States)

    National Aeronautics and Space Administration — BigView allows for interactive panning and zooming of images of arbitrary size on desktop PCs running linux. Additionally, it can work in a multi-screen environment...

  2. Big Data in food and agriculture

    Directory of Open Access Journals (Sweden)

    Kelly Bronson

    2016-06-01

    Full Text Available Farming is undergoing a digital revolution. Our existing review of current Big Data applications in the agri-food sector has revealed several collection and analytics tools that may have implications for relationships of power between players in the food system (e.g. between farmers and large corporations. For example, Who retains ownership of the data generated by applications like Monsanto Corproation's Weed I.D. “app”? Are there privacy implications with the data gathered by John Deere's precision agricultural equipment? Systematically tracing the digital revolution in agriculture, and charting the affordances as well as the limitations of Big Data applied to food and agriculture, should be a broad research goal for Big Data scholarship. Such a goal brings data scholarship into conversation with food studies and it allows for a focus on the material consequences of big data in society.

  3. Quantum nature of the big bang.

    Science.gov (United States)

    Ashtekar, Abhay; Pawlowski, Tomasz; Singh, Parampreet

    2006-04-14

    Some long-standing issues concerning the quantum nature of the big bang are resolved in the context of homogeneous isotropic models with a scalar field. Specifically, the known results on the resolution of the big-bang singularity in loop quantum cosmology are significantly extended as follows: (i) the scalar field is shown to serve as an internal clock, thereby providing a detailed realization of the "emergent time" idea; (ii) the physical Hilbert space, Dirac observables, and semiclassical states are constructed rigorously; (iii) the Hamiltonian constraint is solved numerically to show that the big bang is replaced by a big bounce. Thanks to the nonperturbative, background independent methods, unlike in other approaches the quantum evolution is deterministic across the deep Planck regime.

  4. Big Data Components for Business Process Optimization

    Directory of Open Access Journals (Sweden)

    Mircea Raducu TRIFU

    2016-01-01

    Full Text Available In these days, more and more people talk about Big Data, Hadoop, noSQL and so on, but very few technical people have the necessary expertise and knowledge to work with those concepts and technologies. The present issue explains one of the concept that stand behind two of those keywords, and this is the map reduce concept. MapReduce model is the one that makes the Big Data and Hadoop so powerful, fast, and diverse for business process optimization. MapReduce is a programming model with an implementation built to process and generate large data sets. In addition, it is presented the benefits of integrating Hadoop in the context of Business Intelligence and Data Warehousing applications. The concepts and technologies behind big data let organizations to reach a variety of objectives. Like other new information technologies, the main important objective of big data technology is to bring dramatic cost reduction.

  5. Cosmic relics from the big bang

    Energy Technology Data Exchange (ETDEWEB)

    Hall, L.J.

    1988-12-01

    A brief introduction to the big bang picture of the early universe is given. Dark matter is discussed; particularly its implications for elementary particle physics. A classification scheme for dark matter relics is given. 21 refs., 11 figs., 1 tab.

  6. Big Data and Analytics in Healthcare.

    Science.gov (United States)

    Tan, S S-L; Gao, G; Koch, S

    2015-01-01

    This editorial is part of the Focus Theme of Methods of Information in Medicine on "Big Data and Analytics in Healthcare". The amount of data being generated in the healthcare industry is growing at a rapid rate. This has generated immense interest in leveraging the availability of healthcare data (and "big data") to improve health outcomes and reduce costs. However, the nature of healthcare data, and especially big data, presents unique challenges in processing and analyzing big data in healthcare. This Focus Theme aims to disseminate some novel approaches to address these challenges. More specifically, approaches ranging from efficient methods of processing large clinical data to predictive models that could generate better predictions from healthcare data are presented.

  7. Fisicos argentinos reproduciran el Big Bang

    CERN Multimedia

    De Ambrosio, Martin

    2008-01-01

    Two groups of argentine physicists from La Plata and Buenos Aires Universities work in a sery of experiments who while recreate the conditions of the big explosion that was at the origin of the universe. (1 page)

  8. Genome engineering in human cells.

    Science.gov (United States)

    Song, Minjung; Kim, Young-Hoon; Kim, Jin-Soo; Kim, Hyongbum

    2014-01-01

    Genome editing in human cells is of great value in research, medicine, and biotechnology. Programmable nucleases including zinc-finger nucleases, transcription activator-like effector nucleases, and RNA-guided engineered nucleases recognize a specific target sequence and make a double-strand break at that site, which can result in gene disruption, gene insertion, gene correction, or chromosomal rearrangements. The target sequence complexities of these programmable nucleases are higher than 3.2 mega base pairs, the size of the haploid human genome. Here, we briefly introduce the structure of the human genome and the characteristics of each programmable nuclease, and review their applications in human cells including pluripotent stem cells. In addition, we discuss various delivery methods for nucleases, programmable nickases, and enrichment of gene-edited human cells, all of which facilitate efficient and precise genome editing in human cells.

  9. Big bang nucleosynthesis: Present status

    Science.gov (United States)

    Cyburt, Richard H.; Fields, Brian D.; Olive, Keith A.; Yeh, Tsung-Han

    2016-01-01

    Big bang nucleosynthesis (BBN) describes the production of the lightest nuclides via a dynamic interplay among the four fundamental forces during the first seconds of cosmic time. A brief overview of the essentials of this physics is given, and new calculations presented of light-element abundances through 6Li and 7Li, with updated nuclear reactions and uncertainties including those in the neutron lifetime. Fits are provided for these results as a function of baryon density and of the number of neutrino flavors Nν. Recent developments are reviewed in BBN, particularly new, precision Planck cosmic microwave background (CMB) measurements that now probe the baryon density, helium content, and the effective number of degrees of freedom Neff. These measurements allow for a tight test of BBN and cosmology using CMB data alone. Our likelihood analysis convolves the 2015 Planck data chains with our BBN output and observational data. Adding astronomical measurements of light elements strengthens the power of BBN. A new determination of the primordial helium abundance is included in our likelihood analysis. New D/H observations are now more precise than the corresponding theoretical predictions and are consistent with the standard model and the Planck baryon density. Moreover, D/H now provides a tight measurement of Nν when combined with the CMB baryon density and provides a 2 σ upper limit Nνpointing to new physics. This paper concludes with a look at future directions including key nuclear reactions, astronomical observations, and theoretical issues.

  10. "Big Science" exhibition at Balexert

    CERN Multimedia

    2008-01-01

    CERN is going out to meet those members of the general public who were unable to attend the recent Open Day. The Laboratory will be taking its "Big Science" exhibition from the Globe of Science and Innovation to the Balexert shopping centre from 19 to 31 May 2008. The exhibition, which shows the LHC and its experiments through the eyes of a photographer, features around thirty spectacular photographs measuring 4.5 metres high and 2.5 metres wide. Welcomed and guided around the exhibition by CERN volunteers, shoppers at Balexert will also have the opportunity to discover LHC components on display and watch films. "Fun with Physics" workshops will be held at certain times of the day. Main hall of the Balexert shopping centre, ground floor, from 9.00 a.m. to 7.00 p.m. Monday to Friday and from 10 a.m. to 6 p.m. on the two Saturdays. Call for volunteers All members of the CERN personnel are invited to enrol as volunteers to help welcom...

  11. BIG SKY CARBON SEQUESTRATION PARTNERSHIP

    Energy Technology Data Exchange (ETDEWEB)

    Susan M. Capalbo

    2004-06-01

    The Big Sky Partnership, led by Montana State University, is comprised of research institutions, public entities and private sectors organizations, and the Confederated Salish and Kootenai Tribes and the Nez Perce Tribe. Efforts during the second performance period fall into four areas: evaluation of sources and carbon sequestration sinks; development of GIS-based reporting framework; designing an integrated suite of monitoring, measuring, and verification technologies; and initiating a comprehensive education and outreach program. At the first two Partnership meetings the groundwork was put in place to provide an assessment of capture and storage capabilities for CO{sub 2} utilizing the resources found in the Partnership region (both geological and terrestrial sinks), that would complement the ongoing DOE research. The region has a diverse array of geological formations that could provide storage options for carbon in one or more of its three states. Likewise, initial estimates of terrestrial sinks indicate a vast potential for increasing and maintaining soil C on forested, agricultural, and reclaimed lands. Both options include the potential for offsetting economic benefits to industry and society. Steps have been taken to assure that the GIS-based framework is consistent among types of sinks within the Big Sky Partnership area and with the efforts of other western DOE partnerships. Efforts are also being made to find funding to include Wyoming in the coverage areas for both geological and terrestrial sinks and sources. The Partnership recognizes the critical importance of measurement, monitoring, and verification technologies to support not only carbon trading but all policies and programs that DOE and other agencies may want to pursue in support of GHG mitigation. The efforts begun in developing and implementing MMV technologies for geological sequestration reflect this concern. Research is also underway to identify and validate best management practices for

  12. Big Sky Carbon Sequestration Partnership

    Energy Technology Data Exchange (ETDEWEB)

    Susan Capalbo

    2005-12-31

    The Big Sky Carbon Sequestration Partnership, led by Montana State University, is comprised of research institutions, public entities and private sectors organizations, and the Confederated Salish and Kootenai Tribes and the Nez Perce Tribe. Efforts under this Partnership in Phase I are organized into four areas: (1) Evaluation of sources and carbon sequestration sinks that will be used to determine the location of pilot demonstrations in Phase II; (2) Development of GIS-based reporting framework that links with national networks; (3) Design of an integrated suite of monitoring, measuring, and verification technologies, market-based opportunities for carbon management, and an economic/risk assessment framework; (referred to below as the Advanced Concepts component of the Phase I efforts) and (4) Initiation of a comprehensive education and outreach program. As a result of the Phase I activities, the groundwork is in place to provide an assessment of storage capabilities for CO{sub 2} utilizing the resources found in the Partnership region (both geological and terrestrial sinks), that complements the ongoing DOE research agenda in Carbon Sequestration. The geology of the Big Sky Carbon Sequestration Partnership Region is favorable for the potential sequestration of enormous volume of CO{sub 2}. The United States Geological Survey (USGS 1995) identified 10 geologic provinces and 111 plays in the region. These provinces and plays include both sedimentary rock types characteristic of oil, gas, and coal productions as well as large areas of mafic volcanic rocks. Of the 10 provinces and 111 plays, 1 province and 4 plays are located within Idaho. The remaining 9 provinces and 107 plays are dominated by sedimentary rocks and located in the states of Montana and Wyoming. The potential sequestration capacity of the 9 sedimentary provinces within the region ranges from 25,000 to almost 900,000 million metric tons of CO{sub 2}. Overall every sedimentary formation investigated

  13. Astronomical Surveys and Big Data

    CERN Document Server

    Mickaelian, A M

    2015-01-01

    Recent all-sky and large-area astronomical surveys and their catalogued data over the whole range of electromagnetic spectrum are reviewed, from Gamma-ray to radio, such as Fermi-GLAST and INTEGRAL in Gamma-ray, ROSAT, XMM and Chandra in X-ray, GALEX in UV, SDSS and several POSS I and II based catalogues (APM, MAPS, USNO, GSC) in optical range, 2MASS in NIR, WISE and AKARI IRC in MIR, IRAS and AKARI FIS in FIR, NVSS and FIRST in radio and many others, as well as most important surveys giving optical images (DSS I and II, SDSS, etc.), proper motions (Tycho, USNO, Gaia), variability (GCVS, NSVS, ASAS, Catalina, Pan-STARRS) and spectroscopic data (FBS, SBS, Case, HQS, HES, SDSS, CALIFA, GAMA). An overall understanding of the coverage along the whole wavelength range and comparisons between various surveys are given: galaxy redshift surveys, QSO/AGN, radio, Galactic structure, and Dark Energy surveys. Astronomy has entered the Big Data era. Astrophysical Virtual Observatories and Computational Astrophysics play a...

  14. Big-bang nucleosynthesis revisited

    Science.gov (United States)

    Olive, Keith A.; Schramm, David N.; Steigman, Gary; Walker, Terry P.

    1989-01-01

    The homogeneous big-bang nucleosynthesis yields of D, He-3, He-4, and Li-7 are computed taking into account recent measurements of the neutron mean-life as well as updates of several nuclear reaction rates which primarily affect the production of Li-7. The extraction of primordial abundances from observation and the likelihood that the primordial mass fraction of He-4, Y(sub p) is less than or equal to 0.24 are discussed. Using the primordial abundances of D + He-3 and Li-7 we limit the baryon-to-photon ratio (eta in units of 10 exp -10) 2.6 less than or equal to eta(sub 10) less than or equal to 4.3; which we use to argue that baryons contribute between 0.02 and 0.11 to the critical energy density of the universe. An upper limit to Y(sub p) of 0.24 constrains the number of light neutrinos to N(sub nu) less than or equal to 3.4, in excellent agreement with the LEP and SLC collider results. We turn this argument around to show that the collider limit of 3 neutrino species can be used to bound the primordial abundance of He-4: 0.235 less than or equal to Y(sub p) less than or equal to 0.245.

  15. Big Data Empowered Self Organized Networks

    OpenAIRE

    Baldo, Nicola; Giupponi, Lorenza; Mangues-Bafalluy, Josep

    2014-01-01

    Mobile networks are generating a huge amount of data in the form of network measurements as well as network control and management interactions, and 5G is expected to make it even bigger. In this paper, we discuss the different approaches according to which this information could be leveraged using a Big Data approach. In particular, we focus on Big Data Empowered Self Organized Networks, discussing its most peculiar traits, its potential, and the relevant related work, as well as analysing s...

  16. Congenital malalignment of the big toe nail.

    Science.gov (United States)

    Wagner, Gunnar; Sachse, Michael Max

    2012-05-01

    Congenital malalignment of the big toe nail is based on a lateral deviation of the nail plate. This longitudinal axis shift is due to a deviation of the nail matrix, possibly caused by increased traction of the hypertrophic extensor tendon of the hallux. Congenital malalignment of the big toe nail is typically present at birth. Ingrown toenails and onychogryphosis are among the most common complications. Depending on the degree of deviation, conservative or surgical treatment may be recommended.

  17. ISSUES, CHALLENGES, AND SOLUTIONS: BIG DATA MINING

    OpenAIRE

    Jaseena K.U,; Julie M. David

    2014-01-01

    Data has become an indispensable part of every economy, industry, organization, business function and individual. Big Data is a term used to identify the datasets that whose size is beyond the ability of typical database software tools to store, manage and analyze. The Big Data introduce unique computational and statistical challenges, including scalability and storage bottleneck, noise accumulation, spurious correlation and measurement errors. These challenges are distinguishe...

  18. Cincinnati Big Area Additive Manufacturing (BAAM)

    Energy Technology Data Exchange (ETDEWEB)

    Duty, Chad E. [Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States); Love, Lonnie J. [Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States)

    2015-03-04

    Oak Ridge National Laboratory (ORNL) worked with Cincinnati Incorporated (CI) to demonstrate Big Area Additive Manufacturing which increases the speed of the additive manufacturing (AM) process by over 1000X, increases the size of parts by over 10X and shows a cost reduction of over 100X. ORNL worked with CI to transition the Big Area Additive Manufacturing (BAAM) technology from a proof-of-principle (TRL 2-3) demonstration to a prototype product stage (TRL 7-8).

  19. Data Confidentiality Challenges in Big Data Applications

    Energy Technology Data Exchange (ETDEWEB)

    Yin, Jian; Zhao, Dongfang

    2015-12-15

    In this paper, we address the problem of data confidentiality in big data analytics. In many fields, much useful patterns can be extracted by applying machine learning techniques to big data. However, data confidentiality must be protected. In many scenarios, data confidentiality could well be a prerequisite for data to be shared. We present a scheme to provide provable secure data confidentiality and discuss various techniques to optimize performance of such a system.

  20. COBE looks back to the Big Bang

    Science.gov (United States)

    Mather, John C.

    1993-01-01

    An overview is presented of NASA-Goddard's Cosmic Background Explorer (COBE), the first NASA satellite designed to observe the primeval explosion of the universe. The spacecraft carries three extremely sensitive IR and microwave instruments designed to measure the faint residual radiation from the Big Bang and to search for the formation of the first galaxies. COBE's far IR absolute spectrophotometer has shown that the Big Bang radiation has a blackbody spectrum, proving that there was no large energy release after the explosion.

  1. Harnessing the Heart of Big Data

    OpenAIRE

    Scruggs, Sarah B; Watson, Karol; Su, Andrew I.; Hermjakob, Henning; Yates, John R.; Lindsey, Merry L.; Ping, Peipei

    2015-01-01

    The exponential increase in Big Data generation combined with limited capitalization on the wealth of information embedded within Big Data have prompted us to revisit our scientific discovery paradigms. A successful transition into this digital era of medicine holds great promise for advancing fundamental knowledge in biology, innovating human health and driving personalized medicine, however, this will require a drastic shift of research culture in how we conceptualize science and use data. ...

  2. Figure 1 from Integrative Genomics Viewer: Visualizing Big Data | Office of Cancer Genomics

    Science.gov (United States)

    A screenshot of the IGV user interface at the chromosome view. IGV user interface showing five data types (copy number, methylation, gene expression, and loss of heterozygosity; mutations are overlaid with black boxes) from approximately 80 glioblastoma multiforme samples. Adapted from Figure S1; Robinson et al. 2011

  3. Genomic SELEX: a discovery tool for genomic aptamers.

    Science.gov (United States)

    Zimmermann, Bob; Bilusic, Ivana; Lorenz, Christina; Schroeder, Renée

    2010-10-01

    Genomic SELEX is a discovery tool for genomic aptamers, which are genomically encoded functional domains in nucleic acid molecules that recognize and bind specific ligands. When combined with genomic libraries and using RNA-binding proteins as baits, Genomic SELEX used with high-throughput sequencing enables the discovery of genomic RNA aptamers and the identification of RNA-protein interaction networks. Here we describe how to construct and analyze genomic libraries, how to choose baits for selections, how to perform the selection procedure and finally how to analyze the enriched sequences derived from deep sequencing. As a control procedure, we recommend performing a "Neutral" SELEX experiment in parallel to the selection, omitting the selection step. This control experiment provides a background signal for comparison with the positively selected pool. We also recommend deep sequencing the initial library in order to facilitate the final in silico analysis of enrichment with respect to the initial levels. Counter selection procedures, using modified or inactive baits, allow strengthening the binding specificity of the winning selected sequences.

  4. StellaBase: The Nematostella vectensis Genomics Database

    OpenAIRE

    James C Sullivan; Ryan, Joseph F; Watson, James A.; Webb, Jeramy; Mullikin, James C; Rokhsar, Daniel; Finnerty, John R

    2005-01-01

    StellaBase, the Nematostella vectensis Genomics Database, is a web-based resource that will facilitate desktop and bench-top studies of the starlet sea anemone. Nematostella is an emerging model organism that has already proven useful for addressing fundamental questions in developmental evolution and evolutionary genomics. StellaBase allows users to query the assembled Nematostella genome, a confirmed gene library, and a predicted genome using both keyword and homology based search functions...

  5. Molecular evolution of colorectal cancer: from multistep carcinogenesis to the big bang.

    Science.gov (United States)

    Amaro, Adriana; Chiara, Silvana; Pfeffer, Ulrich

    2016-03-01

    Colorectal cancer is characterized by exquisite genomic instability either in the form of microsatellite instability or chromosomal instability. Microsatellite instability is the result of mutation of mismatch repair genes or their silencing through promoter methylation as a consequence of the CpG island methylator phenotype. The molecular causes of chromosomal instability are less well characterized. Genomic instability and field cancerization lead to a high degree of intratumoral heterogeneity and determine the formation of cancer stem cells and epithelial-mesenchymal transition mediated by the TGF-β and APC pathways. Recent analyses using integrated genomics reveal different phases of colorectal cancer evolution. An initial phase of genomic instability that yields many clones with different mutations (big bang) is followed by an important, previously not detected phase of cancer evolution that consists in the stabilization of several clones and a relatively flat outgrowth. The big bang model can best explain the coexistence of several stable clones and is compatible with the fact that the analysis of the bulk of the primary tumor yields prognostic information.

  6. Facilitation as a governance strategy: Unravelling governments' facilitation frames

    NARCIS (Netherlands)

    Grotenbreg, S. (Sanne); M.W. van Buuren (Arwin)

    2017-01-01

    textabstractGovernments increasingly choose facilitation as a strategy to entice others to produce public goods and services, including in relation to the realisation of sustainable energy innovations. An important instrument to implement this governance strategy is discursive framing. To learn how

  7. Big Data Analytics for Disaster Preparedness and Response of Mobile Communication Infrastructure during Natural Hazards

    Science.gov (United States)

    Zhong, L.; Takano, K.; Ji, Y.; Yamada, S.

    2015-12-01

    The disruption of telecommunications is one of the most critical disasters during natural hazards. As the rapid expanding of mobile communications, the mobile communication infrastructure plays a very fundamental role in the disaster response and recovery activities. For this reason, its disruption will lead to loss of life and property, due to information delays and errors. Therefore, disaster preparedness and response of mobile communication infrastructure itself is quite important. In many cases of experienced disasters, the disruption of mobile communication networks is usually caused by the network congestion and afterward long-term power outage. In order to reduce this disruption, the knowledge of communication demands during disasters is necessary. And big data analytics will provide a very promising way to predict the communication demands by analyzing the big amount of operational data of mobile users in a large-scale mobile network. Under the US-Japan collaborative project on 'Big Data and Disaster Research (BDD)' supported by the Japan Science and Technology Agency (JST) and National Science Foundation (NSF), we are going to investigate the application of big data techniques in the disaster preparedness and response of mobile communication infrastructure. Specifically, in this research, we have considered to exploit the big amount of operational information of mobile users for predicting the communications needs in different time and locations. By incorporating with other data such as shake distribution of an estimated major earthquake and the power outage map, we are able to provide the prediction information of stranded people who are difficult to confirm safety or ask for help due to network disruption. In addition, this result could further facilitate the network operators to assess the vulnerability of their infrastructure and make suitable decision for the disaster preparedness and response. In this presentation, we are going to introduce the

  8. Big Sky Carbon Sequestration Partnership

    Energy Technology Data Exchange (ETDEWEB)

    Susan M. Capalbo

    2005-11-01

    The Big Sky Carbon Sequestration Partnership, led by Montana State University, is comprised of research institutions, public entities and private sectors organizations, and the Confederated Salish and Kootenai Tribes and the Nez Perce Tribe. Efforts under this Partnership in Phase I fall into four areas: evaluation of sources and carbon sequestration sinks that will be used to determine the location of pilot demonstrations in Phase II; development of GIS-based reporting framework that links with national networks; designing an integrated suite of monitoring, measuring, and verification technologies and assessment frameworks; and initiating a comprehensive education and outreach program. The groundwork is in place to provide an assessment of storage capabilities for CO2 utilizing the resources found in the Partnership region (both geological and terrestrial sinks), that would complement the ongoing DOE research agenda in Carbon Sequestration. The region has a diverse array of geological formations that could provide storage options for carbon in one or more of its three states. Likewise, initial estimates of terrestrial sinks indicate a vast potential for increasing and maintaining soil C on forested, agricultural, and reclaimed lands. Both options include the potential for offsetting economic benefits to industry and society. Steps have been taken to assure that the GIS-based framework is consistent among types of sinks within the Big Sky Partnership area and with the efforts of other DOE regional partnerships. The Partnership recognizes the critical importance of measurement, monitoring, and verification technologies to support not only carbon trading but all policies and programs that DOE and other agencies may want to pursue in support of GHG mitigation. The efforts in developing and implementing MMV technologies for geological sequestration reflect this concern. Research is also underway to identify and validate best management practices for soil C in the

  9. BIG SKY CARBON SEQUESTRATION PARTNERSHIP

    Energy Technology Data Exchange (ETDEWEB)

    Susan M. Capalbo

    2004-10-31

    The Big Sky Carbon Sequestration Partnership, led by Montana State University, is comprised of research institutions, public entities and private sectors organizations, and the Confederated Salish and Kootenai Tribes and the Nez Perce Tribe. Efforts under this Partnership fall into four areas: evaluation of sources and carbon sequestration sinks; development of GIS-based reporting framework; designing an integrated suite of monitoring, measuring, and verification technologies; and initiating a comprehensive education and outreach program. At the first two Partnership meetings the groundwork was put in place to provide an assessment of capture and storage capabilities for CO{sub 2} utilizing the resources found in the Partnership region (both geological and terrestrial sinks), that would complement the ongoing DOE research. During the third quarter, planning efforts are underway for the next Partnership meeting which will showcase the architecture of the GIS framework and initial results for sources and sinks, discuss the methods and analysis underway for assessing geological and terrestrial sequestration potentials. The meeting will conclude with an ASME workshop. The region has a diverse array of geological formations that could provide storage options for carbon in one or more of its three states. Likewise, initial estimates of terrestrial sinks indicate a vast potential for increasing and maintaining soil C on forested, agricultural, and reclaimed lands. Both options include the potential for offsetting economic benefits to industry and society. Steps have been taken to assure that the GIS-based framework is consistent among types of sinks within the Big Sky Partnership area and with the efforts of other western DOE partnerships. Efforts are also being made to find funding to include Wyoming in the coverage areas for both geological and terrestrial sinks and sources. The Partnership recognizes the critical importance of measurement, monitoring, and verification

  10. Genome sequence and analysis of Lactobacillus helveticus

    Directory of Open Access Journals (Sweden)

    Paola eCremonesi

    2013-01-01

    Full Text Available The microbiological characterization of lactobacilli is historically well developed, but the genomic analysis is recent. Because of the widespread use of L. helveticus in cheese technology, information concerning the heterogeneity in this species is accumulating rapidly. Recently, the genome of five L. helveticus strains was sequenced to completion and compared with other genomically characterized lactobacilli. The genomic analysis of the first sequenced strain, L. helveticus DPC 4571, isolated from cheese and selected for its characteristics of rapid lysis and high proteolytic activity, has revealed a plethora of genes with industrial potential including those responsible for key metabolic functions such as proteolysis, lipolysis, and cell lysis. These genes and their derived enzymes can facilitate the production of cheese and cheese derivatives with potential for use as ingredients in consumer foods. In addition, L. helveticus has the potential to produce peptides with a biological function, such as angiotensin converting enzyme (ACE inhibitory activity, in fermented dairy products, demonstrating the therapeutic value of this species. A most intriguing feature of the genome of L. helveticus is the remarkable similarity in gene content with many intestinal lactobacilli. Comparative genomics has allowed the identification of key gene sets that facilitate a variety of lifestyles including adaptation to food matrices or the gastrointestinal tract.As genome sequence and functional genomic information continues to explode, key features of the genomes of L. helveticus strains continue to be discovered, answering many questions but also raising many new ones.

  11. Facilitation of learning: part 1.

    Science.gov (United States)

    Warburton, Tyler; Trish, Houghton; Barry, Debbie

    2016-04-06

    This article, the fourth in a series of 11, discusses the context for the facilitation of learning. It outlines the main principles and theories for understanding the process of learning, including examples which link these concepts to practice. The practical aspects of using these theories in a practice setting will be discussed in the fifth article of this series. Together, these two articles will provide mentors and practice teachers with knowledge of the learning process, which will enable them to meet the second domain of the Nursing and Midwifery Council's Standards to Support Learning and Assessment in Practice on facilitation of learning.

  12. Small government or big government?

    Directory of Open Access Journals (Sweden)

    MATEO SPAHO

    2015-03-01

    Full Text Available Since the beginning of the twentieth century, economists and philosophers were polarizedon their positions beyond the role that the government should have in the economy. On one hand John Maynard Keynes represented, within the optics of market economy, a position where the state should intervene in the economy to maintain the aggregate demand and the employment in the country, without hesitation in creating budget deficits and public debt expansion. This approach happens especially in the moments when the domestic economy and global economic trends show a weak growth or a recession. This means a heavy interference inthe economy, with higher income but with high expenditure to GDP too. On the other side, Liberals and Neoliberalsled by Friedrich Hayek advocated a withdrawal of the government from economic activity not just in moments of economic growth but also during the crisis, believing that the market has self-regulating mechanisms within itself. The government, as a result will have a smaller dimension with lower income and also low expenditures compared to the GDP of the country. We took the South-Eastern Europe countries distinguishing those with a "Big Government" or countries with "Small Government". There are analyzed the economic performances during the global crisis (2007-2014. In which countries the public debt grew less? Which country managed to attract more investments and which were the countries that preserved the purchasing power of their consumers? We shall see if during the economic crisis in Eastern Europe the Great Government or the Liberal and "Small" one has been the most successful the model.

  13. The genome of Chenopodium quinoa.

    Science.gov (United States)

    Jarvis, David E; Ho, Yung Shwen; Lightfoot, Damien J; Schmöckel, Sandra M; Li, Bo; Borm, Theo J A; Ohyanagi, Hajime; Mineta, Katsuhiko; Michell, Craig T; Saber, Noha; Kharbatia, Najeh M; Rupper, Ryan R; Sharp, Aaron R; Dally, Nadine; Boughton, Berin A; Woo, Yong H; Gao, Ge; Schijlen, Elio G W M; Guo, Xiujie; Momin, Afaque A; Negrão, Sónia; Al-Babili, Salim; Gehring, Christoph; Roessner, Ute; Jung, Christian; Murphy, Kevin; Arold, Stefan T; Gojobori, Takashi; Linden, C Gerard van der; van Loo, Eibertus N; Jellen, Eric N; Maughan, Peter J; Tester, Mark

    2017-02-16

    Chenopodium quinoa (quinoa) is a highly nutritious grain identified as an important crop to improve world food security. Unfortunately, few resources are available to facilitate its genetic improvement. Here we report the assembly of a high-quality, chromosome-scale reference genome sequence for quinoa, which was produced using single-molecule real-time sequencing in combination with optical, chromosome-contact and genetic maps. We also report the sequencing of two diploids from the ancestral gene pools of quinoa, which enables the identification of sub-genomes in quinoa, and reduced-coverage genome sequences for 22 other samples of the allotetraploid goosefoot complex. The genome sequence facilitated the identification of the transcription factor likely to control the production of anti-nutritional triterpenoid saponins found in quinoa seeds, including a mutation that appears to cause alternative splicing and a premature stop codon in sweet quinoa strains. These genomic resources are an important first step towards the genetic improvement of quinoa.

  14. The genome of Chenopodium quinoa

    KAUST Repository

    Jarvis, David E.

    2017-02-08

    Chenopodium quinoa (quinoa) is a highly nutritious grain identified as an important crop to improve world food security. Unfortunately, few resources are available to facilitate its genetic improvement. Here we report the assembly of a high-quality, chromosome-scale reference genome sequence for quinoa, which was produced using single-molecule real-time sequencing in combination with optical, chromosome-contact and genetic maps. We also report the sequencing of two diploids from the ancestral gene pools of quinoa, which enables the identification of sub-genomes in quinoa, and reduced-coverage genome sequences for 22 other samples of the allotetraploid goosefoot complex. The genome sequence facilitated the identification of the transcription factor likely to control the production of anti-nutritional triterpenoid saponins found in quinoa seeds, including a mutation that appears to cause alternative splicing and a premature stop codon in sweet quinoa strains. These genomic resources are an important first step towards the genetic improvement of quinoa.

  15. The YH database: the first Asian diploid genome database

    DEFF Research Database (Denmark)

    Li, Guoqing; Ma, Lijia; Song, Chao;

    2009-01-01

    genome consensus. The YH database is currently one of the three personal genome database, organizing the original data and analysis results in a user-friendly interface, which is an endeavor to achieve fundamental goals for establishing personal medicine. The database is available at http://yh.genomics.org.cn.......The YH database is a server that allows the user to easily browse and download data from the first Asian diploid genome. The aim of this platform is to facilitate the study of this Asian genome and to enable improved organization and presentation large-scale personal genome data. Powered by GBrowse...

  16. Big Sky Carbon Sequestration Partnership

    Energy Technology Data Exchange (ETDEWEB)

    Susan M. Capalbo

    2005-11-01

    The Big Sky Carbon Sequestration Partnership, led by Montana State University, is comprised of research institutions, public entities and private sectors organizations, and the Confederated Salish and Kootenai Tribes and the Nez Perce Tribe. Efforts under this Partnership in Phase I fall into four areas: evaluation of sources and carbon sequestration sinks that will be used to determine the location of pilot demonstrations in Phase II; development of GIS-based reporting framework that links with national networks; designing an integrated suite of monitoring, measuring, and verification technologies and assessment frameworks; and initiating a comprehensive education and outreach program. The groundwork is in place to provide an assessment of storage capabilities for CO2 utilizing the resources found in the Partnership region (both geological and terrestrial sinks), that would complement the ongoing DOE research agenda in Carbon Sequestration. The region has a diverse array of geological formations that could provide storage options for carbon in one or more of its three states. Likewise, initial estimates of terrestrial sinks indicate a vast potential for increasing and maintaining soil C on forested, agricultural, and reclaimed lands. Both options include the potential for offsetting economic benefits to industry and society. Steps have been taken to assure that the GIS-based framework is consistent among types of sinks within the Big Sky Partnership area and with the efforts of other DOE regional partnerships. The Partnership recognizes the critical importance of measurement, monitoring, and verification technologies to support not only carbon trading but all policies and programs that DOE and other agencies may want to pursue in support of GHG mitigation. The efforts in developing and implementing MMV technologies for geological sequestration reflect this concern. Research is also underway to identify and validate best management practices for soil C in the

  17. Big Sky Carbon Sequestration Partnership

    Energy Technology Data Exchange (ETDEWEB)

    Susan Capalbo

    2005-12-31

    The Big Sky Carbon Sequestration Partnership, led by Montana State University, is comprised of research institutions, public entities and private sectors organizations, and the Confederated Salish and Kootenai Tribes and the Nez Perce Tribe. Efforts under this Partnership in Phase I are organized into four areas: (1) Evaluation of sources and carbon sequestration sinks that will be used to determine the location of pilot demonstrations in Phase II; (2) Development of GIS-based reporting framework that links with national networks; (3) Design of an integrated suite of monitoring, measuring, and verification technologies, market-based opportunities for carbon management, and an economic/risk assessment framework; (referred to below as the Advanced Concepts component of the Phase I efforts) and (4) Initiation of a comprehensive education and outreach program. As a result of the Phase I activities, the groundwork is in place to provide an assessment of storage capabilities for CO{sub 2} utilizing the resources found in the Partnership region (both geological and terrestrial sinks), that complements the ongoing DOE research agenda in Carbon Sequestration. The geology of the Big Sky Carbon Sequestration Partnership Region is favorable for the potential sequestration of enormous volume of CO{sub 2}. The United States Geological Survey (USGS 1995) identified 10 geologic provinces and 111 plays in the region. These provinces and plays include both sedimentary rock types characteristic of oil, gas, and coal productions as well as large areas of mafic volcanic rocks. Of the 10 provinces and 111 plays, 1 province and 4 plays are located within Idaho. The remaining 9 provinces and 107 plays are dominated by sedimentary rocks and located in the states of Montana and Wyoming. The potential sequestration capacity of the 9 sedimentary provinces within the region ranges from 25,000 to almost 900,000 million metric tons of CO{sub 2}. Overall every sedimentary formation investigated

  18. Big Data Big Changes%大数据,大变革

    Institute of Scientific and Technical Information of China (English)

    梁爽

    2014-01-01

    大数据正时刻发生在人们的身边,大数据时代已经到来。本文通过对大数据特点的描述,分析了大数据在国内外的研究现状以及未来的应用方向,只有重新认识大数据,从思维上变革对大数据的认识,从商业模式上适应大数据的变化,创新大数据管理模式,加强制度建设,增强法律意识,保证个人和国家的安全,才能不断推动大数据的健康发展。%Big data are always happen in people’s side, big data era has arrived. This paper has described the characteristics of big data, analyzed big data research status and future application direction. Only to understand big data again, change the thinking of big data, adapt to changes in business model, innovative big data management, strengthen institution construction, enhance law awareness, ensure the personal and national security, it can continuously promote the healthy development of big data.

  19. Benchmarking Big Data Systems and the BigData Top100 List.

    Science.gov (United States)

    Baru, Chaitanya; Bhandarkar, Milind; Nambiar, Raghunath; Poess, Meikel; Rabl, Tilmann

    2013-03-01

    "Big data" has become a major force of innovation across enterprises of all sizes. New platforms with increasingly more features for managing big datasets are being announced almost on a weekly basis. Yet, there is currently a lack of any means of comparability among such platforms. While the performance of traditional database systems is well understood and measured by long-established institutions such as the Transaction Processing Performance Council (TCP), there is neither a clear definition of the performance of big data systems nor a generally agreed upon metric for comparing these systems. In this article, we describe a community-based effort for defining a big data benchmark. Over the past year, a Big Data Benchmarking Community has become established in order to fill this void. The effort focuses on defining an end-to-end application-layer benchmark for measuring the performance of big data applications, with the ability to easily adapt the benchmark specification to evolving challenges in the big data space. This article describes the efforts that have been undertaken thus far toward the definition of a BigData Top100 List. While highlighting the major technical as well as organizational challenges, through this article, we also solicit community input into this process.

  20. Five Big, Big Five Issues : Rationale, Content, Structure, Status, and Crosscultural Assessment

    NARCIS (Netherlands)

    De Raad, Boele

    1998-01-01

    This article discusses the rationale, content, structure, status, and crosscultural assessment of the Big Five trait factors, focusing on topics of dispute and misunderstanding. Taxonomic restrictions of the original Big Five forerunner, the "Norman Five," are discussed, and criticisms regarding the

  1. From darwin to the census of marine life: marine biology as big science.

    Directory of Open Access Journals (Sweden)

    Niki Vermeulen

    Full Text Available With the development of the Human Genome Project, a heated debate emerged on biology becoming 'big science'. However, biology already has a long tradition of collaboration, as natural historians were part of the first collective scientific efforts: exploring the variety of life on earth. Such mappings of life still continue today, and if field biology is gradually becoming an important subject of studies into big science, research into life in the world's oceans is not taken into account yet. This paper therefore explores marine biology as big science, presenting the historical development of marine research towards the international 'Census of Marine Life' (CoML making an inventory of life in the world's oceans. Discussing various aspects of collaboration--including size, internationalisation, research practice, technological developments, application, and public communication--I will ask if CoML still resembles traditional collaborations to collect life. While showing both continuity and change, I will argue that marine biology is a form of natural history: a specific way of working together in biology that has transformed substantially in interaction with recent developments in the life sciences and society. As a result, the paper does not only give an overview of transformations towards large scale research in marine biology, but also shines a new light on big biology, suggesting new ways to deepen the understanding of collaboration in the life sciences by distinguishing between different 'collective ways of knowing'.

  2. From darwin to the census of marine life: marine biology as big science.

    Science.gov (United States)

    Vermeulen, Niki

    2013-01-01

    With the development of the Human Genome Project, a heated debate emerged on biology becoming 'big science'. However, biology already has a long tradition of collaboration, as natural historians were part of the first collective scientific efforts: exploring the variety of life on earth. Such mappings of life still continue today, and if field biology is gradually becoming an important subject of studies into big science, research into life in the world's oceans is not taken into account yet. This paper therefore explores marine biology as big science, presenting the historical development of marine research towards the international 'Census of Marine Life' (CoML) making an inventory of life in the world's oceans. Discussing various aspects of collaboration--including size, internationalisation, research practice, technological developments, application, and public communication--I will ask if CoML still resembles traditional collaborations to collect life. While showing both continuity and change, I will argue that marine biology is a form of natural history: a specific way of working together in biology that has transformed substantially in interaction with recent developments in the life sciences and society. As a result, the paper does not only give an overview of transformations towards large scale research in marine biology, but also shines a new light on big biology, suggesting new ways to deepen the understanding of collaboration in the life sciences by distinguishing between different 'collective ways of knowing'.

  3. GenColors-based comparative genome databases for small eukaryotic genomes.

    Science.gov (United States)

    Felder, Marius; Romualdi, Alessandro; Petzold, Andreas; Platzer, Matthias; Sühnel, Jürgen; Glöckner, Gernot

    2013-01-01

    Many sequence data repositories can give a quick and easily accessible overview on genomes and their annotations. Less widespread is the possibility to compare related genomes with each other in a common database environment. We have previously described the GenColors database system (http://gencolors.fli-leibniz.de) and its applications to a number of bacterial genomes such as Borrelia, Legionella, Leptospira and Treponema. This system has an emphasis on genome comparison. It combines data from related genomes and provides the user with an extensive set of visualization and analysis tools. Eukaryote genomes are normally larger than prokaryote genomes and thus pose additional challenges for such a system. We have, therefore, adapted GenColors to also handle larger datasets of small eukaryotic genomes and to display eukaryotic gene structures. Further recent developments include whole genome views, genome list options and, for bacterial genome browsers, the display of horizontal gene transfer predictions. Two new GenColors-based databases for two fungal species (http://fgb.fli-leibniz.de) and for four social amoebas (http://sacgb.fli-leibniz.de) were set up. Both new resources open up a single entry point for related genomes for the amoebozoa and fungal research communities and other interested users. Comparative genomics approaches are greatly facilitated by these resources.

  4. Corpus Linguistics Facilitates English Teaching

    Institute of Scientific and Technical Information of China (English)

    朱思亲

    2014-01-01

    Corpus linguistics has been widely applied in English teaching. Corpus linguistics has changed the way to teach English. The essay discusses two approaches in English teaching based on corpus, corpus-driven approach and corpus-based approach. It finds out that both corpus-driven approach and corpus-based approach facilitate English teaching in their own ways.

  5. Facilitating Conditions for School Motivation.

    Science.gov (United States)

    Yeung, Alexander Seeshing; McInerney, Dennis M.

    Primary and high school students (277 in grades 5-6; 615 in grades 7-12) in the United States (47 percent boys) responded to 26 items of the Facilitating Conditions Questionnaire (FCQ). Results indicate 7 distinct FCQ factors: perceived value of schooling; affect toward schooling; peer positive academic climate (Peer Positive); encouragement from…

  6. Learning to Facilitate (Online) Meetings

    DEFF Research Database (Denmark)

    Reimann, Peter; Bull, Susan; Vatrapu, Ravi

    2013-01-01

    , etc.. We argue that facilitating meetings is a competence worth developing in students and describe the main knowledge and skill components that pertain to this competence. We then describe some implemented software tools that can be used in schools and colleges to provide opportunities for practicing...

  7. Facilitating Creativity in Adult Learners

    Science.gov (United States)

    Tsai, Kuan Chen

    2013-01-01

    Creativity in education research has received increasing attention, although the major focus of this research has been on children. Despite pleas by several adult educators for promoting creativity, very few studies have focused on adult learners, leaving to it to be explored what approaches are useful for adult educators to facilitate creativity…

  8. Brug af mindfulness til facilitering

    DEFF Research Database (Denmark)

    Adriansen, Hanne Kirstine; Krohn, Simon

    2011-01-01

    Gennem de senere år er mindfulness gået fra udelukkende at være en eksistentiel praksis til også at være en behandlingsform og senest til også at blive brugt som et praktisk redskab i erhvervslivet. Denne artikel viser, at mindfulness også kan anvendes i forbindelse med facilitering. Facilitering...... er et værktøj, som bruges i arbejdslivet fx til møder og konferencer, hvor en gruppe mennesker er samlet for at lære eller udrette noget sammen. Det nye ved at kombinere mindfulness med facilitering er, at fokus hermed ændres fra individet, som er centrum for den eksistentielle fordybelse eller det...... terapeutiske forløb, til gruppen, som er udgangspunktet i facilitering. Artiklen viser, hvordan mindfulness konkret kan bruges på gruppeniveau og diskuterer samtidig hvilke problemer, der kan være forbundet hermed. Baseret på vores egne erfaringer, diskuterer vi, hvordan mindfulness kan påvirke en gruppes...

  9. The challenge of big data in public health: an opportunity for visual analytics.

    Science.gov (United States)

    Ola, Oluwakemi; Sedig, Kamran

    2014-01-01

    Public health (PH) data can generally be characterized as big data. The efficient and effective use of this data determines the extent to which PH stakeholders can sufficiently address societal health concerns as they engage in a variety of work activities. As stakeholders interact with data, they engage in various cognitive activities such as analytical reasoning, decision-making, interpreting, and problem solving. Performing these activities with big data is a challenge for the unaided mind as stakeholders encounter obstacles relating to the data's volume, variety, velocity, and veracity. Such being the case, computer-based information tools are needed to support PH stakeholders. Unfortunately, while existing computational tools are beneficial in addressing certain work activities, they fall short in supporting cognitive activities that involve working with large, heterogeneous, and complex bodies of data. This paper presents visual analytics (VA) tools, a nascent category of computational tools that integrate data analytics with interactive visualizations, to facilitate the performance of cognitive activities involving big data. Historically, PH has lagged behind other sectors in embracing new computational technology. In this paper, we discuss the role that VA tools can play in addressing the challenges presented by big data. In doing so, we demonstrate the potential benefit of incorporating VA tools into PH practice, in addition to highlighting the need for further systematic and focused research.

  10. Transcriptome marker diagnostics using big data.

    Science.gov (United States)

    Han, Henry; Liu, Ying

    2016-02-01

    The big omics data are challenging translational bioinformatics in an unprecedented way for its complexities and volumes. How to employ big omics data to achieve a rivalling-clinical, reproducible disease diagnosis from a systems approach is an urgent problem to be solved in translational bioinformatics and machine learning. In this study, the authors propose a novel transcriptome marker diagnosis to tackle this problem using big RNA-seq data by viewing whole transcriptome as a profile marker systematically. The systems diagnosis not only avoids the reproducibility issue of the existing gene-/network-marker-based diagnostic methods, but also achieves rivalling-clinical diagnostic results by extracting true signals from big RNA-seq data. Their method demonstrates a better fit for personalised diagnostics by attaining exceptional diagnostic performance via using systems information than its competitive methods and prepares itself as a good candidate for clinical usage. To the best of their knowledge, it is the first study on this topic and will inspire the more investigations in big omics data diagnostics.

  11. Conceptualization and theorization of the Big Data

    Directory of Open Access Journals (Sweden)

    Marcos Mazzieri

    2016-06-01

    Full Text Available The term Big Data is being used widely by companies and researchers who consider your relevant functionalities or applications to create value and business innovation. However some questions arise about what is this phenomenon and, more precisely, how it occurs and under what conditions it can create value and innovation in business. In our view, the lack of depth related to the principles involved in Big Data and the very absence of a conceptual definition, made it difficult to answer these questions that have been the basis for our research. To answer these questions we did a bibliometric study and extensive literature review. The bibliometric studies were realized based in articles and citation of Web of Knowledge database. The main result of our research is the providing a conceptual definition for the term Big Data. Also, we propose which principles discovered can contribute with other researches  that intend value creation by Big Data. Finally we propose see the value creation through Big Data using the  Resource Based View as the main theory used for discuss that theme.

  12. Volume and Value of Big Healthcare Data

    Science.gov (United States)

    Dinov, Ivo D.

    2016-01-01

    Modern scientific inquiries require significant data-driven evidence and trans-disciplinary expertise to extract valuable information and gain actionable knowledge about natural processes. Effective evidence-based decisions require collection, processing and interpretation of vast amounts of complex data. The Moore's and Kryder's laws of exponential increase of computational power and information storage, respectively, dictate the need rapid trans-disciplinary advances, technological innovation and effective mechanisms for managing and interrogating Big Healthcare Data. In this article, we review important aspects of Big Data analytics and discuss important questions like: What are the challenges and opportunities associated with this biomedical, social, and healthcare data avalanche? Are there innovative statistical computing strategies to represent, model, analyze and interpret Big heterogeneous data? We present the foundation of a new compressive big data analytics (CBDA) framework for representation, modeling and inference of large, complex and heterogeneous datasets. Finally, we consider specific directions likely to impact the process of extracting information from Big healthcare data, translating that information to knowledge, and deriving appropriate actions. PMID:26998309

  13. Big data analytics a management perspective

    CERN Document Server

    Corea, Francesco

    2016-01-01

    This book is about innovation, big data, and data science seen from a business perspective. Big data is a buzzword nowadays, and there is a growing necessity within practitioners to understand better the phenomenon, starting from a clear stated definition. This book aims to be a starting reading for executives who want (and need) to keep the pace with the technological breakthrough introduced by new analytical techniques and piles of data. Common myths about big data will be explained, and a series of different strategic approaches will be provided. By browsing the book, it will be possible to learn how to implement a big data strategy and how to use a maturity framework to monitor the progress of the data science team, as well as how to move forward from one stage to the next. Crucial challenges related to big data will be discussed, where some of them are more general - such as ethics, privacy, and ownership – while others concern more specific business situations (e.g., initial public offering, growth st...

  14. Big Data Analytics in Immunology: A Knowledge-Based Approach

    Directory of Open Access Journals (Sweden)

    Guang Lan Zhang

    2014-01-01

    Full Text Available With the vast amount of immunological data available, immunology research is entering the big data era. These data vary in granularity, quality, and complexity and are stored in various formats, including publications, technical reports, and databases. The challenge is to make the transition from data to actionable knowledge and wisdom and bridge the knowledge gap and application gap. We report a knowledge-based approach based on a framework called KB-builder that facilitates data mining by enabling fast development and deployment of web-accessible immunological data knowledge warehouses. Immunological knowledge discovery relies heavily on both the availability of accurate, up-to-date, and well-organized data and the proper analytics tools. We propose the use of knowledge-based approaches by developing knowledgebases combining well-annotated data with specialized analytical tools and integrating them into analytical workflow. A set of well-defined workflow types with rich summarization and visualization capacity facilitates the transformation from data to critical information and knowledge. By using KB-builder, we enabled streamlining of normally time-consuming processes of database development. The knowledgebases built using KB-builder will speed up rational vaccine design by providing accurate and well-annotated data coupled with tailored computational analysis tools and workflow.

  15. Big data analytics in immunology: a knowledge-based approach.

    Science.gov (United States)

    Zhang, Guang Lan; Sun, Jing; Chitkushev, Lou; Brusic, Vladimir

    2014-01-01

    With the vast amount of immunological data available, immunology research is entering the big data era. These data vary in granularity, quality, and complexity and are stored in various formats, including publications, technical reports, and databases. The challenge is to make the transition from data to actionable knowledge and wisdom and bridge the knowledge gap and application gap. We report a knowledge-based approach based on a framework called KB-builder that facilitates data mining by enabling fast development and deployment of web-accessible immunological data knowledge warehouses. Immunological knowledge discovery relies heavily on both the availability of accurate, up-to-date, and well-organized data and the proper analytics tools. We propose the use of knowledge-based approaches by developing knowledgebases combining well-annotated data with specialized analytical tools and integrating them into analytical workflow. A set of well-defined workflow types with rich summarization and visualization capacity facilitates the transformation from data to critical information and knowledge. By using KB-builder, we enabled streamlining of normally time-consuming processes of database development. The knowledgebases built using KB-builder will speed up rational vaccine design by providing accurate and well-annotated data coupled with tailored computational analysis tools and workflow.

  16. GeNemo: a search engine for web-based functional genomic data.

    Science.gov (United States)

    Zhang, Yongqing; Cao, Xiaoyi; Zhong, Sheng

    2016-07-08

    A set of new data types emerged from functional genomic assays, including ChIP-seq, DNase-seq, FAIRE-seq and others. The results are typically stored as genome-wide intensities (WIG/bigWig files) or functional genomic regions (peak/BED files). These data types present new challenges to big data science. Here, we present GeNemo, a web-based search engine for functional genomic data. GeNemo searches user-input data against online functional genomic datasets, including the entire collection of ENCODE and mouse ENCODE datasets. Unlike text-based search engines, GeNemo's searches are based on pattern matching of functional genomic regions. This distinguishes GeNemo from text or DNA sequence searches. The user can input any complete or partial functional genomic dataset, for example, a binding intensity file (bigWig) or a peak file. GeNemo reports any genomic regions, ranging from hundred bases to hundred thousand bases, from any of the online ENCODE datasets that share similar functional (binding, modification, accessibility) patterns. This is enabled by a Markov Chain Monte Carlo-based maximization process, executed on up to 24 parallel computing threads. By clicking on a search result, the user can visually compare her/his data with the found datasets and navigate the identified genomic regions. GeNemo is available at www.genemo.org.

  17. About Big Data and its Challenges and Benefits in Manufacturing

    OpenAIRE

    Bogdan NEDELCU

    2013-01-01

    The aim of this article is to show the importance of Big Data and its growing influence on companies. It also shows what kind of big data is currently generated and how much big data is estimated to be generated. We can also see how much are the companies willing to invest in big data and how much are they currently gaining from their big data. There are also shown some major influences that big data has over one major segment in the industry (manufacturing) and the challenges that appear.

  18. SoBigData - VRE specification and software 1

    OpenAIRE

    Assante, Massimiliano; Candela, Leonardo; Frosini, Luca; Lelii, Lucio; Mangiacrapa, Francesco; Pagano, Pasquale

    2016-01-01

    This deliverable complements "D10.5 SoBigData e-Infrastructure software release 1" by describing how such a software has been deployed to serve the current needs of the SoBigData community. In particular, it describes how such a software has been exploited to make available the components envisaged in "D10.2 SoBigData e-Infrastructure release plan 1", i.e. the SoBigData portal (and the underlying Virtual Organisation), the SoBigData Catalogue, and the SoBigData Virtual Research Environments.

  19. Big Data Mining: Challenges, Technologies, Tools and Applications

    OpenAIRE

    Asha M. PAWAR

    2016-01-01

    Big data is a data with large size means it has large volume, velocity and variety. Now a day's big data is expanding in a various science and engineering fields. And so there are many challenges to manage and analyse big data using various tools. This paper introduces the big data and its Characteristic concepts and Next section elaborates about the Challenges in Big data. In Particular, wed discuss about the technologies used in big data Analysis and Which Tools are mainly used to analyse t...

  20. BIG SKY CARBON SEQUESTRATION PARTNERSHIP

    Energy Technology Data Exchange (ETDEWEB)

    Susan M. Capalbo

    2004-10-31

    The Big Sky Carbon Sequestration Partnership, led by Montana State University, is comprised of research institutions, public entities and private sectors organizations, and the Confederated Salish and Kootenai Tribes and the Nez Perce Tribe. Efforts under this Partnership fall into four areas: evaluation of sources and carbon sequestration sinks; development of GIS-based reporting framework; designing an integrated suite of monitoring, measuring, and verification technologies; and initiating a comprehensive education and outreach program. At the first two Partnership meetings the groundwork was put in place to provide an assessment of capture and storage capabilities for CO{sub 2} utilizing the resources found in the Partnership region (both geological and terrestrial sinks), that would complement the ongoing DOE research. During the third quarter, planning efforts are underway for the next Partnership meeting which will showcase the architecture of the GIS framework and initial results for sources and sinks, discuss the methods and analysis underway for assessing geological and terrestrial sequestration potentials. The meeting will conclude with an ASME workshop. The region has a diverse array of geological formations that could provide storage options for carbon in one or more of its three states. Likewise, initial estimates of terrestrial sinks indicate a vast potential for increasing and maintaining soil C on forested, agricultural, and reclaimed lands. Both options include the potential for offsetting economic benefits to industry and society. Steps have been taken to assure that the GIS-based framework is consistent among types of sinks within the Big Sky Partnership area and with the efforts of other western DOE partnerships. Efforts are also being made to find funding to include Wyoming in the coverage areas for both geological and terrestrial sinks and sources. The Partnership recognizes the critical importance of measurement, monitoring, and verification

  1. Patient advocacy: barriers and facilitators

    Directory of Open Access Journals (Sweden)

    Nikravesh Mansoure

    2006-03-01

    Full Text Available Abstract Background During the two recent decades, advocacy has been a topic of much debate in the nursing profession. Although advocacy has embraced a crucial role for nurses, its extent is often limited in practice. While a variety of studies have been generated all over the world, barriers and facilitators in the patient advocacy have not been completely identified. This article presents the findings of a study exploring the barriers and facilitators influencing the role of advocacy among Iranian nurses. Method This study was conducted by grounded theory method. Participants were 24 Iranian registered nurses working in a large university hospital in Tehran, Iran. Semi-structured interviews were used for data collection. All interviews were transcribed verbatim and simultaneously Constant comparative analysis was used according to the Strauss and Corbin method. Results Through data analysis, several main themes emerged to describe the factors that hindered or facilitated patient advocacy. Nurses in this study identified powerlessness, lack of support, law, code of ethics and motivation, limited communication, physicians leading, risk of advocacy, royalty to peers, and insufficient time to interact with patients and families as barriers to advocacy. As for factors that facilitated nurses to act as a patient advocate, it was found that the nature of nurse-patient relationship, recognizing patients' needs, nurses' responsibility, physician as a colleague, and nurses' knowledge and skills could be influential in adopting the advocacy role. Conclusion Participants believed that in this context taking an advocacy role is difficult for nurses due to the barriers mentioned. Therefore, they make decisions and act as a patient's advocate in any situation concerning patient needs and status of barriers and facilitators. In most cases, they can not act at an optimal level; instead they accept only what they can do, which we called 'limited advocacy' in

  2. Mapping the human genome

    Energy Technology Data Exchange (ETDEWEB)

    Cantor, Charles R.

    1989-06-01

    The following pages aim to lay a foundation for understanding the excitement surrounding the ''human genome project,'' as well as to convey a flavor of the ongoing efforts and plans at the Human Genome Center at the Lawrence Berkeley Laboratory. Our own work, of course, is only part of a broad international effort that will dramatically enhance our understanding of human molecular genetics before the end of this century. In this country, the bulk of the effort will be carried out under the auspices of the Department of Energy and the National Institutes of Health, but significant contributions have already been made both by nonprofit private foundations and by private corporation. The respective roles of the DOE and the NIH are being coordinated by an inter-agency committee, the aims of which are to emphasize the strengths of each agency, to facilitate cooperation, and to avoid unnecessary duplication of effort. The NIH, for example, will continue its crucial work in medical genetics and in mapping the genomes of nonhuman species. The DOE, on the other hand, has unique experience in managing large projects, and its national laboratories are repositories of expertise in physics, engineering, and computer science, as well as the life sciences. The tools and techniques the project will ultimately rely on are thus likely to be developed in multidisciplinary efforts at laboratories like LBL. Accordingly, we at LBL take great pride in this enterprise -- an enterprise that will eventually transform our understanding of ourselves.

  3. Big data in food safety; an overview.

    Science.gov (United States)

    Marvin, Hans J P; Janssen, Esmée M; Bouzembrak, Yamine; Hendriksen, Peter J M; Staats, Martijn

    2016-11-07

    Technology is now being developed that is able to handle vast amounts of structured and unstructured data from diverse sources and origins. These technologies are often referred to as big data, and opens new areas of research and applications that will have an increasing impact in all sectors of our society. In this paper we assessed to which extent big data is being applied in the food safety domain and identified several promising trends. In several parts of the world, governments stimulate the publication on internet of all data generated in public funded research projects. This policy opens new opportunities for stakeholders dealing with food safety to address issues which were not possible before. Application of mobile phones as detection devices for food safety and the use of social media as early warning of food safety problems are a few examples of the new developments that are possible due to big data.

  4. Adapting bioinformatics curricula for big data.

    Science.gov (United States)

    Greene, Anna C; Giffin, Kristine A; Greene, Casey S; Moore, Jason H

    2016-01-01

    Modern technologies are capable of generating enormous amounts of data that measure complex biological systems. Computational biologists and bioinformatics scientists are increasingly being asked to use these data to reveal key systems-level properties. We review the extent to which curricula are changing in the era of big data. We identify key competencies that scientists dealing with big data are expected to possess across fields, and we use this information to propose courses to meet these growing needs. While bioinformatics programs have traditionally trained students in data-intensive science, we identify areas of particular biological, computational and statistical emphasis important for this era that can be incorporated into existing curricula. For each area, we propose a course structured around these topics, which can be adapted in whole or in parts into existing curricula. In summary, specific challenges associated with big data provide an important opportunity to update existing curricula, but we do not foresee a wholesale redesign of bioinformatics training programs.

  5. Unsupervised Tensor Mining for Big Data Practitioners.

    Science.gov (United States)

    Papalexakis, Evangelos E; Faloutsos, Christos

    2016-09-01

    Multiaspect data are ubiquitous in modern Big Data applications. For instance, different aspects of a social network are the different types of communication between people, the time stamp of each interaction, and the location associated to each individual. How can we jointly model all those aspects and leverage the additional information that they introduce to our analysis? Tensors, which are multidimensional extensions of matrices, are a principled and mathematically sound way of modeling such multiaspect data. In this article, our goal is to popularize tensors and tensor decompositions to Big Data practitioners by demonstrating their effectiveness, outlining challenges that pertain to their application in Big Data scenarios, and presenting our recent work that tackles those challenges. We view this work as a step toward a fully automated, unsupervised tensor mining tool that can be easily and broadly adopted by practitioners in academia and industry.

  6. Big Data Issues: Performance, Scalability, Availability

    Directory of Open Access Journals (Sweden)

    Laura Matei

    2014-03-01

    Full Text Available Nowadays, Big Data is probably one of the most discussed topics not only in the area of data analysis, but, I believe, in the whole realm of information technology. The simple typing of the words „big data” on an online search engine like Google will retrieve approximately 1,660,000,000 results. Having such a buzz gathered around this term, I could not help but wonder what this phenomenon means.The ever greater portion that the combination of Internet, Cloud Computing and mobile devices has been occupying in our lives, lead to an ever increasing amount of data that must be captured, communicated, aggregated, stored, and analyzed. These sets of data that we are generating are called Big Data.

  7. One Second After the Big Bang

    CERN Document Server

    CERN. Geneva

    2014-01-01

    A new experiment called PTOLEMY (Princeton Tritium Observatory for Light, Early-Universe, Massive-Neutrino Yield) is under development at the Princeton Plasma Physics Laboratory with the goal of challenging one of the most fundamental predictions of the Big Bang – the present-day existence of relic neutrinos produced less than one second after the Big Bang. Using a gigantic graphene surface to hold 100 grams of a single-atomic layer of tritium, low noise antennas that sense the radio waves of individual electrons undergoing cyclotron motion, and a massive array of cryogenic sensors that sit at the transition between normal and superconducting states, the PTOLEMY project has the potential to challenge one of the most fundamental predictions of the Big Bang, to potentially uncover new interactions and properties of the neutrinos, and to search for the existence of a species of light dark matter known as sterile neutrinos.

  8. Big Bang riddles and their revelations

    CERN Document Server

    Magueijo, J; Magueijo, Joao; Baskerville, Kim

    1999-01-01

    We describe how cosmology has converged towards a beautiful model of the Universe: the Big Bang Universe. We praise this model, but show there is a dark side to it. This dark side is usually called ``the cosmological problems'': a set of coincidences and fine tuning features required for the Big Bang Universe to be possible. After reviewing these ``riddles'' we show how they have acted as windows into the very early Universe, revealing new physics and new cosmology just as the Universe came into being. We describe inflation, pre Big Bang, and varying speed of light theories. At the end of the millennium, these proposals are seen respectively as a paradigm, a tentative idea, and outright speculation.

  9. [Evolution of genomic imprinting in mammals: what a zoo!].

    Science.gov (United States)

    Proudhon, Charlotte; Bourc'his, Déborah

    2010-05-01

    Genomic imprinting imposes an obligate mode of biparental reproduction in mammals. This phenomenon results from the monoparental expression of a subset of genes. This specific gene regulation mechanism affects viviparous mammals, especially eutherians, but also marsupials to a lesser extent. Oviparous mammals, or monotremes, do not seem to demonstrate monoparental allele expression. This phylogenic confinement suggests that the evolution of the placenta imposed a selective pressure for the emergence of genomic imprinting. This physiological argument is now complemented by recent genomic evidence facilitated by the sequencing of the platypus genome, a rare modern day case of a monotreme. Analysis of the platypus genome in comparison to eutherian genomes shows a chronological and functional coincidence between the appearance of genomic imprinting and transposable element accumulation. The systematic comparative analyses of genomic sequences in different species is essential for the further understanding of genomic imprinting emergence and divergent evolution along mammalian speciation.

  10. State Responsibility And Accountability In Managing Big Data In Biobank Research

    DEFF Research Database (Denmark)

    Tupasela, Aaro Mikael; Liede, Sandra

    2016-01-01

    , research results and incidental findings in biobanks is becoming, however, an increasingly significant challenge for all biobanks and the countries which are in the process of drafting policy and regulatory frameworks for the management and governance of big data, public health genomics and personalised...... medicine. The Finnish case highlights the challenges that many states are increasingly facing across Europe and elsewhere in terms of how to govern and coordinate the management of biomedical big data.......Within the European context the Data Protection Directive (Directive 95/46/EC) maintains an important role in current legal debates on the rights and obligations different stakeholders have in the processing of personal data. Biobanking and data sharing infrastructures pose new ethical and legal...

  11. On novice facilitators doing research

    DEFF Research Database (Denmark)

    Tavella, Elena

    2016-01-01

    Opportunities for novices to facilitate Problem Structuring Methods (PSMs) workshops are limited, especially because of a lack of access to real-world interventions and confidence in their capabilities. Novices are usually young academics building their careers through publishing. Publishing...... is challenging if facilitation and opportunities for data collection are limited. To address this challenge, this paper suggests autoethnography as a framework for addressing difficulties that novices face in conducting research and publishing on PSMs. This suggestion grows out of a literature study...... on autoethnography and PSMs combined with reflections on the author’s experience as a PSM novice and young academic. Autoethnography is presented as a means to enable access to real-world interventions, enhance novices’ confidence, and identify research and publishing opportunities. The author outlines strengths...

  12. A Critical Axiology for Big Data Studies

    Directory of Open Access Journals (Sweden)

    Saif Shahin

    2016-01-01

    Full Text Available Los datos masivos ( Big Data han tenido un gran impacto en el periodis - mo y los estudios de comunicación, a la vez que han generado un gran número de preocupaciones sociales que van desde la vigilancia masiva hasta la legitimación de prejuicios, como el racismo. En este artículo, se desarrolla una agenda para la investigación crítica de Big Data y se discu - te cuál debería ser el propósito de dicha investigación, de qué obstáculos protegerse y la posibilidad de adaptar los métodos de Big Data para lle - var a cabo la investigación empírica desde un punto de vista crítico. Di - cho programa de investigación no solo permitirá que la erudición crítica desafíe significativamente a Big Data como una herramienta hegemónica, sino que también permitirá que los académicos usen los recursos de Big Data para abordar una serie de problemas sociales de formas previamente imposibles. El artículo llama a la innovación metodológica para combinar las técnicas emergentes de Big Data y los métodos críticos y cualitativos de investigación, como la etnografía y el análisis del discurso, de tal ma - nera que se puedan complementar.

  13. How do we identify big rivers? And how big is big?

    Science.gov (United States)

    Miall, Andrew D.

    2006-04-01

    "Big rivers" are the trunk rivers that carry the water and sediment load from major orogens, or that drain large areas of a continent. Identifying such rivers in the ancient record is a challenge. Some guidance may be provided by tectonic setting and sedimentological evidence, including the scale of architectural elements, and clues from provenance studies, but such data are not infallible guides to river magnitude. The scale of depositional elements is the most obvious clue to channel size, but evidence is typically sparse and inadequate, and may be misleading. For example, thick fining-upward successions may be tectonic cyclothems. Two examples of the analysis of large ancient river systems are discussed here in order to highlight problems of methodology and interpretation. The Hawkesbury Sandstone (Triassic) of the Sydney Basin, Australia, is commonly cited as the deposit of a large river, on the basis of abundant very large-scale crossbedding. An examination of very large outcrops of this unit, including a coastal cliff section 6 km long near Sydney, showed that even with 100% exposure there are ambiguities in the determination of channel scale. It was concluded in this case that the channel dimensions of the Hawkesbury rivers were about half the size of the modern Brahmaputra River. The tectonic setting of a major ancient fluvial system is commonly not a useful clue to river scale. The Hawkesbury Sandstone is a system draining transversely from a cratonic source into a foreland basin, whereas most large rivers in foreland basins flow axially and are derived mainly from the orogenic uplifts (e.g., the large tidally influenced rivers of the Athabasca Oil Sands, Alberta). Epeirogenic tilting of a continent by the dynamic topography process may generate drainages in unexpected directions. For example, analyses of detrital zircons in Upper Paleozoic-Mesozoic nonmarine successions in the SW United States suggests significant derivation from the Appalachian orogen

  14. From big data to smart data

    CERN Document Server

    Iafrate, Fernando

    2015-01-01

    A pragmatic approach to Big Data by taking the reader on a journey between Big Data (what it is) and the Smart Data (what it is for). Today's decision making can be reached via information (related to the data), knowledge (related to people and processes), and timing (the capacity to decide, act and react at the right time). The huge increase in volume of data traffic, and its format (unstructured data such as blogs, logs, and video) generated by the "digitalization" of our world modifies radically our relationship to the space (in motion) and time, dimension and by capillarity, the enterpr

  15. Probing Big Bounce with Dark Matter

    CERN Document Server

    Li, Changhong

    2014-01-01

    We investigate the production of dark matter in a generic bouncing universe framework. Our result shows that, if the future-experimentally-measured cross section and mass of dark matter particle satisfy the cosmological constraint, $\\langle \\sigma v\\rangle m_\\chi^2 < 1.82\\times 10^{-26}$, it becomes a strong indication that our universe went through a Big Bounce---instead of the inflationary phase as postulated in Standard Big Bang Cosmology---at the early stage of the cosmological evolution.

  16. Cognitive computing and big data analytics

    CERN Document Server

    Hurwitz, Judith; Bowles, Adrian

    2015-01-01

    MASTER THE ABILITY TO APPLY BIG DATA ANALYTICS TO MASSIVE AMOUNTS OF STRUCTURED AND UNSTRUCTURED DATA Cognitive computing is a technique that allows humans and computers to collaborate in order to gain insights and knowledge from data by uncovering patterns and anomalies. This comprehensive guide explains the underlying technologies, such as artificial intelligence, machine learning, natural language processing, and big data analytics. It then demonstrates how you can use these technologies to transform your organization. You will explore how different vendors and different industries are a

  17. Some notes on the big trip

    Energy Technology Data Exchange (ETDEWEB)

    Gonzalez-Diaz, Pedro F. [Colina de los Chopos, Centro de Fisica ' Miguel A. Catalan' , Instituto de Matematicas y Fisica Fundamental, Consejo Superior de Investigaciones Cientificas, Serrano 121, 28006 Madrid (Spain)]. E-mail: pedrogonzalez@mi.madritel.es

    2006-03-30

    The big trip is a cosmological process thought to occur in the future by which the entire universe would be engulfed inside a gigantic wormhole and might travel through it along space and time. In this Letter we discuss different arguments that have been raised against the viability of that process, reaching the conclusions that the process can actually occur by accretion of phantom energy onto the wormholes and that it is stable and might occur in the global context of a multiverse model. We finally argue that the big trip does not contradict any holographic bounds on entropy and information.

  18. SETI as a part of Big History

    Science.gov (United States)

    Maccone, Claudio

    2014-08-01

    Big History is an emerging academic discipline which examines history scientifically from the Big Bang to the present. It uses a multidisciplinary approach based on combining numerous disciplines from science and the humanities, and explores human existence in the context of this bigger picture. It is taught at some universities. In a series of recent papers ([11] through [15] and [17] through [18]) and in a book [16], we developed a new mathematical model embracing Darwinian Evolution (RNA to Humans, see, in particular, [17] and Human History (Aztecs to USA, see [16]) and then we extrapolated even that into the future up to ten million years (see 18), the minimum time requested for a civilization to expand to the whole Milky Way (Fermi paradox). In this paper, we further extend that model in the past so as to let it start at the Big Bang (13.8 billion years ago) thus merging Big History, Evolution on Earth and SETI (the modern Search for ExtraTerrestrial Intelligence) into a single body of knowledge of a statistical type. Our idea is that the Geometric Brownian Motion (GBM), so far used as the key stochastic process of financial mathematics (Black-Sholes models and related 1997 Nobel Prize in Economics!) may be successfully applied to the whole of Big History. In particular, in this paper we derive book about GBMs will be written by the author. Mass Extinctions of the geological past also are one more topic that may be cast in the language of a decreasing GBM over a short time lapse, since Mass Extinctions are sudden all-lows in the number of living species. In this paper, we give formulae for the decreasing GBMs of Mass Extinctions, like the K-Pg one of 64 million years ago. Finally, we note that the Big History Equation is just the extension of the Drake Equation to 13.8 billion years of cosmic evolution. So, the relevant GBM starting at the Big Bang epoch (13.8 billion years ago) and growing up to nowadays in a stochastically increasing fashion becomes the GBM

  19. The Big Idea. Dynamic Stakeholder Management

    Science.gov (United States)

    2014-12-01

    Defense AT&L: November–December 2014 8 The Big IDEA Dynamic Stakeholder Management Lt. Col. Franklin D. Gaillard II, USAF Frank Gaillard, Ph.D...currently valid OMB control number. 1. REPORT DATE DEC 2014 2. REPORT TYPE 3. DATES COVERED 00-00-2014 to 00-00-2014 4. TITLE AND SUBTITLE The... Big Idea. Dynamic Stakeholder Management 5a. CONTRACT NUMBER 5b. GRANT NUMBER 5c. PROGRAM ELEMENT NUMBER 6. AUTHOR(S) 5d. PROJECT NUMBER 5e. TASK

  20. How quantum is the big bang?

    Science.gov (United States)

    Bojowald, Martin

    2008-06-06

    When quantum gravity is used to discuss the big bang singularity, the most important, though rarely addressed, question is what role genuine quantum degrees of freedom play. Here, complete effective equations are derived for isotropic models with an interacting scalar to all orders in the expansions involved. The resulting coupling terms show that quantum fluctuations do not affect the bounce much. Quantum correlations, however, do have an important role and could even eliminate the bounce. How quantum gravity regularizes the big bang depends crucially on properties of the quantum state.