WorldWideScience

Sample records for bioinformatics database warehouse

  1. BioWarehouse: a bioinformatics database warehouse toolkit

    Directory of Open Access Journals (Sweden)

    Stringer-Calvert David WJ

    2006-03-01

    Full Text Available Abstract Background This article addresses the problem of interoperation of heterogeneous bioinformatics databases. Results We introduce BioWarehouse, an open source toolkit for constructing bioinformatics database warehouses using the MySQL and Oracle relational database managers. BioWarehouse integrates its component databases into a common representational framework within a single database management system, thus enabling multi-database queries using the Structured Query Language (SQL but also facilitating a variety of database integration tasks such as comparative analysis and data mining. BioWarehouse currently supports the integration of a pathway-centric set of databases including ENZYME, KEGG, and BioCyc, and in addition the UniProt, GenBank, NCBI Taxonomy, and CMR databases, and the Gene Ontology. Loader tools, written in the C and JAVA languages, parse and load these databases into a relational database schema. The loaders also apply a degree of semantic normalization to their respective source data, decreasing semantic heterogeneity. The schema supports the following bioinformatics datatypes: chemical compounds, biochemical reactions, metabolic pathways, proteins, genes, nucleic acid sequences, features on protein and nucleic-acid sequences, organisms, organism taxonomies, and controlled vocabularies. As an application example, we applied BioWarehouse to determine the fraction of biochemically characterized enzyme activities for which no sequences exist in the public sequence databases. The answer is that no sequence exists for 36% of enzyme activities for which EC numbers have been assigned. These gaps in sequence data significantly limit the accuracy of genome annotation and metabolic pathway prediction, and are a barrier for metabolic engineering. Complex queries of this type provide examples of the value of the data warehousing approach to bioinformatics research. Conclusion BioWarehouse embodies significant progress on the

  2. BioWarehouse: a bioinformatics database warehouse toolkit.

    Science.gov (United States)

    Lee, Thomas J; Pouliot, Yannick; Wagner, Valerie; Gupta, Priyanka; Stringer-Calvert, David W J; Tenenbaum, Jessica D; Karp, Peter D

    2006-03-23

    This article addresses the problem of interoperation of heterogeneous bioinformatics databases. We introduce BioWarehouse, an open source toolkit for constructing bioinformatics database warehouses using the MySQL and Oracle relational database managers. BioWarehouse integrates its component databases into a common representational framework within a single database management system, thus enabling multi-database queries using the Structured Query Language (SQL) but also facilitating a variety of database integration tasks such as comparative analysis and data mining. BioWarehouse currently supports the integration of a pathway-centric set of databases including ENZYME, KEGG, and BioCyc, and in addition the UniProt, GenBank, NCBI Taxonomy, and CMR databases, and the Gene Ontology. Loader tools, written in the C and JAVA languages, parse and load these databases into a relational database schema. The loaders also apply a degree of semantic normalization to their respective source data, decreasing semantic heterogeneity. The schema supports the following bioinformatics datatypes: chemical compounds, biochemical reactions, metabolic pathways, proteins, genes, nucleic acid sequences, features on protein and nucleic-acid sequences, organisms, organism taxonomies, and controlled vocabularies. As an application example, we applied BioWarehouse to determine the fraction of biochemically characterized enzyme activities for which no sequences exist in the public sequence databases. The answer is that no sequence exists for 36% of enzyme activities for which EC numbers have been assigned. These gaps in sequence data significantly limit the accuracy of genome annotation and metabolic pathway prediction, and are a barrier for metabolic engineering. Complex queries of this type provide examples of the value of the data warehousing approach to bioinformatics research. BioWarehouse embodies significant progress on the database integration problem for bioinformatics.

  3. Database Vs Data Warehouse

    Directory of Open Access Journals (Sweden)

    2007-01-01

    Full Text Available Data warehouse technology includes a set of concepts and methods that offer the users useful information for decision making. The necessity to build a data warehouse arises from the necessity to improve the quality of information in the organization. The date proceeding from different sources, having a variety of forms - both structured and unstructured, are filtered according to business rules and are integrated in a single large data collection. Using informatics solutions, managers have understood that data stored in operational systems - including databases, are an informational gold mine that must be exploited. Data warehouses have been developed to answer the increasing demands for complex analysis, which could not be properly achieved with operational databases. The present paper emphasizes some of the criteria that information application developers can use in order to choose between a database solution or a data warehouse one.

  4. Atlas – a data warehouse for integrative bioinformatics

    Directory of Open Access Journals (Sweden)

    Yuen Macaire MS

    2005-02-01

    Full Text Available Abstract Background We present a biological data warehouse called Atlas that locally stores and integrates biological sequences, molecular interactions, homology information, functional annotations of genes, and biological ontologies. The goal of the system is to provide data, as well as a software infrastructure for bioinformatics research and development. Description The Atlas system is based on relational data models that we developed for each of the source data types. Data stored within these relational models are managed through Structured Query Language (SQL calls that are implemented in a set of Application Programming Interfaces (APIs. The APIs include three languages: C++, Java, and Perl. The methods in these API libraries are used to construct a set of loader applications, which parse and load the source datasets into the Atlas database, and a set of toolbox applications which facilitate data retrieval. Atlas stores and integrates local instances of GenBank, RefSeq, UniProt, Human Protein Reference Database (HPRD, Biomolecular Interaction Network Database (BIND, Database of Interacting Proteins (DIP, Molecular Interactions Database (MINT, IntAct, NCBI Taxonomy, Gene Ontology (GO, Online Mendelian Inheritance in Man (OMIM, LocusLink, Entrez Gene and HomoloGene. The retrieval APIs and toolbox applications are critical components that offer end-users flexible, easy, integrated access to this data. We present use cases that use Atlas to integrate these sources for genome annotation, inference of molecular interactions across species, and gene-disease associations. Conclusion The Atlas biological data warehouse serves as data infrastructure for bioinformatics research and development. It forms the backbone of the research activities in our laboratory and facilitates the integration of disparate, heterogeneous biological sources of data enabling new scientific inferences. Atlas achieves integration of diverse data sets at two levels. First

  5. COMPARISON OF POPULAR BIOINFORMATICS DATABASES

    OpenAIRE

    Abdulganiyu Abdu Yusuf; Zahraddeen Sufyanu; Kabir Yusuf Mamman; Abubakar Umar Suleiman

    2016-01-01

    Bioinformatics is the application of computational tools to capture and interpret biological data. It has wide applications in drug development, crop improvement, agricultural biotechnology and forensic DNA analysis. There are various databases available to researchers in bioinformatics. These databases are customized for a specific need and are ranged in size, scope, and purpose. The main drawbacks of bioinformatics databases include redundant information, constant change, data spread over m...

  6. Geminivirus data warehouse: a database enriched with machine learning approaches.

    Science.gov (United States)

    Silva, Jose Cleydson F; Carvalho, Thales F M; Basso, Marcos F; Deguchi, Michihito; Pereira, Welison A; Sobrinho, Roberto R; Vidigal, Pedro M P; Brustolini, Otávio J B; Silva, Fabyano F; Dal-Bianco, Maximiller; Fontes, Renildes L F; Santos, Anésia A; Zerbini, Francisco Murilo; Cerqueira, Fabio R; Fontes, Elizabeth P B

    2017-05-05

    The Geminiviridae family encompasses a group of single-stranded DNA viruses with twinned and quasi-isometric virions, which infect a wide range of dicotyledonous and monocotyledonous plants and are responsible for significant economic losses worldwide. Geminiviruses are divided into nine genera, according to their insect vector, host range, genome organization, and phylogeny reconstruction. Using rolling-circle amplification approaches along with high-throughput sequencing technologies, thousands of full-length geminivirus and satellite genome sequences were amplified and have become available in public databases. As a consequence, many important challenges have emerged, namely, how to classify, store, and analyze massive datasets as well as how to extract information or new knowledge. Data mining approaches, mainly supported by machine learning (ML) techniques, are a natural means for high-throughput data analysis in the context of genomics, transcriptomics, proteomics, and metabolomics. Here, we describe the development of a data warehouse enriched with ML approaches, designated geminivirus.org. We implemented search modules, bioinformatics tools, and ML methods to retrieve high precision information, demarcate species, and create classifiers for genera and open reading frames (ORFs) of geminivirus genomes. The use of data mining techniques such as ETL (Extract, Transform, Load) to feed our database, as well as algorithms based on machine learning for knowledge extraction, allowed us to obtain a database with quality data and suitable tools for bioinformatics analysis. The Geminivirus Data Warehouse (geminivirus.org) offers a simple and user-friendly environment for information retrieval and knowledge discovery related to geminiviruses.

  7. SPECIES DATABASES AND THE BIOINFORMATICS REVOLUTION.

    Science.gov (United States)

    Biological databases are having a growth spurt. Much of this results from research in genetics and biodiversity, coupled with fast-paced developments in information technology. The revolution in bioinformatics, defined by Sugden and Pennisi (2000) as the "tools and techniques for...

  8. Optimized Database of Higher Education Management Using Data Warehouse

    Directory of Open Access Journals (Sweden)

    Spits Warnars

    2010-04-01

    Full Text Available The emergence of new higher education institutions has created the competition in higher education market, and data warehouse can be used as an effective technology tools for increasing competitiveness in the higher education market. Data warehouse produce reliable reports for the institution’s high-level management in short time for faster and better decision making, not only on increasing the admission number of students, but also on the possibility to find extraordinary, unconventional funds for the institution. Efficiency comparison was based on length and amount of processed records, total processed byte, amount of processed tables, time to run query and produced record on OLTP database and data warehouse. Efficiency percentages was measured by the formula for percentage increasing and the average efficiency percentage of 461.801,04% shows that using data warehouse is more powerful and efficient rather than using OLTP database. Data warehouse was modeled based on hypercube which is created by limited high demand reports which usually used by high level management. In every table of fact and dimension fields will be inserted which represent the loading constructive merge where the ETL (Extraction, Transformation and Loading process is run based on the old and new files.

  9. PubData: search engine for bioinformatics databases worldwide

    OpenAIRE

    Vand, Kasra; Wahlestedt, Thor; Khomtchouk, Kelly; Sayed, Mohammed; Wahlestedt, Claes; Khomtchouk, Bohdan

    2016-01-01

    We propose a search engine and file retrieval system for all bioinformatics databases worldwide. PubData searches biomedical data in a user-friendly fashion similar to how PubMed searches biomedical literature. PubData is built on novel network programming, natural language processing, and artificial intelligence algorithms that can patch into the file transfer protocol servers of any user-specified bioinformatics database, query its contents, retrieve files for download, and adapt to the use...

  10. Architectural design of a data warehouse to support operational and analytical queries across disparate clinical databases.

    Science.gov (United States)

    Chelico, John D; Wilcox, Adam; Wajngurt, David

    2007-10-11

    As the clinical data warehouse of the New York Presbyterian Hospital has evolved innovative methods of integrating new data sources and providing more effective and efficient data reporting and analysis need to be explored. We designed and implemented a new clinical data warehouse architecture to handle the integration of disparate clinical databases in the institution. By examining the way downstream systems are populated and streamlining the way data is stored we create a virtual clinical data warehouse that is adaptable to future needs of the organization.

  11. Database and Bioinformatics Studies of Probiotics.

    Science.gov (United States)

    Tao, Lin; Wang, Bohua; Zhong, Yafen; Pow, Siok Hoon; Zeng, Xian; Qin, Chu; Zhang, Peng; Chen, Shangying; He, Weidong; Tan, Ying; Liu, Hongxia; Jiang, Yuyang; Chen, Weiping; Chen, Yu Zong

    2017-09-06

    Probiotics have been widely explored for health benefits, animal cares, and agricultural applications. Recent advances in microbiome, microbiota, and microbial dark matter research have fueled greater interests in and paved ways for the study of the mechanisms of probiotics and the discovery of new probiotics from uncharacterized microbial sources. A probiotics database named PROBIO was developed to facilitate these efforts and the need for the information on the known probiotics, which provides the comprehensive information about the probiotic functions of 448 marketed, 167 clinical trial/field trial, and 382 research probiotics for use or being studied for use in humans, animals, and plants. The potential applications of the probiotics data are illustrated by several literature-reported investigations, which have used the relevant information for probing the function and mechanism of the probiotics and for discovering new probiotics. PROBIO can be accessed free of charge at http://bidd2.nus.edu.sg/probio/homepage.htm .

  12. Microsoft Enterprise Consortium: A Resource for Teaching Data Warehouse, Business Intelligence and Database Management Systems

    Science.gov (United States)

    Kreie, Jennifer; Hashemi, Shohreh

    2012-01-01

    Data is a vital resource for businesses; therefore, it is important for businesses to manage and use their data effectively. Because of this, businesses value college graduates with an understanding of and hands-on experience working with databases, data warehouses and data analysis theories and tools. Faculty in many business disciplines try to…

  13. Design of a Multi Dimensional Database for the Archimed DataWarehouse.

    Science.gov (United States)

    Bréant, Claudine; Thurler, Gérald; Borst, François; Geissbuhler, Antoine

    2005-01-01

    The Archimed data warehouse project started in 1993 at the Geneva University Hospital. It has progressively integrated seven data marts (or domains of activity) archiving medical data such as Admission/Discharge/Transfer (ADT) data, laboratory results, radiology exams, diagnoses, and procedure codes. The objective of the Archimed data warehouse is to facilitate the access to an integrated and coherent view of patient medical in order to support analytical activities such as medical statistics, clinical studies, retrieval of similar cases and data mining processes. This paper discusses three principal design aspects relative to the conception of the database of the data warehouse: 1) the granularity of the database, which refers to the level of detail or summarization of data, 2) the database model and architecture, describing how data will be presented to end users and how new data is integrated, 3) the life cycle of the database, in order to ensure long term scalability of the environment. Both, the organization of patient medical data using a standardized elementary fact representation and the use of the multi dimensional model have proved to be powerful design tools to integrate data coming from the multiple heterogeneous database systems part of the transactional Hospital Information System (HIS). Concurrently, the building of the data warehouse in an incremental way has helped to control the evolution of the data content. These three design aspects bring clarity and performance regarding data access. They also provide long term scalability to the system and resilience to further changes that may occur in source systems feeding the data warehouse.

  14. Design database for quantitative trait loci (QTL) data warehouse, data mining, and meta-analysis.

    Science.gov (United States)

    Hu, Zhi-Liang; Reecy, James M; Wu, Xiao-Lin

    2012-01-01

    A database can be used to warehouse quantitative trait loci (QTL) data from multiple sources for comparison, genomic data mining, and meta-analysis. A robust database design involves sound data structure logistics, meaningful data transformations, normalization, and proper user interface designs. This chapter starts with a brief review of relational database basics and concentrates on issues associated with curation of QTL data into a relational database, with emphasis on the principles of data normalization and structure optimization. In addition, some simple examples of QTL data mining and meta-analysis are included. These examples are provided to help readers better understand the potential and importance of sound database design.

  15. Shared Bioinformatics Databases within the Unipro UGENE Platform

    Directory of Open Access Journals (Sweden)

    Protsyuk Ivan V.

    2015-03-01

    Full Text Available Unipro UGENE is an open-source bioinformatics toolkit that integrates popular tools along with original instruments for molecular biologists within a unified user interface. Nowadays, most bioinformatics desktop applications, including UGENE, make use of a local data model while processing different types of data. Such an approach causes an inconvenience for scientists working cooperatively and relying on the same data. This refers to the need of making multiple copies of certain files for every workplace and maintaining synchronization between them in case of modifications. Therefore, we focused on delivering a collaborative work into the UGENE user experience. Currently, several UGENE installations can be connected to a designated shared database and users can interact with it simultaneously. Such databases can be created by UGENE users and be used at their discretion. Objects of each data type, supported by UGENE such as sequences, annotations, multiple alignments, etc., can now be easily imported from or exported to a remote storage. One of the main advantages of this system, compared to existing ones, is the almost simultaneous access of client applications to shared data regardless of their volume. Moreover, the system is capable of storing millions of objects. The storage itself is a regular database server so even an inexpert user is able to deploy it. Thus, UGENE may provide access to shared data for users located, for example, in the same laboratory or institution. UGENE is available at: http://ugene.net/download.html.

  16. Bioinformatics Database Tools in Analysis of Genetics of Neurodevelopmental Disorders

    Directory of Open Access Journals (Sweden)

    Dibyashree Mallik

    2017-10-01

    Full Text Available Bioinformatics tools are recently used in various sectors of biology. Many questions regarding Neurodevelopmental disorder which arises as a major health issue recently can be solved by using various bioinformatics databases. Schizophrenia is such a mental disorder which is now arises as a major threat in young age people because it is mostly seen in case of people during their late adolescence or early adulthood period. Databases like DISGENET, GWAS, PHARMGKB, and DRUGBANK have huge repository of genes associated with schizophrenia. We found a lot of genes are being associated with schizophrenia, but approximately 200 genes are found to be present in any of these databases. After further screening out process 20 genes are found to be highly associated with each other and are also a common genes in many other diseases also. It is also found that they all are serves as a common targeting gene in many antipsychotic drugs. After analysis of various biological properties, molecular function it is found that these 20 genes are mostly involved in biological regulation process and are having receptor activity. They are belonging mainly to receptor protein class. Among these 20 genes CYP2C9, CYP3A4, DRD2, HTR1A, HTR2A are shown to be a main targeting genes of most of the antipsychotic drugs and are associated with  more than 40% diseases. The basic findings of the present study enumerated that a suitable combined drug can be design by targeting these genes which can be used for the better treatment of schizophrenia.

  17. Missing "Links" in Bioinformatics Education: Expanding Students' Conceptions of Bioinformatics Using a Biodiversity Database of Living and Fossil Reef Corals

    Science.gov (United States)

    Nehm, Ross H.; Budd, Ann F.

    2006-01-01

    NMITA is a reef coral biodiversity database that we use to introduce students to the expansive realm of bioinformatics beyond genetics. We introduce a series of lessons that have students use this database, thereby accessing real data that can be used to test hypotheses about biodiversity and evolution while targeting the "National Science …

  18. PATRIC, the bacterial bioinformatics database and analysis resource

    Science.gov (United States)

    Wattam, Alice R.; Abraham, David; Dalay, Oral; Disz, Terry L.; Driscoll, Timothy; Gabbard, Joseph L.; Gillespie, Joseph J.; Gough, Roger; Hix, Deborah; Kenyon, Ronald; Machi, Dustin; Mao, Chunhong; Nordberg, Eric K.; Olson, Robert; Overbeek, Ross; Pusch, Gordon D.; Shukla, Maulik; Schulman, Julie; Stevens, Rick L.; Sullivan, Daniel E.; Vonstein, Veronika; Warren, Andrew; Will, Rebecca; Wilson, Meredith J.C.; Yoo, Hyun Seung; Zhang, Chengdong; Zhang, Yan; Sobral, Bruno W.

    2014-01-01

    The Pathosystems Resource Integration Center (PATRIC) is the all-bacterial Bioinformatics Resource Center (BRC) (http://www.patricbrc.org). A joint effort by two of the original National Institute of Allergy and Infectious Diseases-funded BRCs, PATRIC provides researchers with an online resource that stores and integrates a variety of data types [e.g. genomics, transcriptomics, protein–protein interactions (PPIs), three-dimensional protein structures and sequence typing data] and associated metadata. Datatypes are summarized for individual genomes and across taxonomic levels. All genomes in PATRIC, currently more than 10 000, are consistently annotated using RAST, the Rapid Annotations using Subsystems Technology. Summaries of different data types are also provided for individual genes, where comparisons of different annotations are available, and also include available transcriptomic data. PATRIC provides a variety of ways for researchers to find data of interest and a private workspace where they can store both genomic and gene associations, and their own private data. Both private and public data can be analyzed together using a suite of tools to perform comparative genomic or transcriptomic analysis. PATRIC also includes integrated information related to disease and PPIs. All the data and integrated analysis and visualization tools are freely available. This manuscript describes updates to the PATRIC since its initial report in the 2007 NAR Database Issue. PMID:24225323

  19. PATRIC, the bacterial bioinformatics database and analysis resource.

    Science.gov (United States)

    Wattam, Alice R; Abraham, David; Dalay, Oral; Disz, Terry L; Driscoll, Timothy; Gabbard, Joseph L; Gillespie, Joseph J; Gough, Roger; Hix, Deborah; Kenyon, Ronald; Machi, Dustin; Mao, Chunhong; Nordberg, Eric K; Olson, Robert; Overbeek, Ross; Pusch, Gordon D; Shukla, Maulik; Schulman, Julie; Stevens, Rick L; Sullivan, Daniel E; Vonstein, Veronika; Warren, Andrew; Will, Rebecca; Wilson, Meredith J C; Yoo, Hyun Seung; Zhang, Chengdong; Zhang, Yan; Sobral, Bruno W

    2014-01-01

    The Pathosystems Resource Integration Center (PATRIC) is the all-bacterial Bioinformatics Resource Center (BRC) (http://www.patricbrc.org). A joint effort by two of the original National Institute of Allergy and Infectious Diseases-funded BRCs, PATRIC provides researchers with an online resource that stores and integrates a variety of data types [e.g. genomics, transcriptomics, protein-protein interactions (PPIs), three-dimensional protein structures and sequence typing data] and associated metadata. Datatypes are summarized for individual genomes and across taxonomic levels. All genomes in PATRIC, currently more than 10,000, are consistently annotated using RAST, the Rapid Annotations using Subsystems Technology. Summaries of different data types are also provided for individual genes, where comparisons of different annotations are available, and also include available transcriptomic data. PATRIC provides a variety of ways for researchers to find data of interest and a private workspace where they can store both genomic and gene associations, and their own private data. Both private and public data can be analyzed together using a suite of tools to perform comparative genomic or transcriptomic analysis. PATRIC also includes integrated information related to disease and PPIs. All the data and integrated analysis and visualization tools are freely available. This manuscript describes updates to the PATRIC since its initial report in the 2007 NAR Database Issue.

  20. FaceWarehouse: a 3D facial expression database for visual computing.

    Science.gov (United States)

    Cao, Chen; Weng, Yanlin; Zhou, Shun; Tong, Yiying; Zhou, Kun

    2014-03-01

    We present FaceWarehouse, a database of 3D facial expressions for visual computing applications. We use Kinect, an off-the-shelf RGBD camera, to capture 150 individuals aged 7-80 from various ethnic backgrounds. For each person, we captured the RGBD data of her different expressions, including the neutral expression and 19 other expressions such as mouth-opening, smile, kiss, etc. For every RGBD raw data record, a set of facial feature points on the color image such as eye corners, mouth contour, and the nose tip are automatically localized, and manually adjusted if better accuracy is required. We then deform a template facial mesh to fit the depth data as closely as possible while matching the feature points on the color image to their corresponding points on the mesh. Starting from these fitted face meshes, we construct a set of individual-specific expression blendshapes for each person. These meshes with consistent topology are assembled as a rank-3 tensor to build a bilinear face model with two attributes: identity and expression. Compared with previous 3D facial databases, for every person in our database, there is a much richer matching collection of expressions, enabling depiction of most human facial actions. We demonstrate the potential of FaceWarehouse for visual computing with four applications: facial image manipulation, face component transfer, real-time performance-based facial image animation, and facial animation retargeting from video to image.

  1. Influenza research database: an integrated bioinformatics resource for influenza virus research

    Science.gov (United States)

    The Influenza Research Database (IRD) is a U.S. National Institute of Allergy and Infectious Diseases (NIAID)-sponsored Bioinformatics Resource Center dedicated to providing bioinformatics support for influenza virus research. IRD facilitates the research and development of vaccines, diagnostics, an...

  2. Bioinformatics

    DEFF Research Database (Denmark)

    Baldi, Pierre; Brunak, Søren

    , and medicine will be particularly affected by the new results and the increased understanding of life at the molecular level. Bioinformatics is the development and application of computer methods for analysis, interpretation, and prediction, as well as for the design of experiments. It has emerged...

  3. The SIB Swiss Institute of Bioinformatics' resources: focus on curated databases

    OpenAIRE

    Bultet, Lisandra Aguilar; Aguilar Rodriguez, Jose; Ahrens, Christian H; Ahrne, Erik Lennart; Ai, Ni; Aimo, Lucila; Akalin, Altuna; Aleksiev, Tyanko; Alocci, Davide; Altenhoff, Adrian; Alves, Isabel; Ambrosini, Giovanna; Pedone, Pascale Anderle; Angelina, Paolo; Anisimova, Maria

    2016-01-01

    The SIB Swiss Institute of Bioinformatics (www.isb-sib.ch) provides world-class bioinformatics databases, software tools, services and training to the international life science community in academia and industry. These solutions allow life scientists to turn the exponentially growing amount of data into knowledge. Here, we provide an overview of SIB's resources and competence areas, with a strong focus on curated databases and SIB's most popular and widely used resources. In particular, SIB'...

  4. Databases and Associated Bioinformatic Tools in Studies of Food Allergens, Epitopes and Haptens – a Review

    Directory of Open Access Journals (Sweden)

    Bucholska Justyna

    2018-06-01

    Full Text Available Allergies and/or food intolerances are a growing problem of the modern world. Diffi culties associated with the correct diagnosis of food allergies result in the need to classify the factors causing allergies and allergens themselves. Therefore, internet databases and other bioinformatic tools play a special role in deepening knowledge of biologically-important compounds. Internet repositories, as a source of information on different chemical compounds, including those related to allergy and intolerance, are increasingly being used by scientists. Bioinformatic methods play a signifi cant role in biological and medical sciences, and their importance in food science is increasing. This study aimed at presenting selected databases and tools of bioinformatic analysis useful in research on food allergies, allergens (11 databases, epitopes (7 databases, and haptens (2 databases. It also presents examples of the application of computer methods in studies related to allergies.

  5. E-MSD: the European Bioinformatics Institute Macromolecular Structure Database.

    Science.gov (United States)

    Boutselakis, H; Dimitropoulos, D; Fillon, J; Golovin, A; Henrick, K; Hussain, A; Ionides, J; John, M; Keller, P A; Krissinel, E; McNeil, P; Naim, A; Newman, R; Oldfield, T; Pineda, J; Rachedi, A; Copeland, J; Sitnov, A; Sobhany, S; Suarez-Uruena, A; Swaminathan, J; Tagari, M; Tate, J; Tromm, S; Velankar, S; Vranken, W

    2003-01-01

    The E-MSD macromolecular structure relational database (http://www.ebi.ac.uk/msd) is designed to be a single access point for protein and nucleic acid structures and related information. The database is derived from Protein Data Bank (PDB) entries. Relational database technologies are used in a comprehensive cleaning procedure to ensure data uniformity across the whole archive. The search database contains an extensive set of derived properties, goodness-of-fit indicators, and links to other EBI databases including InterPro, GO, and SWISS-PROT, together with links to SCOP, CATH, PFAM and PROSITE. A generic search interface is available, coupled with a fast secondary structure domain search tool.

  6. Envirofacts Data Warehouse

    Data.gov (United States)

    U.S. Environmental Protection Agency — The Envirofacts Data Warehouse contains information from select EPA Environmental program office databases and provides access about environmental activities that...

  7. A Survey of Bioinformatics Database and Software Usage through Mining the Literature.

    Directory of Open Access Journals (Sweden)

    Geraint Duck

    Full Text Available Computer-based resources are central to much, if not most, biological and medical research. However, while there is an ever expanding choice of bioinformatics resources to use, described within the biomedical literature, little work to date has provided an evaluation of the full range of availability or levels of usage of database and software resources. Here we use text mining to process the PubMed Central full-text corpus, identifying mentions of databases or software within the scientific literature. We provide an audit of the resources contained within the biomedical literature, and a comparison of their relative usage, both over time and between the sub-disciplines of bioinformatics, biology and medicine. We find that trends in resource usage differs between these domains. The bioinformatics literature emphasises novel resource development, while database and software usage within biology and medicine is more stable and conservative. Many resources are only mentioned in the bioinformatics literature, with a relatively small number making it out into general biology, and fewer still into the medical literature. In addition, many resources are seeing a steady decline in their usage (e.g., BLAST, SWISS-PROT, though some are instead seeing rapid growth (e.g., the GO, R. We find a striking imbalance in resource usage with the top 5% of resource names (133 names accounting for 47% of total usage, and over 70% of resources extracted being only mentioned once each. While these results highlight the dynamic and creative nature of bioinformatics research they raise questions about software reuse, choice and the sharing of bioinformatics practice. Is it acceptable that so many resources are apparently never reused? Finally, our work is a step towards automated extraction of scientific method from text. We make the dataset generated by our study available under the CC0 license here: http://dx.doi.org/10.6084/m9.figshare.1281371.

  8. Database Are Not Toasters: A Framework for Comparing Data Warehouse Appliances

    Science.gov (United States)

    Trajman, Omer; Crolotte, Alain; Steinhoff, David; Nambiar, Raghunath Othayoth; Poess, Meikel

    The success of Business Intelligence (BI) applications depends on two factors, the ability to analyze data ever more quickly and the ability to handle ever increasing volumes of data. Data Warehouse (DW) and Data Mart (DM) installations that support BI applications have historically been built using traditional architectures either designed from the ground up or based on customized reference system designs. The advent of Data Warehouse Appliances (DA) brings packaged software and hardware solutions that address performance and scalability requirements for certain market segments. The differences between DAs and custom installations make direct comparisons between them impractical and suggest the need for a targeted DA benchmark. In this paper we review data warehouse appliances by surveying thirteen products offered today. We assess the common characteristics among them and propose a classification for DA offerings. We hope our results will help define a useful benchmark for DAs.

  9. The OAuth 2.0 Web Authorization Protocol for the Internet Addiction Bioinformatics (IABio Database

    Directory of Open Access Journals (Sweden)

    Jeongseok Choi

    2016-03-01

    Full Text Available Internet addiction (IA has become a widespread and problematic phenomenon as smart devices pervade society. Moreover, internet gaming disorder leads to increases in social expenditures for both individuals and nations alike. Although the prevention and treatment of IA are getting more important, the diagnosis of IA remains problematic. Understanding the neurobiological mechanism of behavioral addictions is essential for the development of specific and effective treatments. Although there are many databases related to other addictions, a database for IA has not been developed yet. In addition, bioinformatics databases, especially genetic databases, require a high level of security and should be designed based on medical information standards. In this respect, our study proposes the OAuth standard protocol for database access authorization. The proposed IA Bioinformatics (IABio database system is based on internet user authentication, which is a guideline for medical information standards, and uses OAuth 2.0 for access control technology. This study designed and developed the system requirements and configuration. The OAuth 2.0 protocol is expected to establish the security of personal medical information and be applied to genomic research on IA.

  10. The OAuth 2.0 Web Authorization Protocol for the Internet Addiction Bioinformatics (IABio) Database.

    Science.gov (United States)

    Choi, Jeongseok; Kim, Jaekwon; Lee, Dong Kyun; Jang, Kwang Soo; Kim, Dai-Jin; Choi, In Young

    2016-03-01

    Internet addiction (IA) has become a widespread and problematic phenomenon as smart devices pervade society. Moreover, internet gaming disorder leads to increases in social expenditures for both individuals and nations alike. Although the prevention and treatment of IA are getting more important, the diagnosis of IA remains problematic. Understanding the neurobiological mechanism of behavioral addictions is essential for the development of specific and effective treatments. Although there are many databases related to other addictions, a database for IA has not been developed yet. In addition, bioinformatics databases, especially genetic databases, require a high level of security and should be designed based on medical information standards. In this respect, our study proposes the OAuth standard protocol for database access authorization. The proposed IA Bioinformatics (IABio) database system is based on internet user authentication, which is a guideline for medical information standards, and uses OAuth 2.0 for access control technology. This study designed and developed the system requirements and configuration. The OAuth 2.0 protocol is expected to establish the security of personal medical information and be applied to genomic research on IA.

  11. Envirofacts Data Warehouse

    Science.gov (United States)

    The Envirofacts Data Warehouse contains information from select EPA Environmental program office databases and provides access about environmental activities that may affect air, water, and land anywhere in the United States. The Envirofacts Warehouse supports its own web enabled tools as well as a host of other EPA applications.

  12. Application of bioinformatics tools and databases in microbial dehalogenation research (a review).

    Science.gov (United States)

    Satpathy, R; Konkimalla, V B; Ratha, J

    2015-01-01

    Microbial dehalogenation is a biochemical process in which the halogenated substances are catalyzed enzymatically in to their non-halogenated form. The microorganisms have a wide range of organohalogen degradation ability both explicit and non-specific in nature. Most of these halogenated organic compounds being pollutants need to be remediated; therefore, the current approaches are to explore the potential of microbes at a molecular level for effective biodegradation of these substances. Several microorganisms with dehalogenation activity have been identified and characterized. In this aspect, the bioinformatics plays a key role to gain deeper knowledge in this field of dehalogenation. To facilitate the data mining, many tools have been developed to annotate these data from databases. Therefore, with the discovery of a microorganism one can predict a gene/protein, sequence analysis, can perform structural modelling, metabolic pathway analysis, biodegradation study and so on. This review highlights various methods of bioinformatics approach that describes the application of various databases and specific tools in the microbial dehalogenation fields with special focus on dehalogenase enzymes. Attempts have also been made to decipher some recent applications of in silico modeling methods that comprise of gene finding, protein modelling, Quantitative Structure Biodegradibility Relationship (QSBR) study and reconstruction of metabolic pathways employed in dehalogenation research area.

  13. Importance of databases of nucleic acids for bioinformatic analysis focused to genomics

    Science.gov (United States)

    Jimenez-Gutierrez, L. R.; Barrios-Hernández, C. J.; Pedraza-Ferreira, G. R.; Vera-Cala, L.; Martinez-Perez, F.

    2016-08-01

    Recently, bioinformatics has become a new field of science, indispensable in the analysis of millions of nucleic acids sequences, which are currently deposited in international databases (public or private); these databases contain information of genes, RNA, ORF, proteins, intergenic regions, including entire genomes from some species. The analysis of this information requires computer programs; which were renewed in the use of new mathematical methods, and the introduction of the use of artificial intelligence. In addition to the constant creation of supercomputing units trained to withstand the heavy workload of sequence analysis. However, it is still necessary the innovation on platforms that allow genomic analyses, faster and more effectively, with a technological understanding of all biological processes.

  14. YPED: an integrated bioinformatics suite and database for mass spectrometry-based proteomics research.

    Science.gov (United States)

    Colangelo, Christopher M; Shifman, Mark; Cheung, Kei-Hoi; Stone, Kathryn L; Carriero, Nicholas J; Gulcicek, Erol E; Lam, TuKiet T; Wu, Terence; Bjornson, Robert D; Bruce, Can; Nairn, Angus C; Rinehart, Jesse; Miller, Perry L; Williams, Kenneth R

    2015-02-01

    We report a significantly-enhanced bioinformatics suite and database for proteomics research called Yale Protein Expression Database (YPED) that is used by investigators at more than 300 institutions worldwide. YPED meets the data management, archival, and analysis needs of a high-throughput mass spectrometry-based proteomics research ranging from a single laboratory, group of laboratories within and beyond an institution, to the entire proteomics community. The current version is a significant improvement over the first version in that it contains new modules for liquid chromatography-tandem mass spectrometry (LC-MS/MS) database search results, label and label-free quantitative proteomic analysis, and several scoring outputs for phosphopeptide site localization. In addition, we have added both peptide and protein comparative analysis tools to enable pairwise analysis of distinct peptides/proteins in each sample and of overlapping peptides/proteins between all samples in multiple datasets. We have also implemented a targeted proteomics module for automated multiple reaction monitoring (MRM)/selective reaction monitoring (SRM) assay development. We have linked YPED's database search results and both label-based and label-free fold-change analysis to the Skyline Panorama repository for online spectra visualization. In addition, we have built enhanced functionality to curate peptide identifications into an MS/MS peptide spectral library for all of our protein database search identification results. Copyright © 2015 The Authors. Production and hosting by Elsevier Ltd.. All rights reserved.

  15. Development of prostate cancer research database with the clinical data warehouse technology for direct linkage with electronic medical record system.

    Science.gov (United States)

    Choi, In Young; Park, Seungho; Park, Bumjoon; Chung, Byung Ha; Kim, Choung-Soo; Lee, Hyun Moo; Byun, Seok-Soo; Lee, Ji Youl

    2013-01-01

    In spite of increased prostate cancer patients, little is known about the impact of treatments for prostate cancer patients and outcome of different treatments based on nationwide data. In order to obtain more comprehensive information for Korean prostate cancer patients, many professionals urged to have national system to monitor the quality of prostate cancer care. To gain its objective, the prostate cancer database system was planned and cautiously accommodated different views from various professions. This prostate cancer research database system incorporates information about a prostate cancer research including demographics, medical history, operation information, laboratory, and quality of life surveys. And, this system includes three different ways of clinical data collection to produce a comprehensive data base; direct data extraction from electronic medical record (EMR) system, manual data entry after linking EMR documents like magnetic resonance imaging findings and paper-based data collection for survey from patients. We implemented clinical data warehouse technology to test direct EMR link method with St. Mary's Hospital system. Using this method, total number of eligible patients were 2,300 from 1997 until 2012. Among them, 538 patients conducted surgery and others have different treatments. Our database system could provide the infrastructure for collecting error free data to support various retrospective and prospective studies.

  16. SeqHound: biological sequence and structure database as a platform for bioinformatics research

    Directory of Open Access Journals (Sweden)

    Dumontier Michel

    2002-10-01

    Full Text Available Abstract Background SeqHound has been developed as an integrated biological sequence, taxonomy, annotation and 3-D structure database system. It provides a high-performance server platform for bioinformatics research in a locally-hosted environment. Results SeqHound is based on the National Center for Biotechnology Information data model and programming tools. It offers daily updated contents of all Entrez sequence databases in addition to 3-D structural data and information about sequence redundancies, sequence neighbours, taxonomy, complete genomes, functional annotation including Gene Ontology terms and literature links to PubMed. SeqHound is accessible via a web server through a Perl, C or C++ remote API or an optimized local API. It provides functionality necessary to retrieve specialized subsets of sequences, structures and structural domains. Sequences may be retrieved in FASTA, GenBank, ASN.1 and XML formats. Structures are available in ASN.1, XML and PDB formats. Emphasis has been placed on complete genomes, taxonomy, domain and functional annotation as well as 3-D structural functionality in the API, while fielded text indexing functionality remains under development. SeqHound also offers a streamlined WWW interface for simple web-user queries. Conclusions The system has proven useful in several published bioinformatics projects such as the BIND database and offers a cost-effective infrastructure for research. SeqHound will continue to develop and be provided as a service of the Blueprint Initiative at the Samuel Lunenfeld Research Institute. The source code and examples are available under the terms of the GNU public license at the Sourceforge site http://sourceforge.net/projects/slritools/ in the SLRI Toolkit.

  17. Warehouse Logistics

    OpenAIRE

    Panibratetc, Anastasiia

    2015-01-01

    This research is a review of warehouse logistics on the example of Kannustalo Oy, located in Kannus, Western region of Finland. Kannustalo is an international company of designing, manufacturing and assembling block and turn-key houses. The research subject is logistics process in warehouse system of industrial company. In my work I discussed about theoretical aspect of logistics, logistic functions and processes. Later I considered warehouse as a part of logistics system and provided inf...

  18. Metadata to Support Data Warehouse Evolution

    Science.gov (United States)

    Solodovnikova, Darja

    The focus of this chapter is metadata necessary to support data warehouse evolution. We present the data warehouse framework that is able to track evolution process and adapt data warehouse schemata and data extraction, transformation, and loading (ETL) processes. We discuss the significant part of the framework, the metadata repository that stores information about the data warehouse, logical and physical schemata and their versions. We propose the physical implementation of multiversion data warehouse in a relational DBMS. For each modification of a data warehouse schema, we outline the changes that need to be made to the repository metadata and in the database.

  19. A Reference Viral Database (RVDB) To Enhance Bioinformatics Analysis of High-Throughput Sequencing for Novel Virus Detection.

    Science.gov (United States)

    Goodacre, Norman; Aljanahi, Aisha; Nandakumar, Subhiksha; Mikailov, Mike; Khan, Arifa S

    2018-01-01

    Detection of distantly related viruses by high-throughput sequencing (HTS) is bioinformatically challenging because of the lack of a public database containing all viral sequences, without abundant nonviral sequences, which can extend runtime and obscure viral hits. Our reference viral database (RVDB) includes all viral, virus-related, and virus-like nucleotide sequences (excluding bacterial viruses), regardless of length, and with overall reduced cellular sequences. Semantic selection criteria (SEM-I) were used to select viral sequences from GenBank, resulting in a first-generation viral database (VDB). This database was manually and computationally reviewed, resulting in refined, semantic selection criteria (SEM-R), which were applied to a new download of updated GenBank sequences to create a second-generation VDB. Viral entries in the latter were clustered at 98% by CD-HIT-EST to reduce redundancy while retaining high viral sequence diversity. The viral identity of the clustered representative sequences (creps) was confirmed by BLAST searches in NCBI databases and HMMER searches in PFAM and DFAM databases. The resulting RVDB contained a broad representation of viral families, sequence diversity, and a reduced cellular content; it includes full-length and partial sequences and endogenous nonretroviral elements, endogenous retroviruses, and retrotransposons. Testing of RVDBv10.2, with an in-house HTS transcriptomic data set indicated a significantly faster run for virus detection than interrogating the entirety of the NCBI nonredundant nucleotide database, which contains all viral sequences but also nonviral sequences. RVDB is publically available for facilitating HTS analysis, particularly for novel virus detection. It is meant to be updated on a regular basis to include new viral sequences added to GenBank. IMPORTANCE To facilitate bioinformatics analysis of high-throughput sequencing (HTS) data for the detection of both known and novel viruses, we have

  20. CBS Genome Atlas Database: a dynamic storage for bioinformatic results and sequence data

    DEFF Research Database (Denmark)

    Hallin, Peter Fischer; Ussery, David

    2004-01-01

    , these results counts to more than 220 pieces of information. The backbone of this solution consists of a program package written in Perl, which enables administrators to synchronize and update the database content. The MySQL database has been connected to the CBS web-server via PHP4, to present a dynamic web...... and frequent addition of new models are factors that require a dynamic database layout. Using basic tools like the GNU Make system, csh, Perl and MySQL, we have created a flexible database environment for storing and maintaining such results for a collection of complete microbial genomes. Currently...... content for users outside the center. This solution is tightly fitted to existing server infrastructure and the solutions proposed here can perhaps serve as a template for other research groups to solve database issues....

  1. Improvements to PATRIC, the all-bacterial Bioinformatics Database and Analysis Resource Center

    Science.gov (United States)

    Wattam, Alice R.; Davis, James J.; Assaf, Rida; Boisvert, Sébastien; Brettin, Thomas; Bun, Christopher; Conrad, Neal; Dietrich, Emily M.; Disz, Terry; Gabbard, Joseph L.; Gerdes, Svetlana; Henry, Christopher S.; Kenyon, Ronald W.; Machi, Dustin; Mao, Chunhong; Nordberg, Eric K.; Olsen, Gary J.; Murphy-Olson, Daniel E.; Olson, Robert; Overbeek, Ross; Parrello, Bruce; Pusch, Gordon D.; Shukla, Maulik; Vonstein, Veronika; Warren, Andrew; Xia, Fangfang; Yoo, Hyunseung; Stevens, Rick L.

    2017-01-01

    The Pathosystems Resource Integration Center (PATRIC) is the bacterial Bioinformatics Resource Center (https://www.patricbrc.org). Recent changes to PATRIC include a redesign of the web interface and some new services that provide users with a platform that takes them from raw reads to an integrated analysis experience. The redesigned interface allows researchers direct access to tools and data, and the emphasis has changed to user-created genome-groups, with detailed summaries and views of the data that researchers have selected. Perhaps the biggest change has been the enhanced capability for researchers to analyze their private data and compare it to the available public data. Researchers can assemble their raw sequence reads and annotate the contigs using RASTtk. PATRIC also provides services for RNA-Seq, variation, model reconstruction and differential expression analysis, all delivered through an updated private workspace. Private data can be compared by ‘virtual integration’ to any of PATRIC's public data. The number of genomes available for comparison in PATRIC has expanded to over 80 000, with a special emphasis on genomes with antimicrobial resistance data. PATRIC uses this data to improve both subsystem annotation and k-mer classification, and tags new genomes as having signatures that indicate susceptibility or resistance to specific antibiotics. PMID:27899627

  2. miRToolsGallery: a tag-based and rankable microRNA bioinformatics resources database portal

    Science.gov (United States)

    Chen, Liang; Heikkinen, Liisa; Wang, ChangLiang; Yang, Yang; Knott, K Emily

    2018-01-01

    Abstract Hundreds of bioinformatics tools have been developed for MicroRNA (miRNA) investigations including those used for identification, target prediction, structure and expression profile analysis. However, finding the correct tool for a specific application requires the tedious and laborious process of locating, downloading, testing and validating the appropriate tool from a group of nearly a thousand. In order to facilitate this process, we developed a novel database portal named miRToolsGallery. We constructed the portal by manually curating > 950 miRNA analysis tools and resources. In the portal, a query to locate the appropriate tool is expedited by being searchable, filterable and rankable. The ranking feature is vital to quickly identify and prioritize the more useful from the obscure tools. Tools are ranked via different criteria including the PageRank algorithm, date of publication, number of citations, average of votes and number of publications. miRToolsGallery provides links and data for the comprehensive collection of currently available miRNA tools with a ranking function which can be adjusted using different criteria according to specific requirements. Database URL: http://www.mirtoolsgallery.org PMID:29688355

  3. Bioinformatics tools and database resources for systems genetics analysis in mice-a short review and an evaluation of future needs

    NARCIS (Netherlands)

    Durrant, Caroline; Swertz, Morris A.; Alberts, Rudi; Arends, Danny; Moeller, Steffen; Mott, Richard; Prins, Pjotr; van der Velde, K. Joeri; Jansen, Ritsert C.; Schughart, Klaus

    During a meeting of the SYSGENET working group 'Bioinformatics', currently available software tools and databases for systems genetics in mice were reviewed and the needs for future developments discussed. The group evaluated interoperability and performed initial feasibility studies. To aid future

  4. Warehouse Sanitation Workshop Handbook.

    Science.gov (United States)

    Food and Drug Administration (DHHS/PHS), Washington, DC.

    This workshop handbook contains information and reference materials on proper food warehouse sanitation. The materials have been used at Food and Drug Administration (FDA) food warehouse sanitation workshops, and are selected by the FDA for use by food warehouse operators and for training warehouse sanitation employees. The handbook is divided…

  5. Conceptual Data Warehouse Structures

    DEFF Research Database (Denmark)

    Bækgaard, Lars

    1998-01-01

    changing information needs. We show how the event-entity-relationship model (EVER) can be used for schema design and query formulation in data warehouses. Our work is based on a layered data warehouse architecture in which a global data warehouse is used for flexible long-term organization and storage...... of all warehouse data whereas local data warehouses are used for efficient query formulation and answering. In order to support flexible modeling of global warehouses we use a flexible version of EVER for global schema design. In order to support efficient query formulation in local data warehouses we...

  6. Combining information from a clinical data warehouse and a pharmaceutical database to generate a framework to detect comorbidities in electronic health records.

    Science.gov (United States)

    Sylvestre, Emmanuelle; Bouzillé, Guillaume; Chazard, Emmanuel; His-Mahier, Cécil; Riou, Christine; Cuggia, Marc

    2018-01-24

    Medical coding is used for a variety of activities, from observational studies to hospital billing. However, comorbidities tend to be under-reported by medical coders. The aim of this study was to develop an algorithm to detect comorbidities in electronic health records (EHR) by using a clinical data warehouse (CDW) and a knowledge database. We enriched the Theriaque pharmaceutical database with the French national Comorbidities List to identify drugs associated with at least one major comorbid condition and diagnoses associated with a drug indication. Then, we compared the drug indications in the Theriaque database with the ICD-10 billing codes in EHR to detect potentially missing comorbidities based on drug prescriptions. Finally, we improved comorbidity detection by matching drug prescriptions and laboratory test results. We tested the obtained algorithm by using two retrospective datasets extracted from the Rennes University Hospital (RUH) CDW. The first dataset included all adult patients hospitalized in the ear, nose, throat (ENT) surgical ward between October and December 2014 (ENT dataset). The second included all adult patients hospitalized at RUH between January and February 2015 (general dataset). We reviewed medical records to find written evidence of the suggested comorbidities in current or past stays. Among the 22,132 Common Units of Dispensation (CUD) codes present in the Theriaque database, 19,970 drugs (90.2%) were associated with one or several ICD-10 diagnoses, based on their indication, and 11,162 (50.4%) with at least one of the 4878 comorbidities from the comorbidity list. Among the 122 patients of the ENT dataset, 75.4% had at least one drug prescription without corresponding ICD-10 code. The comorbidity diagnoses suggested by the algorithm were confirmed in 44.6% of the cases. Among the 4312 patients of the general dataset, 68.4% had at least one drug prescription without corresponding ICD-10 code. The comorbidity diagnoses suggested by the

  7. Automation in Warehouse Development

    CERN Document Server

    Verriet, Jacques

    2012-01-01

    The warehouses of the future will come in a variety of forms, but with a few common ingredients. Firstly, human operational handling of items in warehouses is increasingly being replaced by automated item handling. Extended warehouse automation counteracts the scarcity of human operators and supports the quality of picking processes. Secondly, the development of models to simulate and analyse warehouse designs and their components facilitates the challenging task of developing warehouses that take into account each customer’s individual requirements and logistic processes. Automation in Warehouse Development addresses both types of automation from the innovative perspective of applied science. In particular, it describes the outcomes of the Falcon project, a joint endeavour by a consortium of industrial and academic partners. The results include a model-based approach to automate warehouse control design, analysis models for warehouse design, concepts for robotic item handling and computer vision, and auton...

  8. Automation in Warehouse Development

    NARCIS (Netherlands)

    Hamberg, R.; Verriet, J.

    2012-01-01

    The warehouses of the future will come in a variety of forms, but with a few common ingredients. Firstly, human operational handling of items in warehouses is increasingly being replaced by automated item handling. Extended warehouse automation counteracts the scarcity of human operators and

  9. Comprehensive analysis of the N-glycan biosynthetic pathway using bioinformatics to generate UniCorn: A theoretical N-glycan structure database.

    Science.gov (United States)

    Akune, Yukie; Lin, Chi-Hung; Abrahams, Jodie L; Zhang, Jingyu; Packer, Nicolle H; Aoki-Kinoshita, Kiyoko F; Campbell, Matthew P

    2016-08-05

    Glycan structures attached to proteins are comprised of diverse monosaccharide sequences and linkages that are produced from precursor nucleotide-sugars by a series of glycosyltransferases. Databases of these structures are an essential resource for the interpretation of analytical data and the development of bioinformatics tools. However, with no template to predict what structures are possible the human glycan structure databases are incomplete and rely heavily on the curation of published, experimentally determined, glycan structure data. In this work, a library of 45 human glycosyltransferases was used to generate a theoretical database of N-glycan structures comprised of 15 or less monosaccharide residues. Enzyme specificities were sourced from major online databases including Kyoto Encyclopedia of Genes and Genomes (KEGG) Glycan, Consortium for Functional Glycomics (CFG), Carbohydrate-Active enZymes (CAZy), GlycoGene DataBase (GGDB) and BRENDA. Based on the known activities, more than 1.1 million theoretical structures and 4.7 million synthetic reactions were generated and stored in our database called UniCorn. Furthermore, we analyzed the differences between the predicted glycan structures in UniCorn and those contained in UniCarbKB (www.unicarbkb.org), a database which stores experimentally described glycan structures reported in the literature, and demonstrate that UniCorn can be used to aid in the assignment of ambiguous structures whilst also serving as a discovery database. Copyright © 2016 Elsevier Ltd. All rights reserved.

  10. XWeB: The XML Warehouse Benchmark

    Science.gov (United States)

    Mahboubi, Hadj; Darmont, Jérôme

    With the emergence of XML as a standard for representing business data, new decision support applications are being developed. These XML data warehouses aim at supporting On-Line Analytical Processing (OLAP) operations that manipulate irregular XML data. To ensure feasibility of these new tools, important performance issues must be addressed. Performance is customarily assessed with the help of benchmarks. However, decision support benchmarks do not currently support XML features. In this paper, we introduce the XML Warehouse Benchmark (XWeB), which aims at filling this gap. XWeB derives from the relational decision support benchmark TPC-H. It is mainly composed of a test data warehouse that is based on a unified reference model for XML warehouses and that features XML-specific structures, and its associate XQuery decision support workload. XWeB's usage is illustrated by experiments on several XML database management systems.

  11. Development of a medical informatics data warehouse.

    Science.gov (United States)

    Wu, Cai

    2006-01-01

    This project built a medical informatics data warehouse (MedInfo DDW) in an Oracle database to analyze medical information which has been collected through Baylor Family Medicine Clinic (FCM) Logician application. The MedInfo DDW used Star Schema with dimensional model, FCM database as operational data store (ODS); the data from on-line transaction processing (OLTP) were extracted and transferred to a knowledge based data warehouse through SQLLoad, and the patient information was analyzed by using on-line analytic processing (OLAP) in Crystal Report.

  12. Building a Data Warehouse.

    Science.gov (United States)

    Levine, Elliott

    2002-01-01

    Describes how to build a data warehouse, using the Schools Interoperability Framework (www.sifinfo.org), that supports data-driven decision making and complies with the Freedom of Information Act. Provides several suggestions for building and maintaining a data warehouse. (PKP)

  13. Intelligent environmental data warehouse

    International Nuclear Information System (INIS)

    Ekechukwu, B.

    1998-01-01

    Making quick and effective decisions in environment management are based on multiple and complex parameters, a data warehouse is a powerful tool for the over all management of massive environmental information. Selecting the right data from a warehouse is an important factor consideration for end-users. This paper proposed an intelligent environmental data warehouse system. It consists of data warehouse to feed an environmental researchers and managers with desire environmental information needs to their research studies and decision in form of geometric and attribute data for study area, and a metadata for the other sources of environmental information. In addition, the proposed intelligent search engine works according to a set of rule, which enables the system to be aware of the environmental data wanted by the end-user. The system development process passes through four stages. These are data preparation, warehouse development, intelligent engine development and internet platform system development. (author)

  14. Bioinformatics in translational drug discovery.

    Science.gov (United States)

    Wooller, Sarah K; Benstead-Hume, Graeme; Chen, Xiangrong; Ali, Yusuf; Pearl, Frances M G

    2017-08-31

    Bioinformatics approaches are becoming ever more essential in translational drug discovery both in academia and within the pharmaceutical industry. Computational exploitation of the increasing volumes of data generated during all phases of drug discovery is enabling key challenges of the process to be addressed. Here, we highlight some of the areas in which bioinformatics resources and methods are being developed to support the drug discovery pipeline. These include the creation of large data warehouses, bioinformatics algorithms to analyse 'big data' that identify novel drug targets and/or biomarkers, programs to assess the tractability of targets, and prediction of repositioning opportunities that use licensed drugs to treat additional indications. © 2017 The Author(s).

  15. Bioinformatics for cancer immunotherapy target discovery

    DEFF Research Database (Denmark)

    Olsen, Lars Rønn; Campos, Benito; Barnkob, Mike Stein

    2014-01-01

    therapy target discovery in a bioinformatics analysis pipeline. We describe specialized bioinformatics tools and databases for three main bottlenecks in immunotherapy target discovery: the cataloging of potentially antigenic proteins, the identification of potential HLA binders, and the selection epitopes...

  16. RADARS, a bioinformatics solution that automates proteome mass spectral analysis, optimises protein identification, and archives data in a relational database.

    Science.gov (United States)

    Field, Helen I; Fenyö, David; Beavis, Ronald C

    2002-01-01

    RADARS, a rapid, automated, data archiving and retrieval software system for high-throughput proteomic mass spectral data processing and storage, is described. The majority of mass spectrometer data files are compatible with RADARS, for consistent processing. The system automatically takes unprocessed data files, identifies proteins via in silico database searching, then stores the processed data and search results in a relational database suitable for customized reporting. The system is robust, used in 24/7 operation, accessible to multiple users of an intranet through a web browser, may be monitored by Virtual Private Network, and is secure. RADARS is scalable for use on one or many computers, and is suited to multiple processor systems. It can incorporate any local database in FASTA format, and can search protein and DNA databases online. A key feature is a suite of visualisation tools (many available gratis), allowing facile manipulation of spectra, by hand annotation, reanalysis, and access to all procedures. We also described the use of Sonar MS/MS, a novel, rapid search engine requiring 40 MB RAM per process for searches against a genomic or EST database translated in all six reading frames. RADARS reduces the cost of analysis by its efficient algorithms: Sonar MS/MS can identifiy proteins without accurate knowledge of the parent ion mass and without protein tags. Statistical scoring methods provide close-to-expert accuracy and brings robust data analysis to the non-expert user.

  17. Chronic Condition Data Warehouse

    Data.gov (United States)

    U.S. Department of Health & Human Services — The CMS Chronic Condition Data Warehouse (CCW) provides researchers with Medicare and Medicaid beneficiary, claims, and assessment data linked by beneficiary across...

  18. HRSA Data Warehouse

    Data.gov (United States)

    U.S. Department of Health & Human Services — The HRSA Data Warehouse is the go-to source for data, maps, reports, locators, and dashboards on HRSA's public health programs. This website provides a wide variety...

  19. Public Refrigerated Warehouses

    Data.gov (United States)

    Department of Homeland Security — The International Association of Refrigerated Warehouses (IARW) came into existence in 1891 when a number of conventional warehousemen took on the demands of storing...

  20. Developing a standardized healthcare cost data warehouse.

    Science.gov (United States)

    Visscher, Sue L; Naessens, James M; Yawn, Barbara P; Reinalda, Megan S; Anderson, Stephanie S; Borah, Bijan J

    2017-06-12

    Research addressing value in healthcare requires a measure of cost. While there are many sources and types of cost data, each has strengths and weaknesses. Many researchers appear to create study-specific cost datasets, but the explanations of their costing methodologies are not always clear, causing their results to be difficult to interpret. Our solution, described in this paper, was to use widely accepted costing methodologies to create a service-level, standardized healthcare cost data warehouse from an institutional perspective that includes all professional and hospital-billed services for our patients. The warehouse is based on a National Institutes of Research-funded research infrastructure containing the linked health records and medical care administrative data of two healthcare providers and their affiliated hospitals. Since all patients are identified in the data warehouse, their costs can be linked to other systems and databases, such as electronic health records, tumor registries, and disease or treatment registries. We describe the two institutions' administrative source data; the reference files, which include Medicare fee schedules and cost reports; the process of creating standardized costs; and the warehouse structure. The costing algorithm can create inflation-adjusted standardized costs at the service line level for defined study cohorts on request. The resulting standardized costs contained in the data warehouse can be used to create detailed, bottom-up analyses of professional and facility costs of procedures, medical conditions, and patient care cycles without revealing business-sensitive information. After its creation, a standardized cost data warehouse is relatively easy to maintain and can be expanded to include data from other providers. Individual investigators who may not have sufficient knowledge about administrative data do not have to try to create their own standardized costs on a project-by-project basis because our data

  1. DICOM Data Warehouse: Part 2.

    Science.gov (United States)

    Langer, Steve G

    2016-06-01

    In 2010, the DICOM Data Warehouse (DDW) was launched as a data warehouse for DICOM meta-data. Its chief design goals were to have a flexible database schema that enabled it to index standard patient and study information, modality specific tags (public and private), and create a framework to derive computable information (derived tags) from the former items. Furthermore, it was to map the above information to an internally standard lexicon that enables a non-DICOM savvy programmer to write standard SQL queries and retrieve the equivalent data from a cohort of scanners, regardless of what tag that data element was found in over the changing epochs of DICOM and ensuing migration of elements from private to public tags. After 5 years, the original design has scaled astonishingly well. Very little has changed in the database schema. The knowledge base is now fluent in over 90 device types. Also, additional stored procedures have been written to compute data that is derivable from standard or mapped tags. Finally, an early concern is that the system would not be able to address the variability DICOM-SR objects has been addressed. As of this writing the system is indexing 300 MR, 600 CT, and 2000 other (XA, DR, CR, MG) imaging studies per day. The only remaining issue to be solved is the case for tags that were not prospectively indexed-and indeed, this final challenge may lead to a noSQL, big data, approach in a subsequent version.

  2. Security Data Warehouse Application

    Science.gov (United States)

    Vernon, Lynn R.; Hennan, Robert; Ortiz, Chris; Gonzalez, Steve; Roane, John

    2012-01-01

    The Security Data Warehouse (SDW) is used to aggregate and correlate all JSC IT security data. This includes IT asset inventory such as operating systems and patch levels, users, user logins, remote access dial-in and VPN, and vulnerability tracking and reporting. The correlation of this data allows for an integrated understanding of current security issues and systems by providing this data in a format that associates it to an individual host. The cornerstone of the SDW is its unique host-mapping algorithm that has undergone extensive field tests, and provides a high degree of accuracy. The algorithm comprises two parts. The first part employs fuzzy logic to derive a best-guess host assignment using incomplete sensor data. The second part is logic to identify and correct errors in the database, based on subsequent, more complete data. Host records are automatically split or merged, as appropriate. The process had to be refined and thoroughly tested before the SDW deployment was feasible. Complexity was increased by adding the dimension of time. The SDW correlates all data with its relationship to time. This lends support to forensic investigations, audits, and overall situational awareness. Another important feature of the SDW architecture is that all of the underlying complexities of the data model and host-mapping algorithm are encapsulated in an easy-to-use and understandable Perl language Application Programming Interface (API). This allows the SDW to be quickly augmented with additional sensors using minimal coding and testing. It also supports rapid generation of ad hoc reports and integration with other information systems.

  3. A Relevance-Extended Multi-dimensional Model for a Data Warehouse Contextualized with Documents

    DEFF Research Database (Denmark)

    Perez, Juan Manuel; Pedersen, Torben Bach; Berlanga, Rafael

    2005-01-01

    Current data warehouse and OLAP technologies can be applied to analyze the structured data that companies store in their databases. The circumstances that describe the context associated with these data can be found in other internal and external sources of documents. In this paper we propose...... to combine the traditional corporate data warehouse with a document warehouse, resulting in a contextualized warehouse. Thus, contextualized warehouses keep a historical record of the facts and their contexts as described by the documents. In this framework, the user selects an analysis context which...

  4. Data Linkage from Clinical to Study Databases via an R Data Warehouse User Interface. Experiences from a Large Clinical Follow-up Study.

    Science.gov (United States)

    Kaspar, Mathias; Ertl, Maximilian; Fette, Georg; Dietrich, Georg; Toepfer, Martin; Angermann, Christiane; Störk, Stefan; Puppe, Frank

    2016-08-05

    Data that needs to be documented for clinical studies has often been acquired and documented in clinical routine. Usually this data is manually transferred to Case Report Forms (CRF) and/or directly into an electronic data capture (EDC) system. To enhance the documentation process of a large clinical follow-up study targeting patients admitted for acutely decompensated heart failure by accessing the data created during routine and study visits from a hospital information system (HIS) and by transferring it via a data warehouse (DWH) into the study's EDC system. This project is based on the clinical DWH developed at the University of Würzburg. The DWH was extended by several new data domains including data created by the study team itself. An R user interface was developed for the DWH that allows to access its source data in all its detail, to transform data as comprehensively as possible by R into study-specific variables and to support the creation of data and catalog tables. A data flow was established that starts with labeling patients as study patients within the HIS and proceeds with updating the DWH with this label and further data domains at a daily rate. Several study-specific variables were defined using the implemented R user interface of the DWH. This system was then used to export these variables as data tables ready for import into our EDC system. The data tables were then used to initialize the first 296 patients within the EDC system by pseudonym, visit and data values. Afterwards, these records were filled with clinical data on heart failure, vital parameters and time spent on selected wards. This solution focuses on the comprehensive access and transformation of data for a DWH-EDC system linkage. Using this system in a large clinical study has demonstrated the feasibility of this approach for a study with a complex visit schedule.

  5. Building the Readiness Data Warehouse

    National Research Council Canada - National Science Library

    Tysor, Sue

    2000-01-01

    .... This is the role of the data warehouse. The data warehouse will deliver business intelligence based on operational data, decision support data and external data to all business units in the organization...

  6. Contextualizing Data Warehouses with Documents

    DEFF Research Database (Denmark)

    Perez, Juan Manuel; Berlanga, Rafael; Aramburu, Maria Jose

    2008-01-01

    warehouse with a document warehouse, resulting in a contextualized warehouse. Thus, the user first selects an analysis context by supplying some keywords. Then, the analysis is performed on a novel type of OLAP cube, called an R-cube, which is materialized by retrieving and ranking the documents...

  7. Usage of data warehouse for analysing software's bugs

    Science.gov (United States)

    Živanov, Danijel; Krstićev, Danijela Boberić; Mirković, Duško

    2017-07-01

    We analysed the database schema of Bugzilla system and taking into account user's requirements for reporting, we presented a dimensional model for the data warehouse which will be used for reporting software defects. The idea proposed in this paper is not to throw away Bugzilla system because it certainly has many strengths, but to make integration of Bugzilla and the proposed data warehouse. Bugzilla would continue to be used for recording bugs that occur during the development and maintenance of software while the data warehouse would be used for storing data on bugs in an appropriate form, which is more suitable for analysis.

  8. TRUNCATULIX--a data warehouse for the legume community.

    Science.gov (United States)

    Henckel, Kolja; Runte, Kai J; Bekel, Thomas; Dondrup, Michael; Jakobi, Tobias; Küster, Helge; Goesmann, Alexander

    2009-02-11

    Databases for either sequence, annotation, or microarray experiments data are extremely beneficial to the research community, as they centrally gather information from experiments performed by different scientists. However, data from different sources develop their full capacities only when combined. The idea of a data warehouse directly adresses this problem and solves it by integrating all required data into one single database - hence there are already many data warehouses available to genetics. For the model legume Medicago truncatula, there is currently no such single data warehouse that integrates all freely available gene sequences, the corresponding gene expression data, and annotation information. Thus, we created the data warehouse TRUNCATULIX, an integrative database of Medicago truncatula sequence and expression data. The TRUNCATULIX data warehouse integrates five public databases for gene sequences, and gene annotations, as well as a database for microarray expression data covering raw data, normalized datasets, and complete expression profiling experiments. It can be accessed via an AJAX-based web interface using a standard web browser. For the first time, users can now quickly search for specific genes and gene expression data in a huge database based on high-quality annotations. The results can be exported as Excel, HTML, or as csv files for further usage. The integration of sequence, annotation, and gene expression data from several Medicago truncatula databases in TRUNCATULIX provides the legume community with access to data and data mining capability not previously available. TRUNCATULIX is freely available at http://www.cebitec.uni-bielefeld.de/truncatulix/.

  9. InterMine: a flexible data warehouse system for the integration and analysis of heterogeneous biological data.

    Science.gov (United States)

    Smith, Richard N; Aleksic, Jelena; Butano, Daniela; Carr, Adrian; Contrino, Sergio; Hu, Fengyuan; Lyne, Mike; Lyne, Rachel; Kalderimis, Alex; Rutherford, Kim; Stepan, Radek; Sullivan, Julie; Wakeling, Matthew; Watkins, Xavier; Micklem, Gos

    2012-12-01

    InterMine is an open-source data warehouse system that facilitates the building of databases with complex data integration requirements and a need for a fast customizable query facility. Using InterMine, large biological databases can be created from a range of heterogeneous data sources, and the extensible data model allows for easy integration of new data types. The analysis tools include a flexible query builder, genomic region search and a library of 'widgets' performing various statistical analyses. The results can be exported in many commonly used formats. InterMine is a fully extensible framework where developers can add new tools and functionality. Additionally, there is a comprehensive set of web services, for which client libraries are provided in five commonly used programming languages. Freely available from http://www.intermine.org under the LGPL license. g.micklem@gen.cam.ac.uk Supplementary data are available at Bioinformatics online.

  10. DATA WAREHOUSES SECURITY IMPLEMENTATION

    Directory of Open Access Journals (Sweden)

    Burtescu Emil

    2009-05-01

    Full Text Available Data warehouses were initially implemented and developed by the big firms and they were used for working out the managerial problems and for making decisions. Later on, because of the economic tendencies and of the technological progress, the data warehou

  11. A clinician friendly data warehouse oriented toward narrative reports: Dr. Warehouse.

    Science.gov (United States)

    Garcelon, Nicolas; Neuraz, Antoine; Salomon, Rémi; Faour, Hassan; Benoit, Vincent; Delapalme, Arthur; Munnich, Arnold; Burgun, Anita; Rance, Bastien

    2018-04-01

    Clinical data warehouses are often oriented toward integration and exploration of coded data. However narrative reports are of crucial importance for translational research. This paper describes Dr. Warehouse®, an open source data warehouse oriented toward clinical narrative reports and designed to support clinicians' day-to-day use. Dr. Warehouse relies on an original database model to focus on documents in addition to facts. Besides classical querying functionalities, the system provides an advanced search engine and Graphical User Interfaces adapted to the exploration of text. Dr. Warehouse is dedicated to translational research with cohort recruitment capabilities, high throughput phenotyping and patient centric views (including similarity metrics among patients). These features leverage Natural Language Processing based on the extraction of UMLS® concepts, as well as negation and family history detection. A survey conducted after 6 months of use at the Necker Children's Hospital shows a high rate of satisfaction among the users (96.6%). During this period, 122 users performed 2837 queries, accessed 4,267 patients' records and included 36,632 patients in 131 cohorts. The source code is available at this github link https://github.com/imagine-bdd/DRWH. A demonstration based on PubMed abstracts is available at https://imagine-plateforme-bdd.fr/dwh_pubmed/. Copyright © 2018 The Authors. Published by Elsevier Inc. All rights reserved.

  12. The Data Warehouse Lifecycle Toolkit

    CERN Document Server

    Kimball, Ralph; Thornthwaite, Warren; Mundy, Joy; Becker, Bob

    2011-01-01

    A thorough update to the industry standard for designing, developing, and deploying data warehouse and business intelligence systemsThe world of data warehousing has changed remarkably since the first edition of The Data Warehouse Lifecycle Toolkit was published in 1998. In that time, the data warehouse industry has reached full maturity and acceptance, hardware and software have made staggering advances, and the techniques promoted in the premiere edition of this book have been adopted by nearly all data warehouse vendors and practitioners. In addition, the term "business intelligence" emerge

  13. Health Claims Data Warehouse (HCDW)

    Data.gov (United States)

    Office of Personnel Management — The Health Claims Data Warehouse (HCDW) will receive and analyze health claims data to support management and administrative purposes. The Federal Employee Health...

  14. Fire detection in warehouse facilities

    CERN Document Server

    Dinaburg, Joshua

    2013-01-01

    Automatic sprinklers systems are the primary fire protection system in warehouse and storage facilities. The effectiveness of this strategy has come into question due to the challenges presented by modern warehouse facilities, including increased storage heights and areas, automated storage retrieval systems (ASRS), limitations on water supplies, and changes in firefighting strategies. The application of fire detection devices used to provide early warning and notification of incipient warehouse fire events is being considered as a component of modern warehouse fire protection.Fire Detection i

  15. Bioinformatics for Exploration

    Science.gov (United States)

    Johnson, Kathy A.

    2006-01-01

    For the purpose of this paper, bioinformatics is defined as the application of computer technology to the management of biological information. It can be thought of as the science of developing computer databases and algorithms to facilitate and expedite biological research. This is a crosscutting capability that supports nearly all human health areas ranging from computational modeling, to pharmacodynamics research projects, to decision support systems within autonomous medical care. Bioinformatics serves to increase the efficiency and effectiveness of the life sciences research program. It provides data, information, and knowledge capture which further supports management of the bioastronautics research roadmap - identifying gaps that still remain and enabling the determination of which risks have been addressed.

  16. Designing XML schemas for bioinformatics.

    Science.gov (United States)

    Bruhn, Russel Elton; Burton, Philip John

    2003-06-01

    Data interchange bioinformatics databases will, in the future, most likely take place using extensible markup language (XML). The document structure will be described by an XML Schema rather than a document type definition (DTD). To ensure flexibility, the XML Schema must incorporate aspects of Object-Oriented Modeling. This impinges on the choice of the data model, which, in turn, is based on the organization of bioinformatics data by biologists. Thus, there is a need for the general bioinformatics community to be aware of the design issues relating to XML Schema. This paper, which is aimed at a general bioinformatics audience, uses examples to describe the differences between a DTD and an XML Schema and indicates how Unified Modeling Language diagrams may be used to incorporate Object-Oriented Modeling in the design of schema.

  17. Warehouses information system design and development

    Science.gov (United States)

    Darajatun, R. A.; Sukanta

    2017-12-01

    Materials/goods handling industry is fundamental for companies to ensure the smooth running of their warehouses. Efficiency and organization within every aspect of the business is essential in order to gain a competitive advantage. The purpose of this research is design and development of Kanban of inventory storage and delivery system. Application aims to facilitate inventory stock checks to be more efficient and effective. Users easily input finished goods from production department, warehouse, customer, and also suppliers. Master data designed as complete as possible to be prepared applications used in a variety of process logistic warehouse variations. The author uses Java programming language to develop the application, which is used for building Java Web applications, while the database used is MySQL. System development methodology that I use is the Waterfall methodology. Waterfall methodology has several stages of the Analysis, System Design, Implementation, Integration, Operation and Maintenance. In the process of collecting data the author uses the method of observation, interviews, and literature.

  18. TRUNCATULIX – a data warehouse for the legume community

    Directory of Open Access Journals (Sweden)

    Runte Kai J

    2009-02-01

    Full Text Available Abstract Background Databases for either sequence, annotation, or microarray experiments data are extremely beneficial to the research community, as they centrally gather information from experiments performed by different scientists. However, data from different sources develop their full capacities only when combined. The idea of a data warehouse directly adresses this problem and solves it by integrating all required data into one single database – hence there are already many data warehouses available to genetics. For the model legume Medicago truncatula, there is currently no such single data warehouse that integrates all freely available gene sequences, the corresponding gene expression data, and annotation information. Thus, we created the data warehouse TRUNCATULIX, an integrative database of Medicago truncatula sequence and expression data. Results The TRUNCATULIX data warehouse integrates five public databases for gene sequences, and gene annotations, as well as a database for microarray expression data covering raw data, normalized datasets, and complete expression profiling experiments. It can be accessed via an AJAX-based web interface using a standard web browser. For the first time, users can now quickly search for specific genes and gene expression data in a huge database based on high-quality annotations. The results can be exported as Excel, HTML, or as csv files for further usage. Conclusion The integration of sequence, annotation, and gene expression data from several Medicago truncatula databases in TRUNCATULIX provides the legume community with access to data and data mining capability not previously available. TRUNCATULIX is freely available at http://www.cebitec.uni-bielefeld.de/truncatulix/.

  19. Bioinformatics and the Undergraduate Curriculum

    Science.gov (United States)

    Maloney, Mark; Parker, Jeffrey; LeBlanc, Mark; Woodard, Craig T.; Glackin, Mary; Hanrahan, Michael

    2010-01-01

    Recent advances involving high-throughput techniques for data generation and analysis have made familiarity with basic bioinformatics concepts and programs a necessity in the biological sciences. Undergraduate students increasingly need training in methods related to finding and retrieving information stored in vast databases. The rapid rise of…

  20. Bioinformatics and moonlighting proteins

    Directory of Open Access Journals (Sweden)

    Sergio eHernández

    2015-06-01

    Full Text Available Multitasking or moonlighting is the capability of some proteins to execute two or more biochemical functions. Usually, moonlighting proteins are experimentally revealed by serendipity. For this reason, it would be helpful that Bioinformatics could predict this multifunctionality, especially because of the large amounts of sequences from genome projects. In the present work, we analyse and describe several approaches that use sequences, structures, interactomics and current bioinformatics algorithms and programs to try to overcome this problem. Among these approaches are: a remote homology searches using Psi-Blast, b detection of functional motifs and domains, c analysis of data from protein-protein interaction databases (PPIs, d match the query protein sequence to 3D databases (i.e., algorithms as PISITE, e mutation correlation analysis between amino acids by algorithms as MISTIC. Programs designed to identify functional motif/domains detect mainly the canonical function but usually fail in the detection of the moonlighting one, Pfam and ProDom being the best methods. Remote homology search by Psi-Blast combined with data from interactomics databases (PPIs have the best performance. Structural information and mutation correlation analysis can help us to map the functional sites. Mutation correlation analysis can only be used in very specific situations –it requires the existence of multialigned family protein sequences - but can suggest how the evolutionary process of second function acquisition took place. The multitasking protein database MultitaskProtDB (http://wallace.uab.es/multitask/, previously published by our group, has been used as a benchmark for the all of the analyses.

  1. Model Data Warehouse dan Business Intelligence untuk Meningkatkan Penjualan pada PT. S

    Directory of Open Access Journals (Sweden)

    Rudy Rudy

    2011-06-01

    Full Text Available Today a lot of companies use information system in every business activity. Every transaction is stored electronically in the database transaction. The transactional database does not help much to assist the executives in making strategic decisions to improve the company competitiveness. The objective of this research is to analyze the operational database system and the information needed by the management to design a data warehouse model which fits the executive information needs in PT. S. The research method uses the Nine-Step Methodology data warehouse design by Ralph Kimball. The result is a data warehouse featuring business intelligence applications to display information of historical data in tables, graphs, pivot tables, and dashboards and has several points of view for the management. This research concludes that a data warehouse which combines multiple database transactions with business intelligence application can help executives to understand the reports in order to accelerate decision-making processes. 

  2. DSSTOX STRUCTURE-SEARCHABLE PUBLIC TOXICITY DATABASE NETWORK: CURRENT PROGRESS AND NEW INITIATIVES TO IMPROVE CHEMO-BIOINFORMATICS CAPABILITIES

    Science.gov (United States)

    The EPA DSSTox website (http://www/epa.gov/nheerl/dsstox) publishes standardized, structure-annotated toxicity databases, covering a broad range of toxicity disciplines. Each DSSTox database features documentation written in collaboration with the source authors and toxicity expe...

  3. University Accreditation using Data Warehouse

    Science.gov (United States)

    Sinaga, A. S.; Girsang, A. S.

    2017-01-01

    The accreditation aims assuring the quality the quality of the institution education. The institution needs the comprehensive documents for giving the information accurately before reviewed by assessor. Therefore, academic documents should be stored effectively to ease fulfilling the requirement of accreditation. However, the data are generally derived from various sources, various types, not structured and dispersed. This paper proposes designing a data warehouse to integrate all various data to prepare a good academic document for accreditation in a university. The data warehouse is built using nine steps that was introduced by Kimball. This method is applied to produce a data warehouse based on the accreditation assessment focusing in academic part. The data warehouse shows that it can analyse the data to prepare the accreditation assessment documents.

  4. Securing Document Warehouses against Brute Force Query Attacks

    Directory of Open Access Journals (Sweden)

    Sergey Vladimirovich Zapechnikov

    2017-04-01

    Full Text Available The paper presents the scheme of data management and protocols for securing document collection against adversary users who try to abuse their access rights to find out the full content of confidential documents. The configuration of secure document retrieval system is described and a suite of protocols among the clients, warehouse server, audit server and database management server is specified. The scheme makes it infeasible for clients to establish correspondence between the documents relevant to different search queries until a moderator won’t give access to these documents. The proposed solution allows ensuring higher security level for document warehouses.

  5. Databases

    Digital Repository Service at National Institute of Oceanography (India)

    Kunte, P.D.

    Information on bibliographic as well as numeric/textual databases relevant to coastal geomorphology has been included in a tabular form. Databases cover a broad spectrum of related subjects like coastal environment and population aspects, coastline...

  6. Data Warehouse Emissieregistratie. A new tool to sustainability; Data Warehouse Emissieregistratie. Een nieuw instrument op weg naar duurzaamheid

    Energy Technology Data Exchange (ETDEWEB)

    Van Grootveld, G. [VROM-Inpsectie, Den Haag (Netherlands); Op den Kamp, A. [OpdenKamp Adviesgroep, Den Haag (Netherlands)

    2002-12-01

    An overview is given of the possibilities to use and search the title database which contains data on emission of pollution sources in different sectors in the Netherlands. [Dutch] De voorliggende publicatie illustreert de kracht van het Data Warehouse aan de hand van zeven voorbeelden in de hoofdstukken 3 tot en met 9. Daarbij wordt telkens ook een doorkijk naar duurzame ontwikkeling gegeven.In hoofdstuk 10 worden twee cases met een korte handleiding behandeld. In hoofdstuk 1 staat achtergrondinformatie over de milieubeleidketen en de plaats die monitoring daarin neemt. In hoofdstuk 2 worden kort de drie dimensies van het Data Warehouse en de mogelijkheden die het Data Warehouse biedt beschreven (www.emissieregistratie.nl)

  7. The design and application of data warehouse during modern enterprises environment

    Science.gov (United States)

    Zhou, Lijuan; Liu, Chi; Wang, Chunying

    2006-04-01

    The interest in analyzing data has grown tremendously in recent years. To analyze data, a multitude of technologies is need, namely technologies from the fields of Data Warehouse, Data Mining, On-line Analytical Processing (OLAP). This paper proposes the system structure model of the data warehouse during modern enterprises environment according to the information demand for enterprises and the actual demand of user's, and also analyses the benefit of this kind of model in practical application, and provides the setting-up course of the data warehouse model. At the same time it has proposes the total design plans of the data warehouses of modern enterprises. The data warehouse that we build in practical application can be offered: high performance of queries; efficiency of the data; independent characteristic of logical and physical data. In addition, A Data Warehouse contains lots of materialized views over the data provided by the distributed heterogeneous databases for the purpose of efficiently implementing decision-support, OLAP queries or data mining. One of the most important decisions in designing a data warehouse is selection of right views to be materialized. In this paper, we also have designed algorithms for selecting a set of views to be materialized in a data warehouse.First, we give the algorithms for selecting materialized views. Then we use experiments do demonstrate the power of our approach. The results show the proposed algorithm delivers an optimal solution. Finally, we discuss the advantage and shortcoming of our approach and future work.

  8. Databases

    Directory of Open Access Journals (Sweden)

    Nick Ryan

    2004-01-01

    Full Text Available Databases are deeply embedded in archaeology, underpinning and supporting many aspects of the subject. However, as well as providing a means for storing, retrieving and modifying data, databases themselves must be a result of a detailed analysis and design process. This article looks at this process, and shows how the characteristics of data models affect the process of database design and implementation. The impact of the Internet on the development of databases is examined, and the article concludes with a discussion of a range of issues associated with the recording and management of archaeological data.

  9. PEMAHAMAN TEORI DATA WAREHOUSE BAGI MAHASISWA TAHUN AWAL JENJANG STRATA SATU BIDANG ILMU KOMPUTER

    Directory of Open Access Journals (Sweden)

    Harco Leslie Hendric Spits Warnars

    2015-01-01

    Full Text Available As a Computer scientist, a computer science students should have understanding about database theory as a concept of data maintenance. Database will be needed in every single human real life computer implementation such as information systems, information technology, internet, games, artificial intelligence, robot and so on. Inevitably, the right data handling and managament will produce excellent technology implementation. Data warehouse as one of the specialization subject which is offered in computer science study program final semester, provide challenge for computer science students.A survey was conducted on 18 students of early year of computer science study program at Surya university and giving hypothesis that for those students who ever heard of a data warehouse would be interested to learn data warehouse and on other hand, students who had never heard of the data warehouse will not be interested to learn data warehouse. Therefore, it is important that delivery of the Data warehouse subject material should be understood by lecturers, so that students can well understoodwith the data warehouse.

  10. Structuring warehouse management : Exploring the fit between warehouse characteristics and warehouse planning and control structure, and its effect on warehouse performance

    NARCIS (Netherlands)

    N. Faber (Nynke)

    2015-01-01

    markdownabstractThis dissertation studies the management processes that plan, control, and optimize warehouse operations. The inventory in warehouses decouples supply from demand. As such, economies of scale can be achieved in production, purchasing, and transport. As warehouses become more and more

  11. Electronic warehouse receipts registry as a step from paper to electronic warehouse receipts

    Directory of Open Access Journals (Sweden)

    Kovačević Vlado

    2016-01-01

    Full Text Available The aim of this paper is to determine the economic viability of the electronic warehouse receipt registry introduction, as a step toward electronic warehouse receipts. Both forms of warehouse receipt paper and electronic exist in practice, but paper warehouse receipts are more widespread. In this paper, the dematerialization process is analyzed in two steps. The first step is the dematerialization of warehouse receipt registry, with warehouse receipts still in paper form. The second step is the introduction of electronic warehouse receipts themselves. Dematerialization of warehouse receipts is more complex than that for financial securities, because of the individual characteristics of each warehouse receipt. As a consequence, electronic warehouse receipts are in place for only to a handful of commodities, namely cotton and a few grains. Nevertheless, the movement towards the electronic warehouse receipt, which began several decades ago with financial securities, is now taking hold in the agricultural sector. In this paper is analyzed Serbian electronic registry, since the Serbia is first country in EU with electronic warehouse receipts registry donated by FAO. Performed analysis shows the considerable impact of electronic warehouse receipts registry establishment on enhancing the security of the system of public warehouses, and on advancing the trade with warehouse receipt.

  12. Bioinformatics tools and database resources for systems genetics analysis in miceça short review and an evaluation of future needs

    NARCIS (Netherlands)

    Durrant, M.C.; Swertz, M.A.; Alberts, R.; Arends, D.; Möller, S.; Mott, R.; Prins, J.C.P.; Velde, van der K.J.; Jansen, R.C.; Schughart, K.

    2012-01-01

    During a meeting of the SYSGENET working group ‘Bioinformatics’, currently available software tools and databases for systems genetics in mice were reviewed and the needs for future developments discussed. The group evaluated interoperability and performed initial feasibility studies. To aid future

  13. Energy Finance Data Warehouse Manual

    Energy Technology Data Exchange (ETDEWEB)

    Lee, Sangkeun [Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States); Chinthavali, Supriya [Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States); Shankar, Mallikarjun [Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States); Zeng, Claire [Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States); Hendrickson, Stephen [Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States)

    2016-11-30

    The Office of Energy Policy and Systems Analysis s finance team (EPSA-50) requires a suite of automated applications that can extract specific data from a flexible data warehouse (where datasets characterizing energy-related finance, economics and markets are maintained and integrated), perform relevant operations and creatively visualize them to provide a better understanding of what policy options affect various operators/sectors of the electricity system. In addition, the underlying data warehouse should be structured in the most effective and efficient way so that it can become increasingly valuable over time. This report describes the Energy Finance Data Warehouse (EFDW) framework that has been developed to accomplish the defined requirement above. We also specifically dive into the Sankey generator use-case scenario to explain the components of the EFDW framework and their roles. An excel-based data warehouse was used in the creation of the energy finance Sankey diagram and other detailed data finance visualizations to support energy policy analysis. The framework also captures the methodology, calculations and estimations analysts used for the calculation as well as relevant sources so newer analysts can build on work done previously.

  14. Selecting materialized views in a data warehouse

    Science.gov (United States)

    Zhou, Lijuan; Liu, Chi; Liu, Daxin

    2003-01-01

    A Data Warehouse contains lots of materialized views over the data provided by the distributed heterogeneous databases for the purpose of efficiently implementing decision-support or OLAP queries. It is important to select the right view to materialize that answer a given set of queries. In this paper, we have addressed and designed algorithm to select a set of views to materialize in order to answer the most queries under the constraint of a given space. The algorithm presented in this paper aim at making out a minimum set of views, by which we can directly respond to as many as possible user"s query requests. We use experiments to demonstrate our approach. The results show that our algorithm works better. We implemented our algorithms and a performance study of the algorithm shows that the proposed algorithm gives a less complexity and higher speeds and feasible expandability.

  15. A Framework for Designing a Healthcare Outcome Data Warehouse

    Science.gov (United States)

    Parmanto, Bambang; Scotch, Matthew; Ahmad, Sjarif

    2005-01-01

    Many healthcare processes involve a series of patient visits or a series of outcomes. The modeling of outcomes associated with these types of healthcare processes is different from and not as well understood as the modeling of standard industry environments. For this reason, the typical multidimensional data warehouse designs that are frequently seen in other industries are often not a good match for data obtained from healthcare processes. Dimensional modeling is a data warehouse design technique that uses a data structure similar to the easily understood entity-relationship (ER) model but is sophisticated in that it supports high-performance data access. In the context of rehabilitation services, we implemented a slight variation of the dimensional modeling technique to make a data warehouse more appropriate for healthcare. One of the key aspects of designing a healthcare data warehouse is finding the right grain (scope) for different levels of analysis. We propose three levels of grain that enable the analysis of healthcare outcomes from highly summarized reports on episodes of care to fine-grained studies of progress from one treatment visit to the next. These grains allow the database to support multiple levels of analysis, which is imperative for healthcare decision making. PMID:18066371

  16. A framework for designing a healthcare outcome data warehouse.

    Science.gov (United States)

    Parmanto, Bambang; Scotch, Matthew; Ahmad, Sjarif

    2005-09-06

    Many healthcare processes involve a series of patient visits or a series of outcomes. The modeling of outcomes associated with these types of healthcare processes is different from and not as well understood as the modeling of standard industry environments. For this reason, the typical multidimensional data warehouse designs that are frequently seen in other industries are often not a good match for data obtained from healthcare processes. Dimensional modeling is a data warehouse design technique that uses a data structure similar to the easily understood entity-relationship (ER) model but is sophisticated in that it supports high-performance data access. In the context of rehabilitation services, we implemented a slight variation of the dimensional modeling technique to make a data warehouse more appropriate for healthcare. One of the key aspects of designing a healthcare data warehouse is finding the right grain (scope) for different levels of analysis. We propose three levels of grain that enable the analysis of healthcare outcomes from highly summarized reports on episodes of care to fine-grained studies of progress from one treatment visit to the next. These grains allow the database to support multiple levels of analysis, which is imperative for healthcare decision making.

  17. Perancangan Data Warehouse Nilai Mahasiswa Dengan Kimball Nine-Step Methodology

    Directory of Open Access Journals (Sweden)

    Ganda Wijaya

    2017-04-01

    Abstract Student grades has many components that can be analyzed to support decision making. Based on this, the authors conducted a study of student grades. The study was conducted on a database that is in the Bureau of Academic and Student Affairs Administration Bina Sarana Informatika (BAAK BSI. The focus of this research is "How to model a data warehouse that can meet the management needs of the data value of students as supporters of evaluation, planning and decision making?". Data warehouse grades students need to be made in order to obtain the information, reports, and can perform multi-dimensional analysis, which in turn can assist management in making policy. Development of the system is done by using System Development Life Cycle (SDLC with Waterfall approach. While the design of the data warehouse using a nine-step methodology kimball. Results obtained in the form of a star schema and data warehouse value. Data warehouses can provide a summary of information that is fast, accurate and continuous so as to assist management in making policies for the future. In general, the benefits of this research are as additional reference in building a data warehouse using a nine-step methodology kimball.   Keywords: Data Warehouse, Kimball Nine-Step Methodology.

  18. A Bioinformatics Facility for NASA

    Science.gov (United States)

    Schweighofer, Karl; Pohorille, Andrew

    2006-01-01

    Building on an existing prototype, we have fielded a facility with bioinformatics technologies that will help NASA meet its unique requirements for biological research. This facility consists of a cluster of computers capable of performing computationally intensive tasks, software tools, databases and knowledge management systems. Novel computational technologies for analyzing and integrating new biological data and already existing knowledge have been developed. With continued development and support, the facility will fulfill strategic NASA s bioinformatics needs in astrobiology and space exploration. . As a demonstration of these capabilities, we will present a detailed analysis of how spaceflight factors impact gene expression in the liver and kidney for mice flown aboard shuttle flight STS-108. We have found that many genes involved in signal transduction, cell cycle, and development respond to changes in microgravity, but that most metabolic pathways appear unchanged.

  19. A Million Cancer Genome Warehouse

    Science.gov (United States)

    2012-11-20

    of a national program for Cancer Information Donors, the American Society for Clinical Oncology (ASCO) has proposed a rapid learning system for...or Scala and Spark; “scrum” organization of small programming teams; calculating “velocity” to predict time to develop new features; and Agile...2012 to 00-00-2012 4. TITLE AND SUBTITLE A Million Cancer Genome Warehouse 5a. CONTRACT NUMBER 5b. GRANT NUMBER 5c. PROGRAM ELEMENT NUMBER 6

  20. Microbial bioinformatics 2020.

    Science.gov (United States)

    Pallen, Mark J

    2016-09-01

    Microbial bioinformatics in 2020 will remain a vibrant, creative discipline, adding value to the ever-growing flood of new sequence data, while embracing novel technologies and fresh approaches. Databases and search strategies will struggle to cope and manual curation will not be sustainable during the scale-up to the million-microbial-genome era. Microbial taxonomy will have to adapt to a situation in which most microorganisms are discovered and characterised through the analysis of sequences. Genome sequencing will become a routine approach in clinical and research laboratories, with fresh demands for interpretable user-friendly outputs. The "internet of things" will penetrate healthcare systems, so that even a piece of hospital plumbing might have its own IP address that can be integrated with pathogen genome sequences. Microbiome mania will continue, but the tide will turn from molecular barcoding towards metagenomics. Crowd-sourced analyses will collide with cloud computing, but eternal vigilance will be the price of preventing the misinterpretation and overselling of microbial sequence data. Output from hand-held sequencers will be analysed on mobile devices. Open-source training materials will address the need for the development of a skilled labour force. As we boldly go into the third decade of the twenty-first century, microbial sequence space will remain the final frontier! © 2016 The Author. Microbial Biotechnology published by John Wiley & Sons Ltd and Society for Applied Microbiology.

  1. Challenge: A Multidisciplinary Degree Program in Bioinformatics

    Directory of Open Access Journals (Sweden)

    Mudasser Fraz Wyne

    2006-06-01

    Full Text Available Bioinformatics is a new field that is poorly served by any of the traditional science programs in Biology, Computer science or Biochemistry. Known to be a rapidly evolving discipline, Bioinformatics has emerged from experimental molecular biology and biochemistry as well as from the artificial intelligence, database, pattern recognition, and algorithms disciplines of computer science. While institutions are responding to this increased demand by establishing graduate programs in bioinformatics, entrance barriers for these programs are high, largely due to the significant prerequisite knowledge which is required, both in the fields of biochemistry and computer science. Although many schools currently have or are proposing graduate programs in bioinformatics, few are actually developing new undergraduate programs. In this paper I explore the blend of a multidisciplinary approach, discuss the response of academia and highlight challenges faced by this emerging field.

  2. Biggest challenges in bioinformatics.

    Science.gov (United States)

    Fuller, Jonathan C; Khoueiry, Pierre; Dinkel, Holger; Forslund, Kristoffer; Stamatakis, Alexandros; Barry, Joseph; Budd, Aidan; Soldatos, Theodoros G; Linssen, Katja; Rajput, Abdul Mateen

    2013-04-01

    The third Heidelberg Unseminars in Bioinformatics (HUB) was held on 18th October 2012, at Heidelberg University, Germany. HUB brought together around 40 bioinformaticians from academia and industry to discuss the 'Biggest Challenges in Bioinformatics' in a 'World Café' style event.

  3. Biggest challenges in bioinformatics

    OpenAIRE

    Fuller, Jonathan C; Khoueiry, Pierre; Dinkel, Holger; Forslund, Kristoffer; Stamatakis, Alexandros; Barry, Joseph; Budd, Aidan; Soldatos, Theodoros G; Linssen, Katja; Rajput, Abdul Mateen

    2013-01-01

    The third Heidelberg Unseminars in Bioinformatics (HUB) was held in October at Heidelberg University in Germany. HUB brought together around 40 bioinformaticians from academia and industry to discuss the ‘Biggest Challenges in Bioinformatics' in a ‘World Café' style event.

  4. A bioinformatics potpourri.

    Science.gov (United States)

    Schönbach, Christian; Li, Jinyan; Ma, Lan; Horton, Paul; Sjaugi, Muhammad Farhan; Ranganathan, Shoba

    2018-01-19

    The 16th International Conference on Bioinformatics (InCoB) was held at Tsinghua University, Shenzhen from September 20 to 22, 2017. The annual conference of the Asia-Pacific Bioinformatics Network featured six keynotes, two invited talks, a panel discussion on big data driven bioinformatics and precision medicine, and 66 oral presentations of accepted research articles or posters. Fifty-seven articles comprising a topic assortment of algorithms, biomolecular networks, cancer and disease informatics, drug-target interactions and drug efficacy, gene regulation and expression, imaging, immunoinformatics, metagenomics, next generation sequencing for genomics and transcriptomics, ontologies, post-translational modification, and structural bioinformatics are the subject of this editorial for the InCoB2017 supplement issues in BMC Genomics, BMC Bioinformatics, BMC Systems Biology and BMC Medical Genomics. New Delhi will be the location of InCoB2018, scheduled for September 26-28, 2018.

  5. [Peranesthesic Anaphylactic Shocks: Contribution of a Clinical Data Warehouse].

    Science.gov (United States)

    Osmont, Marie-Noëlle; Campillo-Gimenez, Boris; Metayer, Lucie; Jantzem, Hélène; Rochefort-Morel, Cécile; Cuggia, Marc; Polard, Elisabeth

    2015-10-16

    To evaluate the performance of the collection of cases of anaphylactic shock during anesthesia in the Regional Pharmacovigilance Center of Rennes and the contribution of a query in the biomedical data warehouse of the French University Hospital of Rennes in 2009. Different sources were evaluated: the French pharmacovigilance database (including spontaneous reports and reports from a query in the database of the programme de médicalisation des systèmes d'information [PMSI]), records of patients seen in allergo-anesthesia (source considered as comprehensive as possible) and a query in the data warehouse. Analysis of allergo-anesthesia records detected all cases identified by other methods, as well as two other cases (nine cases in total). The query in the data warehouse enabled detection of seven cases out of the nine. Querying full-text reports and structured data extracted from the hospital information system improves the detection of anaphylaxis during anesthesia and facilitates access to data. © 2015 Société Française de Pharmacologie et de Thérapeutique.

  6. Creation of Warehouse Models for Different Layout Designs

    OpenAIRE

    Köhler, Mirko; Lukić, Ivica; Nenadić, Krešimir

    2014-01-01

    Warehouse is one of the most important components in logistics of the supply chain network. Efficiency of warehouse operations is influenced by many different factors. One of the key factors is the racks layout configuration. A warehouse with good racks layout may significantly reduce the cost of warehouse servicing. The objective of this paper is to give a scheme for building warehouses models with one-block and two-block layout for future research in warehouse optimization. An algorithm ...

  7. ¿Why Data warehouse & Business Intelligence at Universidad Simon Bolivar?

    Directory of Open Access Journals (Sweden)

    Kamagate Azoumana

    2013-01-01

    Full Text Available Abstract The data warehouse is supposed to provide storage, functionality and responsiveness to queries beyond the capabilities of today’s transaction databases. Also Data warehouse is built to improve the data access performance of databases.   Resumen Los almacenes de datos se supone que proporcionan almacenamiento, funcionalidad y capacidad de repuesta a las consultas y análisis más eficiente que las bases de datos transaccionales. También el almacén de datos se construiye para mejorar el rendimiento de acceso a los datos.

  8. Análisis de rendimiento académico estudiantil usando data warehouse y redes neuronales Analysis of students' academic performance using data warehouse and neural networks

    Directory of Open Access Journals (Sweden)

    Carolina Zambrano Matamala

    2011-12-01

    Full Text Available Cada día las organizaciones tienen más información porque sus sistemas producen una gran cantidad de operaciones diarias que se almacenan en bases de datos transaccionales. Con el fin de analizar esta información histórica, una alternativa interesante es implementar un Data Warehouse. Por otro lado, los Data Warehouse no son capaces de realizar un análisis predictivo por sí mismos, pero las técnicas de inteligencia de máquinas se pueden utilizar para clasificar, agrupar y predecir en base a información histórica con el fin de mejorar la calidad del análisis. En este trabajo se describe una arquitectura de Data Warehouse con el fin de realizar un análisis del desempeño académico de los estudiantes. El Data Warehouse es utilizado como entrada de una arquitectura de red neuronal con tal de analizar la información histórica y de tendencia en el tiempo. Los resultados muestran la viabilidad de utilizar un Data Warehouse para el análisis de rendimiento académico y la posibilidad de predecir el número de asignaturas aprobadas por los estudiantes usando solamente su propia información histórica.Every day organizations have more information because their systems produce a large amount of daily operations which are stored in transactional databases. In order to analyze this historical information, an interesting alternative is to implement a Data Warehouse. In the other hand, Data Warehouses are not able to perform predictive analysis for themselves, but machine learning techniques can be used to classify, grouping and predict historical information in order to improve the quality of analysis. This paper depicts architecture of a Data Warehouse useful to perform an analysis of students' academic performance. The Data Warehouse is used as input of a Neural Network in order to analyze historical information and forecast. The results show the viability of using Data Warehouse for academic performance analysis and the feasibility of

  9. Bioinformatics: A History of Evolution "In Silico"

    Science.gov (United States)

    Ondrej, Vladan; Dvorak, Petr

    2012-01-01

    Bioinformatics, biological databases, and the worldwide use of computers have accelerated biological research in many fields, such as evolutionary biology. Here, we describe a primer of nucleotide sequence management and the construction of a phylogenetic tree with two examples; the two selected are from completely different groups of organisms:…

  10. Bioinformatics in Undergraduate Education: Practical Examples

    Science.gov (United States)

    Boyle, John A.

    2004-01-01

    Bioinformatics has emerged as an important research tool in recent years. The ability to mine large databases for relevant information has become increasingly central to many different aspects of biochemistry and molecular biology. It is important that undergraduates be introduced to the available information and methodologies. We present a…

  11. Worldwide Warehouse: A Customer Perspective

    Science.gov (United States)

    1994-09-01

    Management Office (PMO) and the customers (returnees and buyers) 23 will be developed or adapted from existing software programs. The hardware could be... customer requirements and desires is the first aspect to be approached. Sections 4.7 to 4.11 were dedicated to inivestigate those relationships and...R x NTIS CRA&I DTIC TAB WORLDWIDE WAREHOUSE: Ju’a-noj1c0[ed 0 A CUSTOMER PERSPECTIVE J-f-c-.tion .......... THESIS By D i s ib , tio

  12. Information Architecture: The Data Warehouse Foundation.

    Science.gov (United States)

    Thomas, Charles R.

    1997-01-01

    Colleges and universities are initiating data warehouse projects to provide integrated information for planning and reporting purposes. A survey of 40 institutions with active data warehouse projects reveals the kinds of tools, contents, data cycles, and access currently used. Essential elements of an integrated information architecture are…

  13. Integrating Data Warehouses with Web Data

    DEFF Research Database (Denmark)

    Perez, Juan Manuel; Berlanga, Rafael; Aramburu, Maria Jose

    This paper surveys the most relevant research on combining Data Warehouse (DW) and Web data. It studies the XML technologies that are currently being used to integrate, store, query and retrieve web data, and their application to data warehouses. The paper addresses the problem of integrating...

  14. Bioinformatics and Cancer

    Science.gov (United States)

    Researchers take on challenges and opportunities to mine "Big Data" for answers to complex biological questions. Learn how bioinformatics uses advanced computing, mathematics, and technological platforms to store, manage, analyze, and understand data.

  15. Deep learning in bioinformatics.

    Science.gov (United States)

    Min, Seonwoo; Lee, Byunghan; Yoon, Sungroh

    2017-09-01

    In the era of big data, transformation of biomedical big data into valuable knowledge has been one of the most important challenges in bioinformatics. Deep learning has advanced rapidly since the early 2000s and now demonstrates state-of-the-art performance in various fields. Accordingly, application of deep learning in bioinformatics to gain insight from data has been emphasized in both academia and industry. Here, we review deep learning in bioinformatics, presenting examples of current research. To provide a useful and comprehensive perspective, we categorize research both by the bioinformatics domain (i.e. omics, biomedical imaging, biomedical signal processing) and deep learning architecture (i.e. deep neural networks, convolutional neural networks, recurrent neural networks, emergent architectures) and present brief descriptions of each study. Additionally, we discuss theoretical and practical issues of deep learning in bioinformatics and suggest future research directions. We believe that this review will provide valuable insights and serve as a starting point for researchers to apply deep learning approaches in their bioinformatics studies. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com.

  16. Data warehouse for assessing animal health, welfare, risk management and -communication.

    Science.gov (United States)

    Nielsen, Annette Cleveland

    2011-01-01

    The objective of this paper is to give an overview of existing databases in Denmark and describe some of the most important of these in relation to establishment of the Danish Veterinary and Food Administrations' veterinary data warehouse. The purpose of the data warehouse and possible use of the data are described. Finally, sharing of data and validity of data is discussed. There are databases in other countries describing animal husbandry and veterinary antimicrobial consumption, but Denmark will be the first country relating all data concerning animal husbandry, -health and -welfare in Danish production animals to each other in a data warehouse. Moreover, creating access to these data for researchers and authorities will hopefully result in easier and more substantial risk based control, risk management and risk communication by the authorities and access to data for researchers for epidemiological studies in animal health and welfare.

  17. PERANCANGAN DATA WAREHOUSE UNTUK MENDUKUNG PERENCANAAN PEMASARAN PERGURUAN TINGGI

    Directory of Open Access Journals (Sweden)

    agung prasetyo

    2017-02-01

    Full Text Available Salah satu indikasi perguruan tinggi yang besar adalah dilihat dari jumlah mahasiswa di perguruan tinggi tersebut. Karenanya, mahasiswa baru merupakan salah satu sumber daya yang menentukan berjalannya sebuah perguruan tinggi. Setiap tahunnya STMIK AMIKOM Purwokerto selalu melakukan penerimaan calon mahasiswa. Data mahasiswa baru tersebut sangat berguna bagi bagian pemasaran sebagai informasi untuk evaluasi kegiatan pemasaran berikutnya. Dengan dibangunnya data warehouse dan aplikasi OLAP dengan menggunakan aplikasi Pentaho Data Integration/Kettle sebagai perangkat ETL dan Pentaho Workbench yang merupakan Online Analytical Processing (OLAP sebagai pengolah database, manajemen di STMIK AMIKOM Purwokerto bisa mengambil beberapa informasi misalnya; banyak jumlah pendaftar per-periode/gelombang, per/asal sekolahnya, per/asal sumber informasi yang diperoleh calon mahasiswa baru, serta tren minat terhadap jurusan yang dipilih oleh calon mahasiswa baru. Data warehouse mampu menganalisis data transaksi, mampu memberikan laporan yang dinamis dan mampu memberikan informasi dalam berbagai dimensi tentang penerimaan mahasiswa baru di STMIK AMIKOM Purwokerto.Kata Kunci: Data Warehouse, OLAP, Pentaho, Penerimaan Calon Mahasiswa.

  18. Data Warehouse on the Web for Accelerator Fabrication And Maintenance

    International Nuclear Information System (INIS)

    Chan, A.; Crane, G.; Macgregor, I.; Meyer, S.

    2011-01-01

    A data warehouse grew out of the needs for a view of accelerator information from a lab-wide or project-wide standpoint (often needing off-site data access for the multi-lab PEP-II collaborators). A World Wide Web interface is used to link legacy database systems of the various labs and departments related to the PEP-II Accelerator. In this paper, we describe how links are made via the 'Formal Device Name' field(s) in the disparate databases. We also describe the functionality of a data warehouse in an accelerator environment. One can pick devices from the PEP-II Component List and find the actual components filling the functional slots, any calibration measurements, fabrication history, associated cables and modules, and operational maintenance records for the components. Information on inventory, drawings, publications, and purchasing history are also part of the PEP-II Database. A strategy of relying on a small team, and of linking existing databases rather than rebuilding systems is outlined.

  19. PERANCANGAN DAN IMPLEMENTASI DATA WAREHOUSE METEOROLOGI, KLIMATOLOGI, GEOFISIKA DAN BENCANA ALAM

    Directory of Open Access Journals (Sweden)

    Agus Safril

    2014-05-01

    Full Text Available BMKG telah memiliki data berasal dari beberapa sistem basis data historis (legacy system baik yang telah tersimpan dalam sistem informasi database maupun data dalam bentuk lembar kerja (worksheet. Data lama ini sering tidak digunakan ketika  sistem database baru dikembangkan. Agar data lama tetap dapat digunakan, diperlukan integrasi data lama dan baru. Data warehouse adalah konsep yang digunakan untuk mengintegrasikan data dalam penyimpanan sistem database terpadu BMKG. Integrasi data dilakukan dengan melakukan ekstraksi dari sumber data dengan mengambil item data yang diperlukan. Sumber data diperoleh dari sistem informasi yang ada di kelompok meteorologi, klimatologi dan geofisika. Proses integrasi data dimulai dengan ekstraksi (extraction kemudian dilakukan penyeragaman (transformation sehingga sesuai dengan format yang digunakan untuk kepentingan analisis. Selanjutnya dilakukan proses penyimpanan dalam data warehouse (loading. Prototipe data warehouse yang dibangun mencakup proses input data melalui ekstraksi data lama maupun data baru menggunakan media perangkat lunak akuisisi data. Hasil keluaran (output berupa laporan data dengan perioda data sesuai dengan kebutuhan.   The data collections of BMKG is captured from the legacy systems that is stored in the information systems or data worksheet. Sometimes the legacy system is not used when the new DBMS has been developed. In order the legacy system usefull for DBMS of BMKG, the data is integrated from the legacy systems to the new database systems. Data warehouse is the concept to integrate data to the BMKG Data Base Management System (DMBS. To integrate data, data is integrated the data sources from legacy systems that has been stored in the meteorology, climatology and geophysic information system. The next steps is transformed to data that has the format accordance with the weather analysis requirement. Finally, data must be loaded into the data warehouse.  The data warehouse

  20. Study and application of data mining and data warehouse in CIMS

    Science.gov (United States)

    Zhou, Lijuan; Liu, Chi; Liu, Daxin

    2003-03-01

    The interest in analyzing data has grown tremendously in recent years. To analyze data, a multitude of technologies is need, namely technologies from the fields of Data Warehouse, Data Mining, On-line Analytical Processing (OLAP). This paper gives a new architecture of data warehouse in CIMS according to CRGC-CIMS application engineering. The data source of this architecture comes from database of CRGC-CIMS system. The data is put in global data set by extracting, filtrating and integrating, and then the data is translated to data warehouse according information request. We have addressed two advantages of the new model in CRGC-CIMS application. In addition, a Data Warehouse contains lots of materialized views over the data provided by the distributed heterogeneous databases for the purpose of efficiently implementing decision-support, OLAP queries or data mining. It is important to select the right view to materialize that answer a given set of queries. In this paper, we also have designed algorithms for selecting a set of views to be materialized in a data warehouse in order to answer the most queries under the constraint of given space. First, we give a cost model for selecting materialized views. Then we give the algorithms that adopt gradually recursive method from bottom to top. We give description and realization of algorithms. Finally, we discuss the advantage and shortcoming of our approach and future work.

  1. Web-enabled Data Warehouse and Data Webhouse

    Directory of Open Access Journals (Sweden)

    Cerasela PIRVU

    2008-01-01

    Full Text Available In this paper, our objectives are to understanding what data warehouse means examine the reasons for doing so, appreciate the implications of the convergence of Web technologies and those of the data warehouse and examine the steps for building a Web-enabled data warehouse. The web revolution has propelled the data warehouse out onto the main stage, because in many situations the data warehouse must be the engine that controls or analysis the web experience. In order to step up to this new responsibility, the data warehouse must adjust. The nature of the data warehouse needs to be somewhat different. As a result, our data warehouses are becoming data webhouses. The data warehouse is becoming the infrastructure that supports customer relationship management (CRM. And the data warehouse is being asked to make the customer clickstream available for analysis. This rebirth of data warehousing architecture is called the data webhouse.

  2. Translational Bioinformatics and Clinical Research (Biomedical) Informatics.

    Science.gov (United States)

    Sirintrapun, S Joseph; Zehir, Ahmet; Syed, Aijazuddin; Gao, JianJiong; Schultz, Nikolaus; Cheng, Donavan T

    2015-06-01

    Translational bioinformatics and clinical research (biomedical) informatics are the primary domains related to informatics activities that support translational research. Translational bioinformatics focuses on computational techniques in genetics, molecular biology, and systems biology. Clinical research (biomedical) informatics involves the use of informatics in discovery and management of new knowledge relating to health and disease. This article details 3 projects that are hybrid applications of translational bioinformatics and clinical research (biomedical) informatics: The Cancer Genome Atlas, the cBioPortal for Cancer Genomics, and the Memorial Sloan Kettering Cancer Center clinical variants and results database, all designed to facilitate insights into cancer biology and clinical/therapeutic correlations. Copyright © 2015 Elsevier Inc. All rights reserved.

  3. Event-Entity-Relationship Modeling in Data Warehouse Environments

    DEFF Research Database (Denmark)

    Bækgaard, Lars

    We use the event-entity-relationship model (EVER) to illustrate the use of entity-based modeling languages for conceptual schema design in data warehouse environments. EVER is a general-purpose information modeling language that supports the specification of both general schema structures and multi......-dimensional schemes that are customized to serve specific information needs. EVER is based on an event concept that is very well suited for multi-dimensional modeling because measurement data often represent events in multi-dimensional databases...

  4. Subcritical calculation of the nuclear material warehouse

    International Nuclear Information System (INIS)

    Garcia M, T.; Mazon R, R.

    2009-01-01

    In this work the subcritical calculation of the nuclear material warehouse of the Reactor TRIGA Mark III labyrinth in the Mexico Nuclear Center is presented. During the adaptation of the nuclear warehouse (vault I), the fuel was temporarily changed to the warehouse (vault II) and it was also carried out the subcritical calculation for this temporary arrangement. The code used for the calculation of the effective multiplication factor, it was the Monte Carlo N-Particle Extended code known as MCNPX, developed by the National Laboratory of Los Alamos, for the particles transport. (Author)

  5. Building a Data Warehouse step by step

    Directory of Open Access Journals (Sweden)

    2007-01-01

    Full Text Available Data warehouses have been developed to answer the increasing demands of quality information required by the top managers and economic analysts of organizations. Their importance in now a day business area is unanimous recognized, being the foundation for developing business intelligence systems. Data warehouses offer support for decision-making process, allowing complex analyses which cannot be properly achieved from operational systems. This paper presents the ways in which a data warehouse may be developed and the stages of building it.

  6. INNOVATIVE DEVELOPMENT OF WAREHOUSE TECHNOLOGY

    Directory of Open Access Journals (Sweden)

    Judit OLÁH

    2017-12-01

    Full Text Available The smooth operation of stocking and the warehouse play a very important role in all manufacturing companies; therefore ongoing monitoring and application of new techniques is essential to increase efficiency. The aim of our research is twofold: the utilization of the pallet shuttle racking system, and the introduction of a development opportunity by the merging of storage and order picking operations in the pallet shuttle system. It can be concluded that it is beneficial for the company to purchase two mobile cars in order to increase the utilization of the pallet shuttle racking system from 60% to 72% and that of the storage from 74% to 76%. We established that after the merging of the storage and order picking activities within the pallet shuttle system, the forklift driver can also complete the selection activities immediately after storage. By merging the two operations and saving time the number of forklift drivers can be reduced from 4 to 3 per shift.

  7. Efficient data management tools for the heterogeneous big data warehouse

    Science.gov (United States)

    Alekseev, A. A.; Osipova, V. V.; Ivanov, M. A.; Klimentov, A.; Grigorieva, N. V.; Nalamwar, H. S.

    2016-09-01

    The traditional RDBMS has been consistent for the normalized data structures. RDBMS served well for decades, but the technology is not optimal for data processing and analysis in data intensive fields like social networks, oil-gas industry, experiments at the Large Hadron Collider, etc. Several challenges have been raised recently on the scalability of data warehouse like workload against the transactional schema, in particular for the analysis of archived data or the aggregation of data for summary and accounting purposes. The paper evaluates new database technologies like HBase, Cassandra, and MongoDB commonly referred as NoSQL databases for handling messy, varied and large amount of data. The evaluation depends upon the performance, throughput and scalability of the above technologies for several scientific and industrial use-cases. This paper outlines the technologies and architectures needed for processing Big Data, as well as the description of the back-end application that implements data migration from RDBMS to NoSQL data warehouse, NoSQL database organization and how it could be useful for further data analytics.

  8. [Establishment of data warehouse of needling and moxibustion literature based on data mining].

    Science.gov (United States)

    Wang, Jian-Ling; Li, Ren-Ling; Jia, Chun-Sheng

    2012-02-01

    In order to explore the efficacy specificity and valuable rules of clinical application of needling and moxibustion methods in a large quantity of information from literature, a data warehouse needs being established. On the basis of the original databases of red-hot needle therapy and hydro-acupuncture therapy, and the newly-established databases of acupoint catgut embedding therapy, acupoint application therapy, etc., and in accordance with the characteristics of different types of needling-moxibustion literature information, databases on different subjects were established first. These subject databases constitute a general "literature data warehouse on needling moxibustion methods" composing of multi-subjects and multiple dimensions so as to discover useful regularities about clinical treatment and trials collected in the literature by using data mining techniques. In the present paper, the authors introduce the design of the data warehouse, determination of subjects, establishment of subject relations, application of the administration platform, and application of data. This data warehouse will provide a standard data representation mode, enlarge data attributes and create extensive data links among literature information in the network, and may bring us with considerable convenience and profits in clinical application decision making and scientific research about needling-moxibustion techniques.

  9. Warehouse location and freight attraction in the greater El Paso region.

    Science.gov (United States)

    2013-12-01

    This project analyzes the current and future warehouse and distribution center locations along the El Paso-Juarez regions in the U.S.-Mexico border. This research seeks has developed a comprehensive database to aid in decision support process for ide...

  10. Model-driven user interfaces for bioinformatics data resources: regenerating the wheel as an alternative to reinventing it

    Directory of Open Access Journals (Sweden)

    Swainston Neil

    2006-12-01

    Full Text Available Abstract Background The proliferation of data repositories in bioinformatics has resulted in the development of numerous interfaces that allow scientists to browse, search and analyse the data that they contain. Interfaces typically support repository access by means of web pages, but other means are also used, such as desktop applications and command line tools. Interfaces often duplicate functionality amongst each other, and this implies that associated development activities are repeated in different laboratories. Interfaces developed by public laboratories are often created with limited developer resources. In such environments, reducing the time spent on creating user interfaces allows for a better deployment of resources for specialised tasks, such as data integration or analysis. Laboratories maintaining data resources are challenged to reconcile requirements for software that is reliable, functional and flexible with limitations on software development resources. Results This paper proposes a model-driven approach for the partial generation of user interfaces for searching and browsing bioinformatics data repositories. Inspired by the Model Driven Architecture (MDA of the Object Management Group (OMG, we have developed a system that generates interfaces designed for use with bioinformatics resources. This approach helps laboratory domain experts decrease the amount of time they have to spend dealing with the repetitive aspects of user interface development. As a result, the amount of time they can spend on gathering requirements and helping develop specialised features increases. The resulting system is known as Pierre, and has been validated through its application to use cases in the life sciences, including the PEDRoDB proteomics database and the e-Fungi data warehouse. Conclusion MDAs focus on generating software from models that describe aspects of service capabilities, and can be applied to support rapid development of repository

  11. Evaluating a healthcare data warehouse for cancer diseases

    OpenAIRE

    Sheta, Dr. Osama E.; Eldeen, Ahmed Nour

    2013-01-01

    This paper presents the evaluation of the architecture of healthcare data warehouse specific to cancer diseases. This data warehouse containing relevant cancer medical information and patient data. The data warehouse provides the source for all current and historical health data to help executive manager and doctors to improve the decision making process for cancer patients. The evaluation model based on Bill Inmon's definition of data warehouse is proposed to evaluate the Cancer data warehouse.

  12. Konsolidasi Data Warehouse untuk Aplikasi Business Intelligence

    Directory of Open Access Journals (Sweden)

    Rudy Rudy

    2012-12-01

    Full Text Available As the business competition is getting strong, corporate leaders need complete data that as a basis for determining future business strategies. Similarly with management of company "A", a pharmaceutical company which has three distribution companies. Each distribution company already has a data warehouse to generate reports for each of them. For business operational and corporate strategies, chairman PT "A" requires an integrated report, so analysis of data owned by the three distribution companies can be done in a full reportto answer the problems faced by the managemet. Thus, data warehouse consilidation can be used as a solution for company "A". Methodology starts with analysis of information needs to be displayed on the application ofbusiness intelligence, data warehouse consolidation, ETL (extract, transform and load, data warehousing, OLAP and Dashboard. Using data warehouse consolidation, information access by management of company "A" can be done in a single presentation, which can display data comparison between the three distribution companies.

  13. Protecting privacy in a clinical data warehouse.

    Science.gov (United States)

    Kong, Guilan; Xiao, Zhichun

    2015-06-01

    Peking University has several prestigious teaching hospitals in China. To make secondary use of massive medical data for research purposes, construction of a clinical data warehouse is imperative in Peking University. However, a big concern for clinical data warehouse construction is how to protect patient privacy. In this project, we propose to use a combination of symmetric block ciphers, asymmetric ciphers, and cryptographic hashing algorithms to protect patient privacy information. The novelty of our privacy protection approach lies in message-level data encryption, the key caching system, and the cryptographic key management system. The proposed privacy protection approach is scalable to clinical data warehouse construction with any size of medical data. With the composite privacy protection approach, the clinical data warehouse can be secure enough to keep the confidential data from leaking to the outside world. © The Author(s) 2014.

  14. Data Warehouse Architecture for Army Installations

    National Research Council Canada - National Science Library

    Reddy, Prameela

    1999-01-01

    .... A data warehouse is a single store of information to answer complex queries from management using cross-functional data to perform advanced data analysis methods and to compare with historical data...

  15. Development of Auto-Stacking Warehouse Truck

    Directory of Open Access Journals (Sweden)

    Kuo-Hsien Hsia

    2018-03-01

    Full Text Available Warehouse automation is a very important issue for the promotion of traditional industries. For the production of larger and stackable products, it is usually necessary to operate a fork-lifter for the stacking and storage of the products by a skilled person. The general autonomous warehouse-truck does not have the ability of stacking objects. In this paper, we develop a prototype of auto-stacking warehouse-truck that can work without direct operation by a skill person. With command made by an RFID card, the stacker truck can take the packaged product to the warehouse on the prior-planned route and store it in a stacking way in the designated storage area, or deliver the product to the shipping area or into the container from the storage area. It can significantly reduce the manpower requirements of the skilled-person of forklift technician and improve the safety of the warehousing area.

  16. Designing a Data Warehouse for Cyber Crimes

    Directory of Open Access Journals (Sweden)

    Il-Yeol Song

    2006-09-01

    Full Text Available One of the greatest challenges facing modern society is the rising tide of cyber crimes. These crimes, since they rarely fit the model of conventional crimes, are difficult to investigate, hard to analyze, and difficult to prosecute. Collecting data in a unified framework is a mandatory step that will assist the investigator in sorting through the mountains of data. In this paper, we explore designing a dimensional model for a data warehouse that can be used in analyzing cyber crime data. We also present some interesting queries and the types of cyber crime analyses that can be performed based on the data warehouse. We discuss several ways of utilizing the data warehouse using OLAP and data mining technologies. We finally discuss legal issues and data population issues for the data warehouse.

  17. Data Warehouse Discovery Framework: The Foundation

    Science.gov (United States)

    Apanowicz, Cas

    The cost of building an Enterprise Data Warehouse Environment runs usually in millions of dollars and takes years to complete. The cost, as big as it is, is not the primary problem for a given corporation. The risk that all money allocated for planning, design and implementation of the Data Warehouse and Business Intelligence Environment may not bring the result expected, fare out way the cost of entire effort [2,10]. The combination of the two above factors is the main reason that Data Warehouse/Business Intelligence is often single most expensive and most risky IT endeavor for companies [13]. That situation was the main author's inspiration behind founding of Infobright Corp and later on the concept of Data Warehouse Discovery Framework.

  18. Warehouse Performance Improvement at Linfox Logistics Indonesia

    OpenAIRE

    Pratama, Riyan Galuh; Simatupang, Togar M

    2013-01-01

    The objective of this research is to provide alternative solutions for Linfox Logistics Indonesia (LLI) in facing warehouse performance issues. The main warehouse performance indicators called Customer Case Filling on Time (CCFOT) and Case Picking Productivity failed to achieve the target. Several analyses were carried out regarding current dispatch process, value stream mapping, and root causes identification. The results find that much waste occurred in dispatch process. Proposed improvemen...

  19. The secondary metabolite bioinformatics portal

    DEFF Research Database (Denmark)

    Weber, Tilmann; Kim, Hyun Uk

    2016-01-01

    . In this context, this review gives a summary of tools and databases that currently are available to mine, identify and characterize natural product biosynthesis pathways and their producers based on ‘omics data. A web portal called Secondary Metabolite Bioinformatics Portal (SMBP at http...... analytical and chemical methods gave access to this group of compounds, nowadays genomics-based methods offer complementary approaches to find, identify and characterize such molecules. This paradigm shift also resulted in a high demand for computational tools to assist researchers in their daily work......Natural products are among the most important sources of lead molecules for drug discovery. With the development of affordable whole-genome sequencing technologies and other ‘omics tools, the field of natural products research is currently undergoing a shift in paradigms. While, for decades, mainly...

  20. Introduction to bioinformatics.

    Science.gov (United States)

    Can, Tolga

    2014-01-01

    Bioinformatics is an interdisciplinary field mainly involving molecular biology and genetics, computer science, mathematics, and statistics. Data intensive, large-scale biological problems are addressed from a computational point of view. The most common problems are modeling biological processes at the molecular level and making inferences from collected data. A bioinformatics solution usually involves the following steps: Collect statistics from biological data. Build a computational model. Solve a computational modeling problem. Test and evaluate a computational algorithm. This chapter gives a brief introduction to bioinformatics by first providing an introduction to biological terminology and then discussing some classical bioinformatics problems organized by the types of data sources. Sequence analysis is the analysis of DNA and protein sequences for clues regarding function and includes subproblems such as identification of homologs, multiple sequence alignment, searching sequence patterns, and evolutionary analyses. Protein structures are three-dimensional data and the associated problems are structure prediction (secondary and tertiary), analysis of protein structures for clues regarding function, and structural alignment. Gene expression data is usually represented as matrices and analysis of microarray data mostly involves statistics analysis, classification, and clustering approaches. Biological networks such as gene regulatory networks, metabolic pathways, and protein-protein interaction networks are usually modeled as graphs and graph theoretic approaches are used to solve associated problems such as construction and analysis of large-scale networks.

  1. Perancangan Model Data Warehouse dan Perangkat Analitik untuk Memaksimalkan Proses Pemasaran Hotel: Studi Kasus pada Hotel Abc

    Directory of Open Access Journals (Sweden)

    Eka Miranda

    2013-06-01

    Full Text Available The increasing competition in hotel business forces every hotel to be equiped with analysis tools that can maximize its marketing performance. This paper discusses the development of a data warehouse model and analytic tools to enhance the company's competitive advantage through the utilization of a variety of data, information and knowledge held by the company as a raw material in the decision making process. A study is done at ABC Hotel which uses a database to save the transactional record. However, the database cannot be directly used to support analysis and decision making process. Based on this issue, the company needs a data warehouse model and analytic tools that can be used to store large amounts of data and also potentially to gain a new perspective of data distribution which allows to provide reporting and answers of ad hoc users questions and assist managers in making decisions. Further data warehouse model and analytic tools can be used to help manager to formulate planning and marketing strategies. Data are collected through interviews and literature study, followed by data analysis to analyze business processes, to identify the problems and the information to support analysis process. Furthermore, data warehouse is designed using analysis of records related to the activities in hotel's marketing area and data warehouse model. The result of this paper is data warehouse model and analytic tools to analyze the external and transactional data and to support decision making process in marketing area.

  2. Implementasi Data Warehouse dan Data Mining: Studi Kasus Analisis Peminatan Studi Siswa

    Directory of Open Access Journals (Sweden)

    Eka Miranda

    2011-06-01

    Full Text Available This paper discusses the implementation of data mining and their role in helping decision-making related to students’ specialization program selection. Currently, the university uses a database to store records of transactions which can not directly be used to assist analysis and decision making. Based on these issues then made the data warehouse design used to store large amounts of data and also has the potential to gain new data distribution perspectives and allows to answer the ad hoc question as well as to perform data analysis. The method used consists of: record analysis related to students’ academic achievement, designing data warehouse and data mining. The paper’s results are in a form of data warehouse and data mining design and its implementation with the classification techniques and association rules. From these results can be seen the students’ tendency and pattern background in choosing the specialization, to help them make decisions. 

  3. System and method for integrating and accessing multiple data sources within a data warehouse architecture

    Science.gov (United States)

    Musick, Charles R [Castro Valley, CA; Critchlow, Terence [Livermore, CA; Ganesh, Madhaven [San Jose, CA; Slezak, Tom [Livermore, CA; Fidelis, Krzysztof [Brentwood, CA

    2006-12-19

    A system and method is disclosed for integrating and accessing multiple data sources within a data warehouse architecture. The metadata formed by the present method provide a way to declaratively present domain specific knowledge, obtained by analyzing data sources, in a consistent and useable way. Four types of information are represented by the metadata: abstract concepts, databases, transformations and mappings. A mediator generator automatically generates data management computer code based on the metadata. The resulting code defines a translation library and a mediator class. The translation library provides a data representation for domain specific knowledge represented in a data warehouse, including "get" and "set" methods for attributes that call transformation methods and derive a value of an attribute if it is missing. The mediator class defines methods that take "distinguished" high-level objects as input and traverse their data structures and enter information into the data warehouse.

  4. Development of Bioinformatics Infrastructure for Genomics Research.

    Science.gov (United States)

    Mulder, Nicola J; Adebiyi, Ezekiel; Adebiyi, Marion; Adeyemi, Seun; Ahmed, Azza; Ahmed, Rehab; Akanle, Bola; Alibi, Mohamed; Armstrong, Don L; Aron, Shaun; Ashano, Efejiro; Baichoo, Shakuntala; Benkahla, Alia; Brown, David K; Chimusa, Emile R; Fadlelmola, Faisal M; Falola, Dare; Fatumo, Segun; Ghedira, Kais; Ghouila, Amel; Hazelhurst, Scott; Isewon, Itunuoluwa; Jung, Segun; Kassim, Samar Kamal; Kayondo, Jonathan K; Mbiyavanga, Mamana; Meintjes, Ayton; Mohammed, Somia; Mosaku, Abayomi; Moussa, Ahmed; Muhammd, Mustafa; Mungloo-Dilmohamud, Zahra; Nashiru, Oyekanmi; Odia, Trust; Okafor, Adaobi; Oladipo, Olaleye; Osamor, Victor; Oyelade, Jellili; Sadki, Khalid; Salifu, Samson Pandam; Soyemi, Jumoke; Panji, Sumir; Radouani, Fouzia; Souiai, Oussama; Tastan Bishop, Özlem

    2017-06-01

    Although pockets of bioinformatics excellence have developed in Africa, generally, large-scale genomic data analysis has been limited by the availability of expertise and infrastructure. H3ABioNet, a pan-African bioinformatics network, was established to build capacity specifically to enable H3Africa (Human Heredity and Health in Africa) researchers to analyze their data in Africa. Since the inception of the H3Africa initiative, H3ABioNet's role has evolved in response to changing needs from the consortium and the African bioinformatics community. H3ABioNet set out to develop core bioinformatics infrastructure and capacity for genomics research in various aspects of data collection, transfer, storage, and analysis. Various resources have been developed to address genomic data management and analysis needs of H3Africa researchers and other scientific communities on the continent. NetMap was developed and used to build an accurate picture of network performance within Africa and between Africa and the rest of the world, and Globus Online has been rolled out to facilitate data transfer. A participant recruitment database was developed to monitor participant enrollment, and data is being harmonized through the use of ontologies and controlled vocabularies. The standardized metadata will be integrated to provide a search facility for H3Africa data and biospecimens. Because H3Africa projects are generating large-scale genomic data, facilities for analysis and interpretation are critical. H3ABioNet is implementing several data analysis platforms that provide a large range of bioinformatics tools or workflows, such as Galaxy, the Job Management System, and eBiokits. A set of reproducible, portable, and cloud-scalable pipelines to support the multiple H3Africa data types are also being developed and dockerized to enable execution on multiple computing infrastructures. In addition, new tools have been developed for analysis of the uniquely divergent African data and for

  5. The Implementation of Data Warehouse and OLAP for Rehabilitation Outcome Evaluation: ReDWinE System

    Science.gov (United States)

    Guo, Fei-Ran; Parmanto, Bambang; Irrgang, James J.; Wang, Jiunjie; Fang, Huaijin

    2000-01-01

    We created a data warehouse and OLAP system on the web for outcome evaluation of rehabilitation services. Thirteen outcome indicators were use in this research. Efficiency of therapists and clinics, expected utility of treatments and graphic patterns were generated for data exploration, data mining and decision support. Users can retrieve plenty of graphs and statistical tables without knowing database structure or attributes. Our experiences showed that multi-dimensional database and OLAP could serve as a decision support system.

  6. Work prioritization by using data warehouse solution; Priorizacao de obras usando solucao de data warehouse

    Energy Technology Data Exchange (ETDEWEB)

    Grupelli Junior, Fernando Antonio; Azoni, Edivar Garcia [Companhia Paranaense de Energia (COPEL), Curitiba, PR (Brazil)

    2000-07-01

    This work proposes the utilization of data warehouse technology for helping of gathering adequate and reliable information, and allows the calculation of cost-benefits ratios of work in the distribution primary network. The paper also intends to suggest a better integration and the utilization of the possibility of a data warehouse and his future integration with a geo processing system.

  7. Benchmarking distributed data warehouse solutions for storing genomic variant information

    Science.gov (United States)

    Wiewiórka, Marek S.; Wysakowicz, Dawid P.; Okoniewski, Michał J.

    2017-01-01

    Abstract Genomic-based personalized medicine encompasses storing, analysing and interpreting genomic variants as its central issues. At a time when thousands of patientss sequenced exomes and genomes are becoming available, there is a growing need for efficient database storage and querying. The answer could be the application of modern distributed storage systems and query engines. However, the application of large genomic variant databases to this problem has not been sufficiently far explored so far in the literature. To investigate the effectiveness of modern columnar storage [column-oriented Database Management System (DBMS)] and query engines, we have developed a prototypic genomic variant data warehouse, populated with large generated content of genomic variants and phenotypic data. Next, we have benchmarked performance of a number of combinations of distributed storages and query engines on a set of SQL queries that address biological questions essential for both research and medical applications. In addition, a non-distributed, analytical database (MonetDB) has been used as a baseline. Comparison of query execution times confirms that distributed data warehousing solutions outperform classic relational DBMSs. Moreover, pre-aggregation and further denormalization of data, which reduce the number of distributed join operations, significantly improve query performance by several orders of magnitude. Most of distributed back-ends offer a good performance for complex analytical queries, while the Optimized Row Columnar (ORC) format paired with Presto and Parquet with Spark 2 query engines provide, on average, the lowest execution times. Apache Kudu on the other hand, is the only solution that guarantees a sub-second performance for simple genome range queries returning a small subset of data, where low-latency response is expected, while still offering decent performance for running analytical queries. In summary, research and clinical applications that require

  8. The GMOD Drupal Bioinformatic Server Framework

    Science.gov (United States)

    Papanicolaou, Alexie; Heckel, David G.

    2010-01-01

    Motivation: Next-generation sequencing technologies have led to the widespread use of -omic applications. As a result, there is now a pronounced bioinformatic bottleneck. The general model organism database (GMOD) tool kit (http://gmod.org) has produced a number of resources aimed at addressing this issue. It lacks, however, a robust online solution that can deploy heterogeneous data and software within a Web content management system (CMS). Results: We present a bioinformatic framework for the Drupal CMS. It consists of three modules. First, GMOD-DBSF is an application programming interface module for the Drupal CMS that simplifies the programming of bioinformatic Drupal modules. Second, the Drupal Bioinformatic Software Bench (biosoftware_bench) allows for a rapid and secure deployment of bioinformatic software. An innovative graphical user interface (GUI) guides both use and administration of the software, including the secure provision of pre-publication datasets. Third, we present genes4all_experiment, which exemplifies how our work supports the wider research community. Conclusion: Given the infrastructure presented here, the Drupal CMS may become a powerful new tool set for bioinformaticians. The GMOD-DBSF base module is an expandable community resource that decreases development time of Drupal modules for bioinformatics. The biosoftware_bench module can already enhance biologists' ability to mine their own data. The genes4all_experiment module has already been responsible for archiving of more than 150 studies of RNAi from Lepidoptera, which were previously unpublished. Availability and implementation: Implemented in PHP and Perl. Freely available under the GNU Public License 2 or later from http://gmod-dbsf.googlecode.com Contact: alexie@butterflybase.org PMID:20971988

  9. The GMOD Drupal bioinformatic server framework.

    Science.gov (United States)

    Papanicolaou, Alexie; Heckel, David G

    2010-12-15

    Next-generation sequencing technologies have led to the widespread use of -omic applications. As a result, there is now a pronounced bioinformatic bottleneck. The general model organism database (GMOD) tool kit (http://gmod.org) has produced a number of resources aimed at addressing this issue. It lacks, however, a robust online solution that can deploy heterogeneous data and software within a Web content management system (CMS). We present a bioinformatic framework for the Drupal CMS. It consists of three modules. First, GMOD-DBSF is an application programming interface module for the Drupal CMS that simplifies the programming of bioinformatic Drupal modules. Second, the Drupal Bioinformatic Software Bench (biosoftware_bench) allows for a rapid and secure deployment of bioinformatic software. An innovative graphical user interface (GUI) guides both use and administration of the software, including the secure provision of pre-publication datasets. Third, we present genes4all_experiment, which exemplifies how our work supports the wider research community. Given the infrastructure presented here, the Drupal CMS may become a powerful new tool set for bioinformaticians. The GMOD-DBSF base module is an expandable community resource that decreases development time of Drupal modules for bioinformatics. The biosoftware_bench module can already enhance biologists' ability to mine their own data. The genes4all_experiment module has already been responsible for archiving of more than 150 studies of RNAi from Lepidoptera, which were previously unpublished. Implemented in PHP and Perl. Freely available under the GNU Public License 2 or later from http://gmod-dbsf.googlecode.com.

  10. Quality controls in integrative approaches to detect errors and inconsistencies in biological databases

    Directory of Open Access Journals (Sweden)

    Ghisalberti Giorgio

    2010-12-01

    Full Text Available Numerous biomolecular data are available, but they are scattered in many databases and only some of them are curated by experts. Most available data are computationally derived and include errors and inconsistencies. Effective use of available data in order to derive new knowledge hence requires data integration and quality improvement. Many approaches for data integration have been proposed. Data warehousing seams to be the most adequate when comprehensive analysis of integrated data is required. This makes it the most suitable also to implement comprehensive quality controls on integrated data. We previously developed GFINDer (http://www.bioinformatics.polimi.it/GFINDer/, a web system that supports scientists in effectively using available information. It allows comprehensive statistical analysis and mining of functional and phenotypic annotations of gene lists, such as those identified by high-throughput biomolecular experiments. GFINDer backend is composed of a multi-organism genomic and proteomic data warehouse (GPDW. Within the GPDW, several controlled terminologies and ontologies, which describe gene and gene product related biomolecular processes, functions and phenotypes, are imported and integrated, together with their associations with genes and proteins of several organisms. In order to ease maintaining updated the GPDW and to ensure the best possible quality of data integrated in subsequent updating of the data warehouse, we developed several automatic procedures. Within them, we implemented numerous data quality control techniques to test the integrated data for a variety of possible errors and inconsistencies. Among other features, the implemented controls check data structure and completeness, ontological data consistency, ID format and evolution, unexpected data quantification values, and consistency of data from single and multiple sources. We use the implemented controls to analyze the quality of data available from several

  11. Advance in structural bioinformatics

    CERN Document Server

    Wei, Dongqing; Zhao, Tangzhen; Dai, Hao

    2014-01-01

    This text examines in detail mathematical and physical modeling, computational methods and systems for obtaining and analyzing biological structures, using pioneering research cases as examples. As such, it emphasizes programming and problem-solving skills. It provides information on structure bioinformatics at various levels, with individual chapters covering introductory to advanced aspects, from fundamental methods and guidelines on acquiring and analyzing genomics and proteomics sequences, the structures of protein, DNA and RNA, to the basics of physical simulations and methods for conform

  12. Metagenomics and Bioinformatics in Microbial Ecology: Current Status and Beyond.

    Science.gov (United States)

    Hiraoka, Satoshi; Yang, Ching-Chia; Iwasaki, Wataru

    2016-09-29

    Metagenomic approaches are now commonly used in microbial ecology to study microbial communities in more detail, including many strains that cannot be cultivated in the laboratory. Bioinformatic analyses make it possible to mine huge metagenomic datasets and discover general patterns that govern microbial ecosystems. However, the findings of typical metagenomic and bioinformatic analyses still do not completely describe the ecology and evolution of microbes in their environments. Most analyses still depend on straightforward sequence similarity searches against reference databases. We herein review the current state of metagenomics and bioinformatics in microbial ecology and discuss future directions for the field. New techniques will allow us to go beyond routine analyses and broaden our knowledge of microbial ecosystems. We need to enrich reference databases, promote platforms that enable meta- or comprehensive analyses of diverse metagenomic datasets, devise methods that utilize long-read sequence information, and develop more powerful bioinformatic methods to analyze data from diverse perspectives.

  13. Crowdsourcing for bioinformatics.

    Science.gov (United States)

    Good, Benjamin M; Su, Andrew I

    2013-08-15

    Bioinformatics is faced with a variety of problems that require human involvement. Tasks like genome annotation, image analysis, knowledge-base population and protein structure determination all benefit from human input. In some cases, people are needed in vast quantities, whereas in others, we need just a few with rare abilities. Crowdsourcing encompasses an emerging collection of approaches for harnessing such distributed human intelligence. Recently, the bioinformatics community has begun to apply crowdsourcing in a variety of contexts, yet few resources are available that describe how these human-powered systems work and how to use them effectively in scientific domains. Here, we provide a framework for understanding and applying several different types of crowdsourcing. The framework considers two broad classes: systems for solving large-volume 'microtasks' and systems for solving high-difficulty 'megatasks'. Within these classes, we discuss system types, including volunteer labor, games with a purpose, microtask markets and open innovation contests. We illustrate each system type with successful examples in bioinformatics and conclude with a guide for matching problems to crowdsourcing solutions that highlights the positives and negatives of different approaches.

  14. Phylogenetic trees in bioinformatics

    Energy Technology Data Exchange (ETDEWEB)

    Burr, Tom L [Los Alamos National Laboratory

    2008-01-01

    Genetic data is often used to infer evolutionary relationships among a collection of viruses, bacteria, animal or plant species, or other operational taxonomic units (OTU). A phylogenetic tree depicts such relationships and provides a visual representation of the estimated branching order of the OTUs. Tree estimation is unique for several reasons, including: the types of data used to represent each OTU; the use ofprobabilistic nucleotide substitution models; the inference goals involving both tree topology and branch length, and the huge number of possible trees for a given sample of a very modest number of OTUs, which implies that fmding the best tree(s) to describe the genetic data for each OTU is computationally demanding. Bioinformatics is too large a field to review here. We focus on that aspect of bioinformatics that includes study of similarities in genetic data from multiple OTUs. Although research questions are diverse, a common underlying challenge is to estimate the evolutionary history of the OTUs. Therefore, this paper reviews the role of phylogenetic tree estimation in bioinformatics, available methods and software, and identifies areas for additional research and development.

  15. PayDIBI: Pay-as-you-go data integration for bioinformatics

    NARCIS (Netherlands)

    Wanders, B.

    2012-01-01

    Background: Scientific research in bio-informatics is often data-driven and supported by biolog- ical databases. In a growing number of research projects, researchers like to ask questions that require the combination of information from more than one database. Most bio-informatics papers do not

  16. Pay-as-you-go data integration for bio-informatics

    NARCIS (Netherlands)

    Wanders, B.

    2012-01-01

    Scientific research in bio-informatics is often data-driven and supported by numerous biological databases. A biological database contains factual information collected from scientific experiments and computational analyses about areas including genomics, proteomics, metabolomics, microarray gene

  17. Data warehouse based decision support system in nuclear power plants

    International Nuclear Information System (INIS)

    Nadinic, B.

    2004-01-01

    Safety is an important element in business decision making processes in nuclear power plants. Information about component reliability, structures and systems, data recorded during the nuclear power plant's operation and outage periods, as well as experiences from other power plants are located in different database systems throughout the power plant. It would be possible to create a decision support system which would collect data, transform it into a standardized form and store it in a single location in a format more suitable for analyses and knowledge discovery. This single location where the data would be stored would be a data warehouse. Such data warehouse based decision support system could help make decision making processes more efficient by providing more information about business processes and predicting possible consequences of different decisions. Two main functionalities in this decision support system would be an OLAP (On Line Analytical Processing) and a data mining system. An OLAP system would enable the users to perform fast, simple and efficient multidimensional analysis of existing data and identify trends. Data mining techniques and algorithms would help discover new, previously unknown information from the data as well as hidden dependencies between various parameters. Data mining would also enable analysts to create relevant prediction models that could predict behaviour of different systems during operation and inspection results during outages. The basic characteristics and theoretical foundations of such decision support system are described and the reasons for choosing a data warehouse as the underlying structure are explained. The article analyzes obvious business benefits of such system as well as potential uses of OLAP and data mining technologies. Possible implementation methodologies and problems that may arise, especially in the field of data integration, are discussed and analyzed.(author)

  18. Establishment of the Integrated Plant Data Warehouse

    International Nuclear Information System (INIS)

    Oota, Yoshimi; Yoshinaga, Toshiaki

    1999-01-01

    This paper presents 'The Establishment of the Integrated Plant Data Warehouse and Verification Tests on Inter-corporate Electronic Commerce based on the Data Warehouse (PDWH)', one of the 'Shared Infrastructure for the Electronic Commerce Consolidation Project', promoted by the Ministry of International Trade and Industry (MITI) through the Information-Technology Promotion Agency (IPA), Japan. A study group called Japan Plant EC (PlantEC) was organized to perform relevant activities. One of the main activities of plantEC involves the construction of the Integrated (including manufacturers, engineering companies, plant construction companies, and machinery and parts manufacturers, etc.) Data Warehouse which is an essential part of the infrastructure necessary for a system to share information on industrial life cycle ranging from planning/designing to operation/maintenance. Another activity is the utilization of this warehouse for the purpose of conducting verification tests to prove its usefulness. Through these verification tests, PlantEC will endeavor to establish a warehouse with standardized data which can be used for the infrastructure of EC in the process plant industry. (author)

  19. Establishment of the Integrated Plant Data Warehouse

    Energy Technology Data Exchange (ETDEWEB)

    Oota, Yoshimi; Yoshinaga, Toshiaki [Hitachi Works, Hitachi Ltd., hitachi, Ibaraki (Japan)

    1999-07-01

    This paper presents 'The Establishment of the Integrated Plant Data Warehouse and Verification Tests on Inter-corporate Electronic Commerce based on the Data Warehouse (PDWH)', one of the 'Shared Infrastructure for the Electronic Commerce Consolidation Project', promoted by the Ministry of International Trade and Industry (MITI) through the Information-Technology Promotion Agency (IPA), Japan. A study group called Japan Plant EC (PlantEC) was organized to perform relevant activities. One of the main activities of plantEC involves the construction of the Integrated (including manufacturers, engineering companies, plant construction companies, and machinery and parts manufacturers, etc.) Data Warehouse which is an essential part of the infrastructure necessary for a system to share information on industrial life cycle ranging from planning/designing to operation/maintenance. Another activity is the utilization of this warehouse for the purpose of conducting verification tests to prove its usefulness. Through these verification tests, PlantEC will endeavor to establish a warehouse with standardized data which can be used for the infrastructure of EC in the process plant industry. (author)

  20. An Overview of Bioinformatics Tools and Resources in Allergy.

    Science.gov (United States)

    Fu, Zhiyan; Lin, Jing

    2017-01-01

    The rapidly increasing number of characterized allergens has created huge demands for advanced information storage, retrieval, and analysis. Bioinformatics and machine learning approaches provide useful tools for the study of allergens and epitopes prediction, which greatly complement traditional laboratory techniques. The specific applications mainly include identification of B- and T-cell epitopes, and assessment of allergenicity and cross-reactivity. In order to facilitate the work of clinical and basic researchers who are not familiar with bioinformatics, we review in this chapter the most important databases, bioinformatic tools, and methods with relevance to the study of allergens.

  1. Data mining for bioinformatics applications

    CERN Document Server

    Zengyou, He

    2015-01-01

    Data Mining for Bioinformatics Applications provides valuable information on the data mining methods have been widely used for solving real bioinformatics problems, including problem definition, data collection, data preprocessing, modeling, and validation. The text uses an example-based method to illustrate how to apply data mining techniques to solve real bioinformatics problems, containing 45 bioinformatics problems that have been investigated in recent research. For each example, the entire data mining process is described, ranging from data preprocessing to modeling and result validation. Provides valuable information on the data mining methods have been widely used for solving real bioinformatics problems Uses an example-based method to illustrate how to apply data mining techniques to solve real bioinformatics problems Contains 45 bioinformatics problems that have been investigated in recent research.

  2. PERANCANGAN SISTEM METADATA UNTUK DATA WAREHOUSE DENGAN STUDI KASUS REVENUE TRACKING PADA PT. TELKOM DIVRE V JAWA TIMUR

    Directory of Open Access Journals (Sweden)

    Yudhi Purwananto

    2004-07-01

    Full Text Available Normal 0 false false false IN X-NONE X-NONE MicrosoftInternetExplorer4 /* Style Definitions */ table.MsoNormalTable {mso-style-name:"Table Normal"; mso-tstyle-rowband-size:0; mso-tstyle-colband-size:0; mso-style-noshow:yes; mso-style-priority:99; mso-style-qformat:yes; mso-style-parent:""; mso-padding-alt:0cm 5.4pt 0cm 5.4pt; mso-para-margin:0cm; mso-para-margin-bottom:.0001pt; mso-pagination:widow-orphan; font-size:11.0pt; font-family:"Calibri","sans-serif"; mso-ascii-font-family:Calibri; mso-ascii-theme-font:minor-latin; mso-fareast-font-family:"Times New Roman"; mso-fareast-theme-font:minor-fareast; mso-hansi-font-family:Calibri; mso-hansi-theme-font:minor-latin; mso-bidi-font-family:"Times New Roman"; mso-bidi-theme-font:minor-bidi;} Data warehouse merupakan media penyimpanan data dalam perusahaan yang diambil dari berbagai sistem dan dapat digunakan untuk berbagai keperluan seperti analisis dan pelaporan. Di PT Telkom Divre V Jawa Timur telah dibangun sebuah data warehouse yang disebut dengan Regional Database. Di Regional Database memerlukan sebuah komponen penting dalam data warehouse yaitu metadata. Definisi metadata secara sederhana adalah "data tentang data". Dalam penelitian ini dirancang sistem metadata dengan studi kasus Revenue Tracking sebagai komponen analisis dan pelaporan pada Regional Database. Metadata sangat perlu digunakan dalam pengelolaan dan memberikan informasi tentang data warehouse. Proses - proses di dalam data warehouse serta komponen - komponen yang berkaitan dengan data warehouse harus saling terintegrasi untuk mewujudkan karakteristik data warehouse yang subject-oriented, integrated, time-variant, dan non-volatile. Karena itu metadata juga harus memiliki kemampuan mempertukarkan informasi (exchange antar komponen dalam data warehouse tersebut. Web service digunakan sebagai mekanisme pertukaran ini. Web service menggunakan teknologi XML dan protokol HTTP dalam berkomunikasi. Dengan web service, setiap komponen

  3. Harvesting Information from a Library Data Warehouse

    Directory of Open Access Journals (Sweden)

    Siew-Phek T. Su

    2017-09-01

    Full Text Available Data warehousing technology has been defined by John Ladley as "a set of methods, techniques, and tools that are leveraged together and used to produce a vehicle that delivers data to end users on an integrated platform." (1 This concept h s been applied increasingly by industries worldwide to develop data warehouses for decision support and knowledge discovery. In the academic sector, several universities have developed data warehouses containing the universities' financial, payroll, personnel, budget, and student data. (2 These data warehouses across all industries and academia have met with varying degrees of success. Data warehousing technology and its related issues have been widely discussed and published. (3 Little has been done, however, on the application of this cutting edge technology in the library environment using library data.

  4. Data warehouse til elbilers opladning og elpriser

    DEFF Research Database (Denmark)

    Andersen, Ove; Krogh, Benjamin Bjerre; Torp, Kristian

    Denne rapport præsenterer, hvordan GPS og CAN bus målinger fra opladning af elbilerne er renset for typiske fejl og gemt i et data warehouse. GPS og CAN bus målingerne er i data warehouset integreret med priserne fra det Nordeuropæiske el spotmarked Nord Pool Spot. Denne integration muliggør...... målinger om opladningen af elbiler er sammen med priserne fra el spotmarkedet indlæst i et data warehouse, som er fuldt ud implementeret. Den logiske data model for dette data warehouse præsenteres i detaljer. Håndteringen af GPS og CAN bus målingerne er generisk og kan udvides til nye data kilder...

  5. What Academia Can Gain from Building a Data Warehouse.

    Science.gov (United States)

    Wierschem, David; McMillen, Jeremy; McBroom, Randy

    2003-01-01

    Describes how, when used effectively, data warehouses can be a significant component of strategic decision making on campus. Discusses what a data warehouse is and what its informational contents may include, environmental drivers and obstacles, and strategies to justify developing a data warehouse for an academic institution. (EV)

  6. 7 CFR 735.302 - Paper warehouse receipts.

    Science.gov (United States)

    2010-01-01

    ... 7 Agriculture 7 2010-01-01 2010-01-01 false Paper warehouse receipts. 735.302 Section 735.302... § 735.302 Paper warehouse receipts. Paper warehouse receipts must be issued as follows: (a) On distinctive paper specified by DACO; (b) Printed by a printer authorized by DACO; and (c) Issued, identified...

  7. Nigerian Concept Of Bonded Warehouses And Dry Ports | Ndikom ...

    African Journals Online (AJOL)

    The bonded warehouse in Nigeria is a strategic expansion of ordinary warehouses that are usually developed in the ports and related cities, strictly meant for safe-keeping of cargoes for owners before final take-over by consignees after payment of some customs duties. A bonded warehouse and ICDs are seen as ...

  8. 27 CFR 24.141 - Bonded wine warehouse.

    Science.gov (United States)

    2010-04-01

    ... 27 Alcohol, Tobacco Products and Firearms 1 2010-04-01 2010-04-01 false Bonded wine warehouse. 24..., DEPARTMENT OF THE TREASURY LIQUORS WINE Establishment and Operations Permanent Discontinuance of Operations § 24.141 Bonded wine warehouse. Where all operations at a bonded wine warehouse are to be permanently...

  9. An overview of the Hadoop/MapReduce/HBase framework and its current applications in bioinformatics

    International Nuclear Information System (INIS)

    Taylor, Ronald C.

    2010-01-01

    Bioinformatics researchers are increasingly confronted with analysis of ultra large-scale data sets, a problem that will only increase at an alarming rate in coming years. Recent developments in open source software, that is, the Hadoop project and associated software, provide a foundation for scaling to petabyte scale data warehouses on Linux clusters, providing fault-tolerant parallelized analysis on such data using a programming style named MapReduce. An overview is given of the current usage within the bioinformatics community of Hadoop, a top-level Apache Software Foundation project, and of associated open source software projects. The concepts behind Hadoop and the associated HBase project are defined, and current bioinformatics software that employ Hadoop is described. The focus is on next-generation sequencing, as the leading application area to date.

  10. Bioinformatics-Aided Venomics

    Directory of Open Access Journals (Sweden)

    Quentin Kaas

    2015-06-01

    Full Text Available Venomics is a modern approach that combines transcriptomics and proteomics to explore the toxin content of venoms. This review will give an overview of computational approaches that have been created to classify and consolidate venomics data, as well as algorithms that have helped discovery and analysis of toxin nucleic acid and protein sequences, toxin three-dimensional structures and toxin functions. Bioinformatics is used to tackle specific challenges associated with the identification and annotations of toxins. Recognizing toxin transcript sequences among second generation sequencing data cannot rely only on basic sequence similarity because toxins are highly divergent. Mass spectrometry sequencing of mature toxins is challenging because toxins can display a large number of post-translational modifications. Identifying the mature toxin region in toxin precursor sequences requires the prediction of the cleavage sites of proprotein convertases, most of which are unknown or not well characterized. Tracing the evolutionary relationships between toxins should consider specific mechanisms of rapid evolution as well as interactions between predatory animals and prey. Rapidly determining the activity of toxins is the main bottleneck in venomics discovery, but some recent bioinformatics and molecular modeling approaches give hope that accurate predictions of toxin specificity could be made in the near future.

  11. Application of XML in real-time data warehouse

    Science.gov (United States)

    Zhao, Yanhong; Wang, Beizhan; Liu, Lizhao; Ye, Su

    2009-07-01

    At present, XML is one of the most widely-used technologies of data-describing and data-exchanging, and the needs for real-time data make real-time data warehouse a popular area in the research of data warehouse. What effects can we have if we apply XML technology to the research of real-time data warehouse? XML technology solves many technologic problems which are impossible to be addressed in traditional real-time data warehouse, and realize the integration of OLAP (On-line Analytical Processing) and OLTP (Online transaction processing) environment. Then real-time data warehouse can truly be called "real time".

  12. Warehouse receipts functioning to reduce market risk

    Directory of Open Access Journals (Sweden)

    Jovičić Daliborka

    2014-01-01

    Full Text Available Cereal production underlies the market risk to a great extent due to its elastic demand. Prices of grain have cyclic movements and significant decline in the harvest periods as a result of insufficient supply and high demand. The very specificity of agricultural production leads to the fact that agricultures are forced to sell their products at unfavorable conditions in order to resume production. The Public Warehouses System allows the agriculturers, who were previously unable to use the bank loans to finance the continuation of their production, to efficiently acquire the necessary funds, by the support of the warehouse receipts which serve as collaterals. Based on the results obtained by applying statistical methods (variance and standard deviation, as a measure of market risk under the assumption that warehouse receipts' prices will approximately follow the overall consumer price index, it can be concluded that the warehouse receipts trade will have a significant impact on risk reduction in cereal production. Positive effects can be manifested through the stabilization of prices, reduction of cyclic movements in the production of basic grains and, in the final stage, on the country's food security.

  13. IR and OLAP in XML document warehouses

    DEFF Research Database (Denmark)

    Perez, Juan Manuel; Pedersen, Torben Bach; Berlanga, Rafael

    2005-01-01

    In this paper we propose to combine IR and OLAP (On-Line Analytical Processing) technologies to exploit a warehouse of text-rich XML documents. In the system we plan to develop, a multidimensional implementation of a relevance modeling document model will be used for interactively querying...

  14. Information Support of Processes in Warehouse Logistics

    Directory of Open Access Journals (Sweden)

    Gordei Kirill

    2013-11-01

    Full Text Available In the conditions of globalization and the world economic communications, the role of information support of business processes increases in various branches and fields of activity. There is not an exception for the warehouse activity. Such information support is realized in warehouse logistic systems. In relation to territorial administratively education, the warehouse logistic system gets a format of difficult social and economic structure which controls the economic streams covering the intermediary, trade and transport organizations and the enterprises of other branches and spheres. Spatial movement of inventory items makes new demands to participants of merchandising. Warehousing (in the meaning – storage – is one of the operations entering into logistic activity, on the organization of a material stream, as a requirement. Therefore, warehousing as "management of spatial movement of stocks" – is justified. Warehousing, in such understanding, tries to get rid of the perception as to containing stocks – a business expensive. This aspiration finds reflection in the logistic systems working by the principle: "just in time", "economical production" and others. Therefore, the role of warehouses as places of storage is transformed to understanding of warehousing as an innovative logistic system.

  15. WAREHOUSE PERFORMANCE MEASUREMENT - A CASE STUDY

    Directory of Open Access Journals (Sweden)

    Crisan Emil

    2009-05-01

    Full Text Available Companies could gain cost advantage using their logistics area of the business. Warehouse management is a possible source of cost improvements from logistics that companies could use during this economic crisis. The goal of this article is to expose a few

  16. Computer modeling of commercial refrigerated warehouse facilities

    International Nuclear Information System (INIS)

    Nicoulin, C.V.; Jacobs, P.C.; Tory, S.

    1997-01-01

    The use of computer models to simulate the energy performance of large commercial refrigeration systems typically found in food processing facilities is an area of engineering practice that has seen little development to date. Current techniques employed in predicting energy consumption by such systems have focused on temperature bin methods of analysis. Existing simulation tools such as DOE2 are designed to model commercial buildings and grocery store refrigeration systems. The HVAC and Refrigeration system performance models in these simulations tools model equipment common to commercial buildings and groceries, and respond to energy-efficiency measures likely to be applied to these building types. The applicability of traditional building energy simulation tools to model refrigerated warehouse performance and analyze energy-saving options is limited. The paper will present the results of modeling work undertaken to evaluate energy savings resulting from incentives offered by a California utility to its Refrigerated Warehouse Program participants. The TRNSYS general-purpose transient simulation model was used to predict facility performance and estimate program savings. Custom TRNSYS components were developed to address modeling issues specific to refrigerated warehouse systems, including warehouse loading door infiltration calculations, an evaporator model, single-state and multi-stage compressor models, evaporative condenser models, and defrost energy requirements. The main focus of the paper will be on the modeling approach. The results from the computer simulations, along with overall program impact evaluation results, will also be presented

  17. Warehouse operations planning model for Bausch & Lomb

    NARCIS (Netherlands)

    Atilgan, Ceren

    2009-01-01

    Operations planning is a major part of the Sales& Operations Planning (S&OP) process. It provides an overview on the operations capacity requirements by considering the supply and demand plan. However, Bausch& Lomb does not have a structured operations planning process for their warehouse

  18. Bioinformatics approaches for identifying new therapeutic bioactive peptides in food

    Directory of Open Access Journals (Sweden)

    Nora Khaldi

    2012-10-01

    Full Text Available ABSTRACT:The traditional methods for mining foods for bioactive peptides are tedious and long. Similar to the drug industry, the length of time to identify and deliver a commercial health ingredient that reduces disease symptoms can take anything between 5 to 10 years. Reducing this time and effort is crucial in order to create new commercially viable products with clear and important health benefits. In the past few years, bioinformatics, the science that brings together fast computational biology, and efficient genome mining, is appearing as the long awaited solution to this problem. By quickly mining food genomes for characteristics of certain food therapeutic ingredients, researchers can potentially find new ones in a matter of a few weeks. Yet, surprisingly, very little success has been achieved so far using bioinformatics in mining for food bioactives.The absence of food specific bioinformatic mining tools, the slow integration of both experimental mining and bioinformatics, and the important difference between different experimental platforms are some of the reasons for the slow progress of bioinformatics in the field of functional food and more specifically in bioactive peptide discovery.In this paper I discuss some methods that could be easily translated, using a rational peptide bioinformatics design, to food bioactive peptide mining. I highlight the need for an integrated food peptide database. I also discuss how to better integrate experimental work with bioinformatics in order to improve the mining of food for bioactive peptides, therefore achieving a higher success rates.

  19. Managing dual warehouses with an incentive policy for deteriorating items

    Science.gov (United States)

    Yu, Jonas C. P.; Wang, Kung-Jeng; Lin, Yu-Siang

    2016-02-01

    Distributors in a supply chain usually limit their own warehouse in finite capacity for cost reduction and excess stock is held in a rent warehouse. In this study, we examine inventory control for deteriorating items in a two-warehouse setting. Assuming that there is an incentive offered by a rent warehouse that allows the rental fee to decrease over time, the objective of this study is to maximise the joint profit of the manufacturer and the distributor. An optimisation procedure is developed to derive the optimal joint economic lot size policy. Several criteria are identified to select the most appropriate warehouse configuration and inventory policy on the basis of storage duration of materials in a rent warehouse. Sensitivity analysis is done to examine the results of model robustness. The proposed model enables a manufacturer with a channel distributor to coordinate the use of alternative warehouses, and to maximise the joint profit of the manufacturer and the distributor.

  20. MouseMine: a new data warehouse for MGI.

    Science.gov (United States)

    Motenko, H; Neuhauser, S B; O'Keefe, M; Richardson, J E

    2015-08-01

    MouseMine (www.mousemine.org) is a new data warehouse for accessing mouse data from Mouse Genome Informatics (MGI). Based on the InterMine software framework, MouseMine supports powerful query, reporting, and analysis capabilities, the ability to save and combine results from different queries, easy integration into larger workflows, and a comprehensive Web Services layer. Through MouseMine, users can access a significant portion of MGI data in new and useful ways. Importantly, MouseMine is also a member of a growing community of online data resources based on InterMine, including those established by other model organism databases. Adopting common interfaces and collaborating on data representation standards are critical to fostering cross-species data analysis. This paper presents a general introduction to MouseMine, presents examples of its use, and discusses the potential for further integration into the MGI interface.

  1. Evolutionary Multiobjective Query Workload Optimization of Cloud Data Warehouses

    Science.gov (United States)

    Dokeroglu, Tansel; Sert, Seyyit Alper; Cinar, Muhammet Serkan

    2014-01-01

    With the advent of Cloud databases, query optimizers need to find paretooptimal solutions in terms of response time and monetary cost. Our novel approach minimizes both objectives by deploying alternative virtual resources and query plans making use of the virtual resource elasticity of the Cloud. We propose an exact multiobjective branch-and-bound and a robust multiobjective genetic algorithm for the optimization of distributed data warehouse query workloads on the Cloud. In order to investigate the effectiveness of our approach, we incorporate the devised algorithms into a prototype system. Finally, through several experiments that we have conducted with different workloads and virtual resource configurations, we conclude remarkable findings of alternative deployments as well as the advantages and disadvantages of the multiobjective algorithms we propose. PMID:24892048

  2. Modelling of Data Warehouse on Food Distribution Center and Reserves in the Ministry of Agriculture

    Directory of Open Access Journals (Sweden)

    Edi Purnomo Putra

    2015-09-01

    Full Text Available The purpose of this study is to perform database’s planning that supports Prototype Modeling Data Warehouse in the Ministry of Agriculture, especially in the Distribution Center and Reserves in the field of distribution, reserve and price. With the prototype of Data Warehouse, the process of data analysis anddecision-making process by the top management will be easier and more accurate. Research’s method used was data collection and design method. Data warehouse’s design method was done by using Kimball’s nine stepsmethodology. Database design was done by using the ERD (Entity Relationship Diagram and activity diagram. The data used for the analysis was obtained from an interview with the head of Distribution, Reserve and Food Price. The results obtained through the analysis incorporated into the Data Warehouse Prototype have been designed to support decision-making. To conclude, Prototype Data Warehouse facilitates the analysis of data, the searching of history data and decision-making by the top management.

  3. Pengembangan Data Warehouse Menggunakan Pendekatan Data-Driven untuk Membantu Pengelolaan SDM

    Directory of Open Access Journals (Sweden)

    Mujiono Mujiono

    2016-01-01

    Full Text Available The basis of bureaucratic reform is the reform of human resources management. One supporting factor is the development of an employee database. To support the management of human resources required including data warehouse and business intelligent tools. The data warehouse is an integrated concept of reliable data storage to provide support to all the needs of the data analysis. In this study developed a data warehouse using the data-driven approach to the source data comes from SIMPEG, SAPK and electronic presence. Data warehouses are designed using the nine steps methodology and unified modeling language (UML notation. Extract transform load (ETL is done by using Pentaho Data Integration by applying transformation maps. Furthermore, to help human resource management, the system is built to perform online analytical processing (OLAP to facilitate web-based information. In this study generated BI application development framework with Model-View-Controller (MVC architecture and OLAP operations are built using the dynamic query generation, PivotTable, and HighChart to present information about PNS, CPNS, Retirement, Kenpa and Presence

  4. Emergent Computation Emphasizing Bioinformatics

    CERN Document Server

    Simon, Matthew

    2005-01-01

    Emergent Computation is concerned with recent applications of Mathematical Linguistics or Automata Theory. This subject has a primary focus upon "Bioinformatics" (the Genome and arising interest in the Proteome), but the closing chapter also examines applications in Biology, Medicine, Anthropology, etc. The book is composed of an organized examination of DNA, RNA, and the assembly of amino acids into proteins. Rather than examine these areas from a purely mathematical viewpoint (that excludes much of the biochemical reality), the author uses scientific papers written mostly by biochemists based upon their laboratory observations. Thus while DNA may exist in its double stranded form, triple stranded forms are not excluded. Similarly, while bases exist in Watson-Crick complements, mismatched bases and abasic pairs are not excluded, nor are Hoogsteen bonds. Just as there are four bases naturally found in DNA, the existence of additional bases is not ignored, nor amino acids in addition to the usual complement of...

  5. Automated Data Aggregation for Time-Series Analysis: Study Case on Anaesthesia Data Warehouse.

    Science.gov (United States)

    Lamer, Antoine; Jeanne, Mathieu; Ficheur, Grégoire; Marcilly, Romaric

    2016-01-01

    Data stored in operational databases are not reusable directly. Aggregation modules are necessary to facilitate secondary use. They decrease volume of data while increasing the number of available information. In this paper, we present four automated engines of aggregation, integrated into an anaesthesia data warehouse. Four instances of clinical questions illustrate the use of those engines for various improvements of quality of care: duration of procedure, drug administration, assessment of hypotension and its related treatment.

  6. Interdisciplinary Introductory Course in Bioinformatics

    Science.gov (United States)

    Kortsarts, Yana; Morris, Robert W.; Utell, Janine M.

    2010-01-01

    Bioinformatics is a relatively new interdisciplinary field that integrates computer science, mathematics, biology, and information technology to manage, analyze, and understand biological, biochemical and biophysical information. We present our experience in teaching an interdisciplinary course, Introduction to Bioinformatics, which was developed…

  7. Virtual Bioinformatics Distance Learning Suite

    Science.gov (United States)

    Tolvanen, Martti; Vihinen, Mauno

    2004-01-01

    Distance learning as a computer-aided concept allows students to take courses from anywhere at any time. In bioinformatics, computers are needed to collect, store, process, and analyze massive amounts of biological and biomedical data. We have applied the concept of distance learning in virtual bioinformatics to provide university course material…

  8. Bioinformatics on the Cloud Computing Platform Azure

    Science.gov (United States)

    Shanahan, Hugh P.; Owen, Anne M.; Harrison, Andrew P.

    2014-01-01

    We discuss the applicability of the Microsoft cloud computing platform, Azure, for bioinformatics. We focus on the usability of the resource rather than its performance. We provide an example of how R can be used on Azure to analyse a large amount of microarray expression data deposited at the public database ArrayExpress. We provide a walk through to demonstrate explicitly how Azure can be used to perform these analyses in Appendix S1 and we offer a comparison with a local computation. We note that the use of the Platform as a Service (PaaS) offering of Azure can represent a steep learning curve for bioinformatics developers who will usually have a Linux and scripting language background. On the other hand, the presence of an additional set of libraries makes it easier to deploy software in a parallel (scalable) fashion and explicitly manage such a production run with only a few hundred lines of code, most of which can be incorporated from a template. We propose that this environment is best suited for running stable bioinformatics software by users not involved with its development. PMID:25050811

  9. Bringing Web 2.0 to bioinformatics.

    Science.gov (United States)

    Zhang, Zhang; Cheung, Kei-Hoi; Townsend, Jeffrey P

    2009-01-01

    Enabling deft data integration from numerous, voluminous and heterogeneous data sources is a major bioinformatic challenge. Several approaches have been proposed to address this challenge, including data warehousing and federated databasing. Yet despite the rise of these approaches, integration of data from multiple sources remains problematic and toilsome. These two approaches follow a user-to-computer communication model for data exchange, and do not facilitate a broader concept of data sharing or collaboration among users. In this report, we discuss the potential of Web 2.0 technologies to transcend this model and enhance bioinformatics research. We propose a Web 2.0-based Scientific Social Community (SSC) model for the implementation of these technologies. By establishing a social, collective and collaborative platform for data creation, sharing and integration, we promote a web services-based pipeline featuring web services for computer-to-computer data exchange as users add value. This pipeline aims to simplify data integration and creation, to realize automatic analysis, and to facilitate reuse and sharing of data. SSC can foster collaboration and harness collective intelligence to create and discover new knowledge. In addition to its research potential, we also describe its potential role as an e-learning platform in education. We discuss lessons from information technology, predict the next generation of Web (Web 3.0), and describe its potential impact on the future of bioinformatics studies.

  10. A decade of Web Server updates at the Bioinformatics Links Directory: 2003-2012.

    Science.gov (United States)

    Brazas, Michelle D; Yim, David; Yeung, Winston; Ouellette, B F Francis

    2012-07-01

    The 2012 Bioinformatics Links Directory update marks the 10th special Web Server issue from Nucleic Acids Research. Beginning with content from their 2003 publication, the Bioinformatics Links Directory in collaboration with Nucleic Acids Research has compiled and published a comprehensive list of freely accessible, online tools, databases and resource materials for the bioinformatics and life science research communities. The past decade has exhibited significant growth and change in the types of tools, databases and resources being put forth, reflecting both technology changes and the nature of research over that time. With the addition of 90 web server tools and 12 updates from the July 2012 Web Server issue of Nucleic Acids Research, the Bioinformatics Links Directory at http://bioinformatics.ca/links_directory/ now contains an impressive 134 resources, 455 databases and 1205 web server tools, mirroring the continued activity and efforts of our field.

  11. Handling Imprecision in Qualitative Data Warehouse: Urban Building Sites Annoyance Analysis Use Case

    Science.gov (United States)

    Amanzougarene, F.; Chachoua, M.; Zeitouni, K.

    2013-05-01

    Data warehouse means a decision support database allowing integration, organization, historisation, and management of data from heterogeneous sources, with the aim of exploiting them for decision-making. Data warehouses are essentially based on multidimensional model. This model organizes data into facts (subjects of analysis) and dimensions (axes of analysis). In classical data warehouses, facts are composed of numerical measures and dimensions which characterize it. Dimensions are organized into hierarchical levels of detail. Based on the navigation and aggregation mechanisms offered by OLAP (On-Line Analytical Processing) tools, facts can be analyzed according to the desired level of detail. In real world applications, facts are not always numerical, and can be of qualitative nature. In addition, sometimes a human expert or learned model such as a decision tree provides a qualitative evaluation of phenomenon based on its different parameters i.e. dimensions. Conventional data warehouses are thus not adapted to qualitative reasoning and have not the ability to deal with qualitative data. In previous work, we have proposed an original approach of qualitative data warehouse modeling, which permits integrating qualitative measures. Based on computing with words methodology, we have extended classical multidimensional data model to allow the aggregation and analysis of qualitative data in OLAP environment. We have implemented this model in a Spatial Decision Support System to help managers of public spaces to reduce annoyances and improve the quality of life of the citizens. In this paper, we will focus our study on the representation and management of imprecision in annoyance analysis process. The main objective of this process consists in determining the least harmful scenario of urban building sites, particularly in dense urban environments.

  12. Clinical Use of an Enterprise Data Warehouse

    Science.gov (United States)

    Evans, R. Scott; Lloyd, James F.; Pierce, Lee A.

    2012-01-01

    The enormous amount of data being collected by electronic medical records (EMR) has found additional value when integrated and stored in data warehouses. The enterprise data warehouse (EDW) allows all data from an organization with numerous inpatient and outpatient facilities to be integrated and analyzed. We have found the EDW at Intermountain Healthcare to not only be an essential tool for management and strategic decision making, but also for patient specific clinical decision support. This paper presents the structure and two case studies of a framework that has provided us the ability to create a number of decision support applications that are dependent on the integration of previous enterprise-wide data in addition to a patient’s current information in the EMR. PMID:23304288

  13. Warehouse site selection in an international environment

    Directory of Open Access Journals (Sweden)

    Sebastjan ŠKERLIČ

    2013-01-01

    Full Text Available The changed conditions in the automotive industry as the market and the production are moving from west to east, both at global and at European level, require constant adjustment from Slovenian companies. The companies strive to remain close to their customers and suppliers, as only by maintaining a high quality and streamlined supply chain, their existence within the demanding automotive industry is guaranteed in the long term. Choosing the right location for a warehouse in an international environment is therefore one of the most important strategic decisions that takes into account a number of interrelated factors such as transport networks, transport infrastructure, trade flows and the total cost. This paper aims to explore the important aspects of selecting a location for a warehouse and to identify potential international strategic locations, which could have a significant impact on the future operations of Slovenian companies in the global automotive industry.

  14. Configurable Web Warehouses construction through BPM Systems

    Directory of Open Access Journals (Sweden)

    Andrea Delgado

    2016-08-01

    Full Text Available The process of building Data Warehouses (DW is well known with well defined stages but at the same time, mostly carried out manually by IT people in conjunction with business people. Web Warehouses (WW are DW whose data sources are taken from the web. We define a flexible WW, which can be configured accordingly to different domains, through the selection of the web sources and the definition of data processing characteristics. A Business Process Management (BPM System allows modeling and executing Business Processes (BPs providing support for the automation of processes. To support the process of building flexible WW we propose a two BPs level: a configuration process to support the selection of web sources and the definition of schemas and mappings, and a feeding process which takes the defined configuration and loads the data into the WW. In this paper we present a proof of concept of both processes, with focus on the configuration process and the defined data.

  15. A framework for information warehouse development processes

    OpenAIRE

    Holten, Roland

    1999-01-01

    Since the terms Data Warehouse and On-Line Analytical Processing were proposed by Inmon and Codd, Codd, Sally respectively the traditional ideas of creating information systems in support of management¿s decision became interesting again in theory and practice. Today information warehousing is a strategic market for any data base systems vendor. Nevertheless the theoretical discussions of this topic go back to the early years of the 20th century as far as management science and accounting the...

  16. Warehouse Order-Picking Process. Review

    Directory of Open Access Journals (Sweden)

    E. V. Korobkov

    2015-01-01

    Full Text Available This article describes basic warehousing activities, namely: movement, information storage and transfer, as well as connections between typical warehouse operations (reception, transfer, assigning storage position and put-away, order-picking, hoarding and sorting, cross-docking, shipping. It presents a classification of the warehouse order-picking systems in terms of manual labor on offer as well as external (marketing channels, consumer’s demand structure, supplier’s replenishment structure and inventory level, total production demand, economic situation and internal (mechanization level, information accessibility, warehouse dimensionality, method of dispatch for shipping, zoning, batching, storage assignment method, routing method factors affecting the designing systems complexity. Basic optimization considerations are described. There is a literature review on the following sub-problems of planning and control of orderpicking processes.A layout design problem has been taken in account at two levels — external (facility layout problem and internal (aisle configuration problem. For a problem of distributing goods or stock keeping units the following methods are emphasized: random, nearest open storage position, and dedicated (COI-based, frequency-based distribution, as well as class-based and familygrouped (complimentary- and contact-based one. Batching problem can be solved by two main methods, i.e. proximity order batching (seed and saving algorithms and time-window order batching. There are two strategies for a zoning problem: progressive and synchronized, and also a special case of zoning — bucket brigades method. Hoarding/sorting problem is briefly reviewed. Order-picking routing problem will be thoroughly described in the next article of the cycle “Warehouse order-picking process”.

  17. BioStar: an online question & answer resource for the bioinformatics community

    Science.gov (United States)

    Although the era of big data has produced many bioinformatics tools and databases, using them effectively often requires specialized knowledge. Many groups lack bioinformatics expertise, and frequently find that software documentation is inadequate and local colleagues may be overburdened or unfamil...

  18. Exploring Cystic Fibrosis Using Bioinformatics Tools: A Module Designed for the Freshman Biology Course

    Science.gov (United States)

    Zhang, Xiaorong

    2011-01-01

    We incorporated a bioinformatics component into the freshman biology course that allows students to explore cystic fibrosis (CF), a common genetic disorder, using bioinformatics tools and skills. Students learn about CF through searching genetic databases, analyzing genetic sequences, and observing the three-dimensional structures of proteins…

  19. Importance of public warehouse system for financing agribusiness sector

    Directory of Open Access Journals (Sweden)

    Zakić Vladimir

    2014-01-01

    Full Text Available The aim of this study was to determine the economic viability of the use of warehouse receipts for the storage of wheat and corn, based on the analysis of trends in product prices, storage costs in public warehouses and interest rate of loans against warehouse receipts. Agricultural producers are urged to sell grain at the harvest time when the price of agricultural products is usually lowest, mostly because of their needs for financial sources. Instead of selling products, farmers can store them in the public warehouses and use short-time financing by lending against warehouse receipt with usually lowest interest rate. In following months, farmers can sell products at higher price and repay short-term loan. This study showed that strategy of using public warehouses and postponing the sale of grains after harvest is profitable strategy for agricultural producers.

  20. Engineering bioinformatics: building reliability, performance and productivity into bioinformatics software.

    Science.gov (United States)

    Lawlor, Brendan; Walsh, Paul

    2015-01-01

    There is a lack of software engineering skills in bioinformatic contexts. We discuss the consequences of this lack, examine existing explanations and remedies to the problem, point out their shortcomings, and propose alternatives. Previous analyses of the problem have tended to treat the use of software in scientific contexts as categorically different from the general application of software engineering in commercial settings. In contrast, we describe bioinformatic software engineering as a specialization of general software engineering, and examine how it should be practiced. Specifically, we highlight the difference between programming and software engineering, list elements of the latter and present the results of a survey of bioinformatic practitioners which quantifies the extent to which those elements are employed in bioinformatics. We propose that the ideal way to bring engineering values into research projects is to bring engineers themselves. We identify the role of Bioinformatic Engineer and describe how such a role would work within bioinformatic research teams. We conclude by recommending an educational emphasis on cross-training software engineers into life sciences, and propose research on Domain Specific Languages to facilitate collaboration between engineers and bioinformaticians.

  1. Engineering bioinformatics: building reliability, performance and productivity into bioinformatics software

    Science.gov (United States)

    Lawlor, Brendan; Walsh, Paul

    2015-01-01

    There is a lack of software engineering skills in bioinformatic contexts. We discuss the consequences of this lack, examine existing explanations and remedies to the problem, point out their shortcomings, and propose alternatives. Previous analyses of the problem have tended to treat the use of software in scientific contexts as categorically different from the general application of software engineering in commercial settings. In contrast, we describe bioinformatic software engineering as a specialization of general software engineering, and examine how it should be practiced. Specifically, we highlight the difference between programming and software engineering, list elements of the latter and present the results of a survey of bioinformatic practitioners which quantifies the extent to which those elements are employed in bioinformatics. We propose that the ideal way to bring engineering values into research projects is to bring engineers themselves. We identify the role of Bioinformatic Engineer and describe how such a role would work within bioinformatic research teams. We conclude by recommending an educational emphasis on cross-training software engineers into life sciences, and propose research on Domain Specific Languages to facilitate collaboration between engineers and bioinformaticians. PMID:25996054

  2. Minimizing Warehouse Space through Inventory Reduction at Reckitt Benckiser

    OpenAIRE

    KILINC, IZGI SELEN

    2009-01-01

    This dissertation represents a ten week internship at pharmaceutical plant of Reckitt Benckiser for the Warehouse Stock Reduction Project. Due to foreseeable growth by the factory, there is increasing pressure to utilise existing warehouse space by reducing the existing stock level by 50 %. Therefore, this study aims to identify the opportunities to reduce the physical stock held in raw/pack materials in the warehouse and save space for additional manufacturing resources. The analysis demo...

  3. Deciphering psoriasis. A bioinformatic approach.

    Science.gov (United States)

    Melero, Juan L; Andrades, Sergi; Arola, Lluís; Romeu, Antoni

    2018-02-01

    Psoriasis is an immune-mediated, inflammatory and hyperproliferative disease of the skin and joints. The cause of psoriasis is still unknown. The fundamental feature of the disease is the hyperproliferation of keratinocytes and the recruitment of cells from the immune system in the region of the affected skin, which leads to deregulation of many well-known gene expressions. Based on data mining and bioinformatic scripting, here we show a new dimension of the effect of psoriasis at the genomic level. Using our own pipeline of scripts in Perl and MySql and based on the freely available NCBI Gene Expression Omnibus (GEO) database: DataSet Record GDS4602 (Series GSE13355), we explore the extent of the effect of psoriasis on gene expression in the affected tissue. We give greater insight into the effects of psoriasis on the up-regulation of some genes in the cell cycle (CCNB1, CCNA2, CCNE2, CDK1) or the dynamin system (GBPs, MXs, MFN1), as well as the down-regulation of typical antioxidant genes (catalase, CAT; superoxide dismutases, SOD1-3; and glutathione reductase, GSR). We also provide a complete list of the human genes and how they respond in a state of psoriasis. Our results show that psoriasis affects all chromosomes and many biological functions. If we further consider the stable and mitotically inheritable character of the psoriasis phenotype, and the influence of environmental factors, then it seems that psoriasis has an epigenetic origin. This fit well with the strong hereditary character of the disease as well as its complex genetic background. Copyright © 2017 Japanese Society for Investigative Dermatology. Published by Elsevier B.V. All rights reserved.

  4. PEMODELAN INTEGRASI NEARLY REAL TIME DATA WAREHOUSE DENGAN SERVICE ORIENTED ARCHITECTURE UNTUK MENUNJANG SISTEM INFORMASI RETAIL

    Directory of Open Access Journals (Sweden)

    I Made Dwi Jendra Sulastra

    2015-12-01

    Full Text Available Updates the data in the data warehouse is not traditionally done every transaction. Retail information systems require the latest data and can be accessed from anywhere for business analysis needs. Therefore, in this study will be made data warehouse model that is able to produce the information near real time, and can be accessed from anywhere by end users application. Modeling design integration of nearly real time data warehouse (NRTDWH with a service oriented architecture (SOA to support the retail information system is done in two stages. In the first stage will be designed modeling NRTDWH using Change Data Capture (CDC based Transaction Log. In the second stage will be designed modeling NRTDWH integration with SOA-based web service. Tests conducted by a simulation test applications. Test applications used retail information systems, web-based web service client, desktop, and mobile. Results of this study were (1 ETL-based CDC captures changes to the source table and then store it in the database NRTDWH with the help of a scheduler; (2 Middleware web service makes 6 service based on data contained in the database NRTDWH, and each of these services accessible and implemented by the web service client.

  5. When process mining meets bioinformatics

    NARCIS (Netherlands)

    Jagadeesh Chandra Bose, R.P.; Aalst, van der W.M.P.; Nurcan, S.

    2011-01-01

    Process mining techniques can be used to extract non-trivial process related knowledge and thus generate interesting insights from event logs. Similarly, bioinformatics aims at increasing the understanding of biological processes through the analysis of information associated with biological

  6. The impact of e-commerce on warehouse operations

    Directory of Open Access Journals (Sweden)

    Wiktor Żuchowski

    2016-03-01

    Full Text Available Background: We often encounter opinions concerning the unusual nature of warehouses used for the purposes of e-commerce, most often spread by providers of modern technological equipment and designers of such solutions. Of course, in the case of newly built facilities, it is advisable to consider innovative technologies, especially in terms of order picking. However, in many cases, the differences between "standard" warehouses, serving, for example, the vehicle spare parts market, and warehouses that are ready to handle retail orders placed electronically (defined as e-commerce are negligible. The scale of the differences between the existing "standard" warehouses and those adapted to handle e-commerce is dependent on the industry and supported of customers' structure. Methods: On the basis of experiences and on examples of enterprises two cases of the impact of a hypothetical e-commerce implementation for the warehouse organization and technology have been analysed. Results: The introduction of e-commerce into warehouses entails respective changes to previously handled orders. Warehouses serving the retail market are in principle prepared to process electronic orders. In this case, the introduction of (direct electronic sales is justified and feasible with relatively little effort. Conclusions: It cannot be said with certainty that the introduction of e-commerce in the warehouse is a revolution for its employees and managers. It depends on the markets in which the company operates, and on customers served by the warehouse prior to the introduction of e-commerce.

  7. Designing a Clinical Data Warehouse Architecture to Support Quality Improvement Initiatives.

    Science.gov (United States)

    Chelico, John D; Wilcox, Adam B; Vawdrey, David K; Kuperman, Gilad J

    2016-01-01

    Clinical data warehouses, initially directed towards clinical research or financial analyses, are evolving to support quality improvement efforts, and must now address the quality improvement life cycle. In addition, data that are needed for quality improvement often do not reside in a single database, requiring easier methods to query data across multiple disparate sources. We created a virtual data warehouse at NewYork Presbyterian Hospital that allowed us to bring together data from several source systems throughout the organization. We also created a framework to match the maturity of a data request in the quality improvement life cycle to proper tools needed for each request. As projects progress in the Define, Measure, Analyze, Improve, Control stages of quality improvement, there is a proper matching of resources the data needs at each step. We describe the analysis and design creating a robust model for applying clinical data warehousing to quality improvement.

  8. Genomics and bioinformatics resources for translational science in Rosaceae.

    Science.gov (United States)

    Jung, Sook; Main, Dorrie

    2014-01-01

    Recent technological advances in biology promise unprecedented opportunities for rapid and sustainable advancement of crop quality. Following this trend, the Rosaceae research community continues to generate large amounts of genomic, genetic and breeding data. These include annotated whole genome sequences, transcriptome and expression data, proteomic and metabolomic data, genotypic and phenotypic data, and genetic and physical maps. Analysis, storage, integration and dissemination of these data using bioinformatics tools and databases are essential to provide utility of the data for basic, translational and applied research. This review discusses the currently available genomics and bioinformatics resources for the Rosaceae family.

  9. Refrigerated Warehouse Demand Response Strategy Guide

    Energy Technology Data Exchange (ETDEWEB)

    Scott, Doug [VaCom Technologies, San Luis Obispo, CA (United States); Castillo, Rafael [VaCom Technologies, San Luis Obispo, CA (United States); Larson, Kyle [VaCom Technologies, San Luis Obispo, CA (United States); Dobbs, Brian [VaCom Technologies, San Luis Obispo, CA (United States); Olsen, Daniel [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States)

    2015-11-01

    This guide summarizes demand response measures that can be implemented in refrigerated warehouses. In an appendix, it also addresses related energy efficiency opportunities. Reducing overall grid demand during peak periods and energy consumption has benefits for facility operators, grid operators, utility companies, and society. State wide demand response potential for the refrigerated warehouse sector in California is estimated to be over 22.1 Megawatts. Two categories of demand response strategies are described in this guide: load shifting and load shedding. Load shifting can be accomplished via pre-cooling, capacity limiting, and battery charger load management. Load shedding can be achieved by lighting reduction, demand defrost and defrost termination, infiltration reduction, and shutting down miscellaneous equipment. Estimation of the costs and benefits of demand response participation yields simple payback periods of 2-4 years. To improve demand response performance, it’s suggested to install air curtains and another form of infiltration barrier, such as a rollup door, for the passageways. Further modifications to increase efficiency of the refrigeration unit are also analyzed. A larger condenser can maintain the minimum saturated condensing temperature (SCT) for more hours of the day. Lowering the SCT reduces the compressor lift, which results in an overall increase in refrigeration system capacity and energy efficiency. Another way of saving energy in refrigerated warehouses is eliminating the use of under-floor resistance heaters. A more energy efficient alternative to resistance heaters is to utilize the heat that is being rejected from the condenser through a heat exchanger. These energy efficiency measures improve efficiency either by reducing the required electric energy input for the refrigeration system, by helping to curtail the refrigeration load on the system, or by reducing both the load and required energy input.

  10. Harnessing the Power of Scientific Data Warehouses

    Directory of Open Access Journals (Sweden)

    Kevin Deeb

    2005-04-01

    Full Text Available Data warehousing architecture should generally protect the confidentiality of data before it can be published, provide sufficient granularity to enable scientists to variously manipulate data, support robust metadata services, and define standardized spatial components. Data can then be transformed into information that would make them readily available in a common format that is easily accessible, fast, and bridges the islands of dispersed information. The benefits of the warehouse can be further enhanced by adding a spatial component so that the data can be brought to life, overlapping layers of information in a format that is easily grasped by management, enabling them to tease out trends in their areas of expertise.

  11. Design and Control of Warehouse Order Picking: a literature review

    NARCIS (Netherlands)

    M.B.M. de Koster (René); T. Le-Duc (Tho); K.J. Roodbergen (Kees-Jan)

    2006-01-01

    textabstractOrder picking has long been identified as the most labour-intensive and costly activity for almost every warehouse; the cost of order picking is estimated to be as much as 55% of the total warehouse operating expense. Any underperformance in order picking can lead to unsatisfactory

  12. Building a data warehouse with examples in SQL server

    CERN Document Server

    Rainardi, Vincent

    2008-01-01

    ""Building a Data Warehouse: With Examples in SQL Server"" describes how to build a data warehouse completely from scratch and shows practical examples on how to do it. Author Rainardi also describes some practical issues that developers are likely to encounter in their first data warehousing project, along with solutions and advice.

  13. 27 CFR 46.236 - Articles in a warehouse.

    Science.gov (United States)

    2010-04-01

    ... 27 Alcohol, Tobacco Products and Firearms 2 2010-04-01 2010-04-01 false Articles in a warehouse... Tubes Held for Sale on April 1, 2009 Filing Requirements § 46.236 Articles in a warehouse. (a) Articles... articles will be offered for sale. (b) Articles offered for sale at several locations must be reported on a...

  14. Multidimensi Pada Data Warehouse Dengan Menggunakan Rumus Kombinasi

    OpenAIRE

    Hendric, Spits Warnars Harco Leslie

    2006-01-01

    Multidimensional in data warehouse is a compulsion and become the most important for information delivery, without multidimensional data warehouse is incomplete. Multidimensional give the able to analyze business measurement in many different ways. Multidimensional is also synonymous with online analytical processing (OLAP).

  15. Optimal time policy for deteriorating items of two-warehouse

    Indian Academy of Sciences (India)

    ... goods in which the first is rented warehouse and the second is own warehouse that deteriorates with two different rates. The aim of this study is to determine the optimal order quantity to maximize the profit of the projected model. Finally, some numerical examples and sensitivity analysis of parameters are made to validate ...

  16. 27 CFR 28.286 - Receipt in customs bonded warehouse.

    Science.gov (United States)

    2010-04-01

    ... 27 Alcohol, Tobacco Products and Firearms 1 2010-04-01 2010-04-01 false Receipt in customs bonded... in Customs Bonded Warehouse § 28.286 Receipt in customs bonded warehouse. On receipt of the distilled spirits or wine and the related TTB Form 5100.11 or 5110.30 as the case may be, the customs officer in...

  17. 19 CFR 19.1 - Classes of customs warehouses.

    Science.gov (United States)

    2010-04-01

    ... 19 Customs Duties 1 2010-04-01 2010-04-01 false Classes of customs warehouses. 19.1 Section 19.1 Customs Duties U.S. CUSTOMS AND BORDER PROTECTION, DEPARTMENT OF HOMELAND SECURITY; DEPARTMENT OF THE TREASURY CUSTOMS WAREHOUSES, CONTAINER STATIONS AND CONTROL OF MERCHANDISE THEREIN § 19.1 Classes of...

  18. Taking Bioinformatics to Systems Medicine.

    Science.gov (United States)

    van Kampen, Antoine H C; Moerland, Perry D

    2016-01-01

    Systems medicine promotes a range of approaches and strategies to study human health and disease at a systems level with the aim of improving the overall well-being of (healthy) individuals, and preventing, diagnosing, or curing disease. In this chapter we discuss how bioinformatics critically contributes to systems medicine. First, we explain the role of bioinformatics in the management and analysis of data. In particular we show the importance of publicly available biological and clinical repositories to support systems medicine studies. Second, we discuss how the integration and analysis of multiple types of omics data through integrative bioinformatics may facilitate the determination of more predictive and robust disease signatures, lead to a better understanding of (patho)physiological molecular mechanisms, and facilitate personalized medicine. Third, we focus on network analysis and discuss how gene networks can be constructed from omics data and how these networks can be decomposed into smaller modules. We discuss how the resulting modules can be used to generate experimentally testable hypotheses, provide insight into disease mechanisms, and lead to predictive models. Throughout, we provide several examples demonstrating how bioinformatics contributes to systems medicine and discuss future challenges in bioinformatics that need to be addressed to enable the advancement of systems medicine.

  19. Generalized Centroid Estimators in Bioinformatics

    Science.gov (United States)

    Hamada, Michiaki; Kiryu, Hisanori; Iwasaki, Wataru; Asai, Kiyoshi

    2011-01-01

    In a number of estimation problems in bioinformatics, accuracy measures of the target problem are usually given, and it is important to design estimators that are suitable to those accuracy measures. However, there is often a discrepancy between an employed estimator and a given accuracy measure of the problem. In this study, we introduce a general class of efficient estimators for estimation problems on high-dimensional binary spaces, which represent many fundamental problems in bioinformatics. Theoretical analysis reveals that the proposed estimators generally fit with commonly-used accuracy measures (e.g. sensitivity, PPV, MCC and F-score) as well as it can be computed efficiently in many cases, and cover a wide range of problems in bioinformatics from the viewpoint of the principle of maximum expected accuracy (MEA). It is also shown that some important algorithms in bioinformatics can be interpreted in a unified manner. Not only the concept presented in this paper gives a useful framework to design MEA-based estimators but also it is highly extendable and sheds new light on many problems in bioinformatics. PMID:21365017

  20. Congestion-Aware Warehouse Flow Analysis and Optimization

    KAUST Repository

    AlHalawani, Sawsan

    2015-12-18

    Generating realistic configurations of urban models is a vital part of the modeling process, especially if these models are used for evaluation and analysis. In this work, we address the problem of assigning objects to their storage locations inside a warehouse which has a great impact on the quality of operations within a warehouse. Existing storage policies aim to improve the efficiency by minimizing travel time or by classifying the items based on some features. We go beyond existing methods as we analyze warehouse layout network in an attempt to understand the factors that affect traffic within the warehouse. We use simulated annealing based sampling to assign items to their storage locations while reducing traffic congestion and enhancing the speed of order picking processes. The proposed method enables a range of applications including efficient storage assignment, warehouse reliability evaluation and traffic congestion estimation.

  1. Congestion-Aware Warehouse Flow Analysis and Optimization

    KAUST Repository

    AlHalawani, Sawsan; Mitra, Niloy J.

    2015-01-01

    Generating realistic configurations of urban models is a vital part of the modeling process, especially if these models are used for evaluation and analysis. In this work, we address the problem of assigning objects to their storage locations inside a warehouse which has a great impact on the quality of operations within a warehouse. Existing storage policies aim to improve the efficiency by minimizing travel time or by classifying the items based on some features. We go beyond existing methods as we analyze warehouse layout network in an attempt to understand the factors that affect traffic within the warehouse. We use simulated annealing based sampling to assign items to their storage locations while reducing traffic congestion and enhancing the speed of order picking processes. The proposed method enables a range of applications including efficient storage assignment, warehouse reliability evaluation and traffic congestion estimation.

  2. Analisis Dan Perancangan Data Warehouse Pada PT Gajah Tunggal Prakarsa

    Directory of Open Access Journals (Sweden)

    Choirul Huda

    2010-12-01

    Full Text Available The purpose of this helpful in making decisions more quickly and precisely. Research methodology includes analysis study was to analyze the data base support in helping decisions making, identifying needs and designing a data warehouse. With the support of data warehouse, company leaders can be more of current systems, library research, designing a data warehouse using star schema. The result of this research is the availability of a data warehouse that can generate information quickly and precisely, thus helping the company in making decisions. The conclusion of this research is the application of data warehouse can be a media aide related parties on PT. Gajah Tunggal initiative in decision making. 

  3. Automatic generation of warehouse mediators using an ontology engine

    Energy Technology Data Exchange (ETDEWEB)

    Critchlow, T., LLNL

    1998-04-01

    Data warehouses created for dynamic scientific environments, such as genetics, face significant challenges to their long-term feasibility One of the most significant of these is the high frequency of schema evolution resulting from both technological advances and scientific insight Failure to quickly incorporate these modifications will quickly render the warehouse obsolete, yet each evolution requires significant effort to ensure the changes are correctly propagated DataFoundry utilizes a mediated warehouse architecture with an ontology infrastructure to reduce the maintenance acquirements of a warehouse. Among the things, the ontology is used as an information source for automatically generating mediators, the methods that transfer data between the data sources and the warehouse The identification, definition and representation of the metadata required to perform this task is a primary contribution of this work.

  4. Protection of warehouses and plants under capacity constraint

    International Nuclear Information System (INIS)

    Bricha, Naji; Nourelfath, Mustapha

    2015-01-01

    While warehouses may be subjected to less protection effort than plants, their unavailability may have substantial impact on the supply chain performance. This paper presents a method for protection of plants and warehouses against intentional attacks in the context of the capacitated plant and warehouses location and capacity acquisition problem. A non-cooperative two-period game is developed to find the equilibrium solution and the optimal defender strategy under capacity constraints. The defender invests in the first period to minimize the expected damage and the attacker moves in the second period to maximize the expected damage. Extra-capacity of neighboring functional plants and warehouses is used after attacks, to satisfy all customers demand and to avoid the backorders. The contest success function is used to evaluate success probability of an attack of plants and warehouses. A numerical example is presented to illustrate an application of the model. The defender strategy obtained by our model is compared to the case where warehouses are subjected to less protection effort than the plants. This comparison allows us to measure how much our method is better, and illustrates the effect of direct investments in protection and indirect protection by warehouse extra-capacities to reduce the expected damage. - Highlights: • Protection of warehouses and plants against intentional attacks. • Capacitated plant and warehouse location and capacity acquisition problem. • A non-cooperative two-period game between the defender and the attacker. • A method to evaluate the utilities and determine the optimal defender strategy. • Using warehouse extra-capacities to reduce the expected damage

  5. The Data Warehouse: Keeping It Simple. MIT Shares Valuable Lessons Learned from a Successful Data Warehouse Implementation.

    Science.gov (United States)

    Thorne, Scott

    2000-01-01

    Explains why the data warehouse is important to the Massachusetts Institute of Technology community, describing its basic functions and technical design points; sharing some non-technical aspects of the school's data warehouse implementation that have proved to be important; examining the importance of proper training in a successful warehouse…

  6. WATCHMAN: A Data Warehouse Intelligent Cache Manager

    Science.gov (United States)

    Scheuermann, Peter; Shim, Junho; Vingralek, Radek

    1996-01-01

    Data warehouses store large volumes of data which are used frequently by decision support applications. Such applications involve complex queries. Query performance in such an environment is critical because decision support applications often require interactive query response time. Because data warehouses are updated infrequently, it becomes possible to improve query performance by caching sets retrieved by queries in addition to query execution plans. In this paper we report on the design of an intelligent cache manager for sets retrieved by queries called WATCHMAN, which is particularly well suited for data warehousing environment. Our cache manager employs two novel, complementary algorithms for cache replacement and for cache admission. WATCHMAN aims at minimizing query response time and its cache replacement policy swaps out entire retrieved sets of queries instead of individual pages. The cache replacement and admission algorithms make use of a profit metric, which considers for each retrieved set its average rate of reference, its size, and execution cost of the associated query. We report on a performance evaluation based on the TPC-D and Set Query benchmarks. These experiments show that WATCHMAN achieves a substantial performance improvement in a decision support environment when compared to a traditional LRU replacement algorithm.

  7. Storage of hazardous substances in bonded warehouses

    International Nuclear Information System (INIS)

    Villalobos Artavia, Beatriz

    2008-01-01

    A variety of special regulations exist in Costa Rica for registration and transport of hazardous substances; these set the requirements for entry into the country and the security of transport units. However, the regulations mentioned no specific rules for storing hazardous substances. Tax deposits have been the initial place where are stored the substances that enter the country.The creation of basic rules that would be regulating the storage of hazardous substances has taken place through the analysis of regulations and national and international laws governing hazardous substances. The regulatory domain that currently exists will be established with a field research in fiscal deposits in the metropolitan area. The storage and security measures that have been used by the personnel handling the substances will be identified to be putting the reality with that the hazardous substances have been handled in tax deposits. A rule base for the storage of hazardous substances in tax deposits can be made, protecting the safety of the environment in which are manipulated and avoiding a possible accident causing a mess around. The rule will have the characteristics of the storage warehouses hazardous substances, such as safety standards, labeling standards, infrastructure features, common storage and transitional measures that must possess and meet all bonded warehouses to store hazardous substances. (author) [es

  8. Bioinformatics Training Network (BTN): a community resource for bioinformatics trainers

    DEFF Research Database (Denmark)

    Schneider, Maria V.; Walter, Peter; Blatter, Marie-Claude

    2012-01-01

    and clearly tagged in relation to target audiences, learning objectives, etc. Ideally, they would also be peer reviewed, and easily and efficiently accessible for downloading. Here, we present the Bioinformatics Training Network (BTN), a new enterprise that has been initiated to address these needs and review...

  9. Integrating Brazilian health information systems in order to support the building of data warehouses

    Directory of Open Access Journals (Sweden)

    Sergio Miranda Freire

    Full Text Available AbstractIntroductionThis paper's aim is to develop a data warehouse from the integration of the files of three Brazilian health information systems concerned with the production of ambulatory and hospital procedures for cancer care, and cancer mortality. These systems do not have a unique patient identification, which makes their integration difficult even within a single system.MethodsData from the Brazilian Public Hospital Information System (SIH-SUS, the Oncology Module for the Outpatient Information System (APAC-ONCO and the Mortality Information System (SIM for the State of Rio de Janeiro, in the period from January 2000 to December 2004 were used. Each of the systems has the monthly data production compiled in dbase files (dbf. All the files pertaining to the same system were then read into a corresponding table in a MySQL Server 5.1. The SIH-SUS and APAC-ONCO tables were linked internally and with one another through record linkage methods. The APAC-ONCO table was linked to the SIM table. Afterwards a data warehouse was built using Pentaho and the MySQL database management system.ResultsThe sensitivities and specificities of the linkage processes were above 95% and close to 100% respectively. The data warehouse provided several analytical views that are accessed through the Pentaho Schema Workbench.ConclusionThis study presented a proposal for the integration of Brazilian Health Systems to support the building of data warehouses and provide information beyond those currently available with the individual systems.

  10. Peer Mentoring for Bioinformatics presentation

    OpenAIRE

    Budd, Aidan

    2014-01-01

    A handout used in a HUB (Heidelberg Unseminars in Bioinformatics) meeting focused on career development for bioinformaticians. It describes an activity for use to help introduce the idea of peer mentoring, potnetially acting as an opportunity to create peer-mentoring groups.

  11. Reproducible Bioinformatics Research for Biologists

    Science.gov (United States)

    This book chapter describes the current Big Data problem in Bioinformatics and the resulting issues with performing reproducible computational research. The core of the chapter provides guidelines and summaries of current tools/techniques that a noncomputational researcher would need to learn to pe...

  12. Taking Bioinformatics to Systems Medicine

    NARCIS (Netherlands)

    van Kampen, Antoine H. C.; Moerland, Perry D.

    2016-01-01

    Systems medicine promotes a range of approaches and strategies to study human health and disease at a systems level with the aim of improving the overall well-being of (healthy) individuals, and preventing, diagnosing, or curing disease. In this chapter we discuss how bioinformatics critically

  13. Bioinformatics of genomic association mapping

    NARCIS (Netherlands)

    Vaez Barzani, Ahmad

    2015-01-01

    In this thesis we present an overview of bioinformatics-based approaches for genomic association mapping, with emphasis on human quantitative traits and their contribution to complex diseases. We aim to provide a comprehensive walk-through of the classic steps of genomic association mapping

  14. Warehouse order-picking process. Order-picker routing problem

    Directory of Open Access Journals (Sweden)

    E. V. Korobkov

    2015-01-01

    Full Text Available This article continues “Warehouse order-picking process” cycle and describes order-picker routing sub-problem of a warehouse order-picking process. It draws analogies between the orderpickers’ routing problem and traveling salesman’s problem, shows differences between the standard problem statement of a traveling salesman and routing problem of warehouse orderpickers, and gives the particular Steiner’s problem statement of a traveling salesman.Warehouse layout with a typical order is represented by a graph, with some its vertices corresponding to mandatory order-picker’s visits and some other ones being noncompulsory. The paper describes an optimal Ratliff-Rosenthal algorithm to solve order-picker’s routing problem for the single-block warehouses, i.e. warehouses with only two crossing aisles, defines seven equivalent classes of partial routing sub-graphs and five transitions used to have an optimal routing sub-graph of a order-picker. An extension of optimal Ratliff-Rosenthal order-picker routing algorithm for multi-block warehouses is presented and also reasons for using the routing heuristics instead of exact optimal algorithms are given. The paper offers algorithmic description of the following seven routing heuristics: S-shaped, return, midpoint, largest gap, aisle-by-aisle, composite, and combined as well as modification of combined heuristics. The comparison of orderpicker routing heuristics for one- and two-block warehouses is to be described in the next article of the “Warehouse order-picking process” cycle.

  15. Database Description - SSBD | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available base Description General information of database Database name SSBD Alternative nam...ss 2-2-3 Minatojima-minamimachi, Chuo-ku, Kobe 650-0047, Japan, RIKEN Quantitative Biology Center Shuichi Onami E-mail: Database... classification Other Molecular Biology Databases Database classification Dynamic databa...elegans Taxonomy ID: 6239 Taxonomy Name: Escherichia coli Taxonomy ID: 562 Database description Systems Scie...i Onami Journal: Bioinformatics/April, 2015/Volume 31, Issue 7 External Links: Original website information Database

  16. EURASIP journal on bioinformatics & systems biology

    National Research Council Canada - National Science Library

    2006-01-01

    "The overall aim of "EURASIP Journal on Bioinformatics and Systems Biology" is to publish research results related to signal processing and bioinformatics theories and techniques relevant to a wide...

  17. Criticality calculation of the nuclear material warehouse of the ININ

    International Nuclear Information System (INIS)

    Garcia, T.; Angeles, A.; Flores C, J.

    2013-10-01

    In this work the conditions of nuclear safety were determined as much in normal conditions as in the accident event of the nuclear fuel warehouse of the reactor TRIGA Mark III of the Instituto Nacional de Investigaciones Nucleares (ININ). The warehouse contains standard fuel elements Leu - 8.5/20, a control rod with follower of standard fuel type Leu - 8.5/20, fuel elements Leu - 30/20, and the reactor fuel Sur-100. To check the subcritical state of the warehouse the effective multiplication factor (keff) was calculated. The keff calculation was carried out with the code MCNPX. (Author)

  18. A PROPOSAL OF DATA QUALITY FOR DATA WAREHOUSES ENVIRONMENT

    Directory of Open Access Journals (Sweden)

    Leo Willyanto Santoso

    2006-01-01

    Full Text Available The quality of the data provided is critical to the success of data warehousing initiatives. There is strong evidence that many organisations have significant data quality problems, and that these have substantial social and economic impacts. This paper describes a study which explores modeling of the dynamic parts of the data warehouse. This metamodel enables data warehouse management, design and evolution based on a high level conceptual perspective, which can be linked to the actual structural and physical aspects of the data warehouse architecture. Moreover, this metamodel is capable of modeling complex activities, their interrelationships, the relationship of activities with data sources and execution details.

  19. The importance of data warehouses for physician executives.

    Science.gov (United States)

    Ruffin, M

    1994-11-01

    Soon, most physicians will begin to learn about data warehouses and clinical and financial data about their patients stored in them. What is a data warehouse? Why are we seeing their emergence in health care only now? How does a hospital, or group practice, or health plan acquire or create a data warehouse? Who should be responsible for it, and what sort of training is needed by those in charge of using it for the edification of the sponsoring organization? I'll try to answer these questions in this article.

  20. Preface to Introduction to Structural Bioinformatics

    NARCIS (Netherlands)

    Feenstra, K. Anton; Abeln, Sanne

    2018-01-01

    While many good textbooks are available on Protein Structure, Molecular Simulations, Thermodynamics and Bioinformatics methods in general, there is no good introductory level book for the field of Structural Bioinformatics. This book aims to give an introduction into Structural Bioinformatics, which

  1. Database Resources of the BIG Data Center in 2018.

    Science.gov (United States)

    2018-01-04

    The BIG Data Center at Beijing Institute of Genomics (BIG) of the Chinese Academy of Sciences provides freely open access to a suite of database resources in support of worldwide research activities in both academia and industry. With the vast amounts of omics data generated at ever-greater scales and rates, the BIG Data Center is continually expanding, updating and enriching its core database resources through big-data integration and value-added curation, including BioCode (a repository archiving bioinformatics tool codes), BioProject (a biological project library), BioSample (a biological sample library), Genome Sequence Archive (GSA, a data repository for archiving raw sequence reads), Genome Warehouse (GWH, a centralized resource housing genome-scale data), Genome Variation Map (GVM, a public repository of genome variations), Gene Expression Nebulas (GEN, a database of gene expression profiles based on RNA-Seq data), Methylation Bank (MethBank, an integrated databank of DNA methylomes), and Science Wikis (a series of biological knowledge wikis for community annotations). In addition, three featured web services are provided, viz., BIG Search (search as a service; a scalable inter-domain text search engine), BIG SSO (single sign-on as a service; a user access control system to gain access to multiple independent systems with a single ID and password) and Gsub (submission as a service; a unified submission service for all relevant resources). All of these resources are publicly accessible through the home page of the BIG Data Center at http://bigd.big.ac.cn. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.

  2. CHOmine: an integrated data warehouse for CHO systems biology and modeling.

    Science.gov (United States)

    Gerstl, Matthias P; Hanscho, Michael; Ruckerbauer, David E; Zanghellini, Jürgen; Borth, Nicole

    2017-01-01

    The last decade has seen a surge in published genome-scale information for Chinese hamster ovary (CHO) cells, which are the main production vehicles for therapeutic proteins. While a single access point is available at www.CHOgenome.org, the primary data is distributed over several databases at different institutions. Currently research is frequently hampered by a plethora of gene names and IDs that vary between published draft genomes and databases making systems biology analyses cumbersome and elaborate. Here we present CHOmine, an integrative data warehouse connecting data from various databases and links to other ones. Furthermore, we introduce CHOmodel, a web based resource that provides access to recently published CHO cell line specific metabolic reconstructions. Both resources allow to query CHO relevant data, find interconnections between different types of data and thus provides a simple, standardized entry point to the world of CHO systems biology. http://www.chogenome.org. © The Author(s) 2017. Published by Oxford University Press.

  3. A Multidimensional Data Warehouse for Community Health Centers.

    Science.gov (United States)

    Kunjan, Kislaya; Toscos, Tammy; Turkcan, Ayten; Doebbeling, Brad N

    2015-01-01

    Community health centers (CHCs) play a pivotal role in healthcare delivery to vulnerable populations, but have not yet benefited from a data warehouse that can support improvements in clinical and financial outcomes across the practice. We have developed a multidimensional clinic data warehouse (CDW) by working with 7 CHCs across the state of Indiana and integrating their operational, financial and electronic patient records to support ongoing delivery of care. We describe in detail the rationale for the project, the data architecture employed, the content of the data warehouse, along with a description of the challenges experienced and strategies used in the development of this repository that may help other researchers, managers and leaders in health informatics. The resulting multidimensional data warehouse is highly practical and is designed to provide a foundation for wide-ranging healthcare data analytics over time and across the community health research enterprise.

  4. Statewide Transportation Engineering Warehouse for Archived Regional Data (STEWARD).

    Science.gov (United States)

    2009-12-01

    This report documents Phase III of the development and operation of a prototype for the Statewide Transportation : Engineering Warehouse for Archived Regional Data (STEWARD). It reflects the progress on the development and : operation of STEWARD sinc...

  5. Implemetasi Data Warehouse pada Bagian Pemasaran Perguruan Tinggi

    Directory of Open Access Journals (Sweden)

    Eka Miranda

    2012-06-01

    Full Text Available Transactional data are widely owned by higher education institutes, but the utilization of the data to support decision making has not functioned maximally. Therefore, higher education institutes need analysis tools to maximize decision making processes. Based on the issue, then data warehouse design was created to: (1 store large-amount data; (2 potentially gain new perspectives of distributed data; (3 provide reports and answers to users’ ad hoc questions; (4 perform data analysis of external conditions and transactional data from the marketing activities of universities, since marketing is one supporting field as well as the cutting edge of higher education institutes. The methods used to design and implement data warehouse are analysis of records related to the marketing activities of higher education institutes and data warehouse design. This study results in a data warehouse design and its implementation to analyze the external data and transactional data from the marketing activities of universities to support decision making.

  6. Contextual snowflake modelling for pattern warehouse logical design

    Indian Academy of Sciences (India)

    being managed by the pattern warehouse management system (PWMS) ... The authors pointed out that the necessity to find out the relationship between patterns .... (i) Some customer queries can only be satisfied by specific DM technique.

  7. Capturing Complex Multidimensional Data in Location-Based Data Warehouses

    DEFF Research Database (Denmark)

    Timko, Igor; Pedersen, Torben Bach

    2004-01-01

    Motivated by the increasing need to handle complex multidimensional data inlocation-based data warehouses, this paper proposes apowerful data model that is able to capture the complexities of such data. The model provides a foundation for handling complex transportationinfrastructures...

  8. Outpatient health care statistics data warehouse--implementation.

    Science.gov (United States)

    Zilli, D

    1999-01-01

    Data warehouse implementation is assumed to be a very knowledge-demanding, expensive and long-lasting process. As such it requires senior management sponsorship, involvement of experts, a big budget and probably years of development time. Presented Outpatient Health Care Statistics Data Warehouse implementation research provides ample evidence against the infallibility of the above statements. New, inexpensive, but powerful technology, which provides outstanding platform for On-Line Analytical Processing (OLAP), has emerged recently. Presumably, it will be the basis for the estimated future growth of data warehouse market, both in the medical and in other business fields. Methods and tools for building, maintaining and exploiting data warehouses are also briefly discussed in the paper.

  9. Minimizing Warehouse Space with a Dedicated Storage Policy

    Directory of Open Access Journals (Sweden)

    Andrea Fumi

    2013-07-01

    inevitably be supported by warehouse management system software. On the contrary, the proposed methodology relies upon a dedicated storage policy, which is easily implementable by companies of all sizes without the need for investing in expensive IT tools.

  10. Operational management system for warehouse logistics of metal trading companies

    Directory of Open Access Journals (Sweden)

    Khayrullin Rustam Zinnatullovich

    2014-07-01

    Full Text Available Logistics is an effective tool in business management. Metal trading business is a part of metal promotion chain from producer to consumer. It's designed to serve as a link connecting the interests of steel producers and end users. We should account for the specifics warehousing trading. The specificity of warehouse metal trading consists primarily in the fact that the purchase is made in large lots, and the sale - in medium and small parties. Loading and unloading of cars and trucks is produced by overhead cranes. Some part of the purchased goods are shipped in relatively large lots without presales preparation. Another part of the goods undergoes presale preparation. Indoor and outdoor warehouses are used with the address storage system. In the process of prolonged storage the metal rusts. Some part of the goods is subjected to final completion (cutting, welding, coloration in service centers and small factories, usually located at the warehouse. The quantity of simultaneously shipped cars, and the quantity of the loader workers brigade can reach few dozens. So it is necessary to control the loading workers, to coordinate and monitor the performance of loading and unloading operations, to make the daily analysis of their work, to evaluate the warehouse operations as a whole. There is a need to manage and control movement of cars and trucks on the warehouse territory to reduce storage and transport costs and improve customer service. ERP-systems and WMS-systems, which are widely used, do not cover fully the functions and processes of the warehouse trading, and do not effectively manage all logistics processes. In this paper the specialized software is proposed. The software is intended for operational logistics management in warehouse metal products trading. The basic functions and processes of metal warehouse trading are described. The effectiveness indices for logistics processes and key effective indicators of warehouse trading are proposed

  11. A study on building data warehouse of hospital information system.

    Science.gov (United States)

    Li, Ping; Wu, Tao; Chen, Mu; Zhou, Bin; Xu, Wei-guo

    2011-08-01

    Existing hospital information systems with simple statistical functions cannot meet current management needs. It is well known that hospital resources are distributed with private property rights among hospitals, such as in the case of the regional coordination of medical services. In this study, to integrate and make full use of medical data effectively, we propose a data warehouse modeling method for the hospital information system. The method can also be employed for a distributed-hospital medical service system. To ensure that hospital information supports the diverse needs of health care, the framework of the hospital information system has three layers: datacenter layer, system-function layer, and user-interface layer. This paper discusses the role of a data warehouse management system in handling hospital information from the establishment of the data theme to the design of a data model to the establishment of a data warehouse. Online analytical processing tools assist user-friendly multidimensional analysis from a number of different angles to extract the required data and information. Use of the data warehouse improves online analytical processing and mitigates deficiencies in the decision support system. The hospital information system based on a data warehouse effectively employs statistical analysis and data mining technology to handle massive quantities of historical data, and summarizes from clinical and hospital information for decision making. This paper proposes the use of a data warehouse for a hospital information system, specifically a data warehouse for the theme of hospital information to determine latitude, modeling and so on. The processing of patient information is given as an example that demonstrates the usefulness of this method in the case of hospital information management. Data warehouse technology is an evolving technology, and more and more decision support information extracted by data mining and with decision-making technology is

  12. A Performance Evaluation of Online Warehouse Update Algorithms

    Science.gov (United States)

    1998-01-01

    able to present a fully consistent ver- sion of the warehouse to the queries while the warehouse is being updated. Multiversioning has been used...LST97]). Special- ized multiversion access structures have also been proposed ([LS89, LS90, dBS96, BC97, VV97, MOPW98]) In the context of OLTP systems...collection processes. 2.1 Multiversioning MVNL supports multiple versions by using Time Travel ([Sto87]). Each row has two extra at- tributes, Tmin

  13. Establishing bioinformatics research in the Asia Pacific

    OpenAIRE

    Ranganathan, Shoba; Tammi, Martti; Gribskov, Michael; Tan, Tin Wee

    2006-01-01

    Abstract In 1998, the Asia Pacific Bioinformatics Network (APBioNet), Asia's oldest bioinformatics organisation was set up to champion the advancement of bioinformatics in the Asia Pacific. By 2002, APBioNet was able to gain sufficient critical mass to initiate the first International Conference on Bioinformatics (InCoB) bringing together scientists working in the field of bioinformatics in the region. This year, the InCoB2006 Conference was organized as the 5th annual conference of the Asia-...

  14. Incorporating Genomics and Bioinformatics across the Life Sciences Curriculum

    Energy Technology Data Exchange (ETDEWEB)

    Ditty, Jayna L.; Kvaal, Christopher A.; Goodner, Brad; Freyermuth, Sharyn K.; Bailey, Cheryl; Britton, Robert A.; Gordon, Stuart G.; Heinhorst, Sabine; Reed, Kelynne; Xu, Zhaohui; Sanders-Lorenz, Erin R.; Axen, Seth; Kim, Edwin; Johns, Mitrick; Scott, Kathleen; Kerfeld, Cheryl A.

    2011-08-01

    Undergraduate life sciences education needs an overhaul, as clearly described in the National Research Council of the National Academies publication BIO 2010: Transforming Undergraduate Education for Future Research Biologists. Among BIO 2010's top recommendations is the need to involve students in working with real data and tools that reflect the nature of life sciences research in the 21st century. Education research studies support the importance of utilizing primary literature, designing and implementing experiments, and analyzing results in the context of a bona fide scientific question in cultivating the analytical skills necessary to become a scientist. Incorporating these basic scientific methodologies in undergraduate education leads to increased undergraduate and post-graduate retention in the sciences. Toward this end, many undergraduate teaching organizations offer training and suggestions for faculty to update and improve their teaching approaches to help students learn as scientists, through design and discovery (e.g., Council of Undergraduate Research [www.cur.org] and Project Kaleidoscope [www.pkal.org]). With the advent of genome sequencing and bioinformatics, many scientists now formulate biological questions and interpret research results in the context of genomic information. Just as the use of bioinformatic tools and databases changed the way scientists investigate problems, it must change how scientists teach to create new opportunities for students to gain experiences reflecting the influence of genomics, proteomics, and bioinformatics on modern life sciences research. Educators have responded by incorporating bioinformatics into diverse life science curricula. While these published exercises in, and guidelines for, bioinformatics curricula are helpful and inspirational, faculty new to the area of bioinformatics inevitably need training in the theoretical underpinnings of the algorithms. Moreover, effectively integrating bioinformatics

  15. An efficiency improvement in warehouse operation using simulation analysis

    Science.gov (United States)

    Samattapapong, N.

    2017-11-01

    In general, industry requires an efficient system for warehouse operation. There are many important factors that must be considered when designing an efficient warehouse system. The most important is an effective warehouse operation system that can help transfer raw material, reduce costs and support transportation. By all these factors, researchers are interested in studying about work systems and warehouse distribution. We start by collecting the important data for storage, such as the information on products, information on size and location, information on data collection and information on production, and all this information to build simulation model in Flexsim® simulation software. The result for simulation analysis found that the conveyor belt was a bottleneck in the warehouse operation. Therefore, many scenarios to improve that problem were generated and testing through simulation analysis process. The result showed that an average queuing time was reduced from 89.8% to 48.7% and the ability in transporting the product increased from 10.2% to 50.9%. Thus, it can be stated that this is the best method for increasing efficiency in the warehouse operation.

  16. Design and Applications of a Multimodality Image Data Warehouse Framework

    Science.gov (United States)

    Wong, Stephen T.C.; Hoo, Kent Soo; Knowlton, Robert C.; Laxer, Kenneth D.; Cao, Xinhau; Hawkins, Randall A.; Dillon, William P.; Arenson, Ronald L.

    2002-01-01

    A comprehensive data warehouse framework is needed, which encompasses imaging and non-imaging information in supporting disease management and research. The authors propose such a framework, describe general design principles and system architecture, and illustrate a multimodality neuroimaging data warehouse system implemented for clinical epilepsy research. The data warehouse system is built on top of a picture archiving and communication system (PACS) environment and applies an iterative object-oriented analysis and design (OOAD) approach and recognized data interface and design standards. The implementation is based on a Java CORBA (Common Object Request Broker Architecture) and Web-based architecture that separates the graphical user interface presentation, data warehouse business services, data staging area, and backend source systems into distinct software layers. To illustrate the practicality of the data warehouse system, the authors describe two distinct biomedical applications—namely, clinical diagnostic workup of multimodality neuroimaging cases and research data analysis and decision threshold on seizure foci lateralization. The image data warehouse framework can be modified and generalized for new application domains. PMID:11971885

  17. The DBCLS BioHackathon: standardization and interoperability for bioinformatics web services and workflows

    NARCIS (Netherlands)

    Katayama, T.; Arakawa, K.; Nakao, M.; Prins, J.C.P.

    2010-01-01

    Web services have become a key technology for bioinformatics, since life science databases are globally decentralized and the exponential increase in the amount of available data demands for efficient systems without the need to transfer entire databases for every step of an analysis. However,

  18. Decision method for optimal selection of warehouse material handling strategies by production companies

    Science.gov (United States)

    Dobos, P.; Tamás, P.; Illés, B.

    2016-11-01

    Adequate establishment and operation of warehouse logistics determines the companies’ competitiveness significantly because it effects greatly the quality and the selling price of the goods that the production companies produce. In order to implement and manage an adequate warehouse system, adequate warehouse position, stock management model, warehouse technology, motivated work force committed to process improvement and material handling strategy are necessary. In practical life, companies have paid small attantion to select the warehouse strategy properly. Although it has a major influence on the production in the case of material warehouse and on smooth costumer service in the case of finished goods warehouse because this can happen with a huge loss in material handling. Due to the dynamically changing production structure, frequent reorganization of warehouse activities is needed, on what the majority of the companies react basically with no reactions. This work presents a simulation test system frames for eligible warehouse material handling strategy selection and also the decision method for selection.

  19. Implementation of Lean Warehouse to Minimize Wastes in Finished Goods Warehouse of PT Charoen Pokphand Indonesia Semarang

    Directory of Open Access Journals (Sweden)

    Nia Budi Puspitasari

    2016-03-01

    Full Text Available PT. Charoen Pokphand Indonesia Semarang is one of the largest poultry feed companies in Indonesia. To store the finished products that are ready to be distributed, it needs a finished goods warehouse. To minimize the wastes that occur in the process of warehousing the finished goods, the implementation of lean warehouse is required. The core process of finished goods warehouse is the process of putting bag that has been through the process of pallets packing, and then transporting the pallets contained bags of feed at finished goods warehouses and the process of unloading food from the finished goods warehouse to the distribution truck. With the implementation of the lean warehouse, we can know whether the activities are value added or not, to be identified later which type of waste happened. Opinions of stakeholders regarding the waste that must be eliminated first need to be determined by questionnaires. Based on the results of the questionnaires, three top wastes are selected to be identified the cause by using fishbone diagram. They can be repaired by using the implementation of 5S, namely Seiri, Seiton, Seiso, Seiketsu, and Shitsuke. Defect waste can be minimized by selecting pallet, putting sack correctly, forklift line clearance, applying working procedures, and creating cleaning schedule. Next, overprocessing waste is minimized by removing unnecessary items, putting based on the date of manufacture, and manufacture of feed plan. Inventory waste is minimized by removing junks, putting feed based on the expired date, and cleaning the barn

  20. Establishing bioinformatics research in the Asia Pacific

    Directory of Open Access Journals (Sweden)

    Tammi Martti

    2006-12-01

    Full Text Available Abstract In 1998, the Asia Pacific Bioinformatics Network (APBioNet, Asia's oldest bioinformatics organisation was set up to champion the advancement of bioinformatics in the Asia Pacific. By 2002, APBioNet was able to gain sufficient critical mass to initiate the first International Conference on Bioinformatics (InCoB bringing together scientists working in the field of bioinformatics in the region. This year, the InCoB2006 Conference was organized as the 5th annual conference of the Asia-Pacific Bioinformatics Network, on Dec. 18–20, 2006 in New Delhi, India, following a series of successful events in Bangkok (Thailand, Penang (Malaysia, Auckland (New Zealand and Busan (South Korea. This Introduction provides a brief overview of the peer-reviewed manuscripts accepted for publication in this Supplement. It exemplifies a typical snapshot of the growing research excellence in bioinformatics of the region as we embark on a trajectory of establishing a solid bioinformatics research culture in the Asia Pacific that is able to contribute fully to the global bioinformatics community.

  1. Emerging strengths in Asia Pacific bioinformatics.

    Science.gov (United States)

    Ranganathan, Shoba; Hsu, Wen-Lian; Yang, Ueng-Cheng; Tan, Tin Wee

    2008-12-12

    The 2008 annual conference of the Asia Pacific Bioinformatics Network (APBioNet), Asia's oldest bioinformatics organisation set up in 1998, was organized as the 7th International Conference on Bioinformatics (InCoB), jointly with the Bioinformatics and Systems Biology in Taiwan (BIT 2008) Conference, Oct. 20-23, 2008 at Taipei, Taiwan. Besides bringing together scientists from the field of bioinformatics in this region, InCoB is actively involving researchers from the area of systems biology, to facilitate greater synergy between these two groups. Marking the 10th Anniversary of APBioNet, this InCoB 2008 meeting followed on from a series of successful annual events in Bangkok (Thailand), Penang (Malaysia), Auckland (New Zealand), Busan (South Korea), New Delhi (India) and Hong Kong. Additionally, tutorials and the Workshop on Education in Bioinformatics and Computational Biology (WEBCB) immediately prior to the 20th Federation of Asian and Oceanian Biochemists and Molecular Biologists (FAOBMB) Taipei Conference provided ample opportunity for inducting mainstream biochemists and molecular biologists from the region into a greater level of awareness of the importance of bioinformatics in their craft. In this editorial, we provide a brief overview of the peer-reviewed manuscripts accepted for publication herein, grouped into thematic areas. As the regional research expertise in bioinformatics matures, the papers fall into thematic areas, illustrating the specific contributions made by APBioNet to global bioinformatics efforts.

  2. BIRCH: A user-oriented, locally-customizable, bioinformatics system

    Science.gov (United States)

    Fristensky, Brian

    2007-01-01

    Background Molecular biologists need sophisticated analytical tools which often demand extensive computational resources. While finding, installing, and using these tools can be challenging, pipelining data from one program to the next is particularly awkward, especially when using web-based programs. At the same time, system administrators tasked with maintaining these tools do not always appreciate the needs of research biologists. Results BIRCH (Biological Research Computing Hierarchy) is an organizational framework for delivering bioinformatics resources to a user group, scaling from a single lab to a large institution. The BIRCH core distribution includes many popular bioinformatics programs, unified within the GDE (Genetic Data Environment) graphic interface. Of equal importance, BIRCH provides the system administrator with tools that simplify the job of managing a multiuser bioinformatics system across different platforms and operating systems. These include tools for integrating locally-installed programs and databases into BIRCH, and for customizing the local BIRCH system to meet the needs of the user base. BIRCH can also act as a front end to provide a unified view of already-existing collections of bioinformatics software. Documentation for the BIRCH and locally-added programs is merged in a hierarchical set of web pages. In addition to manual pages for individual programs, BIRCH tutorials employ step by step examples, with screen shots and sample files, to illustrate both the important theoretical and practical considerations behind complex analytical tasks. Conclusion BIRCH provides a versatile organizational framework for managing software and databases, and making these accessible to a user base. Because of its network-centric design, BIRCH makes it possible for any user to do any task from anywhere. PMID:17291351

  3. BIRCH: A user-oriented, locally-customizable, bioinformatics system

    Directory of Open Access Journals (Sweden)

    Fristensky Brian

    2007-02-01

    Full Text Available Abstract Background Molecular biologists need sophisticated analytical tools which often demand extensive computational resources. While finding, installing, and using these tools can be challenging, pipelining data from one program to the next is particularly awkward, especially when using web-based programs. At the same time, system administrators tasked with maintaining these tools do not always appreciate the needs of research biologists. Results BIRCH (Biological Research Computing Hierarchy is an organizational framework for delivering bioinformatics resources to a user group, scaling from a single lab to a large institution. The BIRCH core distribution includes many popular bioinformatics programs, unified within the GDE (Genetic Data Environment graphic interface. Of equal importance, BIRCH provides the system administrator with tools that simplify the job of managing a multiuser bioinformatics system across different platforms and operating systems. These include tools for integrating locally-installed programs and databases into BIRCH, and for customizing the local BIRCH system to meet the needs of the user base. BIRCH can also act as a front end to provide a unified view of already-existing collections of bioinformatics software. Documentation for the BIRCH and locally-added programs is merged in a hierarchical set of web pages. In addition to manual pages for individual programs, BIRCH tutorials employ step by step examples, with screen shots and sample files, to illustrate both the important theoretical and practical considerations behind complex analytical tasks. Conclusion BIRCH provides a versatile organizational framework for managing software and databases, and making these accessible to a user base. Because of its network-centric design, BIRCH makes it possible for any user to do any task from anywhere.

  4. Study on resources and environmental data integration towards data warehouse construction covering trans-boundary area of China, Russia and Mongolia

    Science.gov (United States)

    Wang, J.; Song, J.; Gao, M.; Zhu, L.

    2014-02-01

    The trans-boundary area between Northern China, Mongolia and eastern Siberia of Russia is a continuous geographical area located in north eastern Asia. Many common issues in this region need to be addressed based on a uniform resources and environmental data warehouse. Based on the practice of joint scientific expedition, the paper presented a data integration solution including 3 steps, i.e., data collection standards and specifications making, data reorganization and process, data warehouse design and development. A series of data collection standards and specifications were drawn up firstly covering more than 10 domains. According to the uniform standard, 20 resources and environmental survey databases in regional scale, and 11 in-situ observation databases were reorganized and integrated. North East Asia Resources and Environmental Data Warehouse was designed, which included 4 layers, i.e., resources layer, core business logic layer, internet interoperation layer, and web portal layer. The data warehouse prototype was developed and deployed initially. All the integrated data in this area can be accessed online.

  5. Study on resources and environmental data integration towards data warehouse construction covering trans-boundary area of China, Russia and Mongolia

    International Nuclear Information System (INIS)

    Wang, J; Song, J; Gao, M; Zhu, L

    2014-01-01

    The trans-boundary area between Northern China, Mongolia and eastern Siberia of Russia is a continuous geographical area located in north eastern Asia. Many common issues in this region need to be addressed based on a uniform resources and environmental data warehouse. Based on the practice of joint scientific expedition, the paper presented a data integration solution including 3 steps, i.e., data collection standards and specifications making, data reorganization and process, data warehouse design and development. A series of data collection standards and specifications were drawn up firstly covering more than 10 domains. According to the uniform standard, 20 resources and environmental survey databases in regional scale, and 11 in-situ observation databases were reorganized and integrated. North East Asia Resources and Environmental Data Warehouse was designed, which included 4 layers, i.e., resources layer, core business logic layer, internet interoperation layer, and web portal layer. The data warehouse prototype was developed and deployed initially. All the integrated data in this area can be accessed online

  6. The Data Warehouse in Service Oriented Architectures and Network Centric Warfare

    National Research Council Canada - National Science Library

    Lenahan, Jack

    2005-01-01

    ... at all policy and command levels to support superior decision making? Analyzing the anticipated massive amount of GIG data will almost certainly require data warehouses and federated data warehouses...

  7. Analisis Dan Perancangan Data Warehouse Pada PT Pelita Tatamas Jaya

    Directory of Open Access Journals (Sweden)

    Choirul Huda

    2010-12-01

    Full Text Available The purpose of this research is to assist in providing information to support decision-making processes in sales, purchasing and inventory control at PT Tatamas Pelita Jaya. With the support of data warehouse, business leaders can be more helpful in making decisions more quickly and precisely. Research methodology includes analysis of current systems, library research, designing a data warehousing schema using bintang. The result of this research is the availability of a data warehouse that can generate information quickly and precisely, thus helping the company in making decisions. The conclusion of this research is the application of data warehouse can be a media aide related parties on PT Tatamas Pelita Jaya in decision making. 

  8. Development of a Clinical Data Warehouse for Hospital Infection Control

    Science.gov (United States)

    Wisniewski, Mary F.; Kieszkowski, Piotr; Zagorski, Brandon M.; Trick, William E.; Sommers, Michael; Weinstein, Robert A.

    2003-01-01

    Existing data stored in a hospital's transactional servers have enormous potential to improve performance measurement and health care quality. Accessing, organizing, and using these data to support research and quality improvement projects are evolving challenges for hospital systems. The authors report development of a clinical data warehouse that they created by importing data from the information systems of three affiliated public hospitals. They describe their methodology; difficulties encountered; responses from administrators, computer specialists, and clinicians; and the steps taken to capture and store patient-level data. The authors provide examples of their use of the clinical data warehouse to monitor antimicrobial resistance, to measure antimicrobial use, to detect hospital-acquired bloodstream infections, to measure the cost of infections, and to detect antimicrobial prescribing errors. In addition, they estimate the amount of time and money saved and the increased precision achieved through the practical application of the data warehouse. PMID:12807807

  9. Development of a public health reporting data warehouse: lessons learned.

    Science.gov (United States)

    Rizi, Seyed Ali Mussavi; Roudsari, Abdul

    2013-01-01

    Data warehouse projects are perceived to be risky and prone to failure due to many organizational and technical challenges. However, often iterative and lengthy processes of implementation of data warehouses at an enterprise level provide an opportunity for formative evaluation of these solutions. This paper describes lessons learned from successful development and implementation of the first phase of an enterprise data warehouse to support public health surveillance at British Columbia Centre for Disease Control. Iterative and prototyping approach to development, overcoming technical challenges of extraction and integration of data from large scale clinical and ancillary systems, a novel approach to record linkage, flexible and reusable modeling of clinical data, and securing senior management support at the right time were the main factors that contributed to the success of the data warehousing project.

  10. Aspects of Data Warehouse Technologies for Complex Web Data

    DEFF Research Database (Denmark)

    Thomsen, Christian

    This thesis is about aspects of specification and development of data warehouse technologies for complex web data. Today, large amounts of data exist in different web resources and in different formats. But it is often hard to analyze and query the often big and complex data or data about the data...... (i.e., metadata). It is therefore interesting to apply Data Warehouse (DW) technology to the data. But to apply DW technology to complex web data is not straightforward and the DW community faces new and exciting challenges. This thesis considers some of these challenges. The work leading...... to this thesis has primarily been done in relation to the project European Internet Accessibility Observatory (EIAO) where a data warehouse for accessibility data (roughly data about how usable web resources are for disabled users) has been specified and developed. But the results of the thesis can also...

  11. Development of global data warehouse for beam diagnostics at SSRF

    International Nuclear Information System (INIS)

    Lai Longwei; Leng Yongbin; Yan Yingbing; Chen Zhichu

    2015-01-01

    The beam diagnostic system is adequate during the daily operation and machine study at the Shanghai Synchrotron Radiation Facility (SSRF). Without the effective event detecting mechanism, it is difficult to dump and analyze abnormal phenomena such as the global orbital disturbance, the malfunction of the BPM and the noise of the DCCT. The global beam diagnostic data warehouse was built in order to monitor the status of the accelerator and the beam instruments. The data warehouse was designed as a Soft IOC hosted on an independent server. Once abnormal phenomena happen it will be triggered and will store the relevant data for further analysis. The results show that the data warehouse can detect abnormal phenomena of the machine and the beam diagnostic system effectively, and can be used for calculating confidential indicators of the beam instruments. It provides an efficient tool for the improvement of the beam diagnostic system and accelerator. (authors)

  12. Development of a clinical data warehouse for hospital infection control.

    Science.gov (United States)

    Wisniewski, Mary F; Kieszkowski, Piotr; Zagorski, Brandon M; Trick, William E; Sommers, Michael; Weinstein, Robert A

    2003-01-01

    Existing data stored in a hospital's transactional servers have enormous potential to improve performance measurement and health care quality. Accessing, organizing, and using these data to support research and quality improvement projects are evolving challenges for hospital systems. The authors report development of a clinical data warehouse that they created by importing data from the information systems of three affiliated public hospitals. They describe their methodology; difficulties encountered; responses from administrators, computer specialists, and clinicians; and the steps taken to capture and store patient-level data. The authors provide examples of their use of the clinical data warehouse to monitor antimicrobial resistance, to measure antimicrobial use, to detect hospital-acquired bloodstream infections, to measure the cost of infections, and to detect antimicrobial prescribing errors. In addition, they estimate the amount of time and money saved and the increased precision achieved through the practical application of the data warehouse.

  13. A simulated annealing approach for redesigning a warehouse network problem

    Science.gov (United States)

    Khairuddin, Rozieana; Marlizawati Zainuddin, Zaitul; Jiun, Gan Jia

    2017-09-01

    Now a day, several companies consider downsizing their distribution networks in ways that involve consolidation or phase-out of some of their current warehousing facilities due to the increasing competition, mounting cost pressure and taking advantage on the economies of scale. Consequently, the changes on economic situation after a certain period of time require an adjustment on the network model in order to get the optimal cost under the current economic conditions. This paper aimed to develop a mixed-integer linear programming model for a two-echelon warehouse network redesign problem with capacitated plant and uncapacitated warehouses. The main contribution of this study is considering capacity constraint for existing warehouses. A Simulated Annealing algorithm is proposed to tackle with the proposed model. The numerical solution showed the model and method of solution proposed was practical.

  14. E-MSD: an integrated data resource for bioinformatics.

    Science.gov (United States)

    Velankar, S; McNeil, P; Mittard-Runte, V; Suarez, A; Barrell, D; Apweiler, R; Henrick, K

    2005-01-01

    The Macromolecular Structure Database (MSD) group (http://www.ebi.ac.uk/msd/) continues to enhance the quality and consistency of macromolecular structure data in the worldwide Protein Data Bank (wwPDB) and to work towards the integration of various bioinformatics data resources. One of the major obstacles to the improved integration of structural databases such as MSD and sequence databases like UniProt is the absence of up to date and well-maintained mapping between corresponding entries. We have worked closely with the UniProt group at the EBI to clean up the taxonomy and sequence cross-reference information in the MSD and UniProt databases. This information is vital for the reliable integration of the sequence family databases such as Pfam and Interpro with the structure-oriented databases of SCOP and CATH. This information has been made available to the eFamily group (http://www.efamily.org.uk/) and now forms the basis of the regular interchange of information between the member databases (MSD, UniProt, Pfam, Interpro, SCOP and CATH). This exchange of annotation information has enriched the structural information in the MSD database with annotation from wider sequence-oriented resources. This work was carried out under the 'Structure Integration with Function, Taxonomy and Sequences (SIFTS)' initiative (http://www.ebi.ac.uk/msd-srv/docs/sifts) in the MSD group.

  15. Fundamentals of bioinformatics and computational biology methods and exercises in matlab

    CERN Document Server

    Singh, Gautam B

    2015-01-01

    This book offers comprehensive coverage of all the core topics of bioinformatics, and includes practical examples completed using the MATLAB bioinformatics toolbox™. It is primarily intended as a textbook for engineering and computer science students attending advanced undergraduate and graduate courses in bioinformatics and computational biology. The book develops bioinformatics concepts from the ground up, starting with an introductory chapter on molecular biology and genetics. This chapter will enable physical science students to fully understand and appreciate the ultimate goals of applying the principles of information technology to challenges in biological data management, sequence analysis, and systems biology. The first part of the book also includes a survey of existing biological databases, tools that have become essential in today’s biotechnology research. The second part of the book covers methodologies for retrieving biological information, including fundamental algorithms for sequence compar...

  16. Demand Response Opportunities in Industrial Refrigerated Warehouses in California

    Energy Technology Data Exchange (ETDEWEB)

    Goli, Sasank; McKane, Aimee; Olsen, Daniel

    2011-06-14

    Industrial refrigerated warehouses that implemented energy efficiency measures and have centralized control systems can be excellent candidates for Automated Demand Response (Auto-DR) due to equipment synergies, and receptivity of facility managers to strategies that control energy costs without disrupting facility operations. Auto-DR utilizes OpenADR protocol for continuous and open communication signals over internet, allowing facilities to automate their Demand Response (DR). Refrigerated warehouses were selected for research because: They have significant power demand especially during utility peak periods; most processes are not sensitive to short-term (2-4 hours) lower power and DR activities are often not disruptive to facility operations; the number of processes is limited and well understood; and past experience with some DR strategies successful in commercial buildings may apply to refrigerated warehouses. This paper presents an overview of the potential for load sheds and shifts from baseline electricity use in response to DR events, along with physical configurations and operating characteristics of refrigerated warehouses. Analysis of data from two case studies and nine facilities in Pacific Gas and Electric territory, confirmed the DR abilities inherent to refrigerated warehouses but showed significant variation across facilities. Further, while load from California's refrigerated warehouses in 2008 was 360 MW with estimated DR potential of 45-90 MW, actual achieved was much less due to low participation. Efforts to overcome barriers to increased participation may include, improved marketing and recruitment of potential DR sites, better alignment and emphasis on financial benefits of participation, and use of Auto-DR to increase consistency of participation.

  17. A study of multidimensional modeling approaches for data warehouse

    Science.gov (United States)

    Yusof, Sharmila Mat; Sidi, Fatimah; Ibrahim, Hamidah; Affendey, Lilly Suriani

    2016-08-01

    Data warehouse system is used to support the process of organizational decision making. Hence, the system must extract and integrate information from heterogeneous data sources in order to uncover relevant knowledge suitable for decision making process. However, the development of data warehouse is a difficult and complex process especially in its conceptual design (multidimensional modeling). Thus, there have been various approaches proposed to overcome the difficulty. This study surveys and compares the approaches of multidimensional modeling and highlights the issues, trend and solution proposed to date. The contribution is on the state of the art of the multidimensional modeling design.

  18. Mastering data warehouse design relational and dimensional techniques

    CERN Document Server

    Imhoff, Claudia; Geiger, Jonathan G

    2003-01-01

    A cutting-edge response to Ralph Kimball''s challenge to the data warehouse community that answers some tough questions about the effectiveness of the relational approach to data warehousingWritten by one of the best-known exponents of the Bill Inmon approach to data warehousingAddresses head-on the tough issues raised by Kimball and explains how to choose the best modeling technique for solving common data warehouse design problemsWeighs the pros and cons of relational vs. dimensional modeling techniquesFocuses on tough modeling problems, including creating and maintaining keys and modeling c

  19. Reliability in Warehouse-Scale Computing: Why Low Latency Matters

    DEFF Research Database (Denmark)

    Nannarelli, Alberto

    2015-01-01

    , the limiting factor of these warehouse-scale data centers is the power dissipation. Power is dissipated not only in the computation itself, but also in heat removal (fans, air conditioning, etc.) to keep the temperature of the devices within the operating ranges. The need to keep the temperature low within......Warehouse sized buildings are nowadays hosting several types of large computing systems: from supercomputers to large clusters of servers to provide the infrastructure to the cloud. Although the main target, especially for high-performance computing, is still to achieve high throughput...

  20. 7 CFR 1421.106 - Warehouse-stored marketing assistance loan collateral.

    Science.gov (United States)

    2010-01-01

    ... 7 Agriculture 10 2010-01-01 2010-01-01 false Warehouse-stored marketing assistance loan collateral... Marketing Assistance Loans § 1421.106 Warehouse-stored marketing assistance loan collateral. (a) A commodity may be pledged as collateral for a warehouse-stored marketing assistance loan in the quantity...

  1. Expanding Post-Harvest Finance Through Warehouse Receipts and Related Instruments

    OpenAIRE

    Baldwin, Marisa; Bryla, Erin; Langenbucher, Anja

    2006-01-01

    Warehouse receipt financing and similar types of collateralized lending provide an alternative to traditional lending requirements of banks and other financiers and could provide opportunities to expand this lending in emerging economies for agricultural trade. The main contents include: what is warehouse receipt financing; what is the value of warehouse receipt financing; other collater...

  2. 27 CFR 28.28 - Withdrawal of wine and distilled spirits from customs bonded warehouses.

    Science.gov (United States)

    2010-04-01

    ... 27 Alcohol, Tobacco Products and Firearms 1 2010-04-01 2010-04-01 false Withdrawal of wine and... Miscellaneous Provisions Customs Bonded Warehouses § 28.28 Withdrawal of wine and distilled spirits from customs bonded warehouses. Wine and bottled distilled spirits entered into customs bonded warehouses as provided...

  3. The Analytic Information Warehouse (AIW): a platform for analytics using electronic health record data.

    Science.gov (United States)

    Post, Andrew R; Kurc, Tahsin; Cholleti, Sharath; Gao, Jingjing; Lin, Xia; Bornstein, William; Cantrell, Dedra; Levine, David; Hohmann, Sam; Saltz, Joel H

    2013-06-01

    To create an analytics platform for specifying and detecting clinical phenotypes and other derived variables in electronic health record (EHR) data for quality improvement investigations. We have developed an architecture for an Analytic Information Warehouse (AIW). It supports transforming data represented in different physical schemas into a common data model, specifying derived variables in terms of the common model to enable their reuse, computing derived variables while enforcing invariants and ensuring correctness and consistency of data transformations, long-term curation of derived data, and export of derived data into standard analysis tools. It includes software that implements these features and a computing environment that enables secure high-performance access to and processing of large datasets extracted from EHRs. We have implemented and deployed the architecture in production locally. The software is available as open source. We have used it as part of hospital operations in a project to reduce rates of hospital readmission within 30days. The project examined the association of over 100 derived variables representing disease and co-morbidity phenotypes with readmissions in 5years of data from our institution's clinical data warehouse and the UHC Clinical Database (CDB). The CDB contains administrative data from over 200 hospitals that are in academic medical centers or affiliated with such centers. A widely available platform for managing and detecting phenotypes in EHR data could accelerate the use of such data in quality improvement and comparative effectiveness studies. Copyright © 2013 Elsevier Inc. All rights reserved.

  4. The Analytic Information Warehouse (AIW): a Platform for Analytics using Electronic Health Record Data

    Science.gov (United States)

    Post, Andrew R.; Kurc, Tahsin; Cholleti, Sharath; Gao, Jingjing; Lin, Xia; Bornstein, William; Cantrell, Dedra; Levine, David; Hohmann, Sam; Saltz, Joel H.

    2013-01-01

    Objective To create an analytics platform for specifying and detecting clinical phenotypes and other derived variables in electronic health record (EHR) data for quality improvement investigations. Materials and Methods We have developed an architecture for an Analytic Information Warehouse (AIW). It supports transforming data represented in different physical schemas into a common data model, specifying derived variables in terms of the common model to enable their reuse, computing derived variables while enforcing invariants and ensuring correctness and consistency of data transformations, long-term curation of derived data, and export of derived data into standard analysis tools. It includes software that implements these features and a computing environment that enables secure high-performance access to and processing of large datasets extracted from EHRs. Results We have implemented and deployed the architecture in production locally. The software is available as open source. We have used it as part of hospital operations in a project to reduce rates of hospital readmission within 30 days. The project examined the association of over 100 derived variables representing disease and co-morbidity phenotypes with readmissions in five years of data from our institution’s clinical data warehouse and the UHC Clinical Database (CDB). The CDB contains administrative data from over 200 hospitals that are in academic medical centers or affiliated with such centers. Discussion and Conclusion A widely available platform for managing and detecting phenotypes in EHR data could accelerate the use of such data in quality improvement and comparative effectiveness studies. PMID:23402960

  5. Biology in 'silico': The Bioinformatics Revolution.

    Science.gov (United States)

    Bloom, Mark

    2001-01-01

    Explains the Human Genome Project (HGP) and efforts to sequence the human genome. Describes the role of bioinformatics in the project and considers it the genetics Swiss Army Knife, which has many different uses, for use in forensic science, medicine, agriculture, and environmental sciences. Discusses the use of bioinformatics in the high school…

  6. Using "Arabidopsis" Genetic Sequences to Teach Bioinformatics

    Science.gov (United States)

    Zhang, Xiaorong

    2009-01-01

    This article describes a new approach to teaching bioinformatics using "Arabidopsis" genetic sequences. Several open-ended and inquiry-based laboratory exercises have been designed to help students grasp key concepts and gain practical skills in bioinformatics, using "Arabidopsis" leucine-rich repeat receptor-like kinase (LRR…

  7. A Mathematical Optimization Problem in Bioinformatics

    Science.gov (United States)

    Heyer, Laurie J.

    2008-01-01

    This article describes the sequence alignment problem in bioinformatics. Through examples, we formulate sequence alignment as an optimization problem and show how to compute the optimal alignment with dynamic programming. The examples and sample exercises have been used by the author in a specialized course in bioinformatics, but could be adapted…

  8. Online Bioinformatics Tutorials | Office of Cancer Genomics

    Science.gov (United States)

    Bioinformatics is a scientific discipline that applies computer science and information technology to help understand biological processes. The NIH provides a list of free online bioinformatics tutorials, either generated by the NIH Library or other institutes, which includes introductory lectures and "how to" videos on using various tools.

  9. Fuzzy Logic in Medicine and Bioinformatics

    Directory of Open Access Journals (Sweden)

    Angela Torres

    2006-01-01

    Full Text Available The purpose of this paper is to present a general view of the current applications of fuzzy logic in medicine and bioinformatics. We particularly review the medical literature using fuzzy logic. We then recall the geometrical interpretation of fuzzy sets as points in a fuzzy hypercube and present two concrete illustrations in medicine (drug addictions and in bioinformatics (comparison of genomes.

  10. Rising Strengths Hong Kong SAR in Bioinformatics.

    Science.gov (United States)

    Chakraborty, Chiranjib; George Priya Doss, C; Zhu, Hailong; Agoramoorthy, Govindasamy

    2017-06-01

    Hong Kong's bioinformatics sector is attaining new heights in combination with its economic boom and the predominance of the working-age group in its population. Factors such as a knowledge-based and free-market economy have contributed towards a prominent position on the world map of bioinformatics. In this review, we have considered the educational measures, landmark research activities and the achievements of bioinformatics companies and the role of the Hong Kong government in the establishment of bioinformatics as strength. However, several hurdles remain. New government policies will assist computational biologists to overcome these hurdles and further raise the profile of the field. There is a high expectation that bioinformatics in Hong Kong will be a promising area for the next generation.

  11. Bioinformatics clouds for big data manipulation

    Directory of Open Access Journals (Sweden)

    Dai Lin

    2012-11-01

    Full Text Available Abstract As advances in life sciences and information technology bring profound influences on bioinformatics due to its interdisciplinary nature, bioinformatics is experiencing a new leap-forward from in-house computing infrastructure into utility-supplied cloud computing delivered over the Internet, in order to handle the vast quantities of biological data generated by high-throughput experimental technologies. Albeit relatively new, cloud computing promises to address big data storage and analysis issues in the bioinformatics field. Here we review extant cloud-based services in bioinformatics, classify them into Data as a Service (DaaS, Software as a Service (SaaS, Platform as a Service (PaaS, and Infrastructure as a Service (IaaS, and present our perspectives on the adoption of cloud computing in bioinformatics. Reviewers This article was reviewed by Frank Eisenhaber, Igor Zhulin, and Sandor Pongor.

  12. The 2016 Bioinformatics Open Source Conference (BOSC).

    Science.gov (United States)

    Harris, Nomi L; Cock, Peter J A; Chapman, Brad; Fields, Christopher J; Hokamp, Karsten; Lapp, Hilmar; Muñoz-Torres, Monica; Wiencko, Heather

    2016-01-01

    Message from the ISCB: The Bioinformatics Open Source Conference (BOSC) is a yearly meeting organized by the Open Bioinformatics Foundation (OBF), a non-profit group dedicated to promoting the practice and philosophy of Open Source software development and Open Science within the biological research community. BOSC has been run since 2000 as a two-day Special Interest Group (SIG) before the annual ISMB conference. The 17th annual BOSC ( http://www.open-bio.org/wiki/BOSC_2016) took place in Orlando, Florida in July 2016. As in previous years, the conference was preceded by a two-day collaborative coding event open to the bioinformatics community. The conference brought together nearly 100 bioinformatics researchers, developers and users of open source software to interact and share ideas about standards, bioinformatics software development, and open and reproducible science.

  13. Bioinformatics clouds for big data manipulation

    KAUST Repository

    Dai, Lin

    2012-11-28

    As advances in life sciences and information technology bring profound influences on bioinformatics due to its interdisciplinary nature, bioinformatics is experiencing a new leap-forward from in-house computing infrastructure into utility-supplied cloud computing delivered over the Internet, in order to handle the vast quantities of biological data generated by high-throughput experimental technologies. Albeit relatively new, cloud computing promises to address big data storage and analysis issues in the bioinformatics field. Here we review extant cloud-based services in bioinformatics, classify them into Data as a Service (DaaS), Software as a Service (SaaS), Platform as a Service (PaaS), and Infrastructure as a Service (IaaS), and present our perspectives on the adoption of cloud computing in bioinformatics.This article was reviewed by Frank Eisenhaber, Igor Zhulin, and Sandor Pongor. 2012 Dai et al.; licensee BioMed Central Ltd.

  14. Bioinformatics clouds for big data manipulation.

    Science.gov (United States)

    Dai, Lin; Gao, Xin; Guo, Yan; Xiao, Jingfa; Zhang, Zhang

    2012-11-28

    As advances in life sciences and information technology bring profound influences on bioinformatics due to its interdisciplinary nature, bioinformatics is experiencing a new leap-forward from in-house computing infrastructure into utility-supplied cloud computing delivered over the Internet, in order to handle the vast quantities of biological data generated by high-throughput experimental technologies. Albeit relatively new, cloud computing promises to address big data storage and analysis issues in the bioinformatics field. Here we review extant cloud-based services in bioinformatics, classify them into Data as a Service (DaaS), Software as a Service (SaaS), Platform as a Service (PaaS), and Infrastructure as a Service (IaaS), and present our perspectives on the adoption of cloud computing in bioinformatics. This article was reviewed by Frank Eisenhaber, Igor Zhulin, and Sandor Pongor.

  15. Data Warehouse for support to the electric energy commercialization; Data Warehouse para apoio a comercializacao de energia eletrica

    Energy Technology Data Exchange (ETDEWEB)

    Lanzotti, Carla R.; Correia, Paulo B. [Universidade Estadual de Campinas (UNICAMP), SP (Brazil). Faculdade de Engenharia Mecanica]. E-mails: clanzotti@yahoo.com; pcorreia@fem.unicamp.br

    2006-07-01

    This paper specifies data base using a data warehouse containing the energy market, the electric system, and the economy information allowing the visualization and analysis of the data through tables and dynamic charts. This data warehouse corresponds to the module 'Information base from Platform helping the electric power commercialization'. The platform is a computer program viewing to help the interested agents in commercializing energy and is formed by three modules as follows: Information Base, Contracting Strategies and Contracting Process. It is expected that the use os these data base, joint to Platform establishes positive conditions to agents from the interested in electric energy commercialization.

  16. 19 CFR 19.13 - Requirements for establishment of warehouse.

    Science.gov (United States)

    2010-04-01

    ...; DEPARTMENT OF THE TREASURY CUSTOMS WAREHOUSES, CONTAINER STATIONS AND CONTROL OF MERCHANDISE THEREIN... secured area separated from the remainder of the premises to be used exclusively for the storage of imported merchandise, domestic spirits, and merchandise subject to internal-revenue tax transferred into...

  17. Risk control for staff planning in e-commerce warehouses

    NARCIS (Netherlands)

    Wruck, Susanne; Vis, Iris F A; Boter, Jaap

    2016-01-01

    Internet sale supply chains often need to fulfil quickly small orders for many customers. The resulting high demand and planning uncertainties pose new challenges for e-commerce warehouse operations. Here, we develop a decision support tool to assist managers in selecting appropriate risk policies

  18. How Can My State Benefit from an Educational Data Warehouse?

    Science.gov (United States)

    Bergner, Terry; Smith, Nancy J.

    2007-01-01

    Imagine if, at the start of the school year, a teacher could have detailed information about the academic history of every student in her or his classroom. This is possible if the teacher can log on to a Web site that provides access to an educational data warehouse. The teacher would see not only several years of state assessment results, but…

  19. Developing and Marketing a Client/Server-Based Data Warehouse.

    Science.gov (United States)

    Singleton, Michele; And Others

    1993-01-01

    To provide better access to information, the University of Arizona information technology center has designed a data warehouse accessible from the desktop computer. A team approach has proved successful in introducing and demonstrating a prototype to the campus community. (Author/MSE)

  20. Data-driven warehouse optimization : Deploying skills of order pickers

    NARCIS (Netherlands)

    M. Matusiak (Marek); M.B.M. de Koster (René); J. Saarinen (Jari)

    2015-01-01

    textabstractBatching orders and routing order pickers is a commonly studied problem in many picker-to-parts warehouses. The impact of individual differences in picking skills on performance has received little attention. In this paper, we show that taking into account differences in the skills of

  1. A novel approach for intelligent distribution of data warehouses

    Directory of Open Access Journals (Sweden)

    Abhay Kumar Agarwal

    2016-07-01

    Full Text Available With the continuous growth in the amount of data, data storage systems have come a long way from flat files systems to RDBMS, Data Warehousing (DW and Distributed Data Warehousing systems. This paper proposes a new distributed data warehouse model. The model is built on a novel approach, for the intelligent distribution of data warehouse. Overall the model is named as Intelligent and Distributed Data Warehouse (IDDW. The proposed model has N-levels and is based on top-down hierarchical design approach of building distributed data warehouse. The building process of IDDW starts with the identification of various locations where DW may be built. Initially, a single location is considered at top-most level of IDDW where DW is built. Thereafter, DW at any other location of any level may be built. A method, to transfer concerned data from any upper level DW to concerned lower level DW, is also presented in the paper. The paper also presents IDDW modeling, its architecture based on modeling, the internal organization of IDDW via which all the operations within IDDW are performed.

  2. Event-Entity-Relationship Modeling in Data Warehouse Environments

    DEFF Research Database (Denmark)

    Bækgaard, Lars

    We use the event-entity-relationship model (EVER) to illustrate the use of entity-based modeling languages for conceptual schema design in data warehouse environments. EVER is a general-purpose information modeling language that supports the specification of both general schema structures and multi...

  3. A Foundation for Spatial Data Warehouses on the Semantic Web


    DEFF Research Database (Denmark)

    Gur, Nurefsan; Pedersen, Torben Bach; Zimaányi, Esteban

    2017-01-01

    Large volumes of geospatial data are being published on the Semantic Web (SW), yielding a need for advanced analysis of such data. However, existing SW technologies only support advanced analytical concepts such as multidimensional (MD) data warehouses and Online Analytical Processing (OLAP) over...

  4. On sustainable operation of warehouse order picking systems

    NARCIS (Netherlands)

    Andriansyah, R.; Etman, L.F.P.; Rooda, J.E.

    2009-01-01

    Sustainable development calls for an efficient utilization of natural and human resources. This issue also arises for warehouse systems, where typically extensive capital investment and labor intensive work are involved. It is therefore important to assess and continuously monitor the performance of

  5. Computational biology and bioinformatics in Nigeria.

    Science.gov (United States)

    Fatumo, Segun A; Adoga, Moses P; Ojo, Opeolu O; Oluwagbemi, Olugbenga; Adeoye, Tolulope; Ewejobi, Itunuoluwa; Adebiyi, Marion; Adebiyi, Ezekiel; Bewaji, Clement; Nashiru, Oyekanmi

    2014-04-01

    Over the past few decades, major advances in the field of molecular biology, coupled with advances in genomic technologies, have led to an explosive growth in the biological data generated by the scientific community. The critical need to process and analyze such a deluge of data and turn it into useful knowledge has caused bioinformatics to gain prominence and importance. Bioinformatics is an interdisciplinary research area that applies techniques, methodologies, and tools in computer and information science to solve biological problems. In Nigeria, bioinformatics has recently played a vital role in the advancement of biological sciences. As a developing country, the importance of bioinformatics is rapidly gaining acceptance, and bioinformatics groups comprised of biologists, computer scientists, and computer engineers are being constituted at Nigerian universities and research institutes. In this article, we present an overview of bioinformatics education and research in Nigeria. We also discuss professional societies and academic and research institutions that play central roles in advancing the discipline in Nigeria. Finally, we propose strategies that can bolster bioinformatics education and support from policy makers in Nigeria, with potential positive implications for other developing countries.

  6. Computational biology and bioinformatics in Nigeria.

    Directory of Open Access Journals (Sweden)

    Segun A Fatumo

    2014-04-01

    Full Text Available Over the past few decades, major advances in the field of molecular biology, coupled with advances in genomic technologies, have led to an explosive growth in the biological data generated by the scientific community. The critical need to process and analyze such a deluge of data and turn it into useful knowledge has caused bioinformatics to gain prominence and importance. Bioinformatics is an interdisciplinary research area that applies techniques, methodologies, and tools in computer and information science to solve biological problems. In Nigeria, bioinformatics has recently played a vital role in the advancement of biological sciences. As a developing country, the importance of bioinformatics is rapidly gaining acceptance, and bioinformatics groups comprised of biologists, computer scientists, and computer engineers are being constituted at Nigerian universities and research institutes. In this article, we present an overview of bioinformatics education and research in Nigeria. We also discuss professional societies and academic and research institutions that play central roles in advancing the discipline in Nigeria. Finally, we propose strategies that can bolster bioinformatics education and support from policy makers in Nigeria, with potential positive implications for other developing countries.

  7. Creating a Data Warehouse using SQL Server

    DEFF Research Database (Denmark)

    Sørensen, Jens Otto; Alnor, Karl

    1999-01-01

    In this paper we construct a Star Join Schema and show how this schema can be created using the basic tools delivered with SQL Server 7.0. Major objectives are to keep the operational database unchanged so that data loading can be done with out disturbing the business logic of the operational...

  8. Quality assessment of digital annotated ECG data from clinical trials by the FDA ECG Warehouse.

    Science.gov (United States)

    Sarapa, Nenad

    2007-09-01

    The FDA mandates that digital electrocardiograms (ECGs) from 'thorough' QTc trials be submitted into the ECG Warehouse in Health Level 7 extended markup language format with annotated onset and offset points of waveforms. The FDA did not disclose the exact Warehouse metrics and minimal acceptable quality standards. The author describes the Warehouse scoring algorithms and metrics used by FDA, points out ways to improve FDA review and suggests Warehouse benefits for pharmaceutical sponsors. The Warehouse ranks individual ECGs according to their score for each quality metric and produces histogram distributions with Warehouse-specific thresholds that identify ECGs of questionable quality. Automatic Warehouse algorithms assess the quality of QT annotation and duration of manual QT measurement by the central ECG laboratory.

  9. Semantic integration of medication data into the EHOP Clinical Data Warehouse.

    Science.gov (United States)

    Delamarre, Denis; Bouzille, Guillaume; Dalleau, Kevin; Courtel, Denis; Cuggia, Marc

    2015-01-01

    Reusing medication data is crucial for many medical research domains. Semantic integration of such data in clinical data warehouse (CDW) is quite challenging. Our objective was to develop a reliable and scalable method for integrating prescription data into EHOP (a French CDW). PN13/PHAST was used as the semantic interoperability standard during the ETL process, and to store the prescriptions as documents in the CDW. Theriaque was used as a drug knowledge database (DKDB), to annotate the prescription dataset with the finest granularity, and to provide semantic capabilities to the EHOP query workbench. the system was evaluated on a clinical data set. Depending on the use case, the precision ranged from 52% to 100%, Recall was always 100%. interoperability standards and DKDB, document approach, and the finest granularity approach are the key factors for successful drug data integration in CDW.

  10. Selection of Forklift Unit for Warehouse Operation by Applying Multi-Criteria Analysis

    Directory of Open Access Journals (Sweden)

    Predrag Atanasković

    2013-07-01

    Full Text Available This paper presents research related to the choice of the criteria that can be used to perform an optimal selection of the forklift unit for warehouse operation. The analysis has been done with the aim of exploring the requirements and defining relevant criteria that are important when investment decision is made for forklift procurement, and based on the conducted research by applying multi-criteria analysis, to determine the appropriate parameters and their relative weights that form the input data and database for selection of the optimal handling unit. This paper presents an example of choosing the optimal forklift based on the selected criteria for the purpose of making the relevant investment decision.

  11. When cloud computing meets bioinformatics: a review.

    Science.gov (United States)

    Zhou, Shuigeng; Liao, Ruiqi; Guan, Jihong

    2013-10-01

    In the past decades, with the rapid development of high-throughput technologies, biology research has generated an unprecedented amount of data. In order to store and process such a great amount of data, cloud computing and MapReduce were applied to many fields of bioinformatics. In this paper, we first introduce the basic concepts of cloud computing and MapReduce, and their applications in bioinformatics. We then highlight some problems challenging the applications of cloud computing and MapReduce to bioinformatics. Finally, we give a brief guideline for using cloud computing in biology research.

  12. Application of machine learning methods in bioinformatics

    Science.gov (United States)

    Yang, Haoyu; An, Zheng; Zhou, Haotian; Hou, Yawen

    2018-05-01

    Faced with the development of bioinformatics, high-throughput genomic technology have enabled biology to enter the era of big data. [1] Bioinformatics is an interdisciplinary, including the acquisition, management, analysis, interpretation and application of biological information, etc. It derives from the Human Genome Project. The field of machine learning, which aims to develop computer algorithms that improve with experience, holds promise to enable computers to assist humans in the analysis of large, complex data sets.[2]. This paper analyzes and compares various algorithms of machine learning and their applications in bioinformatics.

  13. Relax with CouchDB--into the non-relational DBMS era of bioinformatics.

    Science.gov (United States)

    Manyam, Ganiraju; Payton, Michelle A; Roth, Jack A; Abruzzo, Lynne V; Coombes, Kevin R

    2012-07-01

    With the proliferation of high-throughput technologies, genome-level data analysis has become common in molecular biology. Bioinformaticians are developing extensive resources to annotate and mine biological features from high-throughput data. The underlying database management systems for most bioinformatics software are based on a relational model. Modern non-relational databases offer an alternative that has flexibility, scalability, and a non-rigid design schema. Moreover, with an accelerated development pace, non-relational databases like CouchDB can be ideal tools to construct bioinformatics utilities. We describe CouchDB by presenting three new bioinformatics resources: (a) geneSmash, which collates data from bioinformatics resources and provides automated gene-centric annotations, (b) drugBase, a database of drug-target interactions with a web interface powered by geneSmash, and (c) HapMap-CN, which provides a web interface to query copy number variations from three SNP-chip HapMap datasets. In addition to the web sites, all three systems can be accessed programmatically via web services. Copyright © 2012 Elsevier Inc. All rights reserved.

  14. Relax with CouchDB - Into the non-relational DBMS era of Bioinformatics

    Science.gov (United States)

    Manyam, Ganiraju; Payton, Michelle A.; Roth, Jack A.; Abruzzo, Lynne V.; Coombes, Kevin R.

    2012-01-01

    With the proliferation of high-throughput technologies, genome-level data analysis has become common in molecular biology. Bioinformaticians are developing extensive resources to annotate and mine biological features from high-throughput data. The underlying database management systems for most bioinformatics software are based on a relational model. Modern non-relational databases offer an alternative that has flexibility, scalability, and a non-rigid design schema. Moreover, with an accelerated development pace, non-relational databases like CouchDB can be ideal tools to construct bioinformatics utilities. We describe CouchDB by presenting three new bioinformatics resources: (a) geneSmash, which collates data from bioinformatics resources and provides automated gene-centric annotations, (b) drugBase, a database of drug-target interactions with a web interface powered by geneSmash, and (c) HapMap-CN, which provides a web interface to query copy number variations from three SNP-chip HapMap datasets. In addition to the web sites, all three systems can be accessed programmatically via web services. PMID:22609849

  15. Bioinformatics analysis identifies several intrinsically disordered human E3 ubiquitin-protein ligases

    DEFF Research Database (Denmark)

    Boomsma, Wouter Krogh; Nielsen, Sofie Vincents; Lindorff-Larsen, Kresten

    2016-01-01

    conduct a bioinformatics analysis to examine >600 human and S. cerevisiae E3 ligases to identify enzymes that are similar to San1 in terms of function and/or mechanism of substrate recognition. An initial sequence-based database search was found to detect candidates primarily based on the homology...

  16. Why Choose This One? Factors in Scientists' Selection of Bioinformatics Tools

    Science.gov (United States)

    Bartlett, Joan C.; Ishimura, Yusuke; Kloda, Lorie A.

    2011-01-01

    Purpose: The objective was to identify and understand the factors involved in scientists' selection of preferred bioinformatics tools, such as databases of gene or protein sequence information (e.g., GenBank) or programs that manipulate and analyse biological data (e.g., BLAST). Methods: Eight scientists maintained research diaries for a two-week…

  17. Bioinformatics-Driven Identification and Examination of Candidate Genes for Non-Alcoholic Fatty Liver Disease

    DEFF Research Database (Denmark)

    Banasik, Karina; Justesen, Johanne M.; Hornbak, Malene

    2011-01-01

    Objective: Candidate genes for non-alcoholic fatty liver disease (NAFLD) identified by a bioinformatics approach were examined for variant associations to quantitative traits of NAFLD-related phenotypes. Research Design and Methods: By integrating public database text mining, trans-organism protein...

  18. Multidimensional Databases and Data Warehousing

    CERN Document Server

    Jensen, Christian

    2010-01-01

    The present book's subject is multidimensional data models and data modeling concepts as they are applied in real data warehouses. The book aims to present the most important concepts within this subject in a precise and understandable manner. The book's coverage of fundamental concepts includes data cubes and their elements, such as dimensions, facts, and measures and their representation in a relational setting; it includes architecture-related concepts; and it includes the querying of multidimensional databases.The book also covers advanced multidimensional concepts that are considered to b

  19. Bioinformatic tools for PCR Primer design

    African Journals Online (AJOL)

    ES

    Bioinformatics is an emerging scientific discipline that uses information ... complex biological questions. ... and computer programs for various purposes of primer ..... polymerase chain reaction: Human Immunodeficiency Virus 1 model studies.

  20. Concepts and introduction to RNA bioinformatics

    DEFF Research Database (Denmark)

    Gorodkin, Jan; Hofacker, Ivo L.; Ruzzo, Walter L.

    2014-01-01

    RNA bioinformatics and computational RNA biology have emerged from implementing methods for predicting the secondary structure of single sequences. The field has evolved to exploit multiple sequences to take evolutionary information into account, such as compensating (and structure preserving) base...... for interactions between RNA and proteins.Here, we introduce the basic concepts of predicting RNA secondary structure relevant to the further analyses of RNA sequences. We also provide pointers to methods addressing various aspects of RNA bioinformatics and computational RNA biology....

  1. Data warehouse governance programs in healthcare settings: a literature review and a call to action.

    Science.gov (United States)

    Elliott, Thomas E; Holmes, John H; Davidson, Arthur J; La Chance, Pierre-Andre; Nelson, Andrew F; Steiner, John F

    2013-01-01

    Given the extensive data stored in healthcare data warehouses, data warehouse governance policies are needed to ensure data integrity and privacy. This review examines the current state of the data warehouse governance literature as it applies to healthcare data warehouses, identifies knowledge gaps, provides recommendations, and suggests approaches for further research. A comprehensive literature search using five data bases, journal article title-search, and citation searches was conducted between 1997 and 2012. Data warehouse governance documents from two healthcare systems in the USA were also reviewed. A modified version of nine components from the Data Governance Institute Framework for data warehouse governance guided the qualitative analysis. Fifteen articles were retrieved. Only three were related to healthcare settings, each of which addressed only one of the nine framework components. Of the remaining 12 articles, 10 addressed between one and seven framework components and the remainder addressed none. Each of the two data warehouse governance plans obtained from healthcare systems in the USA addressed a subset of the framework components, and between them they covered all nine. While published data warehouse governance policies are rare, the 15 articles and two healthcare organizational documents reviewed in this study may provide guidance to creating such policies. Additional research is needed in this area to ensure that data warehouse governance polices are feasible and effective. The gap between the development of data warehouses in healthcare settings and formal governance policies is substantial, as evidenced by the sparse literature in this domain.

  2. Applications and Methods Utilizing the Simple Semantic Web Architecture and Protocol (SSWAP) for Bioinformatics Resource Discovery and Disparate Data and Service Integration

    Science.gov (United States)

    Scientific data integration and computational service discovery are challenges for the bioinformatic community. This process is made more difficult by the separate and independent construction of biological databases, which makes the exchange of scientific data between information resources difficu...

  3. Routing Optimization of Intelligent Vehicle in Automated Warehouse

    Directory of Open Access Journals (Sweden)

    Yan-cong Zhou

    2014-01-01

    Full Text Available Routing optimization is a key technology in the intelligent warehouse logistics. In order to get an optimal route for warehouse intelligent vehicle, routing optimization in complex global dynamic environment is studied. A new evolutionary ant colony algorithm based on RFID and knowledge-refinement is proposed. The new algorithm gets environmental information timely through the RFID technology and updates the environment map at the same time. It adopts elite ant kept, fallback, and pheromones limitation adjustment strategy. The current optimal route in population space is optimized based on experiential knowledge. The experimental results show that the new algorithm has higher convergence speed and can jump out the U-type or V-type obstacle traps easily. It can also find the global optimal route or approximate optimal one with higher probability in the complex dynamic environment. The new algorithm is proved feasible and effective by simulation results.

  4. Data Warehouse Design from HL7 Clinical Document Architecture Schema.

    Science.gov (United States)

    Pecoraro, Fabrizio; Luzi, Daniela; Ricci, Fabrizio L

    2015-01-01

    This paper proposes a semi-automatic approach to extract clinical information structured in a HL7 Clinical Document Architecture (CDA) and transform it in a data warehouse dimensional model schema. It is based on a conceptual framework published in a previous work that maps the dimensional model primitives with CDA elements. Its feasibility is demonstrated providing a case study based on the analysis of vital signs gathered during laboratory tests.

  5. Aspects of Data Warehouse Technologies for Complex Web Data

    OpenAIRE

    Thomsen, Christian

    2008-01-01

    This thesis is about aspects of specification and development of datawarehouse technologies for complex web data. Today, large amounts of dataexist in different web resources and in different formats. But it is oftenhard to analyze and query the often big and complex data or data about thedata (i.e., metadata). It is therefore interesting to apply Data Warehouse(DW) technology to the data. But to apply DW technology to complex web datais not straightforward and the DW community faces new and ...

  6. Unexpected levels and movement of radon in a large warehouse

    International Nuclear Information System (INIS)

    Gammage, R.B.; Espinosa, G.

    2004-01-01

    Alpha-track detectors, used in screening for radon, identified a large warehouse with levels of radon as high as 20 p Ci/l. This circumstance was unexpected because large bay doors were left open for much of the day to admit 1 8-wheeler trucks, and exhaust fans in the roof produced good ventilation. More detailed temporal and spatial investigations of radon and air-flow patterns were made with electret chambers, Lucas-cell flow chambers, tracer gas, smoke pencils and pressure sensing micrometers. An oval-dome shaped zone of radon (>4 p Ci/L) persisted in the central region of each of four separate bays composing the warehouse. Detailed studies of air movement in the bay with the highest levels of radon showed clockwise rotation of air near the outer walls with a central dead zone. Sub slab, radon-laden air ingresses the building through expansion joints between the floor slabs to produce the measured radon. The likely source of radon is air within porous, karst bedrock that underlies much of north-central Tennessee where the warehouse is situated

  7. Warehouse hazardous and toxic waste design in Karingau Balikpapan

    Science.gov (United States)

    Pratama, Bayu Rendy; Kencanawati, Martheana

    2017-11-01

    PT. Balikpapan Environmental Services (PT. BES) is company that having core business in Hazardous and Toxic Waste Management Services which consisting storage and transporter at Balikpapan. This research starting with data collection such as type of waste, quantity of waste, dimension area of existing building, waste packaging (Drum, IBC tank, Wooden Box, & Bulk Bag). Processing data that will be done are redesign for warehouse dimension and layout of position waste, specify of capacity, specify of quantity, type and detector placement, specify of quantity, type and fire extinguishers position which refers to Bapedal Regulation No. 01 In 1995, SNI 03-3985-2000, Employee Minister Regulation RI No. Per-04/Men/1980. Based on research that already done, founded the design for warehouse dimension of waste is 23 m × 22 m × 5 m with waste layout position appropriate with type of waste. The necessary of quantity for detector on this waste warehouse design are 56 each. The type of fire extinguisher that appropriate with this design is dry powder which containing natrium carbonate, alkali salts, with having each weight of 12 Kg about 18 units.

  8. Navigating the changing learning landscape: perspective from bioinformatics.ca

    OpenAIRE

    Brazas, Michelle D.; Ouellette, B. F. Francis

    2013-01-01

    With the advent of YouTube channels in bioinformatics, open platforms for problem solving in bioinformatics, active web forums in computing analyses and online resources for learning to code or use a bioinformatics tool, the more traditional continuing education bioinformatics training programs have had to adapt. Bioinformatics training programs that solely rely on traditional didactic methods are being superseded by these newer resources. Yet such face-to-face instruction is still invaluable...

  9. THE DEVELOPMENT OF THE APPLICATION OF A DATA WAREHOUSE AT PT JKL

    Directory of Open Access Journals (Sweden)

    Choirul Huda

    2012-05-01

    Full Text Available One rapidly evolving technology today is information technology, which can help decision-making in an organization or a company. The data warehouse is one form of information technology that supports those needs, as one of the right solutions for companies in decision-making. The objective of this research is the development of a data warehouse at PT JKL in order to support executives in analyzing the organization and support the decision-making process. Methodology of this research is conducting interview with related units, literature study and document examination. This research also used the Nine Step Methodology developed by Kimball to design the data warehouse. The results obtained is an application that can summarize the data warehouse, integrating and presenting historical data in multidimensional. The conclusion from this research is the data warehouse can help companies to analyze data in a flexible, fast, and effective data access.Keywords: Data Warehouse; Inventory; Contract Approval; Inventory; Dashboard

  10. toxoMine: an integrated omics data warehouse for Toxoplasma gondii systems biology research.

    Science.gov (United States)

    Rhee, David B; Croken, Matthew McKnight; Shieh, Kevin R; Sullivan, Julie; Micklem, Gos; Kim, Kami; Golden, Aaron

    2015-01-01

    Toxoplasma gondii (T. gondii) is an obligate intracellular parasite that must monitor for changes in the host environment and respond accordingly; however, it is still not fully known which genetic or epigenetic factors are involved in regulating virulence traits of T. gondii. There are on-going efforts to elucidate the mechanisms regulating the stage transition process via the application of high-throughput epigenomics, genomics and proteomics techniques. Given the range of experimental conditions and the typical yield from such high-throughput techniques, a new challenge arises: how to effectively collect, organize and disseminate the generated data for subsequent data analysis. Here, we describe toxoMine, which provides a powerful interface to support sophisticated integrative exploration of high-throughput experimental data and metadata, providing researchers with a more tractable means toward understanding how genetic and/or epigenetic factors play a coordinated role in determining pathogenicity of T. gondii. As a data warehouse, toxoMine allows integration of high-throughput data sets with public T. gondii data. toxoMine is also able to execute complex queries involving multiple data sets with straightforward user interaction. Furthermore, toxoMine allows users to define their own parameters during the search process that gives users near-limitless search and query capabilities. The interoperability feature also allows users to query and examine data available in other InterMine systems, which would effectively augment the search scope beyond what is available to toxoMine. toxoMine complements the major community database ToxoDB by providing a data warehouse that enables more extensive integrative studies for T. gondii. Given all these factors, we believe it will become an indispensable resource to the greater infectious disease research community. © The Author(s) 2015. Published by Oxford University Press.

  11. Engaging Students in a Bioinformatics Activity to Introduce Gene Structure and Function

    Directory of Open Access Journals (Sweden)

    Barbara J. May

    2013-02-01

    Full Text Available Bioinformatics spans many fields of biological research and plays a vital role in mining and analyzing data. Therefore, there is an ever-increasing need for students to understand not only what can be learned from this data, but also how to use basic bioinformatics tools.  This activity is designed to provide secondary and undergraduate biology students to a hands-on activity meant to explore and understand gene structure with the use of basic bioinformatic tools.  Students are provided an “unknown” sequence from which they are asked to use a free online gene finder program to identify the gene. Students then predict the putative function of this gene with the use of additional online databases.

  12. A database for coconut crop improvement.

    Science.gov (United States)

    Rajagopal, Velamoor; Manimekalai, Ramaswamy; Devakumar, Krishnamurthy; Rajesh; Karun, Anitha; Niral, Vittal; Gopal, Murali; Aziz, Shamina; Gunasekaran, Marimuthu; Kumar, Mundappurathe Ramesh; Chandrasekar, Arumugam

    2005-12-08

    Coconut crop improvement requires a number of biotechnology and bioinformatics tools. A database containing information on CG (coconut germplasm), CCI (coconut cultivar identification), CD (coconut disease), MIFSPC (microbial information systems in plantation crops) and VO (vegetable oils) is described. The database was developed using MySQL and PostgreSQL running in Linux operating system. The database interface is developed in PHP, HTML and JAVA. http://www.bioinfcpcri.org.

  13. 7 CFR 735.401 - Electronic warehouse receipt and USWA electronic document providers.

    Science.gov (United States)

    2010-01-01

    ... audit level financial statement prepared according to generally accepted accounting standards as defined... warehouse receipt requirements; (3) Liability; (4) Transfer of records protocol; (5) Records; (6) Conflict...

  14. Two warehouse inventory model for deteriorating item with exponential demand rate and permissible delay in payment

    Directory of Open Access Journals (Sweden)

    Kaliraman Naresh Kumar

    2017-01-01

    Full Text Available A two warehouse inventory model for deteriorating items is considered with exponential demand rate and permissible delay in payment. Shortage is not allowed and deterioration rate is constant. In the model, one warehouse is rented and the other is owned. The rented warehouse is provided with better facility for the stock than the owned warehouse, but is charged more. The objective of this model is to find the best replenishment policies for minimizing the total appropriate inventory cost. A numerical illustration and sensitivity analysis is provided.

  15. Deteksi Outlier Transaksi Menggunakan Visualisasi-Olap Pada Data Warehouse Perguruan Tinggi Swasta

    Directory of Open Access Journals (Sweden)

    Gusti Ngurah Mega Nata

    2016-07-01

    Full Text Available Mendeteksi outlier pada data warehouse merupakan hal penting. Data pada data warehouse sudah diagregasi dan memiliki model multidimensional. Agregasi pada data warehouse dilakukan karena data warehouse digunakan untuk menganalisis data secara cepat pada top level manajemen. Sedangkan, model data multidimensional digunakan untuk melihat data dari berbagai dimensi objek bisnis. Jadi, Mendeteksi outlier pada data warehouse membutuhkan teknik yang dapat melihat outlier pada data yang sudah diagregasi dan dapat melihat dari berbagai dimensi objek bisnis. Mendeteksi outlier pada data warehouse akan menjadi tantangan baru.        Di lain hal, Visualisasi On-line Analytic process (OLAP merupakan tugas penting dalam menyajikan informasi trend (report pada data warehouse dalam bentuk visualisasi data. Pada penelitian ini, visualisasi OLAP digunakan untuk deteksi outlier transaksi. Maka, dalam penelitian ini melakukan analisis untuk mendeteksi outlier menggunakan visualisasi-OLAP. Operasi OLAP yang digunakan yaitu operasi drill-down. Jenis visualisasi yang akan digunakan yaitu visualisasi satu dimensi, dua dimensi dan multi dimensi menggunakan tool weave desktop. Pembangunan data warehouse dilakukan secara button-up. Studi kasus dilakukan pada perguruan tinggi swasta. Kasus yang diselesaikan yaitu mendeteksi outlier transaki pembayaran mahasiswa pada setiap semester. Deteksi outlier pada visualisasi data menggunakan satu tabel dimensional lebih mudah dianalisis dari pada deteksi outlier pada visualisasi data menggunakan dua atau multi tabel dimensional. Dengan kata lain semakin banyak tabel dimensi yang terlibat semakin sulit analisis deteksi outlier yang dilakukan. Kata kunci — Deteksi Outlier,  Visualisasi OLAP, Data warehouse

  16. PENGEMBANGAN DATA WAREHOUSE DAN ON-LINE ANALYTICAL PROCESSING (OLAP UNTUK PENEMUAN INFORMASI DAN ANALISIS DATA (Studi Kasus : Sistem Informasi Penerimaan Mahasiswa Baru STMIK AMIKOM PURWOKERTO

    Directory of Open Access Journals (Sweden)

    Giat Karyono

    2011-08-01

    databases (OLTP SIPMB process of creating a data warehouse is done on different machines by performing replicate of the database used SIPMB. For the needs of the application is made data analysis OLAP PMB. These applications could present multidimensional data in grid view. Analysis of these data may include analysis of specialization in new student enrollment based on the period of new admissions, enrollment surge, home school, home province and district, Pengembangan Data Warehouse dan On-line Analytical Processing (OLAP Untuk Penemuan Informasi Dan Analisis DataJurnal Telematika Vol. 4 No.2 Agustus 2011 14registration information, day, month, or year. So that the results of data analysis can be the management in determining the appropriate marketing strategies to increase the number of freshmen applicants either currently running or for admission in the coming year.

  17. What Information Does Your EHR Contain? Automatic Generation of a Clinical Metadata Warehouse (CMDW) to Support Identification and Data Access Within Distributed Clinical Research Networks.

    Science.gov (United States)

    Bruland, Philipp; Doods, Justin; Storck, Michael; Dugas, Martin

    2017-01-01

    Data dictionaries provide structural meta-information about data definitions in health information technology (HIT) systems. In this regard, reusing healthcare data for secondary purposes offers several advantages (e.g. reduce documentation times or increased data quality). Prerequisites for data reuse are its quality, availability and identical meaning of data. In diverse projects, research data warehouses serve as core components between heterogeneous clinical databases and various research applications. Given the complexity (high number of data elements) and dynamics (regular updates) of electronic health record (EHR) data structures, we propose a clinical metadata warehouse (CMDW) based on a metadata registry standard. Metadata of two large hospitals were automatically inserted into two CMDWs containing 16,230 forms and 310,519 data elements. Automatic updates of metadata are possible as well as semantic annotations. A CMDW allows metadata discovery, data quality assessment and similarity analyses. Common data models for distributed research networks can be established based on similarity analyses.

  18. Expiration of Historical Databases

    DEFF Research Database (Denmark)

    Toman, David

    2001-01-01

    We present a technique for automatic expiration of data in a historical data warehouse that preserves answers to a known and fixed set of first-order queries. In addition, we show that for queries with output size bounded by a function of the active data domain size (the number of values that have...... ever appeared in the warehouse), the size of the portion of the data warehouse history needed to answer the queries is also bounded by a function of the active data do-main size and therefore does not depend on the age of the warehouse (the length of the history)....

  19. Planning bioinformatics workflows using an expert system

    Science.gov (United States)

    Chen, Xiaoling; Chang, Jeffrey T.

    2017-01-01

    Abstract Motivation: Bioinformatic analyses are becoming formidably more complex due to the increasing number of steps required to process the data, as well as the proliferation of methods that can be used in each step. To alleviate this difficulty, pipelines are commonly employed. However, pipelines are typically implemented to automate a specific analysis, and thus are difficult to use for exploratory analyses requiring systematic changes to the software or parameters used. Results: To automate the development of pipelines, we have investigated expert systems. We created the Bioinformatics ExperT SYstem (BETSY) that includes a knowledge base where the capabilities of bioinformatics software is explicitly and formally encoded. BETSY is a backwards-chaining rule-based expert system comprised of a data model that can capture the richness of biological data, and an inference engine that reasons on the knowledge base to produce workflows. Currently, the knowledge base is populated with rules to analyze microarray and next generation sequencing data. We evaluated BETSY and found that it could generate workflows that reproduce and go beyond previously published bioinformatics results. Finally, a meta-investigation of the workflows generated from the knowledge base produced a quantitative measure of the technical burden imposed by each step of bioinformatics analyses, revealing the large number of steps devoted to the pre-processing of data. In sum, an expert system approach can facilitate exploratory bioinformatic analysis by automating the development of workflows, a task that requires significant domain expertise. Availability and Implementation: https://github.com/jefftc/changlab Contact: jeffrey.t.chang@uth.tmc.edu PMID:28052928

  20. What is bioinformatics? A proposed definition and overview of the field.

    Science.gov (United States)

    Luscombe, N M; Greenbaum, D; Gerstein, M

    2001-01-01

    The recent flood of data from genome sequences and functional genomics has given rise to new field, bioinformatics, which combines elements of biology and computer science. Here we propose a definition for this new field and review some of the research that is being pursued, particularly in relation to transcriptional regulatory systems. Our definition is as follows: Bioinformatics is conceptualizing biology in terms of macromolecules (in the sense of physical-chemistry) and then applying "informatics" techniques (derived from disciplines such as applied maths, computer science, and statistics) to understand and organize the information associated with these molecules, on a large-scale. Analyses in bioinformatics predominantly focus on three types of large datasets available in molecular biology: macromolecular structures, genome sequences, and the results of functional genomics experiments (e.g. expression data). Additional information includes the text of scientific papers and "relationship data" from metabolic pathways, taxonomy trees, and protein-protein interaction networks. Bioinformatics employs a wide range of computational techniques including sequence and structural alignment, database design and data mining, macromolecular geometry, phylogenetic tree construction, prediction of protein structure and function, gene finding, and expression data clustering. The emphasis is on approaches integrating a variety of computational methods and heterogeneous data sources. Finally, bioinformatics is a practical discipline. We survey some representative applications, such as finding homologues, designing drugs, and performing large-scale censuses. Additional information pertinent to the review is available over the web at http://bioinfo.mbb.yale.edu/what-is-it.

  1. The visit-data warehouse: enabling novel secondary use of health information exchange data.

    Science.gov (United States)

    Fleischman, William; Lowry, Tina; Shapiro, Jason

    2014-01-01

    Health Information Exchange (HIE) efforts face challenges with data quality and performance, and this becomes especially problematic when data is leveraged for uses beyond primary clinical use. We describe a secondary data infrastructure focusing on patient-encounter, nonclinical data that was built on top of a functioning HIE platform to support novel secondary data uses and prevent potentially negative impacts these uses might have otherwise had on HIE system performance. HIE efforts have generally formed for the primary clinical use of individual clinical providers searching for data on individual patients under their care, but many secondary uses have been proposed and are being piloted to support care management, quality improvement, and public health. This infrastructure review describes a module built into the Healthix HIE. Healthix, based in the New York metropolitan region, comprises 107 participating organizations with 29,946 acute-care beds in 383 facilities, and includes more than 9.2 million unique patients. The primary infrastructure is based on the InterSystems proprietary Caché data model distributed across servers in multiple locations, and uses a master patient index to link individual patients' records across multiple sites. We built a parallel platform, the "visit data warehouse," of patient encounter data (demographics, date, time, and type of visit) using a relational database model to allow accessibility using standard database tools and flexibility for developing secondary data use cases. These four secondary use cases include the following: (1) tracking encounter-based metrics in a newly established geriatric emergency department (ED), (2) creating a dashboard to provide a visual display as well as a tabular output of near-real-time de-identified encounter data from the data warehouse, (3) tracking frequent ED users as part of a regional-approach to case management intervention, and (4) improving an existing quality improvement program

  2. MOWServ: a web client for integration of bioinformatic resources

    Science.gov (United States)

    Ramírez, Sergio; Muñoz-Mérida, Antonio; Karlsson, Johan; García, Maximiliano; Pérez-Pulido, Antonio J.; Claros, M. Gonzalo; Trelles, Oswaldo

    2010-01-01

    The productivity of any scientist is affected by cumbersome, tedious and time-consuming tasks that try to make the heterogeneous web services compatible so that they can be useful in their research. MOWServ, the bioinformatic platform offered by the Spanish National Institute of Bioinformatics, was released to provide integrated access to databases and analytical tools. Since its release, the number of available services has grown dramatically, and it has become one of the main contributors of registered services in the EMBRACE Biocatalogue. The ontology that enables most of the web-service compatibility has been curated, improved and extended. The service discovery has been greatly enhanced by Magallanes software and biodataSF. User data are securely stored on the main server by an authentication protocol that enables the monitoring of current or already-finished user’s tasks, as well as the pipelining of successive data processing services. The BioMoby standard has been greatly extended with the new features included in the MOWServ, such as management of additional information (metadata such as extended descriptions, keywords and datafile examples), a qualified registry, error handling, asynchronous services and service replication. All of them have increased the MOWServ service quality, usability and robustness. MOWServ is available at http://www.inab.org/MOWServ/ and has a mirror at http://www.bitlab-es.com/MOWServ/. PMID:20525794

  3. Creating a specialist protein resource network: a meeting report for the protein bioinformatics and community resources retreat

    DEFF Research Database (Denmark)

    Babbitt, Patricia C.; Bagos, Pantelis G.; Bairoch, Amos

    2015-01-01

    During 11–12 August 2014, a Protein Bioinformatics and Community Resources Retreat was held at the Wellcome Trust Genome Campus in Hinxton, UK. This meeting brought together the principal investigators of several specialized protein resources (such as CAZy, TCDB and MEROPS) as well as those from...... protein databases from the large Bioinformatics centres (including UniProt and RefSeq). The retreat was divided into five sessions: (1) key challenges, (2) the databases represented, (3) best practices for maintenance and curation, (4) information flow to and from large data centers and (5) communication...

  4. Facilitating the use of large-scale biological data and tools in the era of translational bioinformatics

    DEFF Research Database (Denmark)

    Kouskoumvekaki, Irene; Shublaq, Nour; Brunak, Søren

    2014-01-01

    As both the amount of generated biological data and the processing compute power increase, computational experimentation is no longer the exclusivity of bioinformaticians, but it is moving across all biomedical domains. For bioinformatics to realize its translational potential, domain experts need...... access to user-friendly solutions to navigate, integrate and extract information out of biological databases, as well as to combine tools and data resources in bioinformatics workflows. In this review, we present services that assist biomedical scientists in incorporating bioinformatics tools...... into their research.We review recent applications of Cytoscape, BioGPS and DAVID for data visualization, integration and functional enrichment. Moreover, we illustrate the use of Taverna, Kepler, GenePattern, and Galaxy as open-access workbenches for bioinformatics workflows. Finally, we mention services...

  5. Benefits of a clinical data warehouse with data mining tools to collect data for a radiotherapy trial.

    Science.gov (United States)

    Roelofs, Erik; Persoon, Lucas; Nijsten, Sebastiaan; Wiessler, Wolfgang; Dekker, André; Lambin, Philippe

    2013-07-01

    Collecting trial data in a medical environment is at present mostly performed manually and therefore time-consuming, prone to errors and often incomplete with the complex data considered. Faster and more accurate methods are needed to improve the data quality and to shorten data collection times where information is often scattered over multiple data sources. The purpose of this study is to investigate the possible benefit of modern data warehouse technology in the radiation oncology field. In this study, a Computer Aided Theragnostics (CAT) data warehouse combined with automated tools for feature extraction was benchmarked against the regular manual data-collection processes. Two sets of clinical parameters were compiled for non-small cell lung cancer (NSCLC) and rectal cancer, using 27 patients per disease. Data collection times and inconsistencies were compared between the manual and the automated extraction method. The average time per case to collect the NSCLC data manually was 10.4 ± 2.1 min and 4.3 ± 1.1 min when using the automated method (pdata collected for NSCLC and 5.3% for rectal cancer, there was a discrepancy between the manual and automated method. Aggregating multiple data sources in a data warehouse combined with tools for extraction of relevant parameters is beneficial for data collection times and offers the ability to improve data quality. The initial investments in digitizing the data are expected to be compensated due to the flexibility of the data analysis. Furthermore, successive investigations can easily select trial candidates and extract new parameters from the existing databases. Copyright © 2012 Elsevier Ireland Ltd. All rights reserved.

  6. Benefits of a clinical data warehouse with data mining tools to collect data for a radiotherapy trial

    International Nuclear Information System (INIS)

    Roelofs, Erik; Persoon, Lucas; Nijsten, Sebastiaan; Wiessler, Wolfgang; Dekker, André; Lambin, Philippe

    2013-01-01

    Introduction: Collecting trial data in a medical environment is at present mostly performed manually and therefore time-consuming, prone to errors and often incomplete with the complex data considered. Faster and more accurate methods are needed to improve the data quality and to shorten data collection times where information is often scattered over multiple data sources. The purpose of this study is to investigate the possible benefit of modern data warehouse technology in the radiation oncology field. Material and methods: In this study, a Computer Aided Theragnostics (CAT) data warehouse combined with automated tools for feature extraction was benchmarked against the regular manual data-collection processes. Two sets of clinical parameters were compiled for non-small cell lung cancer (NSCLC) and rectal cancer, using 27 patients per disease. Data collection times and inconsistencies were compared between the manual and the automated extraction method. Results: The average time per case to collect the NSCLC data manually was 10.4 ± 2.1 min and 4.3 ± 1.1 min when using the automated method (p < 0.001). For rectal cancer, these times were 13.5 ± 4.1 and 6.8 ± 2.4 min, respectively (p < 0.001). In 3.2% of the data collected for NSCLC and 5.3% for rectal cancer, there was a discrepancy between the manual and automated method. Conclusions: Aggregating multiple data sources in a data warehouse combined with tools for extraction of relevant parameters is beneficial for data collection times and offers the ability to improve data quality. The initial investments in digitizing the data are expected to be compensated due to the flexibility of the data analysis. Furthermore, successive investigations can easily select trial candidates and extract new parameters from the existing databases

  7. Biowep: a workflow enactment portal for bioinformatics applications.

    Science.gov (United States)

    Romano, Paolo; Bartocci, Ezio; Bertolini, Guglielmo; De Paoli, Flavio; Marra, Domenico; Mauri, Giancarlo; Merelli, Emanuela; Milanesi, Luciano

    2007-03-08

    The huge amount of biological information, its distribution over the Internet and the heterogeneity of available software tools makes the adoption of new data integration and analysis network tools a necessity in bioinformatics. ICT standards and tools, like Web Services and Workflow Management Systems (WMS), can support the creation and deployment of such systems. Many Web Services are already available and some WMS have been proposed. They assume that researchers know which bioinformatics resources can be reached through a programmatic interface and that they are skilled in programming and building workflows. Therefore, they are not viable to the majority of unskilled researchers. A portal enabling these to take profit from new technologies is still missing. We designed biowep, a web based client application that allows for the selection and execution of a set of predefined workflows. The system is available on-line. Biowep architecture includes a Workflow Manager, a User Interface and a Workflow Executor. The task of the Workflow Manager is the creation and annotation of workflows. These can be created by using either the Taverna Workbench or BioWMS. Enactment of workflows is carried out by FreeFluo for Taverna workflows and by BioAgent/Hermes, a mobile agent-based middleware, for BioWMS ones. Main workflows' processing steps are annotated on the basis of their input and output, elaboration type and application domain by using a classification of bioinformatics data and tasks. The interface supports users authentication and profiling. Workflows can be selected on the basis of users' profiles and can be searched through their annotations. Results can be saved. We developed a web system that support the selection and execution of predefined workflows, thus simplifying access for all researchers. The implementation of Web Services allowing specialized software to interact with an exhaustive set of biomedical databases and analysis software and the creation of

  8. Biowep: a workflow enactment portal for bioinformatics applications

    Directory of Open Access Journals (Sweden)

    Romano Paolo

    2007-03-01

    databases and analysis software and the creation of effective workflows can significantly improve automation of in-silico analysis. Biowep is available for interested researchers as a reference portal. They are invited to submit their workflows to the workflow repository. Biowep is further being developed in the sphere of the Laboratory of Interdisciplinary Technologies in Bioinformatics – LITBIO.

  9. The Androgen Receptor Gene Mutations Database.

    Science.gov (United States)

    Gottlieb, B; Lehvaslaiho, H; Beitel, L K; Lumbroso, R; Pinsky, L; Trifiro, M

    1998-01-01

    The current version of the androgen receptor (AR) gene mutations database is described. The total number of reported mutations has risen from 272 to 309 in the past year. We have expanded the database: (i) by giving each entry an accession number; (ii) by adding information on the length of polymorphic polyglutamine (polyGln) and polyglycine (polyGly) tracts in exon 1; (iii) by adding information on large gene deletions; (iv) by providing a direct link with a completely searchable database (courtesy EMBL-European Bioinformatics Institute). The addition of the exon 1 polymorphisms is discussed in light of their possible relevance as markers for predisposition to prostate or breast cancer. The database is also available on the internet (http://www.mcgill. ca/androgendb/ ), from EMBL-European Bioinformatics Institute (ftp. ebi.ac.uk/pub/databases/androgen ), or as a Macintosh FilemakerPro or Word file (MC33@musica.mcgill.ca).

  10. Promoting synergistic research and education in genomics and bioinformatics.

    Science.gov (United States)

    Yang, Jack Y; Yang, Mary Qu; Zhu, Mengxia Michelle; Arabnia, Hamid R; Deng, Youping

    2008-01-01

    Bioinformatics and Genomics are closely related disciplines that hold great promises for the advancement of research and development in complex biomedical systems, as well as public health, drug design, comparative genomics, personalized medicine and so on. Research and development in these two important areas are impacting the science and technology.High throughput sequencing and molecular imaging technologies marked the beginning of a new era for modern translational medicine and personalized healthcare. The impact of having the human sequence and personalized digital images in hand has also created tremendous demands of developing powerful supercomputing, statistical learning and artificial intelligence approaches to handle the massive bioinformatics and personalized healthcare data, which will obviously have a profound effect on how biomedical research will be conducted toward the improvement of human health and prolonging of human life in the future. The International Society of Intelligent Biological Medicine (http://www.isibm.org) and its official journals, the International Journal of Functional Informatics and Personalized Medicine (http://www.inderscience.com/ijfipm) and the International Journal of Computational Biology and Drug Design (http://www.inderscience.com/ijcbdd) in collaboration with International Conference on Bioinformatics and Computational Biology (Biocomp), touch tomorrow's bioinformatics and personalized medicine throughout today's efforts in promoting the research, education and awareness of the upcoming integrated inter/multidisciplinary field. The 2007 international conference on Bioinformatics and Computational Biology (BIOCOMP07) was held in Las Vegas, the United States of American on June 25-28, 2007. The conference attracted over 400 papers, covering broad research areas in the genomics, biomedicine and bioinformatics. The Biocomp 2007 provides a common platform for the cross fertilization of ideas, and to help shape knowledge and

  11. BioRuby: bioinformatics software for the Ruby programming language.

    Science.gov (United States)

    Goto, Naohisa; Prins, Pjotr; Nakao, Mitsuteru; Bonnal, Raoul; Aerts, Jan; Katayama, Toshiaki

    2010-10-15

    The BioRuby software toolkit contains a comprehensive set of free development tools and libraries for bioinformatics and molecular biology, written in the Ruby programming language. BioRuby has components for sequence analysis, pathway analysis, protein modelling and phylogenetic analysis; it supports many widely used data formats and provides easy access to databases, external programs and public web services, including BLAST, KEGG, GenBank, MEDLINE and GO. BioRuby comes with a tutorial, documentation and an interactive environment, which can be used in the shell, and in the web browser. BioRuby is free and open source software, made available under the Ruby license. BioRuby runs on all platforms that support Ruby, including Linux, Mac OS X and Windows. And, with JRuby, BioRuby runs on the Java Virtual Machine. The source code is available from http://www.bioruby.org/. katayama@bioruby.org

  12. Bioinformatic tools for PCR Primer design

    African Journals Online (AJOL)

    ES

    reaction (PCR), oligo hybridization and DNA sequencing. Proper primer design is actually one of the most important factors/steps in successful DNA sequencing. Various bioinformatics programs are available for selection of primer pairs from a template sequence. The plethora programs for PCR primer design reflects the.

  13. "Extreme Programming" in a Bioinformatics Class

    Science.gov (United States)

    Kelley, Scott; Alger, Christianna; Deutschman, Douglas

    2009-01-01

    The importance of Bioinformatics tools and methodology in modern biological research underscores the need for robust and effective courses at the college level. This paper describes such a course designed on the principles of cooperative learning based on a computer software industry production model called "Extreme Programming" (EP).…

  14. Protein raftophilicity. How bioinformatics can help membranologists

    DEFF Research Database (Denmark)

    Nielsen, Henrik; Sperotto, Maria Maddalena

    )-based bioinformatics approach. The ANN was trained to recognize feature-based patterns in proteins that are considered to be associated with lipid rafts. The trained ANN was then used to predict protein raftophilicity. We found that, in the case of α-helical membrane proteins, their hydrophobic length does not affect...

  15. Implementing bioinformatic workflows within the bioextract server

    Science.gov (United States)

    Computational workflows in bioinformatics are becoming increasingly important in the achievement of scientific advances. These workflows typically require the integrated use of multiple, distributed data sources and analytic tools. The BioExtract Server (http://bioextract.org) is a distributed servi...

  16. Privacy Preserving PCA on Distributed Bioinformatics Datasets

    Science.gov (United States)

    Li, Xin

    2011-01-01

    In recent years, new bioinformatics technologies, such as gene expression microarray, genome-wide association study, proteomics, and metabolomics, have been widely used to simultaneously identify a huge number of human genomic/genetic biomarkers, generate a tremendously large amount of data, and dramatically increase the knowledge on human…

  17. Bioboxes: standardised containers for interchangeable bioinformatics software.

    Science.gov (United States)

    Belmann, Peter; Dröge, Johannes; Bremges, Andreas; McHardy, Alice C; Sczyrba, Alexander; Barton, Michael D

    2015-01-01

    Software is now both central and essential to modern biology, yet lack of availability, difficult installations, and complex user interfaces make software hard to obtain and use. Containerisation, as exemplified by the Docker platform, has the potential to solve the problems associated with sharing software. We propose bioboxes: containers with standardised interfaces to make bioinformatics software interchangeable.

  18. Development and implementation of a bioinformatics online ...

    African Journals Online (AJOL)

    Thus, there is the need for appropriate strategies of introducing the basic components of this emerging scientific field to part of the African populace through the development of an online distance education learning tool. This study involved the design of a bioinformatics online distance educative tool an implementation of ...

  19. ON PROBLEM OF REGIONAL WAREHOUSE AND TRANSPORT INFRASTRUCTURE OPTIMIZATION

    Directory of Open Access Journals (Sweden)

    I. Yu. Miretskiy

    2017-01-01

    Full Text Available The article suggests an approach of solving the problem of warehouse and transport infrastructure optimization in a region. The task is to determine the optimal capacity and location of the support network of warehouses in the region, as well as power, composition and location of motor fleets. Optimization is carried out using mathematical models of a regional warehouse network and a network of motor fleets. These models are presented as mathematical programming problems with separable functions. The process of finding the optimal solution of problems is complicated due to high dimensionality, non-linearity of functions, and the fact that a part of variables are constrained to integer, and some variables can take values only from a discrete set. Given the mentioned above complications search for an exact solution was rejected. The article suggests an approximate approach to solving problems. This approach employs effective computational schemes for solving multidimensional optimization problems. We use the continuous relaxation of the original problem to obtain its approximate solution. An approximately optimal solution of continuous relaxation is taken as an approximate solution of the original problem. The suggested solution method implies linearization of the obtained continuous relaxation and use of the separable programming scheme and the scheme of branches and bounds. We describe the use of the simplex method for solving the linearized continuous relaxation of the original problem and the specific moments of the branches and bounds method implementation. The paper shows the finiteness of the algorithm and recommends how to accelerate process of finding a solution.

  20. A Framework for a Clinical Reasoning Knowledge Warehouse

    DEFF Research Database (Denmark)

    Vilstrup Pedersen, Klaus; Boye, Niels

    2004-01-01

    In many areas of the medical domain, the decision process i.e. reasoning, involving health care professionals is distributed, cooperative and complex. This paper presents a framework for a Clinical Reasoning Knowledge Warehouse that combines theories and models from Artificial Intelligence...... is stored and made accessible when relevant to the reasoning context and the specific patient case. Furthermore, the information structure supports the creation of new generalized knowledge using data mining tools. The patient case is divided into an observation level and an opinion level. At the opinion...

  1. Data warehouse de soporte a datos de GSA

    OpenAIRE

    Arribas López, Iván

    2008-01-01

    El presente documento describe los procesos de extracción, transformación y carga de logs de GSA en un data warehouse. GSA (Google Search Appliance) es una aplicacion de Google que utiliza su gestor de consultas para buscar información en la información indexada de un determinado sitio web. Esta aplicación consecuentemente guarda un log de consultas de usuario a ese sitio web en formato estándar CLF modificado. Analizar este log le permitiría conocer al promotor del sitio en cuestión la infor...

  2. Warehouse stocking optimization based on dynamic ant colony genetic algorithm

    Science.gov (United States)

    Xiao, Xiaoxu

    2018-04-01

    In view of the various orders of FAW (First Automotive Works) International Logistics Co., Ltd., the SLP method is used to optimize the layout of the warehousing units in the enterprise, thus the warehouse logistics is optimized and the external processing speed of the order is improved. In addition, the relevant intelligent algorithms for optimizing the stocking route problem are analyzed. The ant colony algorithm and genetic algorithm which have good applicability are emphatically studied. The parameters of ant colony algorithm are optimized by genetic algorithm, which improves the performance of ant colony algorithm. A typical path optimization problem model is taken as an example to prove the effectiveness of parameter optimization.

  3. Improving warehouse responsiveness by job priority management : A European distribution centre field study

    NARCIS (Netherlands)

    T.Y. Kim (Thai Young)

    2018-01-01

    textabstractWarehouses employ order cut-off times to ensure sufficient time for fulfilment. To satisfy higher consumer expectations, these cut-off times are gradually postponed to improve order responsiveness. Warehouses therefore have to allocate jobs more efficiently to meet compressed response

  4. Population dynamics of stored maize insect pests in warehouses in two districts of Ghana

    Science.gov (United States)

    Understanding what insect species are present and their temporal and spatial patterns of distribution is important for developing a successful integrated pest management strategy for food storage in warehouses. Maize in many countries in Africa is stored in bags in warehouses, but little monitoring ...

  5. 7 CFR 1427.16 - Movement and protection of warehouse-stored cotton.

    Science.gov (United States)

    2010-01-01

    ... 7 Agriculture 10 2010-01-01 2010-01-01 false Movement and protection of warehouse-stored cotton. 1427.16 Section 1427.16 Agriculture Regulations of the Department of Agriculture (Continued) COMMODITY... Cotton Loan and Loan Deficiency Payments § 1427.16 Movement and protection of warehouse-stored cotton. (a...

  6. Protocol for a national blood transfusion data warehouse from donor to recipient

    NARCIS (Netherlands)

    van Hoeven, Loan R; Hooftman, Babette H; Janssen, Mart P; de Bruijne, Martine C; de Vooght, Karen M K; Kemper, Peter; Koopman, Maria M W

    2016-01-01

    INTRODUCTION: Blood transfusion has health-related, economical and safety implications. In order to optimise the transfusion chain, comprehensive research data are needed. The Dutch Transfusion Data warehouse (DTD) project aims to establish a data warehouse where data from donors and transfusion

  7. 19 CFR 19.14 - Materials for use in manufacturing warehouse.

    Science.gov (United States)

    2010-04-01

    ... warehouse is located under an immediate transportation without appraisement entry or warehouse withdrawal for transportation, whichever is applicable. (b) Bond required. Before the transfer of the merchandise... the manufacture of articles as authorized by law. Port Director (d) Domestic spirits and wines. For...

  8. 27 CFR 28.244a - Shipment to a customs bonded warehouse.

    Science.gov (United States)

    2010-04-01

    ... 27 Alcohol, Tobacco Products and Firearms 1 2010-04-01 2010-04-01 false Shipment to a customs... Export Consignment § 28.244a Shipment to a customs bonded warehouse. Distilled spirits and wine withdrawn for shipment to a customs bonded warehouse shall be consigned in care of the customs officer in charge...

  9. 27 CFR 28.27 - Entry of wine into customs bonded warehouses.

    Science.gov (United States)

    2010-04-01

    ... 27 Alcohol, Tobacco Products and Firearms 1 2010-04-01 2010-04-01 false Entry of wine into customs... TRADE BUREAU, DEPARTMENT OF THE TREASURY LIQUORS EXPORTATION OF ALCOHOL Miscellaneous Provisions Customs Bonded Warehouses § 28.27 Entry of wine into customs bonded warehouses. Upon filing of the application or...

  10. 78 FR 65300 - Notice of Availability (NOA) for General Purpose Warehouse and Information Technology Center...

    Science.gov (United States)

    2013-10-31

    ... (NOA) for General Purpose Warehouse and Information Technology Center Construction (GPW/IT)--Tracy Site... proposed action to construct a General Purpose Warehouse and Information Technology Center at Defense..., Suite 02G09, Alexandria, VA 22350- 3100. FOR FURTHER INFORMATION CONTACT: Ann Engelberger at (703) 767...

  11. 27 CFR 24.126 - Change in proprietorship involving a bonded wine warehouse.

    Science.gov (United States)

    2010-04-01

    ... involving a bonded wine warehouse. 24.126 Section 24.126 Alcohol, Tobacco Products and Firearms ALCOHOL AND TOBACCO TAX AND TRADE BUREAU, DEPARTMENT OF THE TREASURY LIQUORS WINE Establishment and Operations Changes Subsequent to Original Establishment § 24.126 Change in proprietorship involving a bonded wine warehouse...

  12. Navigating the changing learning landscape: perspective from bioinformatics.ca.

    Science.gov (United States)

    Brazas, Michelle D; Ouellette, B F Francis

    2013-09-01

    With the advent of YouTube channels in bioinformatics, open platforms for problem solving in bioinformatics, active web forums in computing analyses and online resources for learning to code or use a bioinformatics tool, the more traditional continuing education bioinformatics training programs have had to adapt. Bioinformatics training programs that solely rely on traditional didactic methods are being superseded by these newer resources. Yet such face-to-face instruction is still invaluable in the learning continuum. Bioinformatics.ca, which hosts the Canadian Bioinformatics Workshops, has blended more traditional learning styles with current online and social learning styles. Here we share our growing experiences over the past 12 years and look toward what the future holds for bioinformatics training programs.

  13. Development of a clinical data warehouse from an intensive care clinical information system.

    Science.gov (United States)

    de Mul, Marleen; Alons, Peter; van der Velde, Peter; Konings, Ilse; Bakker, Jan; Hazelzet, Jan

    2012-01-01

    There are relatively few institutions that have developed clinical data warehouses, containing patient data from the point of care. Because of the various care practices, data types and definitions, and the perceived incompleteness of clinical information systems, the development of a clinical data warehouse is a challenge. In order to deal with managerial and clinical information needs, as well as educational and research aims that are important in the setting of a university hospital, Erasmus Medical Center Rotterdam, The Netherlands, developed a data warehouse incrementally. In this paper we report on the in-house development of an integral part of the data warehouse specifically for the intensive care units (ICU-DWH). It was modeled using Atos Origin Metadata Frame method. The paper describes the methodology, the development process and the content of the ICU-DWH, and discusses the need for (clinical) data warehouses in intensive care. Copyright © 2010 Elsevier Ireland Ltd. All rights reserved.

  14. An integrative data analysis platform for gene set analysis and knowledge discovery in a data warehouse framework.

    Science.gov (United States)

    Chen, Yi-An; Tripathi, Lokesh P; Mizuguchi, Kenji

    2016-01-01

    Data analysis is one of the most critical and challenging steps in drug discovery and disease biology. A user-friendly resource to visualize and analyse high-throughput data provides a powerful medium for both experimental and computational biologists to understand vastly different biological data types and obtain a concise, simplified and meaningful output for better knowledge discovery. We have previously developed TargetMine, an integrated data warehouse optimized for target prioritization. Here we describe how upgraded and newly modelled data types in TargetMine can now survey the wider biological and chemical data space, relevant to drug discovery and development. To enhance the scope of TargetMine from target prioritization to broad-based knowledge discovery, we have also developed a new auxiliary toolkit to assist with data analysis and visualization in TargetMine. This toolkit features interactive data analysis tools to query and analyse the biological data compiled within the TargetMine data warehouse. The enhanced system enables users to discover new hypotheses interactively by performing complicated searches with no programming and obtaining the results in an easy to comprehend output format. Database URL: http://targetmine.mizuguchilab.org. © The Author(s) 2016. Published by Oxford University Press.

  15. Component-Based Approach for Educating Students in Bioinformatics

    Science.gov (United States)

    Poe, D.; Venkatraman, N.; Hansen, C.; Singh, G.

    2009-01-01

    There is an increasing need for an effective method of teaching bioinformatics. Increased progress and availability of computer-based tools for educating students have led to the implementation of a computer-based system for teaching bioinformatics as described in this paper. Bioinformatics is a recent, hybrid field of study combining elements of…

  16. Characteristics desired in clinical data warehouse for biomedical research.

    Science.gov (United States)

    Shin, Soo-Yong; Kim, Woo Sung; Lee, Jae-Ho

    2014-04-01

    Due to the unique characteristics of clinical data, clinical data warehouses (CDWs) have not been successful so far. Specifically, the use of CDWs for biomedical research has been relatively unsuccessful thus far. The characteristics necessary for the successful implementation and operation of a CDW for biomedical research have not clearly defined yet. THREE EXAMPLES OF CDWS WERE REVIEWED: a multipurpose CDW in a hospital, a CDW for independent multi-institutional research, and a CDW for research use in an institution. After reviewing the three CDW examples, we propose some key characteristics needed in a CDW for biomedical research. A CDW for research should include an honest broker system and an Institutional Review Board approval interface to comply with governmental regulations. It should also include a simple query interface, an anonymized data review tool, and a data extraction tool. Also, it should be a biomedical research platform for data repository use as well as data analysis. The proposed characteristics desired in a CDW may have limited transfer value to organizations in other countries. However, these analysis results are still valid in Korea, and we have developed clinical research data warehouse based on these desiderata.

  17. MitoMiner: a data warehouse for mitochondrial proteomics data.

    Science.gov (United States)

    Smith, Anthony C; Blackshaw, James A; Robinson, Alan J

    2012-01-01

    MitoMiner (http://mitominer.mrc-mbu.cam.ac.uk/) is a data warehouse for the storage and analysis of mitochondrial proteomics data gathered from publications of mass spectrometry and green fluorescent protein tagging studies. In MitoMiner, these data are integrated with data from UniProt, Gene Ontology, Online Mendelian Inheritance in Man, HomoloGene, Kyoto Encyclopaedia of Genes and Genomes and PubMed. The latest release of MitoMiner stores proteomics data sets from 46 studies covering 11 different species from eumetazoa, viridiplantae, fungi and protista. MitoMiner is implemented by using the open source InterMine data warehouse system, which provides a user interface allowing users to upload data for analysis, personal accounts to store queries and results and enables queries of any data in the data model. MitoMiner also provides lists of proteins for use in analyses, including the new MitoMiner mitochondrial proteome reference sets that specify proteins with substantial experimental evidence for mitochondrial localization. As further mitochondrial proteomics data sets from normal and diseased tissue are published, MitoMiner can be used to characterize the variability of the mitochondrial proteome between tissues and investigate how changes in the proteome may contribute to mitochondrial dysfunction and mitochondrial-associated diseases such as cancer, neurodegenerative diseases, obesity, diabetes, heart failure and the ageing process.

  18. Biopython: freely available Python tools for computational molecular biology and bioinformatics

    DEFF Research Database (Denmark)

    Cock, Peter J A; Antao, Tiago; Chang, Jeffrey T

    2009-01-01

    SUMMARY: The Biopython project is a mature open source international collaboration of volunteer developers, providing Python libraries for a wide range of bioinformatics problems. Biopython includes modules for reading and writing different sequence file formats and multiple sequence alignments......, dealing with 3D macro molecular structures, interacting with common tools such as BLAST, ClustalW and EMBOSS, accessing key online databases, as well as providing numerical methods for statistical learning. AVAILABILITY: Biopython is freely available, with documentation and source code at (www...

  19. Bioinformatics and systems biology research update from the 15th International Conference on Bioinformatics (InCoB2016).

    Science.gov (United States)

    Schönbach, Christian; Verma, Chandra; Bond, Peter J; Ranganathan, Shoba

    2016-12-22

    The International Conference on Bioinformatics (InCoB) has been publishing peer-reviewed conference papers in BMC Bioinformatics since 2006. Of the 44 articles accepted for publication in supplement issues of BMC Bioinformatics, BMC Genomics, BMC Medical Genomics and BMC Systems Biology, 24 articles with a bioinformatics or systems biology focus are reviewed in this editorial. InCoB2017 is scheduled to be held in Shenzen, China, September 20-22, 2017.

  20. The eBioKit, a stand-alone educational platform for bioinformatics.

    Science.gov (United States)

    Hernández-de-Diego, Rafael; de Villiers, Etienne P; Klingström, Tomas; Gourlé, Hadrien; Conesa, Ana; Bongcam-Rudloff, Erik

    2017-09-01

    Bioinformatics skills have become essential for many research areas; however, the availability of qualified researchers is usually lower than the demand and training to increase the number of able bioinformaticians is an important task for the bioinformatics community. When conducting training or hands-on tutorials, the lack of control over the analysis tools and repositories often results in undesirable situations during training, as unavailable online tools or version conflicts may delay, complicate, or even prevent the successful completion of a training event. The eBioKit is a stand-alone educational platform that hosts numerous tools and databases for bioinformatics research and allows training to take place in a controlled environment. A key advantage of the eBioKit over other existing teaching solutions is that all the required software and databases are locally installed on the system, significantly reducing the dependence on the internet. Furthermore, the architecture of the eBioKit has demonstrated itself to be an excellent balance between portability and performance, not only making the eBioKit an exceptional educational tool but also providing small research groups with a platform to incorporate bioinformatics analysis in their research. As a result, the eBioKit has formed an integral part of training and research performed by a wide variety of universities and organizations such as the Pan African Bioinformatics Network (H3ABioNet) as part of the initiative Human Heredity and Health in Africa (H3Africa), the Southern Africa Network for Biosciences (SAnBio) initiative, the Biosciences eastern and central Africa (BecA) hub, and the International Glossina Genome Initiative.

  1. The ATLAS Distributed Data Management System & Databases

    CERN Document Server

    Garonne, V; The ATLAS collaboration; Barisits, M; Beermann, T; Vigne, R; Serfon, C

    2013-01-01

    The ATLAS Distributed Data Management (DDM) System is responsible for the global management of petabytes of high energy physics data. The current system, DQ2, has a critical dependency on Relational Database Management Systems (RDBMS), like Oracle. RDBMS are well-suited to enforcing data integrity in online transaction processing applications, however, concerns have been raised about the scalability of its data warehouse-like workload. In particular, analysis of archived data or aggregation of transactional data for summary purposes is problematic. Therefore, we have evaluated new approaches to handle vast amounts of data. We have investigated a class of database technologies commonly referred to as NoSQL databases. This includes distributed filesystems, like HDFS, that support parallel execution of computational tasks on distributed data, as well as schema-less approaches via key-value stores, like HBase. In this talk we will describe our use cases in ATLAS, share our experiences with various databases used ...

  2. Bioinformatics in New Generation Flavivirus Vaccines

    Directory of Open Access Journals (Sweden)

    Penelope Koraka

    2010-01-01

    Full Text Available Flavivirus infections are the most prevalent arthropod-borne infections world wide, often causing severe disease especially among children, the elderly, and the immunocompromised. In the absence of effective antiviral treatment, prevention through vaccination would greatly reduce morbidity and mortality associated with flavivirus infections. Despite the success of the empirically developed vaccines against yellow fever virus, Japanese encephalitis virus and tick-borne encephalitis virus, there is an increasing need for a more rational design and development of safe and effective vaccines. Several bioinformatic tools are available to support such rational vaccine design. In doing so, several parameters have to be taken into account, such as safety for the target population, overall immunogenicity of the candidate vaccine, and efficacy and longevity of the immune responses triggered. Examples of how bio-informatics is applied to assist in the rational design and improvements of vaccines, particularly flavivirus vaccines, are presented and discussed.

  3. FCDD: A Database for Fruit Crops Diseases.

    Science.gov (United States)

    Chauhan, Rupal; Jasrai, Yogesh; Pandya, Himanshu; Chaudhari, Suman; Samota, Chand Mal

    2014-01-01

    Fruit Crops Diseases Database (FCDD) requires a number of biotechnology and bioinformatics tools. The FCDD is a unique bioinformatics resource that compiles information about 162 details on fruit crops diseases, diseases type, its causal organism, images, symptoms and their control. The FCDD contains 171 phytochemicals from 25 fruits, their 2D images and their 20 possible sequences. This information has been manually extracted and manually verified from numerous sources, including other electronic databases, textbooks and scientific journals. FCDD is fully searchable and supports extensive text search. The main focus of the FCDD is on providing possible information of fruit crops diseases, which will help in discovery of potential drugs from one of the common bioresource-fruits. The database was developed using MySQL. The database interface is developed in PHP, HTML and JAVA. FCDD is freely available. http://www.fruitcropsdd.com/

  4. The growing need for microservices in bioinformatics

    Directory of Open Access Journals (Sweden)

    Christopher L Williams

    2016-01-01

    Full Text Available Objective: Within the information technology (IT industry, best practices and standards are constantly evolving and being refined. In contrast, computer technology utilized within the healthcare industry often evolves at a glacial pace, with reduced opportunities for justified innovation. Although the use of timely technology refreshes within an enterprise′s overall technology stack can be costly, thoughtful adoption of select technologies with a demonstrated return on investment can be very effective in increasing productivity and at the same time, reducing the burden of maintenance often associated with older and legacy systems. In this brief technical communication, we introduce the concept of microservices as applied to the ecosystem of data analysis pipelines. Microservice architecture is a framework for dividing complex systems into easily managed parts. Each individual service is limited in functional scope, thereby conferring a higher measure of functional isolation and reliability to the collective solution. Moreover, maintenance challenges are greatly simplified by virtue of the reduced architectural complexity of each constitutive module. This fact notwithstanding, rendered overall solutions utilizing a microservices-based approach provide equal or greater levels of functionality as compared to conventional programming approaches. Bioinformatics, with its ever-increasing demand for performance and new testing algorithms, is the perfect use-case for such a solution. Moreover, if promulgated within the greater development community as an open-source solution, such an approach holds potential to be transformative to current bioinformatics software development. Context: Bioinformatics relies on nimble IT framework which can adapt to changing requirements. Aims: To present a well-established software design and deployment strategy as a solution for current challenges within bioinformatics Conclusions: Use of the microservices framework

  5. The growing need for microservices in bioinformatics.

    Science.gov (United States)

    Williams, Christopher L; Sica, Jeffrey C; Killen, Robert T; Balis, Ulysses G J

    2016-01-01

    Within the information technology (IT) industry, best practices and standards are constantly evolving and being refined. In contrast, computer technology utilized within the healthcare industry often evolves at a glacial pace, with reduced opportunities for justified innovation. Although the use of timely technology refreshes within an enterprise's overall technology stack can be costly, thoughtful adoption of select technologies with a demonstrated return on investment can be very effective in increasing productivity and at the same time, reducing the burden of maintenance often associated with older and legacy systems. In this brief technical communication, we introduce the concept of microservices as applied to the ecosystem of data analysis pipelines. Microservice architecture is a framework for dividing complex systems into easily managed parts. Each individual service is limited in functional scope, thereby conferring a higher measure of functional isolation and reliability to the collective solution. Moreover, maintenance challenges are greatly simplified by virtue of the reduced architectural complexity of each constitutive module. This fact notwithstanding, rendered overall solutions utilizing a microservices-based approach provide equal or greater levels of functionality as compared to conventional programming approaches. Bioinformatics, with its ever-increasing demand for performance and new testing algorithms, is the perfect use-case for such a solution. Moreover, if promulgated within the greater development community as an open-source solution, such an approach holds potential to be transformative to current bioinformatics software development. Bioinformatics relies on nimble IT framework which can adapt to changing requirements. To present a well-established software design and deployment strategy as a solution for current challenges within bioinformatics. Use of the microservices framework is an effective methodology for the fabrication and

  6. The growing need for microservices in bioinformatics

    Science.gov (United States)

    Williams, Christopher L.; Sica, Jeffrey C.; Killen, Robert T.; Balis, Ulysses G. J.

    2016-01-01

    Objective: Within the information technology (IT) industry, best practices and standards are constantly evolving and being refined. In contrast, computer technology utilized within the healthcare industry often evolves at a glacial pace, with reduced opportunities for justified innovation. Although the use of timely technology refreshes within an enterprise's overall technology stack can be costly, thoughtful adoption of select technologies with a demonstrated return on investment can be very effective in increasing productivity and at the same time, reducing the burden of maintenance often associated with older and legacy systems. In this brief technical communication, we introduce the concept of microservices as applied to the ecosystem of data analysis pipelines. Microservice architecture is a framework for dividing complex systems into easily managed parts. Each individual service is limited in functional scope, thereby conferring a higher measure of functional isolation and reliability to the collective solution. Moreover, maintenance challenges are greatly simplified by virtue of the reduced architectural complexity of each constitutive module. This fact notwithstanding, rendered overall solutions utilizing a microservices-based approach provide equal or greater levels of functionality as compared to conventional programming approaches. Bioinformatics, with its ever-increasing demand for performance and new testing algorithms, is the perfect use-case for such a solution. Moreover, if promulgated within the greater development community as an open-source solution, such an approach holds potential to be transformative to current bioinformatics software development. Context: Bioinformatics relies on nimble IT framework which can adapt to changing requirements. Aims: To present a well-established software design and deployment strategy as a solution for current challenges within bioinformatics Conclusions: Use of the microservices framework is an effective

  7. Bioinformatics of cardiovascular miRNA biology.

    Science.gov (United States)

    Kunz, Meik; Xiao, Ke; Liang, Chunguang; Viereck, Janika; Pachel, Christina; Frantz, Stefan; Thum, Thomas; Dandekar, Thomas

    2015-12-01

    MicroRNAs (miRNAs) are small ~22 nucleotide non-coding RNAs and are highly conserved among species. Moreover, miRNAs regulate gene expression of a large number of genes associated with important biological functions and signaling pathways. Recently, several miRNAs have been found to be associated with cardiovascular diseases. Thus, investigating the complex regulatory effect of miRNAs may lead to a better understanding of their functional role in the heart. To achieve this, bioinformatics approaches have to be coupled with validation and screening experiments to understand the complex interactions of miRNAs with the genome. This will boost the subsequent development of diagnostic markers and our understanding of the physiological and therapeutic role of miRNAs in cardiac remodeling. In this review, we focus on and explain different bioinformatics strategies and algorithms for the identification and analysis of miRNAs and their regulatory elements to better understand cardiac miRNA biology. Starting with the biogenesis of miRNAs, we present approaches such as LocARNA and miRBase for combining sequence and structure analysis including phylogenetic comparisons as well as detailed analysis of RNA folding patterns, functional target prediction, signaling pathway as well as functional analysis. We also show how far bioinformatics helps to tackle the unprecedented level of complexity and systemic effects by miRNA, underlining the strong therapeutic potential of miRNA and miRNA target structures in cardiovascular disease. In addition, we discuss drawbacks and limitations of bioinformatics algorithms and the necessity of experimental approaches for miRNA target identification. This article is part of a Special Issue entitled 'Non-coding RNAs'. Copyright © 2014 Elsevier Ltd. All rights reserved.

  8. Comprehensive decision tree models in bioinformatics.

    Directory of Open Access Journals (Sweden)

    Gregor Stiglic

    Full Text Available PURPOSE: Classification is an important and widely used machine learning technique in bioinformatics. Researchers and other end-users of machine learning software often prefer to work with comprehensible models where knowledge extraction and explanation of reasoning behind the classification model are possible. METHODS: This paper presents an extension to an existing machine learning environment and a study on visual tuning of decision tree classifiers. The motivation for this research comes from the need to build effective and easily interpretable decision tree models by so called one-button data mining approach where no parameter tuning is needed. To avoid bias in classification, no classification performance measure is used during the tuning of the model that is constrained exclusively by the dimensions of the produced decision tree. RESULTS: The proposed visual tuning of decision trees was evaluated on 40 datasets containing classical machine learning problems and 31 datasets from the field of bioinformatics. Although we did not expected significant differences in classification performance, the results demonstrate a significant increase of accuracy in less complex visually tuned decision trees. In contrast to classical machine learning benchmarking datasets, we observe higher accuracy gains in bioinformatics datasets. Additionally, a user study was carried out to confirm the assumption that the tree tuning times are significantly lower for the proposed method in comparison to manual tuning of the decision tree. CONCLUSIONS: The empirical results demonstrate that by building simple models constrained by predefined visual boundaries, one not only achieves good comprehensibility, but also very good classification performance that does not differ from usually more complex models built using default settings of the classical decision tree algorithm. In addition, our study demonstrates the suitability of visually tuned decision trees for datasets

  9. Comprehensive decision tree models in bioinformatics.

    Science.gov (United States)

    Stiglic, Gregor; Kocbek, Simon; Pernek, Igor; Kokol, Peter

    2012-01-01

    Classification is an important and widely used machine learning technique in bioinformatics. Researchers and other end-users of machine learning software often prefer to work with comprehensible models where knowledge extraction and explanation of reasoning behind the classification model are possible. This paper presents an extension to an existing machine learning environment and a study on visual tuning of decision tree classifiers. The motivation for this research comes from the need to build effective and easily interpretable decision tree models by so called one-button data mining approach where no parameter tuning is needed. To avoid bias in classification, no classification performance measure is used during the tuning of the model that is constrained exclusively by the dimensions of the produced decision tree. The proposed visual tuning of decision trees was evaluated on 40 datasets containing classical machine learning problems and 31 datasets from the field of bioinformatics. Although we did not expected significant differences in classification performance, the results demonstrate a significant increase of accuracy in less complex visually tuned decision trees. In contrast to classical machine learning benchmarking datasets, we observe higher accuracy gains in bioinformatics datasets. Additionally, a user study was carried out to confirm the assumption that the tree tuning times are significantly lower for the proposed method in comparison to manual tuning of the decision tree. The empirical results demonstrate that by building simple models constrained by predefined visual boundaries, one not only achieves good comprehensibility, but also very good classification performance that does not differ from usually more complex models built using default settings of the classical decision tree algorithm. In addition, our study demonstrates the suitability of visually tuned decision trees for datasets with binary class attributes and a high number of possibly

  10. Penalized feature selection and classification in bioinformatics

    OpenAIRE

    Ma, Shuangge; Huang, Jian

    2008-01-01

    In bioinformatics studies, supervised classification with high-dimensional input variables is frequently encountered. Examples routinely arise in genomic, epigenetic and proteomic studies. Feature selection can be employed along with classifier construction to avoid over-fitting, to generate more reliable classifier and to provide more insights into the underlying causal relationships. In this article, we provide a review of several recently developed penalized feature selection and classific...

  11. Bioinformatics Training: A Review of Challenges, Actions and Support Requirements

    DEFF Research Database (Denmark)

    Schneider, M.V.; Watson, J.; Attwood, T.

    2010-01-01

    As bioinformatics becomes increasingly central to research in the molecular life sciences, the need to train non-bioinformaticians to make the most of bioinformatics resources is growing. Here, we review the key challenges and pitfalls to providing effective training for users of bioinformatics...... services, and discuss successful training strategies shared by a diverse set of bioinformatics trainers. We also identify steps that trainers in bioinformatics could take together to advance the state of the art in current training practices. The ideas presented in this article derive from the first...

  12. Adapting bioinformatics curricula for big data.

    Science.gov (United States)

    Greene, Anna C; Giffin, Kristine A; Greene, Casey S; Moore, Jason H

    2016-01-01

    Modern technologies are capable of generating enormous amounts of data that measure complex biological systems. Computational biologists and bioinformatics scientists are increasingly being asked to use these data to reveal key systems-level properties. We review the extent to which curricula are changing in the era of big data. We identify key competencies that scientists dealing with big data are expected to possess across fields, and we use this information to propose courses to meet these growing needs. While bioinformatics programs have traditionally trained students in data-intensive science, we identify areas of particular biological, computational and statistical emphasis important for this era that can be incorporated into existing curricula. For each area, we propose a course structured around these topics, which can be adapted in whole or in parts into existing curricula. In summary, specific challenges associated with big data provide an important opportunity to update existing curricula, but we do not foresee a wholesale redesign of bioinformatics training programs. © The Author 2015. Published by Oxford University Press.

  13. Application of Bioinformatics in Chronobiology Research

    Directory of Open Access Journals (Sweden)

    Robson da Silva Lopes

    2013-01-01

    Full Text Available Bioinformatics and other well-established sciences, such as molecular biology, genetics, and biochemistry, provide a scientific approach for the analysis of data generated through “omics” projects that may be used in studies of chronobiology. The results of studies that apply these techniques demonstrate how they significantly aided the understanding of chronobiology. However, bioinformatics tools alone cannot eliminate the need for an understanding of the field of research or the data to be considered, nor can such tools replace analysts and researchers. It is often necessary to conduct an evaluation of the results of a data mining effort to determine the degree of reliability. To this end, familiarity with the field of investigation is necessary. It is evident that the knowledge that has been accumulated through chronobiology and the use of tools derived from bioinformatics has contributed to the recognition and understanding of the patterns and biological rhythms found in living organisms. The current work aims to develop new and important applications in the near future through chronobiology research.

  14. Chapter 16: text mining for translational bioinformatics.

    Science.gov (United States)

    Cohen, K Bretonnel; Hunter, Lawrence E

    2013-04-01

    Text mining for translational bioinformatics is a new field with tremendous research potential. It is a subfield of biomedical natural language processing that concerns itself directly with the problem of relating basic biomedical research to clinical practice, and vice versa. Applications of text mining fall both into the category of T1 translational research-translating basic science results into new interventions-and T2 translational research, or translational research for public health. Potential use cases include better phenotyping of research subjects, and pharmacogenomic research. A variety of methods for evaluating text mining applications exist, including corpora, structured test suites, and post hoc judging. Two basic principles of linguistic structure are relevant for building text mining applications. One is that linguistic structure consists of multiple levels. The other is that every level of linguistic structure is characterized by ambiguity. There are two basic approaches to text mining: rule-based, also known as knowledge-based; and machine-learning-based, also known as statistical. Many systems are hybrids of the two approaches. Shared tasks have had a strong effect on the direction of the field. Like all translational bioinformatics software, text mining software for translational bioinformatics can be considered health-critical and should be subject to the strictest standards of quality assurance and software testing.

  15. Adapting bioinformatics curricula for big data

    Science.gov (United States)

    Greene, Anna C.; Giffin, Kristine A.; Greene, Casey S.

    2016-01-01

    Modern technologies are capable of generating enormous amounts of data that measure complex biological systems. Computational biologists and bioinformatics scientists are increasingly being asked to use these data to reveal key systems-level properties. We review the extent to which curricula are changing in the era of big data. We identify key competencies that scientists dealing with big data are expected to possess across fields, and we use this information to propose courses to meet these growing needs. While bioinformatics programs have traditionally trained students in data-intensive science, we identify areas of particular biological, computational and statistical emphasis important for this era that can be incorporated into existing curricula. For each area, we propose a course structured around these topics, which can be adapted in whole or in parts into existing curricula. In summary, specific challenges associated with big data provide an important opportunity to update existing curricula, but we do not foresee a wholesale redesign of bioinformatics training programs. PMID:25829469

  16. Saccharomyces genome database informs human biology

    OpenAIRE

    Skrzypek, Marek S; Nash, Robert S; Wong, Edith D; MacPherson, Kevin A; Hellerstedt, Sage T; Engel, Stacia R; Karra, Kalpana; Weng, Shuai; Sheppard, Travis K; Binkley, Gail; Simison, Matt; Miyasato, Stuart R; Cherry, J Michael

    2017-01-01

    Abstract The Saccharomyces Genome Database (SGD; http://www.yeastgenome.org) is an expertly curated database of literature-derived functional information for the model organism budding yeast, Saccharomyces cerevisiae. SGD constantly strives to synergize new types of experimental data and bioinformatics predictions with existing data, and to organize them into a comprehensive and up-to-date information resource. The primary mission of SGD is to facilitate research into the biology of yeast and...

  17. The ESID Online Database network.

    Science.gov (United States)

    Guzman, D; Veit, D; Knerr, V; Kindle, G; Gathmann, B; Eades-Perner, A M; Grimbacher, B

    2007-03-01

    Primary immunodeficiencies (PIDs) belong to the group of rare diseases. The European Society for Immunodeficiencies (ESID), is establishing an innovative European patient and research database network for continuous long-term documentation of patients, in order to improve the diagnosis, classification, prognosis and therapy of PIDs. The ESID Online Database is a web-based system aimed at data storage, data entry, reporting and the import of pre-existing data sources in an enterprise business-to-business integration (B2B). The online database is based on Java 2 Enterprise System (J2EE) with high-standard security features, which comply with data protection laws and the demands of a modern research platform. The ESID Online Database is accessible via the official website (http://www.esid.org/). Supplementary data are available at Bioinformatics online.

  18. Nanoinformatics: an emerging area of information technology at the intersection of bioinformatics, computational chemistry and nanobiotechnology

    Directory of Open Access Journals (Sweden)

    Fernando González-Nilo

    2011-01-01

    Full Text Available After the progress made during the genomics era, bioinformatics was tasked with supporting the flow of information generated by nanobiotechnology efforts. This challenge requires adapting classical bioinformatic and computational chemistry tools to store, standardize, analyze, and visualize nanobiotechnological information. Thus, old and new bioinformatic and computational chemistry tools have been merged into a new sub-discipline: nanoinformatics. This review takes a second look at the development of this new and exciting area as seen from the perspective of the evolution of nanobiotechnology applied to the life sciences. The knowledge obtained at the nano-scale level implies answers to new questions and the development of new concepts in different fields. The rapid convergence of technologies around nanobiotechnologies has spun off collaborative networks and web platforms created for sharing and discussing the knowledge generated in nanobiotechnology. The implementation of new database schemes suitable for storage, processing and integrating physical, chemical, and biological properties of nanoparticles will be a key element in achieving the promises in this convergent field. In this work, we will review some applications of nanobiotechnology to life sciences in generating new requirements for diverse scientific fields, such as bioinformatics and computational chemistry.

  19. Bioinformatics analysis of Brucella vaccines and vaccine targets using VIOLIN.

    Science.gov (United States)

    He, Yongqun; Xiang, Zuoshuang

    2010-09-27

    Brucella spp. are Gram-negative, facultative intracellular bacteria that cause brucellosis, one of the commonest zoonotic diseases found worldwide in humans and a variety of animal species. While several animal vaccines are available, there is no effective and safe vaccine for prevention of brucellosis in humans. VIOLIN (http://www.violinet.org) is a web-based vaccine database and analysis system that curates, stores, and analyzes published data of commercialized vaccines, and vaccines in clinical trials or in research. VIOLIN contains information for 454 vaccines or vaccine candidates for 73 pathogens. VIOLIN also contains many bioinformatics tools for vaccine data analysis, data integration, and vaccine target prediction. To demonstrate the applicability of VIOLIN for vaccine research, VIOLIN was used for bioinformatics analysis of existing Brucella vaccines and prediction of new Brucella vaccine targets. VIOLIN contains many literature mining programs (e.g., Vaxmesh) that provide in-depth analysis of Brucella vaccine literature. As a result of manual literature curation, VIOLIN contains information for 38 Brucella vaccines or vaccine candidates, 14 protective Brucella antigens, and 68 host response studies to Brucella vaccines from 97 peer-reviewed articles. These Brucella vaccines are classified in the Vaccine Ontology (VO) system and used for different ontological applications. The web-based VIOLIN vaccine target prediction program Vaxign was used to predict new Brucella vaccine targets. Vaxign identified 14 outer membrane proteins that are conserved in six virulent strains from B. abortus, B. melitensis, and B. suis that are pathogenic in humans. Of the 14 membrane proteins, two proteins (Omp2b and Omp31-1) are not present in B. ovis, a Brucella species that is not pathogenic in humans. Brucella vaccine data stored in VIOLIN were compared and analyzed using the VIOLIN query system. Bioinformatics curation and ontological representation of Brucella vaccines

  20. Functional proteomics with new mass spectrometric and bioinformatics tools

    International Nuclear Information System (INIS)

    Kesners, P.W.A.

    2001-01-01

    A comprehensive range of mass spectrometric tools is required to investigate todays life science applications and a strong focus is on addressing the needs of functional proteomics. Application examples are given showing the streamlined process of protein identification from low femtomole amounts of digests. Sample preparation is achieved with a convertible robot for automated 2D gel picking, and MALDI target dispensing. MALDI-TOF or ESI-MS subsequent to enzymatic digestion. A choice of mass spectrometers including Q-q-TOF with multipass capability, MALDI-MS/MS with unsegmented PSD, Ion Trap and FT-MS are discussed for their respective strengths and applications. Bioinformatics software that allows both database work and novel peptide mass spectra interpretation is reviewed. The automated database searching uses either entire digest LC-MS n ESI Ion Trap data or MALDI MS and MS/MS spectra. It is shown how post translational modifications are interactively uncovered and de-novo sequencing of peptides is facilitated

  1. Analyzing gene expression profiles in dilated cardiomyopathy via bioinformatics methods.

    Science.gov (United States)

    Wang, Liming; Zhu, L; Luan, R; Wang, L; Fu, J; Wang, X; Sui, L

    2016-10-10

    Dilated cardiomyopathy (DCM) is characterized by ventricular dilatation, and it is a common cause of heart failure and cardiac transplantation. This study aimed to explore potential DCM-related genes and their underlying regulatory mechanism using methods of bioinformatics. The gene expression profiles of GSE3586 were downloaded from Gene Expression Omnibus database, including 15 normal samples and 13 DCM samples. The differentially expressed genes (DEGs) were identified between normal and DCM samples using Limma package in R language. Pathway enrichment analysis of DEGs was then performed. Meanwhile, the potential transcription factors (TFs) and microRNAs (miRNAs) of these DEGs were predicted based on their binding sequences. In addition, DEGs were mapped to the cMap database to find the potential small molecule drugs. A total of 4777 genes were identified as DEGs by comparing gene expression profiles between DCM and control samples. DEGs were significantly enriched in 26 pathways, such as lymphocyte TarBase pathway and androgen receptor signaling pathway. Furthermore, potential TFs (SP1, LEF1, and NFAT) were identified, as well as potential miRNAs (miR-9, miR-200 family, and miR-30 family). Additionally, small molecules like isoflupredone and trihexyphenidyl were found to be potential therapeutic drugs for DCM. The identified DEGs (PRSS12 and FOXG1), potential TFs, as well as potential miRNAs, might be involved in DCM.

  2. Analyzing gene expression profiles in dilated cardiomyopathy via bioinformatics methods

    Directory of Open Access Journals (Sweden)

    Liming Wang

    Full Text Available Dilated cardiomyopathy (DCM is characterized by ventricular dilatation, and it is a common cause of heart failure and cardiac transplantation. This study aimed to explore potential DCM-related genes and their underlying regulatory mechanism using methods of bioinformatics. The gene expression profiles of GSE3586 were downloaded from Gene Expression Omnibus database, including 15 normal samples and 13 DCM samples. The differentially expressed genes (DEGs were identified between normal and DCM samples using Limma package in R language. Pathway enrichment analysis of DEGs was then performed. Meanwhile, the potential transcription factors (TFs and microRNAs (miRNAs of these DEGs were predicted based on their binding sequences. In addition, DEGs were mapped to the cMap database to find the potential small molecule drugs. A total of 4777 genes were identified as DEGs by comparing gene expression profiles between DCM and control samples. DEGs were significantly enriched in 26 pathways, such as lymphocyte TarBase pathway and androgen receptor signaling pathway. Furthermore, potential TFs (SP1, LEF1, and NFAT were identified, as well as potential miRNAs (miR-9, miR-200 family, and miR-30 family. Additionally, small molecules like isoflupredone and trihexyphenidyl were found to be potential therapeutic drugs for DCM. The identified DEGs (PRSS12 and FOXG1, potential TFs, as well as potential miRNAs, might be involved in DCM.

  3. Column-Oriented Databases, an Alternative for Analytical Environment

    Directory of Open Access Journals (Sweden)

    Gheorghe MATEI

    2010-12-01

    Full Text Available It is widely accepted that a data warehouse is the central place of a Business Intelligence system. It stores all data that is relevant for the company, data that is acquired both from internal and external sources. Such a repository stores data from more years than a transactional system can do, and offer valuable information to its users to make the best decisions, based on accurate and reliable data. As the volume of data stored in an enterprise data warehouse becomes larger and larger, new approaches are needed to make the analytical system more efficient. This paper presents column-oriented databases, which are considered an element of the new generation of DBMS technology. The paper emphasizes the need and the advantages of these databases for an analytical environment and make a short presentation of two of the DBMS built in a columnar approach.

  4. A Simulation Modeling Approach Method Focused on the Refrigerated Warehouses Using Design of Experiment

    Science.gov (United States)

    Cho, G. S.

    2017-09-01

    For performance optimization of Refrigerated Warehouses, design parameters are selected based on the physical parameters such as number of equipment and aisles, speeds of forklift for ease of modification. This paper provides a comprehensive framework approach for the system design of Refrigerated Warehouses. We propose a modeling approach which aims at the simulation optimization so as to meet required design specifications using the Design of Experiment (DOE) and analyze a simulation model using integrated aspect-oriented modeling approach (i-AOMA). As a result, this suggested method can evaluate the performance of a variety of Refrigerated Warehouses operations.

  5. Development of National Health Data Warehouse for Data Mining

    Directory of Open Access Journals (Sweden)

    Shahidul Islam Khan

    2015-07-01

    Full Text Available Health informatics is currently one of the top focuses of computer science researchers. Availability of timely and accurate data is essential for medical decision making. Health care organizations face a common problem with the large amount of data they have in numerous systems. Researchers, health care providers and patients will not be able to utilize the knowledge stored in different repositories unless amalgamate the information from disparate sources is done. This problem can be solved by Data warehousing. Data warehousing techniques share a common set of tasks, include requirements analysis, data design, architectural design, implementation and deployment. Developing health data warehouse is complex and time consuming but is also essential to deliver quality health services. This paper depicts prospects and complexities of health data warehousing and mining and illustrate a data-warehousing model suitable for integrating data from different health care sources to discover effective knowledge.

  6. Flow shop scheduling algorithm to optimize warehouse activities

    Directory of Open Access Journals (Sweden)

    P. Centobelli

    2016-01-01

    Full Text Available Successful flow-shop scheduling outlines a more rapid and efficient process of order fulfilment in warehouse activities. Indeed the way and the speed of order processing and, in particular, the operations concerning materials handling between the upper stocking area and a lower forward picking one must be optimized. The two activities, drops and pickings, have considerable impact on important performance parameters for Supply Chain wholesaler companies. In this paper, a new flow shop scheduling algorithm is formulated in order to process a greater number of orders by replacing the FIFO logic for the drops activities of a wholesaler company on a daily basis. The System Dynamics modelling and simulation have been used to simulate the actual scenario and the output solutions. Finally, a t-Student test validates the modelled algorithm, granting that it can be used for all wholesalers based on drop and picking activities.

  7. Presence survival spores of Bacillus thuringiensis varieties in grain warehouse

    Directory of Open Access Journals (Sweden)

    Sánchez-Yáñez Juan Manuel

    2016-08-01

    Full Text Available Genus Bacillus thuringiensis (Bt synthesized spores and crystals toxic to pest-insects in agriculture. Bt is comospolitan then possible to isolate some subspecies or varieties from warehouse. The aims of study were: i to isolate Bt varieties from grain at werehouse ii to evaluate Bt toxicity on Spodoptera frugiperda and Shit-ophilus zeamaisese iii to analyze Bt spores persistence in Zea mays grains at werehouse compared to same Bt on grains exposed to sun radiation. Results showed that at werehouse were recovered more than one variety of Bt spores. According to each isolate Bt1 o Bt2 were toxic to S. frugiperda or S. zeamaisese. One those Bt belong to var morrisoni. At werehouse these spores on Z. mays grains surviving more time, while the same spores exposed to boicide sun radiation they died.

  8. Urban freight distribution: council warehouses & freight by rail

    Directory of Open Access Journals (Sweden)

    Aleksander SŁADKOWSKI

    2014-10-01

    Full Text Available Rail is the one of the highly underused form of freight transportation in the European Union. Majority of the freightage are distributed by trucks and HGVs. With new regulations and socio-environmental concerns urban logistics is facing a new challenge which can be tackled using innovative transport mechanisms and streamline operations. This article sheds light on a system which integrates freight distribution via metro lines in the closest vicinity of the customer, use of council warehouses and further innovative transport mechanisms for final delivery. This system uses existing infrastructure effectively without impacting its surroundings and triggers the reduction of polluting carriers. This system offers the option of immediate implementation which will enable EU to compete with a global freight distribution market.

  9. Towards a Modernization Process for Secure Data Warehouses

    Science.gov (United States)

    Blanco, Carlos; Pérez-Castillo, Ricardo; Hernández, Arnulfo; Fernández-Medina, Eduardo; Trujillo, Juan

    Data Warehouses (DW) manage crucial enterprise information used for the decision making process which has to be protected from unauthorized accesses. However, security constraints are not properly integrated in the complete DWs’ development process, being traditionally considered in the last stages. Furthermore, legacy systems need a reverse engineering process in order to accomplish re-documentation for detecting new security requirements as well as system’s design recovery to enable migration and reuse. Thus, we have proposed a model driven architecture (MDA) for secure DWs which takes into account security issues from the early stages of development and provides automatic transformations between models. This paper fulfills this architecture providing an architecture-driven modernization (ADM) process focused on obtaining conceptual security models from legacy OLAP systems.

  10. Data warehouse for detection of occupational diseases in OHS data.

    Science.gov (United States)

    Godderis, L; Mylle, G; Coene, M; Verbeek, C; Viaene, B; Bulterys, S; Schouteden, M

    2015-11-01

    Occupational health and safety (OHS) services collect a wide range of data during health surveillance. To build a 'data warehouse' to make OHS data available for research and to investigate sector-specific health problems. Medical data were extracted, transformed and loaded into the data warehouse. After validation, data on lifestyle, categorized medication use, ICD-9-CM encoded sickness absences and health complaints, collected between 2010 and 2014, were analysed with logistic regression to compare proportions between employment sectors, taking into account age, gender, body mass index (BMI) and year of examination. The data set comprised 585000 employees. Average age and employment seniority were 39 ± 12 and 8 ± 9 years, respectively. BMI was 26 ± 5 kg/m(2). Health complaints, medication use and sickness absence significantly increased with BMI and age. The proportion of employees with health problems was highest in health care (64%), government (61%) and manufacturing (60%) and lowest in the service sector. In all sectors, 10% of workers reported locomotor health problems, apart from the service sector (8%) with similar results for medication consumption. Neuropsychological drugs were more frequently used by health care workers (8%). The transport sector contained the highest proportion of cardiological medication users (12%). Finally, 30-59% of employees reported at least one sickness absence episode. Sickness absence due to locomotor issues was highest in manufacturing (11%) and health care (10%), followed by government (9%) and construction (9%). Significant differences in indices of workers' health were observed between sectors. This information is now being used in the implementation of a sector-oriented health surveillance programme. © The Author 2015. Published by Oxford University Press on behalf of the Society of Occupational Medicine. All rights reserved. For Permissions, please email: journals.permissions@oup.com.

  11. Implementation of Advanced Warehouses in a Hospital Environment - Case study

    Science.gov (United States)

    Costa, J.; Sameiro Carvalho, M.; Nobre, A.

    2015-05-01

    In Portugal, there is an increase of costs in the healthcare sector due to several factors such as the aging of the population, the increased demand for health care services and the increasing investment in new technologies. Thus, there is a need to reduce costs, by presenting the effective and efficient management of logistics supply systems with enormous potential to achieve savings in health care organizations without compromising the quality of the provided service, which is a critical factor, in this type of sector. In this research project the implementation of Advanced Warehouses has been studied, in the Hospital de Braga patient care units, based in a mix of replenishment systems approaches: the par level system, the two bin system and the consignment model. The logistics supply process is supported by information technology (IT), allowing a proactive replacement of products, based on the hospital services consumption records. The case study was developed in two patient care units, in order to study the impact of the operation of the three replenishment systems. Results showed that an important inventory holding costs reduction can be achieved in the patient care unit warehouses while increasing the service level and increasing control of incoming and stored materials with less human resources. The main conclusion of this work illustrates the possibility of operating multiple replenishment models, according to the types of materials that healthcare organizations deal with, so that they are able to provide quality health care services at a reduced cost and economically sustainable. The adoption of adequate IT has been shown critical for the success of the project.

  12. REDfly: a Regulatory Element Database for Drosophila.

    Science.gov (United States)

    Gallo, Steven M; Li, Long; Hu, Zihua; Halfon, Marc S

    2006-02-01

    Bioinformatics studies of transcriptional regulation in the metazoa are significantly hindered by the absence of readily available data on large numbers of transcriptional cis-regulatory modules (CRMs). Even the richly annotated Drosophila melanogaster genome lacks extensive CRM information. We therefore present here a database of Drosophila CRMs curated from the literature complete with both DNA sequence and a searchable description of the gene expression pattern regulated by each CRM. This resource should greatly facilitate the development of computational approaches to CRM discovery as well as bioinformatics analyses of regulatory sequence properties and evolution.

  13. CoryneRegNet 4.0 – A reference database for corynebacterial gene regulatory networks

    Directory of Open Access Journals (Sweden)

    Baumbach Jan

    2007-11-01

    Full Text Available Abstract Background Detailed information on DNA-binding transcription factors (the key players in the regulation of gene expression and on transcriptional regulatory interactions of microorganisms deduced from literature-derived knowledge, computer predictions and global DNA microarray hybridization experiments, has opened the way for the genome-wide analysis of transcriptional regulatory networks. The large-scale reconstruction of these networks allows the in silico analysis of cell behavior in response to changing environmental conditions. We previously published CoryneRegNet, an ontology-based data warehouse of corynebacterial transcription factors and regulatory networks. Initially, it was designed to provide methods for the analysis and visualization of the gene regulatory network of Corynebacterium glutamicum. Results Now we introduce CoryneRegNet release 4.0, which integrates data on the gene regulatory networks of 4 corynebacteria, 2 mycobacteria and the model organism Escherichia coli K12. As the previous versions, CoryneRegNet provides a web-based user interface to access the database content, to allow various queries, and to support the reconstruction, analysis and visualization of regulatory networks at different hierarchical levels. In this article, we present the further improved database content of CoryneRegNet along with novel analysis features. The network visualization feature GraphVis now allows the inter-species comparisons of reconstructed gene regulatory networks and the projection of gene expression levels onto that networks. Therefore, we added stimulon data directly into the database, but also provide Web Service access to the DNA microarray analysis platform EMMA. Additionally, CoryneRegNet now provides a SOAP based Web Service server, which can easily be consumed by other bioinformatics software systems. Stimulons (imported from the database, or uploaded by the user can be analyzed in the context of known

  14. Disease-based mortality after percutaneous endoscopic gastrostomy: utility of the enterprise data warehouse.

    Science.gov (United States)

    Poulose, Benjamin K; Kaiser, Joan; Beck, William C; Jackson, Pearlie; Nealon, William H; Sharp, Kenneth W; Holzman, Michael D

    2013-11-01

    Percutaneous endoscopic gastrostomy (PEG) remains a mainstay of enteral access. Thirty-day mortality for PEG has ranged from 16 to 43 %. This study aims to discern patient groups that demonstrate limited survival after PEG placement. The Enterprise Data Warehouse (EDW) concept allows an efficient means of integrating administrative, clinical, and quality-of-life data. On the basis of this concept, we developed the Vanderbilt Procedural Outcomes Database (VPOD) and analyzed these data for evaluation of post-PEG mortality over time. Patients were identified using the VPOD from 2008 to 2010 and followed for 1 year after the procedure. Patients were categorized according to common clinical groups for PEG placement: stroke/CNS tumors, neuromuscular disorders, head and neck cancers, other malignancies, trauma, cerebral palsy, gastroparesis, or other indications for PEG. All-cause mortality at 30, 60, 90, 180, and 360 days was determined by linking VPOD information with the Social Security Death Index. Chi-square analysis was used to determine significance across groups. Nine hundred fifty-three patients underwent PEG placement during the study period. Mortality over time (30-, 60-, 90-, 180-, and 360-day mortality) was greatest for patients with malignancies other than head and neck cancer (29, 45, 57, 66, and 72 %) and least for cerebral palsy or patients with gastroparesis (7 % at all time points). Patients with neuromuscular disorders had a similar mortality curve as head and neck cancer patients. Stroke/CNS tumor patients and patients with other indications had the second highest mortality, while trauma patients had low mortality. PEG mortality was much higher in patients with malignancies other than head and neck cancer compared to previously published rates. PEG should be used with great caution in this and other high-risk patient groups. This study demonstrates the power of an EDW-based database to evaluate large numbers of patients with clinically meaningful

  15. Using Electronic Health Records to Build an Ophthalmologic Data Warehouse and Visualize Patients' Data.

    Science.gov (United States)

    Kortüm, Karsten U; Müller, Michael; Kern, Christoph; Babenko, Alexander; Mayer, Wolfgang J; Kampik, Anselm; Kreutzer, Thomas C; Priglinger, Siegfried; Hirneiss, Christoph

    2017-06-01

    To develop a near-real-time data warehouse (DW) in an academic ophthalmologic center to gain scientific use of increasing digital data from electronic medical records (EMR) and diagnostic devices. Database development. Specific macular clinic user interfaces within the institutional hospital information system were created. Orders for imaging modalities were sent by an EMR-linked picture-archiving and communications system to the respective devices. All data of 325 767 patients since 2002 were gathered in a DW running on an SQL database. A data discovery tool was developed. An exemplary search for patients with age-related macular degeneration, performed cataract surgery, and at least 10 intravitreal (excluding bevacizumab) injections was conducted. Data related to those patients (3 142 204 diagnoses [including diagnoses from other fields of medicine], 720 721 procedures [eg, surgery], and 45 416 intravitreal injections) were stored, including 81 274 optical coherence tomography measurements. A web-based browsing tool was successfully developed for data visualization and filtering data by several linked criteria, for example, minimum number of intravitreal injections of a specific drug and visual acuity interval. The exemplary search identified 450 patients with 516 eyes meeting all criteria. A DW was successfully implemented in an ophthalmologic academic environment to support and facilitate research by using increasing EMR and measurement data. The identification of eligible patients for studies was simplified. In future, software for decision support can be developed based on the DW and its structured data. The improved classification of diseases and semiautomatic validation of data via machine learning are warranted. Copyright © 2017 Elsevier Inc. All rights reserved.

  16. Modern bioinformatics meets traditional Chinese medicine.

    Science.gov (United States)

    Gu, Peiqin; Chen, Huajun

    2014-11-01

    Traditional Chinese medicine (TCM) is gaining increasing attention with the emergence of integrative medicine and personalized medicine, characterized by pattern differentiation on individual variance and treatments based on natural herbal synergism. Investigating the effectiveness and safety of the potential mechanisms of TCM and the combination principles of drug therapies will bridge the cultural gap with Western medicine and improve the development of integrative medicine. Dealing with rapidly growing amounts of biomedical data and their heterogeneous nature are two important tasks among modern biomedical communities. Bioinformatics, as an emerging interdisciplinary field of computer science and biology, has become a useful tool for easing the data deluge pressure by automating the computation processes with informatics methods. Using these methods to retrieve, store and analyze the biomedical data can effectively reveal the associated knowledge hidden in the data, and thus promote the discovery of integrated information. Recently, these techniques of bioinformatics have been used for facilitating the interactional effects of both Western medicine and TCM. The analysis of TCM data using computational technologies provides biological evidence for the basic understanding of TCM mechanisms, safety and efficacy of TCM treatments. At the same time, the carrier and targets associated with TCM remedies can inspire the rethinking of modern drug development. This review summarizes the significant achievements of applying bioinformatics techniques to many aspects of the research in TCM, such as analysis of TCM-related '-omics' data and techniques for analyzing biological processes and pharmaceutical mechanisms of TCM, which have shown certain potential of bringing new thoughts to both sides. © The Author 2013. Published by Oxford University Press. For Permissions, please email: journals.permissions@oup.com.

  17. SysBioCube: A Data Warehouse and Integrative Data Analysis Platform Facilitating Systems Biology Studies of Disorders of Military Relevance.

    Science.gov (United States)

    Chowbina, Sudhir; Hammamieh, Rasha; Kumar, Raina; Chakraborty, Nabarun; Yang, Ruoting; Mudunuri, Uma; Jett, Marti; Palma, Joseph M; Stephens, Robert

    2013-01-01

    SysBioCube is an integrated data warehouse and analysis platform for experimental data relating to diseases of military relevance developed for the US Army Medical Research and Materiel Command Systems Biology Enterprise (SBE). It brings together, under a single database environment, pathophysio-, psychological, molecular and biochemical data from mouse models of post-traumatic stress disorder and (pre-) clinical data from human PTSD patients.. SysBioCube will organize, centralize and normalize this data and provide an access portal for subsequent analysis to the SBE. It provides new or expanded browsing, querying and visualization to provide better understanding of the systems biology of PTSD, all brought about through the integrated environment. We employ Oracle database technology to store the data using an integrated hierarchical database schema design. The web interface provides researchers with systematic information and option to interrogate the profiles of pan-omics component across different data types, experimental designs and other covariates.

  18. Multiobjective optimization in bioinformatics and computational biology.

    Science.gov (United States)

    Handl, Julia; Kell, Douglas B; Knowles, Joshua

    2007-01-01

    This paper reviews the application of multiobjective optimization in the fields of bioinformatics and computational biology. A survey of existing work, organized by application area, forms the main body of the review, following an introduction to the key concepts in multiobjective optimization. An original contribution of the review is the identification of five distinct "contexts," giving rise to multiple objectives: These are used to explain the reasons behind the use of multiobjective optimization in each application area and also to point the way to potential future uses of the technique.

  19. Robust Bioinformatics Recognition with VLSI Biochip Microsystem

    Science.gov (United States)

    Lue, Jaw-Chyng L.; Fang, Wai-Chi

    2006-01-01

    A microsystem architecture for real-time, on-site, robust bioinformatic patterns recognition and analysis has been proposed. This system is compatible with on-chip DNA analysis means such as polymerase chain reaction (PCR)amplification. A corresponding novel artificial neural network (ANN) learning algorithm using new sigmoid-logarithmic transfer function based on error backpropagation (EBP) algorithm is invented. Our results show the trained new ANN can recognize low fluorescence patterns better than the conventional sigmoidal ANN does. A differential logarithmic imaging chip is designed for calculating logarithm of relative intensities of fluorescence signals. The single-rail logarithmic circuit and a prototype ANN chip are designed, fabricated and characterized.

  20. Introducing bioinformatics, the biosciences' genomic revolution

    CERN Document Server

    Zanella, Paolo

    1999-01-01

    The general audience for these lectures is mainly physicists, computer scientists, engineers or the general public wanting to know more about what’s going on in the biosciences. What’s bioinformatics and why is all this fuss being made about it ? What’s this revolution triggered by the human genome project ? Are there any results yet ? What are the problems ? What new avenues of research have been opened up ? What about the technology ? These new developments will be compared with what happened at CERN earlier in its evolution, and it is hoped that the similiraties and contrasts will stimulate new curiosity and provoke new thoughts.

  1. 77 FR 20353 - United States Warehouse Act; Export Food Aid Commodities Licensing Agreement

    Science.gov (United States)

    2012-04-04

    ... licensing agreement include, but are not limited to, corn soy blend, vegetable oil, and pulses such as peas, beans, and lentils. USWA licensing is a voluntary program. Warehouse operators that apply for USWA...

  2. Promotion bureau warehouse system design. Case study in University of AA

    Science.gov (United States)

    Parwati, N.; Qibtiyah, M.

    2017-12-01

    The warehouse becomes one of the important parts in an industry. By having a good warehousing system, an industry can improve the effectiveness of its performance, so that profits for the company can continue to increase. Meanwhile, if it has a poorly organized warehouse system, it is feared there will be a decrease in the level of effectiveness of the industry itself. In this research, the object was warehousing system in promotion bureau of University AA. To improve the effectiveness of warehousing system, warehouse layout design is done by specifying categories of goods based on the flow of goods in and out of warehouse with ABC analysis method. In addition, the design of information systems to assist in controlling the system to support all the demand for every burreau and department in the university.

  3. 76 FR 13972 - United States Warehouse Act; Export Food Aid Commodities Licensing Agreement

    Science.gov (United States)

    2011-03-15

    ..., nuts, cottonseed, and dry beans. Warehouse operators that apply voluntarily agree to be licensed... program for port and transload facility operators storing EFAC. This proposal is in response to the...

  4. Simple re-instantiation of small databases using cloud computing.

    Science.gov (United States)

    Tan, Tin Wee; Xie, Chao; De Silva, Mark; Lim, Kuan Siong; Patro, C Pawan K; Lim, Shen Jean; Govindarajan, Kunde Ramamoorthy; Tong, Joo Chuan; Choo, Khar Heng; Ranganathan, Shoba; Khan, Asif M

    2013-01-01

    Small bioinformatics databases, unlike institutionally funded large databases, are vulnerable to discontinuation and many reported in publications are no longer accessible. This leads to irreproducible scientific work and redundant effort, impeding the pace of scientific progress. We describe a Web-accessible system, available online at http://biodb100.apbionet.org, for archival and future on demand re-instantiation of small databases within minutes. Depositors can rebuild their databases by downloading a Linux live operating system (http://www.bioslax.com), preinstalled with bioinformatics and UNIX tools. The database and its dependencies can be compressed into an ".lzm" file for deposition. End-users can search for archived databases and activate them on dynamically re-instantiated BioSlax instances, run as virtual machines over the two popular full virtualization standard cloud-computing platforms, Xen Hypervisor or vSphere. The system is adaptable to increasing demand for disk storage or computational load and allows database developers to use the re-instantiated databases for integration and development of new databases. Herein, we demonstrate that a relatively inexpensive solution can be implemented for archival of bioinformatics databases and their rapid re-instantiation should the live databases disappear.

  5. The Revolution in Viral Genomics as Exemplified by the Bioinformatic Analysis of Human Adenoviruses

    Directory of Open Access Journals (Sweden)

    Sarah Torres

    2010-06-01

    Full Text Available Over the past 30 years, genomic and bioinformatic analysis of human adenoviruses has been achieved using a variety of DNA sequencing methods; initially with the use of restriction enzymes and more currently with the use of the GS FLX pyrosequencing technology. Following the conception of DNA sequencing in the 1970s, analysis of adenoviruses has evolved from 100 base pair mRNA fragments to entire genomes. Comparative genomics of adenoviruses made its debut in 1984 when nucleotides and amino acids of coding sequences within the hexon genes of two human adenoviruses (HAdV, HAdV–C2 and HAdV–C5, were compared and analyzed. It was determined that there were three different zones (1-393, 394-1410, 1411-2910 within the hexon gene, of which HAdV–C2 and HAdV–C5 shared zones 1 and 3 with 95% and 89.5% nucleotide identity, respectively. In 1992, HAdV-C5 became the first adenovirus genome to be fully sequenced using the Sanger method. Over the next seven years, whole genome analysis and characterization was completed using bioinformatic tools such as blastn, tblastx, ClustalV and FASTA, in order to determine key proteins in species HAdV-A through HAdV-F. The bioinformatic revolution was initiated with the introduction of a novel species, HAdV-G, that was typed and named by the use of whole genome sequencing and phylogenetics as opposed to traditional serology. HAdV bioinformatics will continue to advance as the latest sequencing technology enables scientists to add to and expand the resource databases. As a result of these advancements, how novel HAdVs are typed has changed. Bioinformatic analysis has become the revolutionary tool that has significantly accelerated the in-depth study of HAdV microevolution through comparative genomics.

  6. Analyzing the field of bioinformatics with the multi-faceted topic modeling technique.

    Science.gov (United States)

    Heo, Go Eun; Kang, Keun Young; Song, Min; Lee, Jeong-Hoon

    2017-05-31

    Bioinformatics is an interdisciplinary field at the intersection of molecular biology and computing technology. To characterize the field as convergent domain, researchers have used bibliometrics, augmented with text-mining techniques for content analysis. In previous studies, Latent Dirichlet Allocation (LDA) was the most representative topic modeling technique for identifying topic structure of subject areas. However, as opposed to revealing the topic structure in relation to metadata such as authors, publication date, and journals, LDA only displays the simple topic structure. In this paper, we adopt the Tang et al.'s Author-Conference-Topic (ACT) model to study the field of bioinformatics from the perspective of keyphrases, authors, and journals. The ACT model is capable of incorporating the paper, author, and conference into the topic distribution simultaneously. To obtain more meaningful results, we use journals and keyphrases instead of conferences and bag-of-words.. For analysis, we use PubMed to collected forty-six bioinformatics journals from the MEDLINE database. We conducted time series topic analysis over four periods from 1996 to 2015 to further examine the interdisciplinary nature of bioinformatics. We analyze the ACT Model results in each period. Additionally, for further integrated analysis, we conduct a time series analysis among the top-ranked keyphrases, journals, and authors according to their frequency. We also examine the patterns in the top journals by simultaneously identifying the topical probability in each period, as well as the top authors and keyphrases. The results indicate that in recent years diversified topics have become more prevalent and convergent topics have become more clearly represented. The results of our analysis implies that overtime the field of bioinformatics becomes more interdisciplinary where there is a steady increase in peripheral fields such as conceptual, mathematical, and system biology. These results are

  7. A web services choreography scenario for interoperating bioinformatics applications

    Directory of Open Access Journals (Sweden)

    Cheung David W

    2004-03-01

    Full Text Available Abstract Background Very often genome-wide data analysis requires the interoperation of multiple databases and analytic tools. A large number of genome databases and bioinformatics applications are available through the web, but it is difficult to automate interoperation because: 1 the platforms on which the applications run are heterogeneous, 2 their web interface is not machine-friendly, 3 they use a non-standard format for data input and output, 4 they do not exploit standards to define application interface and message exchange, and 5 existing protocols for remote messaging are often not firewall-friendly. To overcome these issues, web services have emerged as a standard XML-based model for message exchange between heterogeneous applications. Web services engines have been developed to manage the configuration and execution of a web services workflow. Results To demonstrate the benefit of using web services over traditional web interfaces, we compare the two implementations of HAPI, a gene expression analysis utility developed by the University of California San Diego (UCSD that allows visual characterization of groups or clusters of genes based on the biomedical literature. This utility takes a set of microarray spot IDs as input and outputs a hierarchy of MeSH Keywords that correlates to the input and is grouped by Medical Subject Heading (MeSH category. While the HTML output is easy for humans to visualize, it is difficult for computer applications to interpret semantically. To facilitate the capability of machine processing, we have created a workflow of three web services that replicates the HAPI functionality. These web services use document-style messages, which means that messages are encoded in an XML-based format. We compared three approaches to the implementation of an XML-based workflow: a hard coded Java application, Collaxa BPEL Server and Taverna Workbench. The Java program functions as a web services engine and interoperates

  8. Warehouse design and product assignment and allocation: A mathematical programming model

    OpenAIRE

    Geraldes, Carla A. S.; Carvalho, Maria Sameiro; Pereira, Guilherme

    2012-01-01

    Warehouses can be considered one of the most important nodes in supply chains. The dynamic nature of today's markets compels organizations to an incessant reassessment in an effort to respond to continuous challenges. Therefore warehouses must be continually re-evaluated to ensure that they are consistent with both market's demands and management's strategies. In this paper we discuss a mathematical programming model aiming to support product assignment and allocation to the functional areas ...

  9. A Survey of Scholarly Literature Describing the Field of Bioinformatics Education and Bioinformatics Educational Research

    Science.gov (United States)

    Magana, Alejandra J.; Taleyarkhan, Manaz; Alvarado, Daniela Rivera; Kane, Michael; Springer, John; Clase, Kari

    2014-01-01

    Bioinformatics education can be broadly defined as the teaching and learning of the use of computer and information technology, along with mathematical and statistical analysis for gathering, storing, analyzing, interpreting, and integrating data to solve biological problems. The recent surge of genomics, proteomics, and structural biology in the…

  10. Creating a specialist protein resource network: a meeting report for the protein bioinformatics and community resources retreat.

    Science.gov (United States)

    Babbitt, Patricia C; Bagos, Pantelis G; Bairoch, Amos; Bateman, Alex; Chatonnet, Arnaud; Chen, Mark Jinan; Craik, David J; Finn, Robert D; Gloriam, David; Haft, Daniel H; Henrissat, Bernard; Holliday, Gemma L; Isberg, Vignir; Kaas, Quentin; Landsman, David; Lenfant, Nicolas; Manning, Gerard; Nagano, Nozomi; Srinivasan, Narayanaswamy; O'Donovan, Claire; Pruitt, Kim D; Sowdhamini, Ramanathan; Rawlings, Neil D; Saier, Milton H; Sharman, Joanna L; Spedding, Michael; Tsirigos, Konstantinos D; Vastermark, Ake; Vriend, Gerrit

    2015-01-01

    During 11-12 August 2014, a Protein Bioinformatics and Community Resources Retreat was held at the Wellcome Trust Genome Campus in Hinxton, UK. This meeting brought together the principal investigators of several specialized protein resources (such as CAZy, TCDB and MEROPS) as well as those from protein databases from the large Bioinformatics centres (including UniProt and RefSeq). The retreat was divided into five sessions: (1) key challenges, (2) the databases represented, (3) best practices for maintenance and curation, (4) information flow to and from large data centers and (5) communication and funding. An important outcome of this meeting was the creation of a Specialist Protein Resource Network that we believe will improve coordination of the activities of its member resources. We invite further protein database resources to join the network and continue the dialogue.

  11. Relational databases

    CERN Document Server

    Bell, D A

    1986-01-01

    Relational Databases explores the major advances in relational databases and provides a balanced analysis of the state of the art in relational databases. Topics covered include capture and analysis of data placement requirements; distributed relational database systems; data dependency manipulation in database schemata; and relational database support for computer graphics and computer aided design. This book is divided into three sections and begins with an overview of the theory and practice of distributed systems, using the example of INGRES from Relational Technology as illustration. The

  12. 6th International Conference on Practical Applications of Computational Biology & Bioinformatics

    CERN Document Server

    Luscombe, Nicholas; Fdez-Riverola, Florentino; Rodríguez, Juan; Practical Applications of Computational Biology & Bioinformatics

    2012-01-01

    The growth in the Bioinformatics and Computational Biology fields over the last few years has been remarkable.. The analysis of the datasets of Next Generation Sequencing needs new algorithms and approaches from fields such as Databases, Statistics, Data Mining, Machine Learning, Optimization, Computer Science and Artificial Intelligence. Also Systems Biology has also been emerging as an alternative to the reductionist view that dominated biological research in the last decades. This book presents the results of the  6th International Conference on Practical Applications of Computational Biology & Bioinformatics held at University of Salamanca, Spain, 28-30th March, 2012 which brought together interdisciplinary scientists that have a strong background in the biological and computational sciences.

  13. Logistics Cost Calculation of Implementation Warehouse Management System: A Case Study

    Directory of Open Access Journals (Sweden)

    Kučera Tomáš

    2017-01-01

    Full Text Available Warehouse management system can take full advantage of the resources and provide efficient warehousing services. The paper aims to show advantages and disadvantages of the warehouse management system in a chosen enterprise, which is focused on logistics services and transportation. The paper can bring new innovative approach for warehousing and presents how logistics enterprise can reduce logistics costs. This approach includes cost reduction of the establishment, operation and savings in the overall assessment of the implementation of the warehouse management system. The innovative warehouse management system will be demonstrated as the case study, which is classified as a qualitative scientific method, in the chosen logistics enterprise. The paper is based on the research of the world literature, analyses of the internal logistics processes, data and finally enterprise documents. The paper discovers costs related to personnel costs, handling equipment costs and costs for material identification. Implementation of the warehouse management system will reduce overall logistics costs of warehousing and extend the warehouse management system to other parts of the logistics chain.

  14. A Novel Optimization Method on Logistics Operation for Warehouse & Port Enterprises Based on Game Theory

    Directory of Open Access Journals (Sweden)

    Junyang Li

    2013-09-01

    Full Text Available Purpose: The following investigation aims to deal with the competitive relationship among different warehouses & ports in the same company. Design/methodology/approach: In this paper, Game Theory is used in carrying out the optimization model. Genetic Algorithm is used to solve the model. Findings: Unnecessary competition will rise up if there is little internal communication among different warehouses & ports in one company. This paper carries out a novel optimization method on warehouse & port logistics operation model. Originality/value: Warehouse logistics business is a combination of warehousing services and terminal services which is provided by port logistics through the existing port infrastructure on the basis of a port. The newly proposed method can help to optimize logistics operation model for warehouse & port enterprises effectively. We set Sinotrans Guangdong Company as an example to illustrate the newly proposed method. Finally, according to the case study, this paper gives some responses and suggestions on logistics operation in Sinotrans Guangdong warehouse & port for its future development.

  15. Financing agribusiness: Insurance coverage as protection against credit risk of warehouse receipt collateral

    Directory of Open Access Journals (Sweden)

    Jovičić Daliborka

    2017-01-01

    Full Text Available Financing agribusiness by warehouse receipts allows the agricultural producers to obtain working capital on the basis of agricultural products stored in licensed warehouses, as collateral. The implementation of the system of licensed warehouses and issuance of warehouse receipts as collateral for obtaining a bank loan is supported by the European Bank for Reconstruction and Development and it has had positive results in the neighbouring countries. The precondition for financing this project was to establish a Compensation Fund for providing insurance coverage for licensed warehouses against professional liability. However, in the lack of an adequate legal framework, the operational risk is possible to occur. Bearing in mind that Serbia has a tradition in insurance industry and a number of operating insurance companies, the issue is that of the economic benefit and the method of insuring against this risk. The paper will present a detailed analysis of the operation of the Fund, capital requirement, solvency margin and a critical review of the Law on Public Warehouses which regulates the rights and obligations of the Compensation Fund in the case of loss occurrence.

  16. OpenHelix: bioinformatics education outside of a different box.

    Science.gov (United States)

    Williams, Jennifer M; Mangan, Mary E; Perreault-Micale, Cynthia; Lathe, Scott; Sirohi, Neeraj; Lathe, Warren C

    2010-11-01

    The amount of biological data is increasing rapidly, and will continue to increase as new rapid technologies are developed. Professionals in every area of bioscience will have data management needs that require publicly available bioinformatics resources. Not all scientists desire a formal bioinformatics education but would benefit from more informal educational sources of learning. Effective bioinformatics education formats will address a broad range of scientific needs, will be aimed at a variety of user skill levels, and will be delivered in a number of different formats to address different learning styles. Informal sources of bioinformatics education that are effective are available, and will be explored in this review.

  17. Biofuel Database

    Science.gov (United States)

    Biofuel Database (Web, free access)   This database brings together structural, biological, and thermodynamic data for enzymes that are either in current use or are being considered for use in the production of biofuels.

  18. Community Database

    Data.gov (United States)

    National Oceanic and Atmospheric Administration, Department of Commerce — This excel spreadsheet is the result of merging at the port level of several of the in-house fisheries databases in combination with other demographic databases such...

  19. Database Administrator

    Science.gov (United States)

    Moore, Pam

    2010-01-01

    The Internet and electronic commerce (e-commerce) generate lots of data. Data must be stored, organized, and managed. Database administrators, or DBAs, work with database software to find ways to do this. They identify user needs, set up computer databases, and test systems. They ensure that systems perform as they should and add people to the…

  20. A generally applicable lightweight method for calculating a value structure for tools and services in bioinformatics infrastructure projects.

    Science.gov (United States)

    Mayer, Gerhard; Quast, Christian; Felden, Janine; Lange, Matthias; Prinz, Manuel; Pühler, Alfred; Lawerenz, Chris; Scholz, Uwe; Glöckner, Frank Oliver; Müller, Wolfgang; Marcus, Katrin; Eisenacher, Martin

    2017-10-30

    Sustainable noncommercial bioinformatics infrastructures are a prerequisite to use and take advantage of the potential of big data analysis for research and economy. Consequently, funders, universities and institutes as well as users ask for a transparent value model for the tools and services offered. In this article, a generally applicable lightweight method is described by which bioinformatics infrastructure projects can estimate the value of tools and services offered without determining exactly the total costs of ownership. Five representative scenarios for value estimation from a rough estimation to a detailed breakdown of costs are presented. To account for the diversity in bioinformatics applications and services, the notion of service-specific 'service provision units' is introduced together with the factors influencing them and the main underlying assumptions for these 'value influencing factors'. Special attention is given on how to handle personnel costs and indirect costs such as electricity. Four examples are presented for the calculation of the value of tools and services provided by the German Network for Bioinformatics Infrastructure (de.NBI): one for tool usage, one for (Web-based) database analyses, one for consulting services and one for bioinformatics training events. Finally, from the discussed values, the costs of direct funding and the costs of payment of services by funded projects are calculated and compared. © The Author 2017. Published by Oxford University Press.

  1. Desain Sistem Semantic Data Warehouse dengan Metode Ontology dan Rule Based untuk Mengolah Data Akademik Universitas XYZ di Bali

    Directory of Open Access Journals (Sweden)

    Made Pradnyana Ambara

    2016-06-01

    Full Text Available Data warehouse pada umumnya yang sering dikenal data warehouse tradisional mempunyai beberapa kelemahan yang mengakibatkan kualitas data yang dihasilkan tidak spesifik dan efektif. Sistem semantic data warehouse merupakan solusi untuk menangani permasalahan pada data warehouse tradisional dengan kelebihan antara lain: manajeman kualitas data yang spesifik dengan format data seragam untuk mendukung laporan OLAP yang baik, dan performance pencarian informasi yang lebih efektif dengan kata kunci bahasa alami. Pemodelan sistem semantic data warehouse menggunakan metode ontology menghasilkan model resource description framework schema (RDFS logic yang akan ditransformasikan menjadi snowflake schema. Laporan akademik yang dibutuhkan dihasilkan melalui metode nine step Kimball dan pencarian semantic menggunakan metode rule based. Pengujian dilakukan menggunakan dua metode uji yaitu pengujian dengan black box testing dan angket kuesioner cheklist. Dari hasil penelitian ini dapat disimpulkan bahwa sistem semantic data warehouse dapat membantu proses pengolahan data akademik yang menghasilkan laporan yang berkualitas untuk mendukung proses pengambilan keputusan.

  2. High-throughput bioinformatics with the Cyrille2 pipeline system

    Directory of Open Access Journals (Sweden)

    de Groot Joost CW

    2008-02-01

    Full Text Available Abstract Background Modern omics research involves the application of high-throughput technologies that generate vast volumes of data. These data need to be pre-processed, analyzed and integrated with existing knowledge through the use of diverse sets of software tools, models and databases. The analyses are often interdependent and chained together to form complex workflows or pipelines. Given the volume of the data used and the multitude of computational resources available, specialized pipeline software is required to make high-throughput analysis of large-scale omics datasets feasible. Results We have developed a generic pipeline system called Cyrille2. The system is modular in design and consists of three functionally distinct parts: 1 a web based, graphical user interface (GUI that enables a pipeline operator to manage the system; 2 the Scheduler, which forms the functional core of the system and which tracks what data enters the system and determines what jobs must be scheduled for execution, and; 3 the Executor, which searches for scheduled jobs and executes these on a compute cluster. Conclusion The Cyrille2 system is an extensible, modular system, implementing the stated requirements. Cyrille2 enables easy creation and execution of high throughput, flexible bioinformatics pipelines.

  3. Progress and challenges in bioinformatics approaches for enhancer identification

    KAUST Repository

    Kleftogiannis, Dimitrios A.

    2017-02-03

    Enhancers are cis-acting DNA elements that play critical roles in distal regulation of gene expression. Identifying enhancers is an important step for understanding distinct gene expression programs that may reflect normal and pathogenic cellular conditions. Experimental identification of enhancers is constrained by the set of conditions used in the experiment. This requires multiple experiments to identify enhancers, as they can be active under specific cellular conditions but not in different cell types/tissues or cellular states. This has opened prospects for computational prediction methods that can be used for high-throughput identification of putative enhancers to complement experimental approaches. Potential functions and properties of predicted enhancers have been catalogued and summarized in several enhancer-oriented databases. Because the current methods for the computational prediction of enhancers produce significantly different enhancer predictions, it will be beneficial for the research community to have an overview of the strategies and solutions developed in this field. In this review, we focus on the identification and analysis of enhancers by bioinformatics approaches. First, we describe a general framework for computational identification of enhancers, present relevant data types and discuss possible computational solutions. Next, we cover over 30 existing computational enhancer identification methods that were developed since 2000. Our review highlights advantages, limitations and potentials, while suggesting pragmatic guidelines for development of more efficient computational enhancer prediction methods. Finally, we discuss challenges and open problems of this topic, which require further consideration.

  4. Accurate Prediction of Coronary Artery Disease Using Bioinformatics Algorithms

    Directory of Open Access Journals (Sweden)

    Hajar Shafiee

    2016-06-01

    Full Text Available Background and Objectives: Cardiovascular disease is one of the main causes of death in developed and Third World countries. According to the statement of the World Health Organization, it is predicted that death due to heart disease will rise to 23 million by 2030. According to the latest statistics reported by Iran’s Minister of health, 3.39% of all deaths are attributed to cardiovascular diseases and 19.5% are related to myocardial infarction. The aim of this study was to predict coronary artery disease using data mining algorithms. Methods: In this study, various bioinformatics algorithms, such as decision trees, neural networks, support vector machines, clustering, etc., were used to predict coronary heart disease. The data used in this study was taken from several valid databases (including 14 data. Results: In this research, data mining techniques can be effectively used to diagnose different diseases, including coronary artery disease. Also, for the first time, a prediction system based on support vector machine with the best possible accuracy was introduced. Conclusion: The results showed that among the features, thallium scan variable is the most important feature in the diagnosis of heart disease. Designation of machine prediction models, such as support vector machine learning algorithm can differentiate between sick and healthy individuals with 100% accuracy.

  5. Progress and challenges in bioinformatics approaches for enhancer identification

    KAUST Repository

    Kleftogiannis, Dimitrios A.; Kalnis, Panos; Bajic, Vladimir B.

    2017-01-01

    Enhancers are cis-acting DNA elements that play critical roles in distal regulation of gene expression. Identifying enhancers is an important step for understanding distinct gene expression programs that may reflect normal and pathogenic cellular conditions. Experimental identification of enhancers is constrained by the set of conditions used in the experiment. This requires multiple experiments to identify enhancers, as they can be active under specific cellular conditions but not in different cell types/tissues or cellular states. This has opened prospects for computational prediction methods that can be used for high-throughput identification of putative enhancers to complement experimental approaches. Potential functions and properties of predicted enhancers have been catalogued and summarized in several enhancer-oriented databases. Because the current methods for the computational prediction of enhancers produce significantly different enhancer predictions, it will be beneficial for the research community to have an overview of the strategies and solutions developed in this field. In this review, we focus on the identification and analysis of enhancers by bioinformatics approaches. First, we describe a general framework for computational identification of enhancers, present relevant data types and discuss possible computational solutions. Next, we cover over 30 existing computational enhancer identification methods that were developed since 2000. Our review highlights advantages, limitations and potentials, while suggesting pragmatic guidelines for development of more efficient computational enhancer prediction methods. Finally, we discuss challenges and open problems of this topic, which require further consideration.

  6. Bioinformatic Prediction of WSSV-Host Protein-Protein Interaction

    Directory of Open Access Journals (Sweden)

    Zheng Sun

    2014-01-01

    Full Text Available WSSV is one of the most dangerous pathogens in shrimp aquaculture. However, the molecular mechanism of how WSSV interacts with shrimp is still not very clear. In the present study, bioinformatic approaches were used to predict interactions between proteins from WSSV and shrimp. The genome data of WSSV (NC_003225.1 and the constructed transcriptome data of F. chinensis were used to screen potentially interacting proteins by searching in protein interaction databases, including STRING, Reactome, and DIP. Forty-four pairs of proteins were suggested to have interactions between WSSV and the shrimp. Gene ontology analysis revealed that 6 pairs of these interacting proteins were classified into “extracellular region” or “receptor complex” GO-terms. KEGG pathway analysis showed that they were involved in the “ECM-receptor interaction pathway.” In the 6 pairs of interacting proteins, an envelope protein called “collagen-like protein” (WSSV-CLP encoded by an early virus gene “wsv001” in WSSV interacted with 6 deduced proteins from the shrimp, including three integrin alpha (ITGA, two integrin beta (ITGB, and one syndecan (SDC. Sequence analysis on WSSV-CLP, ITGA, ITGB, and SDC revealed that they possessed the sequence features for protein-protein interactions. This study might provide new insights into the interaction mechanisms between WSSV and shrimp.

  7. DW4TR: A Data Warehouse for Translational Research.

    Science.gov (United States)

    Hu, Hai; Correll, Mick; Kvecher, Leonid; Osmond, Michelle; Clark, Jim; Bekhash, Anthony; Schwab, Gwendolyn; Gao, De; Gao, Jun; Kubatin, Vladimir; Shriver, Craig D; Hooke, Jeffrey A; Maxwell, Larry G; Kovatich, Albert J; Sheldon, Jonathan G; Liebman, Michael N; Mural, Richard J

    2011-12-01

    The linkage between the clinical and laboratory research domains is a key issue in translational research. Integration of clinicopathologic data alone is a major task given the number of data elements involved. For a translational research environment, it is critical to make these data usable at the point-of-need. Individual systems have been developed to meet the needs of particular projects though the need for a generalizable system has been recognized. Increased use of Electronic Medical Record data in translational research will demand generalizing the system for integrating clinical data to support the study of a broad range of human diseases. To ultimately satisfy these needs, we have developed a system to support multiple translational research projects. This system, the Data Warehouse for Translational Research (DW4TR), is based on a light-weight, patient-centric modularly-structured clinical data model and a specimen-centric molecular data model. The temporal relationships of the data are also part of the model. The data are accessed through an interface composed of an Aggregated Biomedical-Information Browser (ABB) and an Individual Subject Information Viewer (ISIV) which target general users. The system was developed to support a breast cancer translational research program and has been extended to support a gynecological disease program. Further extensions of the DW4TR are underway. We believe that the DW4TR will play an important role in translational research across multiple disease types. Copyright © 2011 Elsevier Inc. All rights reserved.

  8. Bioinformatic Analysis of Strawberry GSTF12 Gene

    Science.gov (United States)

    Wang, Xiran; Jiang, Leiyu; Tang, Haoru

    2018-01-01

    GSTF12 has always been known as a key factor of proanthocyanins accumulate in plant testa. Through bioinformatics analysis of the nucleotide and encoded protein sequence of GSTF12, it is more advantageous to the study of genes related to anthocyanin biosynthesis accumulation pathway. Therefore, we chosen GSTF12 gene of 11 kinds species, downloaded their nucleotide and protein sequence from NCBI as the research object, found strawberry GSTF12 gene via bioinformation analyse, constructed phylogenetic tree. At the same time, we analysed the strawberry GSTF12 gene of physical and chemical properties and its protein structure and so on. The phylogenetic tree showed that Strawberry and petunia were closest relative. By the protein prediction, we found that the protein owed one proper signal peptide without obvious transmembrane regions.

  9. Bioinformatics for Next Generation Sequencing Data

    Directory of Open Access Journals (Sweden)

    Alberto Magi

    2010-09-01

    Full Text Available The emergence of next-generation sequencing (NGS platforms imposes increasing demands on statistical methods and bioinformatic tools for the analysis and the management of the huge amounts of data generated by these technologies. Even at the early stages of their commercial availability, a large number of softwares already exist for analyzing NGS data. These tools can be fit into many general categories including alignment of sequence reads to a reference, base-calling and/or polymorphism detection, de novo assembly from paired or unpaired reads, structural variant detection and genome browsing. This manuscript aims to guide readers in the choice of the available computational tools that can be used to face the several steps of the data analysis workflow.

  10. Combining multiple decisions: applications to bioinformatics

    International Nuclear Information System (INIS)

    Yukinawa, N; Ishii, S; Takenouchi, T; Oba, S

    2008-01-01

    Multi-class classification is one of the fundamental tasks in bioinformatics and typically arises in cancer diagnosis studies by gene expression profiling. This article reviews two recent approaches to multi-class classification by combining multiple binary classifiers, which are formulated based on a unified framework of error-correcting output coding (ECOC). The first approach is to construct a multi-class classifier in which each binary classifier to be aggregated has a weight value to be optimally tuned based on the observed data. In the second approach, misclassification of each binary classifier is formulated as a bit inversion error with a probabilistic model by making an analogy to the context of information transmission theory. Experimental studies using various real-world datasets including cancer classification problems reveal that both of the new methods are superior or comparable to other multi-class classification methods

  11. Data mining in bioinformatics using Weka.

    Science.gov (United States)

    Frank, Eibe; Hall, Mark; Trigg, Len; Holmes, Geoffrey; Witten, Ian H

    2004-10-12

    The Weka machine learning workbench provides a general-purpose environment for automatic classification, regression, clustering and feature selection-common data mining problems in bioinformatics research. It contains an extensive collection of machine learning algorithms and data pre-processing methods complemented by graphical user interfaces for data exploration and the experimental comparison of different machine learning techniques on the same problem. Weka can process data given in the form of a single relational table. Its main objectives are to (a) assist users in extracting useful information from data and (b) enable them to easily identify a suitable algorithm for generating an accurate predictive model from it. http://www.cs.waikato.ac.nz/ml/weka.

  12. Bioinformatic and Biometric Methods in Plant Morphology

    Directory of Open Access Journals (Sweden)

    Surangi W. Punyasena

    2014-08-01

    Full Text Available Recent advances in microscopy, imaging, and data analyses have permitted both the greater application of quantitative methods and the collection of large data sets that can be used to investigate plant morphology. This special issue, the first for Applications in Plant Sciences, presents a collection of papers highlighting recent methods in the quantitative study of plant form. These emerging biometric and bioinformatic approaches to plant sciences are critical for better understanding how morphology relates to ecology, physiology, genotype, and evolutionary and phylogenetic history. From microscopic pollen grains and charcoal particles, to macroscopic leaves and whole root systems, the methods presented include automated classification and identification, geometric morphometrics, and skeleton networks, as well as tests of the limits of human assessment. All demonstrate a clear need for these computational and morphometric approaches in order to increase the consistency, objectivity, and throughput of plant morphological studies.

  13. Academic Training - Bioinformatics: Decoding the Genome

    CERN Multimedia

    Chris Jones

    2006-01-01

    ACADEMIC TRAINING LECTURE SERIES 27, 28 February 1, 2, 3 March 2006 from 11:00 to 12:00 - Auditorium, bldg. 500 Decoding the Genome A special series of 5 lectures on: Recent extraordinary advances in the life sciences arising through new detection technologies and bioinformatics The past five years have seen an extraordinary change in the information and tools available in the life sciences. The sequencing of the human genome, the discovery that we possess far fewer genes than foreseen, the measurement of the tiny changes in the genomes that differentiate us, the sequencing of the genomes of many pathogens that lead to diseases such as malaria are all examples of completely new information that is now available in the quest for improved healthcare. New tools have allowed similar strides in the discovery of the associated protein structures, providing invaluable information for those searching for new drugs. New DNA microarray chips permit simultaneous measurement of the state of expression of tens...

  14. Evaluating an Inquiry-Based Bioinformatics Course Using Q Methodology

    Science.gov (United States)

    Ramlo, Susan E.; McConnell, David; Duan, Zhong-Hui; Moore, Francisco B.

    2008-01-01

    Faculty at a Midwestern metropolitan public university recently developed a course on bioinformatics that emphasized collaboration and inquiry. Bioinformatics, essentially the application of computational tools to biological data, is inherently interdisciplinary. Thus part of the challenge of creating this course was serving the needs and…

  15. Bioinformatics education dissemination with an evolutionary problem solving perspective.

    Science.gov (United States)

    Jungck, John R; Donovan, Samuel S; Weisstein, Anton E; Khiripet, Noppadon; Everse, Stephen J

    2010-11-01

    Bioinformatics is central to biology education in the 21st century. With the generation of terabytes of data per day, the application of computer-based tools to stored and distributed data is fundamentally changing research and its application to problems in medicine, agriculture, conservation and forensics. In light of this 'information revolution,' undergraduate biology curricula must be redesigned to prepare the next generation of informed citizens as well as those who will pursue careers in the life sciences. The BEDROCK initiative (Bioinformatics Education Dissemination: Reaching Out, Connecting and Knitting together) has fostered an international community of bioinformatics educators. The initiative's goals are to: (i) Identify and support faculty who can take leadership roles in bioinformatics education; (ii) Highlight and distribute innovative approaches to incorporating evolutionary bioinformatics data and techniques throughout undergraduate education; (iii) Establish mechanisms for the broad dissemination of bioinformatics resource materials and teaching models; (iv) Emphasize phylogenetic thinking and problem solving; and (v) Develop and publish new software tools to help students develop and test evolutionary hypotheses. Since 2002, BEDROCK has offered more than 50 faculty workshops around the world, published many resources and supported an environment for developing and sharing bioinformatics education approaches. The BEDROCK initiative builds on the established pedagogical philosophy and academic community of the BioQUEST Curriculum Consortium to assemble the diverse intellectual and human resources required to sustain an international reform effort in undergraduate bioinformatics education.

  16. Bioinformatics and its application in animal health: a review | Soetan ...

    African Journals Online (AJOL)

    Bioinformatics is an interdisciplinary subject, which uses computer application, statistics, mathematics and engineering for the analysis and management of biological information. It has become an important tool for basic and applied research in veterinary sciences. Bioinformatics has brought about advancements into ...

  17. Recent developments in life sciences research: Role of bioinformatics

    African Journals Online (AJOL)

    Life sciences research and development has opened up new challenges and opportunities for bioinformatics. The contribution of bioinformatics advances made possible the mapping of the entire human genome and genomes of many other organisms in just over a decade. These discoveries, along with current efforts to ...

  18. Generative Topic Modeling in Image Data Mining and Bioinformatics Studies

    Science.gov (United States)

    Chen, Xin

    2012-01-01

    Probabilistic topic models have been developed for applications in various domains such as text mining, information retrieval and computer vision and bioinformatics domain. In this thesis, we focus on developing novel probabilistic topic models for image mining and bioinformatics studies. Specifically, a probabilistic topic-connection (PTC) model…

  19. Assessment of a Bioinformatics across Life Science Curricula Initiative

    Science.gov (United States)

    Howard, David R.; Miskowski, Jennifer A.; Grunwald, Sandra K.; Abler, Michael L.

    2007-01-01

    At the University of Wisconsin-La Crosse, we have undertaken a program to integrate the study of bioinformatics across the undergraduate life science curricula. Our efforts have included incorporating bioinformatics exercises into courses in the biology, microbiology, and chemistry departments, as well as coordinating the efforts of faculty within…

  20. Concepts Of Bioinformatics And Its Application In Veterinary ...

    African Journals Online (AJOL)

    Bioinformatics has advanced the course of research and future veterinary vaccines development because it has provided new tools for identification of vaccine targets from sequenced biological data of organisms. In Nigeria, there is lack of bioinformatics training in the universities, expect for short training courses in which ...

  1. Current status and future perspectives of bioinformatics in Tanzania ...

    African Journals Online (AJOL)

    The main bottleneck in advancing genomics in present times is the lack of expertise in using bioinformatics tools and approaches for data mining in raw DNA sequences generated by modern high throughput technologies such as next generation sequencing. Although bioinformatics has been making major progress and ...

  2. The 2015 Bioinformatics Open Source Conference (BOSC 2015).

    Science.gov (United States)

    Harris, Nomi L; Cock, Peter J A; Lapp, Hilmar; Chapman, Brad; Davey, Rob; Fields, Christopher; Hokamp, Karsten; Munoz-Torres, Monica

    2016-02-01

    The Bioinformatics Open Source Conference (BOSC) is organized by the Open Bioinformatics Foundation (OBF), a nonprofit group dedicated to promoting the practice and philosophy of open source software development and open science within the biological research community. Since its inception in 2000, BOSC has provided bioinformatics developers with a forum for communicating the results of their latest efforts to the wider research community. BOSC offers a focused environment for developers and users to interact and share ideas about standards; software development practices; practical techniques for solving bioinformatics problems; and approaches that promote open science and sharing of data, results, and software. BOSC is run as a two-day special interest group (SIG) before the annual Intelligent Systems in Molecular Biology (ISMB) conference. BOSC 2015 took place in Dublin, Ireland, and was attended by over 125 people, about half of whom were first-time attendees. Session topics included "Data Science;" "Standards and Interoperability;" "Open Science and Reproducibility;" "Translational Bioinformatics;" "Visualization;" and "Bioinformatics Open Source Project Updates". In addition to two keynote talks and dozens of shorter talks chosen from submitted abstracts, BOSC 2015 included a panel, titled "Open Source, Open Door: Increasing Diversity in the Bioinformatics Open Source Community," that provided an opportunity for open discussion about ways to increase the diversity of participants in BOSC in particular, and in open source bioinformatics in general. The complete program of BOSC 2015 is available online at http://www.open-bio.org/wiki/BOSC_2015_Schedule.

  3. Is there room for ethics within bioinformatics education?

    Science.gov (United States)

    Taneri, Bahar

    2011-07-01

    When bioinformatics education is considered, several issues are addressed. At the undergraduate level, the main issue revolves around conveying information from two main and different fields: biology and computer science. At the graduate level, the main issue is bridging the gap between biology students and computer science students. However, there is an educational component that is rarely addressed within the context of bioinformatics education: the ethics component. Here, a different perspective is provided on bioinformatics education, and the current status of ethics is analyzed within the existing bioinformatics programs. Analysis of the existing undergraduate and graduate programs, in both Europe and the United States, reveals the minimal attention given to ethics within bioinformatics education. Given that bioinformaticians speedily and effectively shape the biomedical sciences and hence their implications for society, here redesigning of the bioinformatics curricula is suggested in order to integrate the necessary ethics education. Unique ethical problems awaiting bioinformaticians and bioinformatics ethics as a separate field of study are discussed. In addition, a template for an "Ethics in Bioinformatics" course is provided.

  4. 4273π: bioinformatics education on low cost ARM hardware.

    Science.gov (United States)

    Barker, Daniel; Ferrier, David Ek; Holland, Peter Wh; Mitchell, John Bo; Plaisier, Heleen; Ritchie, Michael G; Smart, Steven D

    2013-08-12

    Teaching bioinformatics at universities is complicated by typical computer classroom settings. As well as running software locally and online, students should gain experience of systems administration. For a future career in biology or bioinformatics, the installation of software is a useful skill. We propose that this may be taught by running the course on GNU/Linux running on inexpensive Raspberry Pi computer hardware, for which students may be granted full administrator access. We release 4273π, an operating system image for Raspberry Pi based on Raspbian Linux. This includes minor customisations for classroom use and includes our Open Access bioinformatics course, 4273π Bioinformatics for Biologists. This is based on the final-year undergraduate module BL4273, run on Raspberry Pi computers at the University of St Andrews, Semester 1, academic year 2012-2013. 4273π is a means to teach bioinformatics, including systems administration tasks, to undergraduates at low cost.

  5. LXtoo: an integrated live Linux distribution for the bioinformatics community.

    Science.gov (United States)

    Yu, Guangchuang; Wang, Li-Gen; Meng, Xiao-Hua; He, Qing-Yu

    2012-07-19

    Recent advances in high-throughput technologies dramatically increase biological data generation. However, many research groups lack computing facilities and specialists. This is an obstacle that remains to be addressed. Here, we present a Linux distribution, LXtoo, to provide a flexible computing platform for bioinformatics analysis. Unlike most of the existing live Linux distributions for bioinformatics limiting their usage to sequence analysis and protein structure prediction, LXtoo incorporates a comprehensive collection of bioinformatics software, including data mining tools for microarray and proteomics, protein-protein interaction analysis, and computationally complex tasks like molecular dynamics. Moreover, most of the programs have been configured and optimized for high performance computing. LXtoo aims to provide well-supported computing environment tailored for bioinformatics research, reducing duplication of efforts in building computing infrastructure. LXtoo is distributed as a Live DVD and freely available at http://bioinformatics.jnu.edu.cn/LXtoo.

  6. The development and application of bioinformatics core competencies to improve bioinformatics training and education.

    Science.gov (United States)

    Mulder, Nicola; Schwartz, Russell; Brazas, Michelle D; Brooksbank, Cath; Gaeta, Bruno; Morgan, Sarah L; Pauley, Mark A; Rosenwald, Anne; Rustici, Gabriella; Sierk, Michael; Warnow, Tandy; Welch, Lonnie

    2018-02-01

    Bioinformatics is recognized as part of the essential knowledge base of numerous career paths in biomedical research and healthcare. However, there is little agreement in the field over what that knowledge entails or how best to provide it. These disagreements are compounded by the wide range of populations in need of bioinformatics training, with divergent prior backgrounds and intended application areas. The Curriculum Task Force of the International Society of Computational Biology (ISCB) Education Committee has sought to provide a framework for training needs and curricula in terms of a set of bioinformatics core competencies that cut across many user personas and training programs. The initial competencies developed based on surveys of employers and training programs have since been refined through a multiyear process of community engagement. This report describes the current status of the competencies and presents a series of use cases illustrating how they are being applied in diverse training contexts. These use cases are intended to demonstrate how others can make use of the competencies and engage in the process of their continuing refinement and application. The report concludes with a consideration of remaining challenges and future plans.

  7. The development and application of bioinformatics core competencies to improve bioinformatics training and education

    Science.gov (United States)

    Brooksbank, Cath; Morgan, Sarah L.; Rosenwald, Anne; Warnow, Tandy; Welch, Lonnie

    2018-01-01

    Bioinformatics is recognized as part of the essential knowledge base of numerous career paths in biomedical research and healthcare. However, there is little agreement in the field over what that knowledge entails or how best to provide it. These disagreements are compounded by the wide range of populations in need of bioinformatics training, with divergent prior backgrounds and intended application areas. The Curriculum Task Force of the International Society of Computational Biology (ISCB) Education Committee has sought to provide a framework for training needs and curricula in terms of a set of bioinformatics core competencies that cut across many user personas and training programs. The initial competencies developed based on surveys of employers and training programs have since been refined through a multiyear process of community engagement. This report describes the current status of the competencies and presents a series of use cases illustrating how they are being applied in diverse training contexts. These use cases are intended to demonstrate how others can make use of the competencies and engage in the process of their continuing refinement and application. The report concludes with a consideration of remaining challenges and future plans. PMID:29390004

  8. EDAM: an ontology of bioinformatics operations, types of data and identifiers, topics and formats

    Science.gov (United States)

    Ison, Jon; Kalaš, Matúš; Jonassen, Inge; Bolser, Dan; Uludag, Mahmut; McWilliam, Hamish; Malone, James; Lopez, Rodrigo; Pettifer, Steve; Rice, Peter

    2013-01-01

    Motivation: Advancing the search, publication and integration of bioinformatics tools and resources demands consistent machine-understandable descriptions. A comprehensive ontology allowing such descriptions is therefore required. Results: EDAM is an ontology of bioinformatics operations (tool or workflow functions), types of data and identifiers, application domains and data formats. EDAM supports semantic annotation of diverse entities such as Web services, databases, programmatic libraries, standalone tools, interactive applications, data schemas, datasets and publications within bioinformatics. EDAM applies to organizing and finding suitable tools and data and to automating their integration into complex applications or workflows. It includes over 2200 defined concepts and has successfully been used for annotations and implementations. Availability: The latest stable version of EDAM is available in OWL format from http://edamontology.org/EDAM.owl and in OBO format from http://edamontology.org/EDAM.obo. It can be viewed online at the NCBO BioPortal and the EBI Ontology Lookup Service. For documentation and license please refer to http://edamontology.org. This article describes version 1.2 available at http://edamontology.org/EDAM_1.2.owl. Contact: jison@ebi.ac.uk PMID:23479348

  9. Host-parasite interactions and ecology of the malaria parasite-a bioinformatics approach.

    Science.gov (United States)

    Izak, Dariusz; Klim, Joanna; Kaczanowski, Szymon

    2018-04-25

    Malaria remains one of the highest mortality infectious diseases. Malaria is caused by parasites from the genus Plasmodium. Most deaths are caused by infections involving Plasmodium falciparum, which has a complex life cycle. Malaria parasites are extremely well adapted for interactions with their host and their host's immune system and are able to suppress the human immune system, erase immunological memory and rapidly alter exposed antigens. Owing to this rapid evolution, parasites develop drug resistance and express novel forms of antigenic proteins that are not recognized by the host immune system. There is an emerging need for novel interventions, including novel drugs and vaccines. Designing novel therapies requires knowledge about host-parasite interactions, which is still limited. However, significant progress has recently been achieved in this field through the application of bioinformatics analysis of parasite genome sequences. In this review, we describe the main achievements in 'malarial' bioinformatics and provide examples of successful applications of protein sequence analysis. These examples include the prediction of protein functions based on homology and the prediction of protein surface localization via domain and motif analysis. Additionally, we describe PlasmoDB, a database that stores accumulated experimental data. This tool allows data mining of the stored information and will play an important role in the development of malaria science. Finally, we illustrate the application of bioinformatics in the development of population genetics research on malaria parasites, an approach referred to as reverse ecology.

  10. Using Bioinformatics to Develop and Test Hypotheses: E. coli-Specific Virulence Determinants

    Directory of Open Access Journals (Sweden)

    Joanna R. Klein

    2012-09-01

    Full Text Available Bioinformatics, the use of computer resources to understand biological information, is an important tool in research, and can be easily integrated into the curriculum of undergraduate courses. Such an example is provided in this series of four activities that introduces students to the field of bioinformatics as they design PCR based tests for pathogenic E. coli strains. A variety of computer tools are used including BLAST searches at NCBI, bacterial genome searches at the Integrated Microbial Genomes (IMG database, protein analysis at Pfam and literature research at PubMed. In the process, students also learn about virulence factors, enzyme function and horizontal gene transfer. Some or all of the four activities can be incorporated into microbiology or general biology courses taken by students at a variety of levels, ranging from high school through college. The activities build on one another as they teach and reinforce knowledge and skills, promote critical thinking, and provide for student collaboration and presentation. The computer-based activities can be done either in class or outside of class, thus are appropriate for inclusion in online or blended learning formats. Assessment data showed that students learned general microbiology concepts related to pathogenesis and enzyme function, gained skills in using tools of bioinformatics and molecular biology, and successfully developed and tested a scientific hypothesis.

  11. Design Criteria in Revitalizing Old Warehouse District on the Kalimas Riverbank Area of Surabaya City

    Directory of Open Access Journals (Sweden)

    Endang Titi Sunarti Darjosanjoto

    2015-09-01

    Full Text Available Neglected warehouse buildings along the Kalimas River have created a poor urban façade in terms of visual quality. However the city government is planning to encourage tourism activities that take advantage of Kalimas River and its surrounding environment. If there is no good plan in accordance with the concept of local identity for old city of Surabaya, it will reduce it as a tourist attraction. In reference to the issue above, design criteria needs to be compiled for revitalizing the old warehouse district, which is expected to revive the identity of this district and be able to support the city’s tourism. This study was conducted by recording field observations, and the data was analyzed using the character appraisal method. The character appraisal analysis method is presented in the form of street picture data, which is divided into determined segments. The results show that there are five components including place attachment, sustainable urban design, green open space design, ecological riverfront design, and activity support that should be considered in the revitalization of the warehouse district. Those components are divided into two parts: building and open space at the riverbank. There are 13 design criteria for building at the riverbank, while there are 14 design criteria for open space at the riverbank. These design criteria can enrich the warehouse district’s revitalization by improving the visual quality of the urban environment.Keywords: design criteria; warehouse district; riverbank; Surabaya; revitalization.

  12. Medical Big Data Warehouse: Architecture and System Design, a Case Study: Improving Healthcare Resources Distribution.

    Science.gov (United States)

    Sebaa, Abderrazak; Chikh, Fatima; Nouicer, Amina; Tari, AbdelKamel

    2018-02-19

    The huge increases in medical devices and clinical applications which generate enormous data have raised a big issue in managing, processing, and mining this massive amount of data. Indeed, traditional data warehousing frameworks can not be effective when managing the volume, variety, and velocity of current medical applications. As a result, several data warehouses face many issues over medical data and many challenges need to be addressed. New solutions have emerged and Hadoop is one of the best examples, it can be used to process these streams of medical data. However, without an efficient system design and architecture, these performances will not be significant and valuable for medical managers. In this paper, we provide a short review of the literature about research issues of traditional data warehouses and we present some important Hadoop-based data warehouses. In addition, a Hadoop-based architecture and a conceptual data model for designing medical Big Data warehouse are given. In our case study, we provide implementation detail of big data warehouse based on the proposed architecture and data model in the Apache Hadoop platform to ensure an optimal allocation of health resources.

  13. Design of data warehouse in teaching state based on OLAP and data mining

    Science.gov (United States)

    Zhou, Lijuan; Wu, Minhua; Li, Shuang

    2009-04-01

    The data warehouse and the data mining technology is one of information technology research hot topics. At present the data warehouse and the data mining technology in aspects and so on commercial, financial industry as well as enterprise's production, market marketing obtained the widespread application, but is relatively less in educational fields' application. Over the years, the teaching and management have been accumulating large amounts of data in colleges and universities, while the data can not be effectively used, in the light of social needs of the university development and the current status of data management, the establishment of data warehouse in university state, the better use of existing data, and on the basis dealing with a higher level of disposal --data mining are particularly important. In this paper, starting from the decision-making needs design data warehouse structure of university teaching state, and then through the design structure and data extraction, loading, conversion create a data warehouse model, finally make use of association rule mining algorithm for data mining, to get effective results applied in practice. Based on the data analysis and mining, get a lot of valuable information, which can be used to guide teaching management, thereby improving the quality of teaching and promoting teaching devotion in universities and enhancing teaching infrastructure. At the same time it can provide detailed, multi-dimensional information for universities assessment and higher education research.

  14. Ontology based heterogeneous materials database integration and semantic query

    Science.gov (United States)

    Zhao, Shuai; Qian, Quan

    2017-10-01

    Materials digital data, high throughput experiments and high throughput computations are regarded as three key pillars of materials genome initiatives. With the fast growth of materials data, the integration and sharing of data is very urgent, that has gradually become a hot topic of materials informatics. Due to the lack of semantic description, it is difficult to integrate data deeply in semantic level when adopting the conventional heterogeneous database integration approaches such as federal database or data warehouse. In this paper, a semantic integration method is proposed to create the semantic ontology by extracting the database schema semi-automatically. Other heterogeneous databases are integrated to the ontology by means of relational algebra and the rooted graph. Based on integrated ontology, semantic query can be done using SPARQL. During the experiments, two world famous First Principle Computational databases, OQMD and Materials Project are used as the integration targets, which show the availability and effectiveness of our method.

  15. Database and applications security integrating information security and data management

    CERN Document Server

    Thuraisingham, Bhavani

    2005-01-01

    This is the first book to provide an in-depth coverage of all the developments, issues and challenges in secure databases and applications. It provides directions for data and application security, including securing emerging applications such as bioinformatics, stream information processing and peer-to-peer computing. Divided into eight sections, each of which focuses on a key concept of secure databases and applications, this book deals with all aspects of technology, including secure relational databases, inference problems, secure object databases, secure distributed databases and emerging

  16. Federal databases

    International Nuclear Information System (INIS)

    Welch, M.J.; Welles, B.W.

    1988-01-01

    Accident statistics on all modes of transportation are available as risk assessment analytical tools through several federal agencies. This paper reports on the examination of the accident databases by personal contact with the federal staff responsible for administration of the database programs. This activity, sponsored by the Department of Energy through Sandia National Laboratories, is an overview of the national accident data on highway, rail, air, and marine shipping. For each mode, the definition or reporting requirements of an accident are determined and the method of entering the accident data into the database is established. Availability of the database to others, ease of access, costs, and who to contact were prime questions to each of the database program managers. Additionally, how the agency uses the accident data was of major interest

  17. Bioinformatics research in the Asia Pacific: a 2007 update.

    Science.gov (United States)

    Ranganathan, Shoba; Gribskov, Michael; Tan, Tin Wee

    2008-01-01

    We provide a 2007 update on the bioinformatics research in the Asia-Pacific from the Asia Pacific Bioinformatics Network (APBioNet), Asia's oldest bioinformatics organisation set up in 1998. From 2002, APBioNet has organized the first International Conference on Bioinformatics (InCoB) bringing together scientists working in the field of bioinformatics in the region. This year, the InCoB2007 Conference was organized as the 6th annual conference of the Asia-Pacific Bioinformatics Network, on Aug. 27-30, 2007 at Hong Kong, following a series of successful events in Bangkok (Thailand), Penang (Malaysia), Auckland (New Zealand), Busan (South Korea) and New Delhi (India). Besides a scientific meeting at Hong Kong, satellite events organized are a pre-conference training workshop at Hanoi, Vietnam and a post-conference workshop at Nansha, China. This Introduction provides a brief overview of the peer-reviewed manuscripts accepted for publication in this Supplement. We have organized the papers into thematic areas, highlighting the growing contribution of research excellence from this region, to global bioinformatics endeavours.

  18. Continuing Education Workshops in Bioinformatics Positively Impact Research and Careers.

    Science.gov (United States)

    Brazas, Michelle D; Ouellette, B F Francis

    2016-06-01

    Bioinformatics.ca has been hosting continuing education programs in introductory and advanced bioinformatics topics in Canada since 1999 and has trained more than 2,000 participants to date. These workshops have been adapted over the years to keep pace with advances in both science and technology as well as the changing landscape in available learning modalities and the bioinformatics training needs of our audience. Post-workshop surveys have been a mandatory component of each workshop and are used to ensure appropriate adjustments are made to workshops to maximize learning. However, neither bioinformatics.ca nor others offering similar training programs have explored the long-term impact of bioinformatics continuing education training. Bioinformatics.ca recently initiated a look back on the impact its workshops have had on the career trajectories, research outcomes, publications, and collaborations of its participants. Using an anonymous online survey, bioinformatics.ca analyzed responses from those surveyed and discovered its workshops have had a positive impact on collaborations, research, publications, and career progression.

  19. A geodata warehouse: Using denormalisation techniques as a tool for delivering spatially enabled integrated geological information to geologists

    Science.gov (United States)

    Kingdon, Andrew; Nayembil, Martin L.; Richardson, Anne E.; Smith, A. Graham

    2016-11-01

    New requirements to understand geological properties in three dimensions have led to the development of PropBase, a data structure and delivery tools to deliver this. At the BGS, relational database management systems (RDBMS) has facilitated effective data management using normalised subject-based database designs with business rules in a centralised, vocabulary controlled, architecture. These have delivered effective data storage in a secure environment. However, isolated subject-oriented designs prevented efficient cross-domain querying of datasets. Additionally, the tools provided often did not enable effective data discovery as they struggled to resolve the complex underlying normalised structures providing poor data access speeds. Users developed bespoke access tools to structures they did not fully understand sometimes delivering them incorrect results. Therefore, BGS has developed PropBase, a generic denormalised data structure within an RDBMS to store property data, to facilitate rapid and standardised data discovery and access, incorporating 2D and 3D physical and chemical property data, with associated metadata. This includes scripts to populate and synchronise the layer with its data sources through structured input and transcription standards. A core component of the architecture includes, an optimised query object, to deliver geoscience information from a structure equivalent to a data warehouse. This enables optimised query performance to deliver data in multiple standardised formats using a web discovery tool. Semantic interoperability is enforced through vocabularies combined from all data sources facilitating searching of related terms. PropBase holds 28.1 million spatially enabled property data points from 10 source databases incorporating over 50 property data types with a vocabulary set that includes 557 property terms. By enabling property data searches across multiple databases PropBase has facilitated new scientific research, previously

  20. MSeqDR: A Centralized Knowledge Repository and Bioinformatics Web Resource to Facilitate Genomic Investigations in Mitochondrial Disease

    OpenAIRE

    Shen, Lishuang; Diroma, Maria Angela; Gonzalez, Michael; Navarro-Gomez, Daniel; Leipzig, Jeremy; Lott, Marie T.; Oven, Mannis; Wallace, D.C.; Muraresku, Colleen Clarke; Zolkipli-Cunningham, Zarazuela; Chinnery, Patrick; Attimonelli, Marcella; Zuchner, Stephan; Falk, Marni J.; Gai, Xiaowu

    2016-01-01

    textabstractMSeqDR is the Mitochondrial Disease Sequence Data Resource, a centralized and comprehensive genome and phenome bioinformatics resource built by the mitochondrial disease community to facilitate clinical diagnosis and research investigations of individual patient phenotypes, genomes, genes, and variants. A central Web portal (https://mseqdr.org) integrates community knowledge from expert-curated databases with genomic and phenotype data shared by clinicians and researchers. MSeqDR ...

  1. Scale out databases for CERN use cases

    International Nuclear Information System (INIS)

    Baranowski, Zbigniew; Grzybek, Maciej; Canali, Luca; Garcia, Daniel Lanza; Surdy, Kacper

    2015-01-01

    Data generation rates are expected to grow very fast for some database workloads going into LHC run 2 and beyond. In particular this is expected for data coming from controls, logging and monitoring systems. Storing, administering and accessing big data sets in a relational database system can quickly become a very hard technical challenge, as the size of the active data set and the number of concurrent users increase. Scale-out database technologies are a rapidly developing set of solutions for deploying and managing very large data warehouses on commodity hardware and with open source software. In this paper we will describe the architecture and tests on database systems based on Hadoop and the Cloudera Impala engine. We will discuss the results of our tests, including tests of data loading and integration with existing data sources and in particular with relational databases. We will report on query performance tests done with various data sets of interest at CERN, notably data from the accelerator log database. (paper)

  2. Some Considerations about Modern Database Machines

    Directory of Open Access Journals (Sweden)

    Manole VELICANU

    2010-01-01

    Full Text Available Optimizing the two computing resources of any computing system - time and space - has al-ways been one of the priority objectives of any database. A current and effective solution in this respect is the computer database. Optimizing computer applications by means of database machines has been a steady preoccupation of researchers since the late seventies. Several information technologies have revolutionized the present information framework. Out of these, those which have brought a major contribution to the optimization of the databases are: efficient handling of large volumes of data (Data Warehouse, Data Mining, OLAP – On Line Analytical Processing, the improvement of DBMS – Database Management Systems facilities through the integration of the new technologies, the dramatic increase in computing power and the efficient use of it (computer networks, massive parallel computing, Grid Computing and so on. All these information technologies, and others, have favored the resumption of the research on database machines and the obtaining in the last few years of some very good practical results, as far as the optimization of the computing resources is concerned.

  3. metabolicMine: an integrated genomics, genetics and proteomics data warehouse for common metabolic disease research.

    Science.gov (United States)

    Lyne, Mike; Smith, Richard N; Lyne, Rachel; Aleksic, Jelena; Hu, Fengyuan; Kalderimis, Alex; Stepan, Radek; Micklem, Gos

    2013-01-01

    Common metabolic and endocrine diseases such as diabetes affect millions of people worldwide and have a major health impact, frequently leading to complications and mortality. In a search for better prevention and treatment, there is ongoing research into the underlying molecular and genetic bases of these complex human diseases, as well as into the links with risk factors such as obesity. Although an increasing number of relevant genomic and proteomic data sets have become available, the quantity and diversity of the data make their efficient exploitation challenging. Here, we present metabolicMine, a data warehouse with a specific focus on the genomics, genetics and proteomics of common metabolic diseases. Developed in collaboration with leading UK metabolic disease groups, metabolicMine integrates data sets from a range of experiments and model organisms alongside tools for exploring them. The current version brings together information covering genes, proteins, orthologues, interactions, gene expression, pathways, ontologies, diseases, genome-wide association studies and single nucleotide polymorphisms. Although the emphasis is on human data, key data sets from mouse and rat are included. These are complemented by interoperation with the RatMine rat genomics database, with a corresponding mouse version under development by the Mouse Genome Informatics (MGI) group. The web interface contains a number of features including keyword search, a library of Search Forms, the QueryBuilder and list analysis tools. This provides researchers with many different ways to analyse, view and flexibly export data. Programming interfaces and automatic code generation in several languages are supported, and many of the features of the web interface are available through web services. The combination of diverse data sets integrated with analysis tools and a powerful query system makes metabolicMine a valuable research resource. The web interface makes it accessible to first

  4. Order Picking Process in Warehouse: Case Study of Dairy Industry in Croatia

    Directory of Open Access Journals (Sweden)

    Josip Habazin

    2017-02-01

    Full Text Available The proper functioning of warehouse processes is fundamental for operational improvement and overall logistic supply chain improvement. Order picking is considered one of the most important from the group. Throughout picking orders in warehouses, the presence of human work is highly reflected, with the main goal to reduce the process time as much as possible, that is, to the very minimum. There are several different order picking methods, and nowadays, the most common ones are being developed and are significantly dependent on the type of goods, the warehouse equipment, etc., and those that stand out are scanning and picking by voice. This paper will provide information regarding the dairy industry in the Republic of Croatia with the analysis of order picking process in the observed company. Overall research highlighted the problem and resulted in proposals of solutions.

  5. Data Delivery and Mapping Over the Web: National Water-Quality Assessment Data Warehouse

    Science.gov (United States)

    Bell, Richard W.; Williamson, Alex K.

    2006-01-01

    The U.S. Geological Survey began its National Water-Quality Assessment (NAWQA) Program in 1991, systematically collecting chemical, biological, and physical water-quality data from study units (basins) across the Nation. In 1999, the NAWQA Program developed a data warehouse to better facilitate national and regional analysis of data from 36 study units started in 1991 and 1994. Data from 15 study units started in 1997 were added to the warehouse in 2001. The warehouse currently contains and links the following data: -- Chemical concentrations in water, sediment, and aquatic-organism tissues and related quality-control data from the USGS National Water Information System (NWIS), -- Biological data for stream-habitat and ecological-community data on fish, algae, and benthic invertebrates, -- Site, well, and basin information associated with thousands of descriptive variables derived from spatial analysis, like land use, soil, and population density, and -- Daily streamflow and temperature information from NWIS for selected sampling sites.

  6. Roadmap to a Comprehensive Clinical Data Warehouse for Precision Medicine Applications in Oncology.

    Science.gov (United States)

    Foran, David J; Chen, Wenjin; Chu, Huiqi; Sadimin, Evita; Loh, Doreen; Riedlinger, Gregory; Goodell, Lauri A; Ganesan, Shridar; Hirshfield, Kim; Rodriguez, Lorna; DiPaola, Robert S

    2017-01-01

    Leading institutions throughout the country have established Precision Medicine programs to support personalized treatment of patients. A cornerstone for these programs is the establishment of enterprise-wide Clinical Data Warehouses. Working shoulder-to-shoulder, a team of physicians, systems biologists, engineers, and scientists at Rutgers Cancer Institute of New Jersey have designed, developed, and implemented the Warehouse with information originating from data sources, including Electronic Medical Records, Clinical Trial Management Systems, Tumor Registries, Biospecimen Repositories, Radiology and Pathology archives, and Next Generation Sequencing services. Innovative solutions were implemented to detect and extract unstructured clinical information that was embedded in paper/text documents, including synoptic pathology reports. Supporting important precision medicine use cases, the growing Warehouse enables physicians to systematically mine and review the molecular, genomic, image-based, and correlated clinical information of patient tumors individually or as part of large cohorts to identify changes and patterns that may influence treatment decisions and potential outcomes.

  7. From capturing nursing knowledge to retrieval of data from a data warehouse.

    Science.gov (United States)

    Thoroddsen, Asta; Guðjónsdóttir, Hanna K; Guðjónsdóttir, Elisabet

    2014-01-01

    The purpose of the project was to capture nursing data and knowledge, represent it for use and re-use by retrieval from a data warehouse, which contains both clinical and financial hospital data. Today nurses at LUH use standardized nursing terminologies to document information related to patients and the nursing care in the EHR at all times. Pre-defined order sets for nursing care have been developed using best practice where available and tacit nursing knowledge has been captured and coded with standardized nursing terminologies and made explicit for dissemination in the EHR. All patient-nursing data is permanently stored in a data repository. Core nursing data elements have been selected for transfer and storage in the data warehouse and patient-nursing data are now captured, stored, can be related to other data elements from the warehouse and be retrieved for use and re-use.

  8. Clinical Data Warehouse: An Effective Tool to Create Intelligence in Disease Management.

    Science.gov (United States)

    Karami, Mahtab; Rahimi, Azin; Shahmirzadi, Ali Hosseini

    Clinical business intelligence tools such as clinical data warehouse enable health care organizations to objectively assess the disease management programs that affect the quality of patients' life and well-being in public. The purpose of these programs is to reduce disease occurrence, improve patient care, and decrease health care costs. Therefore, applying clinical data warehouse can be effective in generating useful information about aspects of patient care to facilitate budgeting, planning, research, process improvement, external reporting, benchmarking, and trend analysis, as well as to enable the decisions needed to prevent the progression or appearance of the illness aligning with maintaining the health of the population. The aim of this review article is to describe the benefits of clinical data warehouse applications in creating intelligence for disease management programs.

  9. Ontology-Based Big Dimension Modeling in Data Warehouse Schema Design

    DEFF Research Database (Denmark)

    Iftikhar, Nadeem

    2013-01-01

    During data warehouse schema design, designers often encounter how to model big dimensions that typically contain a large number of attributes and records. To investigate effective approaches for modeling big dimensions is necessary in order to achieve better query performance, with respect...... partitioning, vertical partitioning and their hybrid. We formalize the design methods and propose an algorithm that describes the modeling process from an OWL ontology to a data warehouse schema. In addition, this paper also presents an effective ontology-based tool to automate the modeling process. The tool...... can automatically generate the data warehouse schema from the ontology of describing the terms and business semantics for the big dimension. In case of any change in the requirements, we only need to modify the ontology, and re-generate the schema using the tool. This paper also evaluates the proposed...

  10. The Impact of E-Commerce Development on the Warehouse Space Market in Poland

    Directory of Open Access Journals (Sweden)

    Dembińska Izabela

    2016-12-01

    Full Text Available The subject of discussion in the article is the impact of e-commerce sector on the warehouse space market. On the basis of available reports, the development of e-commerce has been characterized in Poland, showing the dynamics and the type of change. The needs of e-commerce sector in the field of logistics, in particular in the area of storage, have been presented in the paper. These needs have been characterized and at the same time, how representatives of the warehouse space market are prepared to support companies in the e-commerce sector is also discussed. The considerations are illustrated by the changes that occur as a result of the development of e-commerce on the warehouse space market in Poland.

  11. Real Time Business Analytics for Buying or Selling Transaction on Commodity Warehouse Receipt System

    Science.gov (United States)

    Djatna, Taufik; Teniwut, Wellem A.; Hairiyah, Nina; Marimin

    2017-10-01

    The requirement for smooth information such as buying and selling is essential for commodity warehouse receipt system such as dried seaweed and their stakeholders to transact for an operational transaction. Transactions of buying or selling a commodity warehouse receipt system are a risky process due to the fluctuations in dynamic commodity prices. An integrated system to determine the condition of the real time was needed to make a decision-making transaction by the owner or prospective buyer. The primary motivation of this study is to propose computational methods to trace market tendency for either buying or selling processes. The empirical results reveal that feature selection gain ratio and k-NN outperforms other forecasting models, implying that the proposed approach is a promising alternative to the stock market tendency of warehouse receipt document exploration with accurate level rate is 95.03%.

  12. To the question about the layout of the racks in the warehouse

    Directory of Open Access Journals (Sweden)

    Ilesaliev D.I.

    2017-03-01

    Full Text Available Warehouses, which are located at points of transshipment of cargo from one type of transport on the other, play a sig-nificant role in the transformation of cargo to further the most effective transportation of goods. The location of racks and longitudinal passages are important in the work of transhipment warehouse. Typically, racks and longitudinal pas-sages are perpendicular to each other, the article proposes a radical change with the "euclidean advantage". This is an-other way of designing warehouses for efficiency overload packaged cargo in the supply chain. Purpose is to reduce the mileage for one cycle of the loader from loading and unloading areas to storage areas.

  13. Bioinformatics in cancer therapy and drug design

    International Nuclear Information System (INIS)

    Horbach, D.Y.; Usanov, S.A.

    2005-01-01

    One of the mechanisms of external signal transduction (ionizing radiation, toxicants, stress) to the target cell is the existence of membrane and intracellular proteins with intrinsic tyrosine kinase activity. No wonder that etiology of malignant growth links to abnormalities in signal transduction through tyrosine kinases. The epidermal growth factor receptor (EGFR) tyrosine kinases play fundamental roles in development, proliferation and differentiation of tissues of epithelial, mesenchymal and neuronal origin. There are four types of EGFR: EGF receptor (ErbB1/HER1), ErbB2/Neu/HER2, ErbB3/HER3 and ErbB4/HER4. Abnormal expression of EGFR, appearance of receptor mutants with changed ability to protein-protein interactions or increased tyrosine kinase activity have been implicated in the malignancy of different types of human tumors. Bioinformatics is currently using in investigation on design and selection of drugs that can make alterations in structure or competitively bind with receptors and so display antagonistic characteristics. (authors)

  14. Bioinformatics in cancer therapy and drug design

    Energy Technology Data Exchange (ETDEWEB)

    Horbach, D Y [International A. Sakharov environmental univ., Minsk (Belarus); Usanov, S A [Inst. of bioorganic chemistry, National academy of sciences of Belarus, Minsk (Belarus)

    2005-05-15

    One of the mechanisms of external signal transduction (ionizing radiation, toxicants, stress) to the target cell is the existence of membrane and intracellular proteins with intrinsic tyrosine kinase activity. No wonder that etiology of malignant growth links to abnormalities in signal transduction through tyrosine kinases. The epidermal growth factor receptor (EGFR) tyrosine kinases play fundamental roles in development, proliferation and differentiation of tissues of epithelial, mesenchymal and neuronal origin. There are four types of EGFR: EGF receptor (ErbB1/HER1), ErbB2/Neu/HER2, ErbB3/HER3 and ErbB4/HER4. Abnormal expression of EGFR, appearance of receptor mutants with changed ability to protein-protein interactions or increased tyrosine kinase activity have been implicated in the malignancy of different types of human tumors. Bioinformatics is currently using in investigation on design and selection of drugs that can make alterations in structure or competitively bind with receptors and so display antagonistic characteristics. (authors)

  15. Bioinformatics study of the mangrove actin genes

    Science.gov (United States)

    Basyuni, M.; Wasilah, M.; Sumardi

    2017-01-01

    This study describes the bioinformatics methods to analyze eight actin genes from mangrove plants on DDBJ/EMBL/GenBank as well as predicted the structure, composition, subcellular localization, similarity, and phylogenetic. The physical and chemical properties of eight mangroves showed variation among the genes. The percentage of the secondary structure of eight mangrove actin genes followed the order of a helix > random coil > extended chain structure for BgActl, KcActl, RsActl, and A. corniculatum Act. In contrast to this observation, the remaining actin genes were random coil > extended chain structure > a helix. This study, therefore, shown the prediction of secondary structure was performed for necessary structural information. The values of chloroplast or signal peptide or mitochondrial target were too small, indicated that no chloroplast or mitochondrial transit peptide or signal peptide of secretion pathway in mangrove actin genes. These results suggested the importance of understanding the diversity and functional of properties of the different amino acids in mangrove actin genes. To clarify the relationship among the mangrove actin gene, a phylogenetic tree was constructed. Three groups of mangrove actin genes were formed, the first group contains B. gymnorrhiza BgAct and R. stylosa RsActl. The second cluster which consists of 5 actin genes the largest group, and the last branch consist of one gene, B. sexagula Act. The present study, therefore, supported the previous results that plant actin genes form distinct clusters in the tree.

  16. Parallel evolutionary computation in bioinformatics applications.

    Science.gov (United States)

    Pinho, Jorge; Sobral, João Luis; Rocha, Miguel

    2013-05-01

    A large number of optimization problems within the field of Bioinformatics require methods able to handle its inherent complexity (e.g. NP-hard problems) and also demand increased computational efforts. In this context, the use of parallel architectures is a necessity. In this work, we propose ParJECoLi, a Java based library that offers a large set of metaheuristic methods (such as Evolutionary Algorithms) and also addresses the issue of its efficient execution on a wide range of parallel architectures. The proposed approach focuses on the easiness of use, making the adaptation to distinct parallel environments (multicore, cluster, grid) transparent to the user. Indeed, this work shows how the development of the optimization library can proceed independently of its adaptation for several architectures, making use of Aspect-Oriented Programming. The pluggable nature of parallelism related modules allows the user to easily configure its environment, adding parallelism modules to the base source code when needed. The performance of the platform is validated with two case studies within biological model optimization. Copyright © 2012 Elsevier Ireland Ltd. All rights reserved.

  17. Heuristics for multi-item two-echelon spare parts inventory control problem with batch ordering in the central warehouse

    NARCIS (Netherlands)

    Topan, E.; Bayindir, Z.P.; Tan, T.

    2010-01-01

    We consider a multi-item two-echelon inventory system in which the central warehouse operates under a (Q;R) policy, and each local warehouse implements (S ¡ 1; S) policy. The objective is to find the policy parameters minimizing expected system-wide inventory holding and fixed ordering costs subject

  18. Quantitative performance of E-Scribe warehouse in detecting quality issues with digital annotated ECG data from healthy subjects.

    Science.gov (United States)

    Sarapa, Nenad; Mortara, Justin L; Brown, Barry D; Isola, Lamberto; Badilini, Fabio

    2008-05-01

    The US Food and Drug Administration recommends submission of digital electrocardiograms in the standard HL7 XML format into the electrocardiogram warehouse to support preapproval review of new drug applications. The Food and Drug Administration scrutinizes electrocardiogram quality by viewing the annotated waveforms and scoring electrocardiogram quality by the warehouse algorithms. Part of the Food and Drug Administration warehouse is commercially available to sponsors as the E-Scribe Warehouse. The authors tested the performance of E-Scribe Warehouse algorithms by quantifying electrocardiogram acquisition quality, adherence to QT annotation protocol, and T-wave signal strength in 2 data sets: "reference" (104 digital electrocardiograms from a phase I study with sotalol in 26 healthy subjects with QT annotations by computer-assisted manual adjustment) and "test" (the same electrocardiograms with an intentionally introduced predefined number of quality issues). The E-Scribe Warehouse correctly detected differences between the 2 sets expected from the number and pattern of errors in the "test" set (except for 1 subject with QT misannotated in different leads of serial electrocardiograms) and confirmed the absence of differences where none was expected. E-Scribe Warehouse scores below the threshold value identified individual electrocardiograms with questionable T-wave signal strength. The E-Scribe Warehouse showed satisfactory performance in detecting electrocardiogram quality issues that may impair reliability of QTc assessment in clinical trials in healthy subjects.

  19. Creating databases for biological information: an introduction.

    Science.gov (United States)

    Stein, Lincoln

    2013-06-01

    The essence of bioinformatics is dealing with large quantities of information. Whether it be sequencing data, microarray data files, mass spectrometric data (e.g., fingerprints), the catalog of strains arising from an insertional mutagenesis project, or even large numbers of PDF files, there inevitably comes a time when the information can simply no longer be managed with files and directories. This is where databases come into play. This unit briefly reviews the characteristics of several database management systems, including flat file, indexed file, relational databases, and NoSQL databases. It compares their strengths and weaknesses and offers some general guidelines for selecting an appropriate database management system. Copyright 2013 by JohnWiley & Sons, Inc.

  20. GENEASE: Real time bioinformatics tool for multi-omics and disease ontology exploration, analysis and visualization.

    Science.gov (United States)

    Ghandikota, Sudhir; Hershey, Gurjit K Khurana; Mersha, Tesfaye B

    2018-03-24

    Advances in high-throughput sequencing technologies have made it possible to generate multiple omics data at an unprecedented rate and scale. The accumulation of these omics data far outpaces the rate at which biologists can mine and generate new hypothesis to test experimentally. There is an urgent need to develop a myriad of powerful tools to efficiently and effectively search and filter these resources to address specific post-GWAS functional genomics questions. However, to date, these resources are scattered across several databases and often lack a unified portal for data annotation and analytics. In addition, existing tools to analyze and visualize these databases are highly fragmented, resulting researchers to access multiple applications and manual interventions for each gene or variant in an ad hoc fashion until all the questions are answered. In this study, we present GENEASE, a web-based one-stop bioinformatics tool designed to not only query and explore multi-omics and phenotype databases (e.g., GTEx, ClinVar, dbGaP, GWAS Catalog, ENCODE, Roadmap Epigenomics, KEGG, Reactome, Gene and Phenotype Ontology) in a single web interface but also to perform seamless post genome-wide association downstream functional and overlap analysis for non-coding regulatory variants. GENEASE accesses over 50 different databases in public domain including model organism-specific databases to facilitate gene/variant and disease exploration, enrichment and overlap analysis in real time. It is a user-friendly tool with point-and-click interface containing links for support information including user manual and examples. GENEASE can be accessed freely at http://research.cchmc.org/mershalab/genease_new/login.html. Tesfaye.Mersha@cchmc.org, Sudhir.Ghandikota@cchmc.org. Supplementary data are available at Bioinformatics online.