WorldWideScience

Sample records for bioinformatics database warehouse

  1. BioWarehouse: a bioinformatics database warehouse toolkit.

    Science.gov (United States)

    Lee, Thomas J; Pouliot, Yannick; Wagner, Valerie; Gupta, Priyanka; Stringer-Calvert, David W J; Tenenbaum, Jessica D; Karp, Peter D

    2006-03-23

    This article addresses the problem of interoperation of heterogeneous bioinformatics databases. We introduce BioWarehouse, an open source toolkit for constructing bioinformatics database warehouses using the MySQL and Oracle relational database managers. BioWarehouse integrates its component databases into a common representational framework within a single database management system, thus enabling multi-database queries using the Structured Query Language (SQL) but also facilitating a variety of database integration tasks such as comparative analysis and data mining. BioWarehouse currently supports the integration of a pathway-centric set of databases including ENZYME, KEGG, and BioCyc, and in addition the UniProt, GenBank, NCBI Taxonomy, and CMR databases, and the Gene Ontology. Loader tools, written in the C and JAVA languages, parse and load these databases into a relational database schema. The loaders also apply a degree of semantic normalization to their respective source data, decreasing semantic heterogeneity. The schema supports the following bioinformatics datatypes: chemical compounds, biochemical reactions, metabolic pathways, proteins, genes, nucleic acid sequences, features on protein and nucleic-acid sequences, organisms, organism taxonomies, and controlled vocabularies. As an application example, we applied BioWarehouse to determine the fraction of biochemically characterized enzyme activities for which no sequences exist in the public sequence databases. The answer is that no sequence exists for 36% of enzyme activities for which EC numbers have been assigned. These gaps in sequence data significantly limit the accuracy of genome annotation and metabolic pathway prediction, and are a barrier for metabolic engineering. Complex queries of this type provide examples of the value of the data warehousing approach to bioinformatics research. BioWarehouse embodies significant progress on the database integration problem for bioinformatics.

  2. BioWarehouse: a bioinformatics database warehouse toolkit

    Directory of Open Access Journals (Sweden)

    Stringer-Calvert David WJ

    2006-03-01

    Full Text Available Abstract Background This article addresses the problem of interoperation of heterogeneous bioinformatics databases. Results We introduce BioWarehouse, an open source toolkit for constructing bioinformatics database warehouses using the MySQL and Oracle relational database managers. BioWarehouse integrates its component databases into a common representational framework within a single database management system, thus enabling multi-database queries using the Structured Query Language (SQL but also facilitating a variety of database integration tasks such as comparative analysis and data mining. BioWarehouse currently supports the integration of a pathway-centric set of databases including ENZYME, KEGG, and BioCyc, and in addition the UniProt, GenBank, NCBI Taxonomy, and CMR databases, and the Gene Ontology. Loader tools, written in the C and JAVA languages, parse and load these databases into a relational database schema. The loaders also apply a degree of semantic normalization to their respective source data, decreasing semantic heterogeneity. The schema supports the following bioinformatics datatypes: chemical compounds, biochemical reactions, metabolic pathways, proteins, genes, nucleic acid sequences, features on protein and nucleic-acid sequences, organisms, organism taxonomies, and controlled vocabularies. As an application example, we applied BioWarehouse to determine the fraction of biochemically characterized enzyme activities for which no sequences exist in the public sequence databases. The answer is that no sequence exists for 36% of enzyme activities for which EC numbers have been assigned. These gaps in sequence data significantly limit the accuracy of genome annotation and metabolic pathway prediction, and are a barrier for metabolic engineering. Complex queries of this type provide examples of the value of the data warehousing approach to bioinformatics research. Conclusion BioWarehouse embodies significant progress on the

  3. Atlas – a data warehouse for integrative bioinformatics

    Directory of Open Access Journals (Sweden)

    Yuen Macaire MS

    2005-02-01

    Full Text Available Abstract Background We present a biological data warehouse called Atlas that locally stores and integrates biological sequences, molecular interactions, homology information, functional annotations of genes, and biological ontologies. The goal of the system is to provide data, as well as a software infrastructure for bioinformatics research and development. Description The Atlas system is based on relational data models that we developed for each of the source data types. Data stored within these relational models are managed through Structured Query Language (SQL calls that are implemented in a set of Application Programming Interfaces (APIs. The APIs include three languages: C++, Java, and Perl. The methods in these API libraries are used to construct a set of loader applications, which parse and load the source datasets into the Atlas database, and a set of toolbox applications which facilitate data retrieval. Atlas stores and integrates local instances of GenBank, RefSeq, UniProt, Human Protein Reference Database (HPRD, Biomolecular Interaction Network Database (BIND, Database of Interacting Proteins (DIP, Molecular Interactions Database (MINT, IntAct, NCBI Taxonomy, Gene Ontology (GO, Online Mendelian Inheritance in Man (OMIM, LocusLink, Entrez Gene and HomoloGene. The retrieval APIs and toolbox applications are critical components that offer end-users flexible, easy, integrated access to this data. We present use cases that use Atlas to integrate these sources for genome annotation, inference of molecular interactions across species, and gene-disease associations. Conclusion The Atlas biological data warehouse serves as data infrastructure for bioinformatics research and development. It forms the backbone of the research activities in our laboratory and facilitates the integration of disparate, heterogeneous biological sources of data enabling new scientific inferences. Atlas achieves integration of diverse data sets at two levels. First

  4. Database Vs Data Warehouse

    Directory of Open Access Journals (Sweden)

    2007-01-01

    Full Text Available Data warehouse technology includes a set of concepts and methods that offer the users useful information for decision making. The necessity to build a data warehouse arises from the necessity to improve the quality of information in the organization. The date proceeding from different sources, having a variety of forms - both structured and unstructured, are filtered according to business rules and are integrated in a single large data collection. Using informatics solutions, managers have understood that data stored in operational systems - including databases, are an informational gold mine that must be exploited. Data warehouses have been developed to answer the increasing demands for complex analysis, which could not be properly achieved with operational databases. The present paper emphasizes some of the criteria that information application developers can use in order to choose between a database solution or a data warehouse one.

  5. Geminivirus data warehouse: a database enriched with machine learning approaches.

    Science.gov (United States)

    Silva, Jose Cleydson F; Carvalho, Thales F M; Basso, Marcos F; Deguchi, Michihito; Pereira, Welison A; Sobrinho, Roberto R; Vidigal, Pedro M P; Brustolini, Otávio J B; Silva, Fabyano F; Dal-Bianco, Maximiller; Fontes, Renildes L F; Santos, Anésia A; Zerbini, Francisco Murilo; Cerqueira, Fabio R; Fontes, Elizabeth P B

    2017-05-05

    The Geminiviridae family encompasses a group of single-stranded DNA viruses with twinned and quasi-isometric virions, which infect a wide range of dicotyledonous and monocotyledonous plants and are responsible for significant economic losses worldwide. Geminiviruses are divided into nine genera, according to their insect vector, host range, genome organization, and phylogeny reconstruction. Using rolling-circle amplification approaches along with high-throughput sequencing technologies, thousands of full-length geminivirus and satellite genome sequences were amplified and have become available in public databases. As a consequence, many important challenges have emerged, namely, how to classify, store, and analyze massive datasets as well as how to extract information or new knowledge. Data mining approaches, mainly supported by machine learning (ML) techniques, are a natural means for high-throughput data analysis in the context of genomics, transcriptomics, proteomics, and metabolomics. Here, we describe the development of a data warehouse enriched with ML approaches, designated geminivirus.org. We implemented search modules, bioinformatics tools, and ML methods to retrieve high precision information, demarcate species, and create classifiers for genera and open reading frames (ORFs) of geminivirus genomes. The use of data mining techniques such as ETL (Extract, Transform, Load) to feed our database, as well as algorithms based on machine learning for knowledge extraction, allowed us to obtain a database with quality data and suitable tools for bioinformatics analysis. The Geminivirus Data Warehouse (geminivirus.org) offers a simple and user-friendly environment for information retrieval and knowledge discovery related to geminiviruses.

  6. COMPARISON OF POPULAR BIOINFORMATICS DATABASES

    OpenAIRE

    Abdulganiyu Abdu Yusuf; Zahraddeen Sufyanu; Kabir Yusuf Mamman; Abubakar Umar Suleiman

    2016-01-01

    Bioinformatics is the application of computational tools to capture and interpret biological data. It has wide applications in drug development, crop improvement, agricultural biotechnology and forensic DNA analysis. There are various databases available to researchers in bioinformatics. These databases are customized for a specific need and are ranged in size, scope, and purpose. The main drawbacks of bioinformatics databases include redundant information, constant change, data spread over m...

  7. Optimized Database of Higher Education Management Using Data Warehouse

    Directory of Open Access Journals (Sweden)

    Spits Warnars

    2010-04-01

    Full Text Available The emergence of new higher education institutions has created the competition in higher education market, and data warehouse can be used as an effective technology tools for increasing competitiveness in the higher education market. Data warehouse produce reliable reports for the institution’s high-level management in short time for faster and better decision making, not only on increasing the admission number of students, but also on the possibility to find extraordinary, unconventional funds for the institution. Efficiency comparison was based on length and amount of processed records, total processed byte, amount of processed tables, time to run query and produced record on OLTP database and data warehouse. Efficiency percentages was measured by the formula for percentage increasing and the average efficiency percentage of 461.801,04% shows that using data warehouse is more powerful and efficient rather than using OLTP database. Data warehouse was modeled based on hypercube which is created by limited high demand reports which usually used by high level management. In every table of fact and dimension fields will be inserted which represent the loading constructive merge where the ETL (Extraction, Transformation and Loading process is run based on the old and new files.

  8. Design of a Multi Dimensional Database for the Archimed DataWarehouse.

    Science.gov (United States)

    Bréant, Claudine; Thurler, Gérald; Borst, François; Geissbuhler, Antoine

    2005-01-01

    The Archimed data warehouse project started in 1993 at the Geneva University Hospital. It has progressively integrated seven data marts (or domains of activity) archiving medical data such as Admission/Discharge/Transfer (ADT) data, laboratory results, radiology exams, diagnoses, and procedure codes. The objective of the Archimed data warehouse is to facilitate the access to an integrated and coherent view of patient medical in order to support analytical activities such as medical statistics, clinical studies, retrieval of similar cases and data mining processes. This paper discusses three principal design aspects relative to the conception of the database of the data warehouse: 1) the granularity of the database, which refers to the level of detail or summarization of data, 2) the database model and architecture, describing how data will be presented to end users and how new data is integrated, 3) the life cycle of the database, in order to ensure long term scalability of the environment. Both, the organization of patient medical data using a standardized elementary fact representation and the use of the multi dimensional model have proved to be powerful design tools to integrate data coming from the multiple heterogeneous database systems part of the transactional Hospital Information System (HIS). Concurrently, the building of the data warehouse in an incremental way has helped to control the evolution of the data content. These three design aspects bring clarity and performance regarding data access. They also provide long term scalability to the system and resilience to further changes that may occur in source systems feeding the data warehouse.

  9. PubData: search engine for bioinformatics databases worldwide

    OpenAIRE

    Vand, Kasra; Wahlestedt, Thor; Khomtchouk, Kelly; Sayed, Mohammed; Wahlestedt, Claes; Khomtchouk, Bohdan

    2016-01-01

    We propose a search engine and file retrieval system for all bioinformatics databases worldwide. PubData searches biomedical data in a user-friendly fashion similar to how PubMed searches biomedical literature. PubData is built on novel network programming, natural language processing, and artificial intelligence algorithms that can patch into the file transfer protocol servers of any user-specified bioinformatics database, query its contents, retrieve files for download, and adapt to the use...

  10. SPECIES DATABASES AND THE BIOINFORMATICS REVOLUTION.

    Science.gov (United States)

    Biological databases are having a growth spurt. Much of this results from research in genetics and biodiversity, coupled with fast-paced developments in information technology. The revolution in bioinformatics, defined by Sugden and Pennisi (2000) as the "tools and techniques for...

  11. Architectural design of a data warehouse to support operational and analytical queries across disparate clinical databases.

    Science.gov (United States)

    Chelico, John D; Wilcox, Adam; Wajngurt, David

    2007-10-11

    As the clinical data warehouse of the New York Presbyterian Hospital has evolved innovative methods of integrating new data sources and providing more effective and efficient data reporting and analysis need to be explored. We designed and implemented a new clinical data warehouse architecture to handle the integration of disparate clinical databases in the institution. By examining the way downstream systems are populated and streamlining the way data is stored we create a virtual clinical data warehouse that is adaptable to future needs of the organization.

  12. The SIB Swiss Institute of Bioinformatics' resources: focus on curated databases

    OpenAIRE

    Bultet, Lisandra Aguilar; Aguilar Rodriguez, Jose; Ahrens, Christian H; Ahrne, Erik Lennart; Ai, Ni; Aimo, Lucila; Akalin, Altuna; Aleksiev, Tyanko; Alocci, Davide; Altenhoff, Adrian; Alves, Isabel; Ambrosini, Giovanna; Pedone, Pascale Anderle; Angelina, Paolo; Anisimova, Maria

    2016-01-01

    The SIB Swiss Institute of Bioinformatics (www.isb-sib.ch) provides world-class bioinformatics databases, software tools, services and training to the international life science community in academia and industry. These solutions allow life scientists to turn the exponentially growing amount of data into knowledge. Here, we provide an overview of SIB's resources and competence areas, with a strong focus on curated databases and SIB's most popular and widely used resources. In particular, SIB'...

  13. Influenza research database: an integrated bioinformatics resource for influenza virus research

    Science.gov (United States)

    The Influenza Research Database (IRD) is a U.S. National Institute of Allergy and Infectious Diseases (NIAID)-sponsored Bioinformatics Resource Center dedicated to providing bioinformatics support for influenza virus research. IRD facilitates the research and development of vaccines, diagnostics, an...

  14. Envirofacts Data Warehouse

    Science.gov (United States)

    The Envirofacts Data Warehouse contains information from select EPA Environmental program office databases and provides access about environmental activities that may affect air, water, and land anywhere in the United States. The Envirofacts Warehouse supports its own web enabled tools as well as a host of other EPA applications.

  15. FaceWarehouse: a 3D facial expression database for visual computing.

    Science.gov (United States)

    Cao, Chen; Weng, Yanlin; Zhou, Shun; Tong, Yiying; Zhou, Kun

    2014-03-01

    We present FaceWarehouse, a database of 3D facial expressions for visual computing applications. We use Kinect, an off-the-shelf RGBD camera, to capture 150 individuals aged 7-80 from various ethnic backgrounds. For each person, we captured the RGBD data of her different expressions, including the neutral expression and 19 other expressions such as mouth-opening, smile, kiss, etc. For every RGBD raw data record, a set of facial feature points on the color image such as eye corners, mouth contour, and the nose tip are automatically localized, and manually adjusted if better accuracy is required. We then deform a template facial mesh to fit the depth data as closely as possible while matching the feature points on the color image to their corresponding points on the mesh. Starting from these fitted face meshes, we construct a set of individual-specific expression blendshapes for each person. These meshes with consistent topology are assembled as a rank-3 tensor to build a bilinear face model with two attributes: identity and expression. Compared with previous 3D facial databases, for every person in our database, there is a much richer matching collection of expressions, enabling depiction of most human facial actions. We demonstrate the potential of FaceWarehouse for visual computing with four applications: facial image manipulation, face component transfer, real-time performance-based facial image animation, and facial animation retargeting from video to image.

  16. Design database for quantitative trait loci (QTL) data warehouse, data mining, and meta-analysis.

    Science.gov (United States)

    Hu, Zhi-Liang; Reecy, James M; Wu, Xiao-Lin

    2012-01-01

    A database can be used to warehouse quantitative trait loci (QTL) data from multiple sources for comparison, genomic data mining, and meta-analysis. A robust database design involves sound data structure logistics, meaningful data transformations, normalization, and proper user interface designs. This chapter starts with a brief review of relational database basics and concentrates on issues associated with curation of QTL data into a relational database, with emphasis on the principles of data normalization and structure optimization. In addition, some simple examples of QTL data mining and meta-analysis are included. These examples are provided to help readers better understand the potential and importance of sound database design.

  17. Metadata to Support Data Warehouse Evolution

    Science.gov (United States)

    Solodovnikova, Darja

    The focus of this chapter is metadata necessary to support data warehouse evolution. We present the data warehouse framework that is able to track evolution process and adapt data warehouse schemata and data extraction, transformation, and loading (ETL) processes. We discuss the significant part of the framework, the metadata repository that stores information about the data warehouse, logical and physical schemata and their versions. We propose the physical implementation of multiversion data warehouse in a relational DBMS. For each modification of a data warehouse schema, we outline the changes that need to be made to the repository metadata and in the database.

  18. A Survey of Bioinformatics Database and Software Usage through Mining the Literature.

    Directory of Open Access Journals (Sweden)

    Geraint Duck

    Full Text Available Computer-based resources are central to much, if not most, biological and medical research. However, while there is an ever expanding choice of bioinformatics resources to use, described within the biomedical literature, little work to date has provided an evaluation of the full range of availability or levels of usage of database and software resources. Here we use text mining to process the PubMed Central full-text corpus, identifying mentions of databases or software within the scientific literature. We provide an audit of the resources contained within the biomedical literature, and a comparison of their relative usage, both over time and between the sub-disciplines of bioinformatics, biology and medicine. We find that trends in resource usage differs between these domains. The bioinformatics literature emphasises novel resource development, while database and software usage within biology and medicine is more stable and conservative. Many resources are only mentioned in the bioinformatics literature, with a relatively small number making it out into general biology, and fewer still into the medical literature. In addition, many resources are seeing a steady decline in their usage (e.g., BLAST, SWISS-PROT, though some are instead seeing rapid growth (e.g., the GO, R. We find a striking imbalance in resource usage with the top 5% of resource names (133 names accounting for 47% of total usage, and over 70% of resources extracted being only mentioned once each. While these results highlight the dynamic and creative nature of bioinformatics research they raise questions about software reuse, choice and the sharing of bioinformatics practice. Is it acceptable that so many resources are apparently never reused? Finally, our work is a step towards automated extraction of scientific method from text. We make the dataset generated by our study available under the CC0 license here: http://dx.doi.org/10.6084/m9.figshare.1281371.

  19. Shared Bioinformatics Databases within the Unipro UGENE Platform

    Directory of Open Access Journals (Sweden)

    Protsyuk Ivan V.

    2015-03-01

    Full Text Available Unipro UGENE is an open-source bioinformatics toolkit that integrates popular tools along with original instruments for molecular biologists within a unified user interface. Nowadays, most bioinformatics desktop applications, including UGENE, make use of a local data model while processing different types of data. Such an approach causes an inconvenience for scientists working cooperatively and relying on the same data. This refers to the need of making multiple copies of certain files for every workplace and maintaining synchronization between them in case of modifications. Therefore, we focused on delivering a collaborative work into the UGENE user experience. Currently, several UGENE installations can be connected to a designated shared database and users can interact with it simultaneously. Such databases can be created by UGENE users and be used at their discretion. Objects of each data type, supported by UGENE such as sequences, annotations, multiple alignments, etc., can now be easily imported from or exported to a remote storage. One of the main advantages of this system, compared to existing ones, is the almost simultaneous access of client applications to shared data regardless of their volume. Moreover, the system is capable of storing millions of objects. The storage itself is a regular database server so even an inexpert user is able to deploy it. Thus, UGENE may provide access to shared data for users located, for example, in the same laboratory or institution. UGENE is available at: http://ugene.net/download.html.

  20. Envirofacts Data Warehouse

    Data.gov (United States)

    U.S. Environmental Protection Agency — The Envirofacts Data Warehouse contains information from select EPA Environmental program office databases and provides access about environmental activities that...

  1. Microsoft Enterprise Consortium: A Resource for Teaching Data Warehouse, Business Intelligence and Database Management Systems

    Science.gov (United States)

    Kreie, Jennifer; Hashemi, Shohreh

    2012-01-01

    Data is a vital resource for businesses; therefore, it is important for businesses to manage and use their data effectively. Because of this, businesses value college graduates with an understanding of and hands-on experience working with databases, data warehouses and data analysis theories and tools. Faculty in many business disciplines try to…

  2. Bioinformatics in translational drug discovery.

    Science.gov (United States)

    Wooller, Sarah K; Benstead-Hume, Graeme; Chen, Xiangrong; Ali, Yusuf; Pearl, Frances M G

    2017-08-31

    Bioinformatics approaches are becoming ever more essential in translational drug discovery both in academia and within the pharmaceutical industry. Computational exploitation of the increasing volumes of data generated during all phases of drug discovery is enabling key challenges of the process to be addressed. Here, we highlight some of the areas in which bioinformatics resources and methods are being developed to support the drug discovery pipeline. These include the creation of large data warehouses, bioinformatics algorithms to analyse 'big data' that identify novel drug targets and/or biomarkers, programs to assess the tractability of targets, and prediction of repositioning opportunities that use licensed drugs to treat additional indications. © 2017 The Author(s).

  3. Databases and Associated Bioinformatic Tools in Studies of Food Allergens, Epitopes and Haptens – a Review

    Directory of Open Access Journals (Sweden)

    Bucholska Justyna

    2018-06-01

    Full Text Available Allergies and/or food intolerances are a growing problem of the modern world. Diffi culties associated with the correct diagnosis of food allergies result in the need to classify the factors causing allergies and allergens themselves. Therefore, internet databases and other bioinformatic tools play a special role in deepening knowledge of biologically-important compounds. Internet repositories, as a source of information on different chemical compounds, including those related to allergy and intolerance, are increasingly being used by scientists. Bioinformatic methods play a signifi cant role in biological and medical sciences, and their importance in food science is increasing. This study aimed at presenting selected databases and tools of bioinformatic analysis useful in research on food allergies, allergens (11 databases, epitopes (7 databases, and haptens (2 databases. It also presents examples of the application of computer methods in studies related to allergies.

  4. TRUNCATULIX--a data warehouse for the legume community.

    Science.gov (United States)

    Henckel, Kolja; Runte, Kai J; Bekel, Thomas; Dondrup, Michael; Jakobi, Tobias; Küster, Helge; Goesmann, Alexander

    2009-02-11

    Databases for either sequence, annotation, or microarray experiments data are extremely beneficial to the research community, as they centrally gather information from experiments performed by different scientists. However, data from different sources develop their full capacities only when combined. The idea of a data warehouse directly adresses this problem and solves it by integrating all required data into one single database - hence there are already many data warehouses available to genetics. For the model legume Medicago truncatula, there is currently no such single data warehouse that integrates all freely available gene sequences, the corresponding gene expression data, and annotation information. Thus, we created the data warehouse TRUNCATULIX, an integrative database of Medicago truncatula sequence and expression data. The TRUNCATULIX data warehouse integrates five public databases for gene sequences, and gene annotations, as well as a database for microarray expression data covering raw data, normalized datasets, and complete expression profiling experiments. It can be accessed via an AJAX-based web interface using a standard web browser. For the first time, users can now quickly search for specific genes and gene expression data in a huge database based on high-quality annotations. The results can be exported as Excel, HTML, or as csv files for further usage. The integration of sequence, annotation, and gene expression data from several Medicago truncatula databases in TRUNCATULIX provides the legume community with access to data and data mining capability not previously available. TRUNCATULIX is freely available at http://www.cebitec.uni-bielefeld.de/truncatulix/.

  5. Missing "Links" in Bioinformatics Education: Expanding Students' Conceptions of Bioinformatics Using a Biodiversity Database of Living and Fossil Reef Corals

    Science.gov (United States)

    Nehm, Ross H.; Budd, Ann F.

    2006-01-01

    NMITA is a reef coral biodiversity database that we use to introduce students to the expansive realm of bioinformatics beyond genetics. We introduce a series of lessons that have students use this database, thereby accessing real data that can be used to test hypotheses about biodiversity and evolution while targeting the "National Science …

  6. TRUNCATULIX – a data warehouse for the legume community

    Directory of Open Access Journals (Sweden)

    Runte Kai J

    2009-02-01

    Full Text Available Abstract Background Databases for either sequence, annotation, or microarray experiments data are extremely beneficial to the research community, as they centrally gather information from experiments performed by different scientists. However, data from different sources develop their full capacities only when combined. The idea of a data warehouse directly adresses this problem and solves it by integrating all required data into one single database – hence there are already many data warehouses available to genetics. For the model legume Medicago truncatula, there is currently no such single data warehouse that integrates all freely available gene sequences, the corresponding gene expression data, and annotation information. Thus, we created the data warehouse TRUNCATULIX, an integrative database of Medicago truncatula sequence and expression data. Results The TRUNCATULIX data warehouse integrates five public databases for gene sequences, and gene annotations, as well as a database for microarray expression data covering raw data, normalized datasets, and complete expression profiling experiments. It can be accessed via an AJAX-based web interface using a standard web browser. For the first time, users can now quickly search for specific genes and gene expression data in a huge database based on high-quality annotations. The results can be exported as Excel, HTML, or as csv files for further usage. Conclusion The integration of sequence, annotation, and gene expression data from several Medicago truncatula databases in TRUNCATULIX provides the legume community with access to data and data mining capability not previously available. TRUNCATULIX is freely available at http://www.cebitec.uni-bielefeld.de/truncatulix/.

  7. The OAuth 2.0 Web Authorization Protocol for the Internet Addiction Bioinformatics (IABio) Database.

    Science.gov (United States)

    Choi, Jeongseok; Kim, Jaekwon; Lee, Dong Kyun; Jang, Kwang Soo; Kim, Dai-Jin; Choi, In Young

    2016-03-01

    Internet addiction (IA) has become a widespread and problematic phenomenon as smart devices pervade society. Moreover, internet gaming disorder leads to increases in social expenditures for both individuals and nations alike. Although the prevention and treatment of IA are getting more important, the diagnosis of IA remains problematic. Understanding the neurobiological mechanism of behavioral addictions is essential for the development of specific and effective treatments. Although there are many databases related to other addictions, a database for IA has not been developed yet. In addition, bioinformatics databases, especially genetic databases, require a high level of security and should be designed based on medical information standards. In this respect, our study proposes the OAuth standard protocol for database access authorization. The proposed IA Bioinformatics (IABio) database system is based on internet user authentication, which is a guideline for medical information standards, and uses OAuth 2.0 for access control technology. This study designed and developed the system requirements and configuration. The OAuth 2.0 protocol is expected to establish the security of personal medical information and be applied to genomic research on IA.

  8. Development of a medical informatics data warehouse.

    Science.gov (United States)

    Wu, Cai

    2006-01-01

    This project built a medical informatics data warehouse (MedInfo DDW) in an Oracle database to analyze medical information which has been collected through Baylor Family Medicine Clinic (FCM) Logician application. The MedInfo DDW used Star Schema with dimensional model, FCM database as operational data store (ODS); the data from on-line transaction processing (OLTP) were extracted and transferred to a knowledge based data warehouse through SQLLoad, and the patient information was analyzed by using on-line analytic processing (OLAP) in Crystal Report.

  9. The OAuth 2.0 Web Authorization Protocol for the Internet Addiction Bioinformatics (IABio Database

    Directory of Open Access Journals (Sweden)

    Jeongseok Choi

    2016-03-01

    Full Text Available Internet addiction (IA has become a widespread and problematic phenomenon as smart devices pervade society. Moreover, internet gaming disorder leads to increases in social expenditures for both individuals and nations alike. Although the prevention and treatment of IA are getting more important, the diagnosis of IA remains problematic. Understanding the neurobiological mechanism of behavioral addictions is essential for the development of specific and effective treatments. Although there are many databases related to other addictions, a database for IA has not been developed yet. In addition, bioinformatics databases, especially genetic databases, require a high level of security and should be designed based on medical information standards. In this respect, our study proposes the OAuth standard protocol for database access authorization. The proposed IA Bioinformatics (IABio database system is based on internet user authentication, which is a guideline for medical information standards, and uses OAuth 2.0 for access control technology. This study designed and developed the system requirements and configuration. The OAuth 2.0 protocol is expected to establish the security of personal medical information and be applied to genomic research on IA.

  10. A clinician friendly data warehouse oriented toward narrative reports: Dr. Warehouse.

    Science.gov (United States)

    Garcelon, Nicolas; Neuraz, Antoine; Salomon, Rémi; Faour, Hassan; Benoit, Vincent; Delapalme, Arthur; Munnich, Arnold; Burgun, Anita; Rance, Bastien

    2018-04-01

    Clinical data warehouses are often oriented toward integration and exploration of coded data. However narrative reports are of crucial importance for translational research. This paper describes Dr. Warehouse®, an open source data warehouse oriented toward clinical narrative reports and designed to support clinicians' day-to-day use. Dr. Warehouse relies on an original database model to focus on documents in addition to facts. Besides classical querying functionalities, the system provides an advanced search engine and Graphical User Interfaces adapted to the exploration of text. Dr. Warehouse is dedicated to translational research with cohort recruitment capabilities, high throughput phenotyping and patient centric views (including similarity metrics among patients). These features leverage Natural Language Processing based on the extraction of UMLS® concepts, as well as negation and family history detection. A survey conducted after 6 months of use at the Necker Children's Hospital shows a high rate of satisfaction among the users (96.6%). During this period, 122 users performed 2837 queries, accessed 4,267 patients' records and included 36,632 patients in 131 cohorts. The source code is available at this github link https://github.com/imagine-bdd/DRWH. A demonstration based on PubMed abstracts is available at https://imagine-plateforme-bdd.fr/dwh_pubmed/. Copyright © 2018 The Authors. Published by Elsevier Inc. All rights reserved.

  11. Model Data Warehouse dan Business Intelligence untuk Meningkatkan Penjualan pada PT. S

    Directory of Open Access Journals (Sweden)

    Rudy Rudy

    2011-06-01

    Full Text Available Today a lot of companies use information system in every business activity. Every transaction is stored electronically in the database transaction. The transactional database does not help much to assist the executives in making strategic decisions to improve the company competitiveness. The objective of this research is to analyze the operational database system and the information needed by the management to design a data warehouse model which fits the executive information needs in PT. S. The research method uses the Nine-Step Methodology data warehouse design by Ralph Kimball. The result is a data warehouse featuring business intelligence applications to display information of historical data in tables, graphs, pivot tables, and dashboards and has several points of view for the management. This research concludes that a data warehouse which combines multiple database transactions with business intelligence application can help executives to understand the reports in order to accelerate decision-making processes. 

  12. Bioinformatics Database Tools in Analysis of Genetics of Neurodevelopmental Disorders

    Directory of Open Access Journals (Sweden)

    Dibyashree Mallik

    2017-10-01

    Full Text Available Bioinformatics tools are recently used in various sectors of biology. Many questions regarding Neurodevelopmental disorder which arises as a major health issue recently can be solved by using various bioinformatics databases. Schizophrenia is such a mental disorder which is now arises as a major threat in young age people because it is mostly seen in case of people during their late adolescence or early adulthood period. Databases like DISGENET, GWAS, PHARMGKB, and DRUGBANK have huge repository of genes associated with schizophrenia. We found a lot of genes are being associated with schizophrenia, but approximately 200 genes are found to be present in any of these databases. After further screening out process 20 genes are found to be highly associated with each other and are also a common genes in many other diseases also. It is also found that they all are serves as a common targeting gene in many antipsychotic drugs. After analysis of various biological properties, molecular function it is found that these 20 genes are mostly involved in biological regulation process and are having receptor activity. They are belonging mainly to receptor protein class. Among these 20 genes CYP2C9, CYP3A4, DRD2, HTR1A, HTR2A are shown to be a main targeting genes of most of the antipsychotic drugs and are associated with  more than 40% diseases. The basic findings of the present study enumerated that a suitable combined drug can be design by targeting these genes which can be used for the better treatment of schizophrenia.

  13. XWeB: The XML Warehouse Benchmark

    Science.gov (United States)

    Mahboubi, Hadj; Darmont, Jérôme

    With the emergence of XML as a standard for representing business data, new decision support applications are being developed. These XML data warehouses aim at supporting On-Line Analytical Processing (OLAP) operations that manipulate irregular XML data. To ensure feasibility of these new tools, important performance issues must be addressed. Performance is customarily assessed with the help of benchmarks. However, decision support benchmarks do not currently support XML features. In this paper, we introduce the XML Warehouse Benchmark (XWeB), which aims at filling this gap. XWeB derives from the relational decision support benchmark TPC-H. It is mainly composed of a test data warehouse that is based on a unified reference model for XML warehouses and that features XML-specific structures, and its associate XQuery decision support workload. XWeB's usage is illustrated by experiments on several XML database management systems.

  14. InterMine: a flexible data warehouse system for the integration and analysis of heterogeneous biological data.

    Science.gov (United States)

    Smith, Richard N; Aleksic, Jelena; Butano, Daniela; Carr, Adrian; Contrino, Sergio; Hu, Fengyuan; Lyne, Mike; Lyne, Rachel; Kalderimis, Alex; Rutherford, Kim; Stepan, Radek; Sullivan, Julie; Wakeling, Matthew; Watkins, Xavier; Micklem, Gos

    2012-12-01

    InterMine is an open-source data warehouse system that facilitates the building of databases with complex data integration requirements and a need for a fast customizable query facility. Using InterMine, large biological databases can be created from a range of heterogeneous data sources, and the extensible data model allows for easy integration of new data types. The analysis tools include a flexible query builder, genomic region search and a library of 'widgets' performing various statistical analyses. The results can be exported in many commonly used formats. InterMine is a fully extensible framework where developers can add new tools and functionality. Additionally, there is a comprehensive set of web services, for which client libraries are provided in five commonly used programming languages. Freely available from http://www.intermine.org under the LGPL license. g.micklem@gen.cam.ac.uk Supplementary data are available at Bioinformatics online.

  15. Database Are Not Toasters: A Framework for Comparing Data Warehouse Appliances

    Science.gov (United States)

    Trajman, Omer; Crolotte, Alain; Steinhoff, David; Nambiar, Raghunath Othayoth; Poess, Meikel

    The success of Business Intelligence (BI) applications depends on two factors, the ability to analyze data ever more quickly and the ability to handle ever increasing volumes of data. Data Warehouse (DW) and Data Mart (DM) installations that support BI applications have historically been built using traditional architectures either designed from the ground up or based on customized reference system designs. The advent of Data Warehouse Appliances (DA) brings packaged software and hardware solutions that address performance and scalability requirements for certain market segments. The differences between DAs and custom installations make direct comparisons between them impractical and suggest the need for a targeted DA benchmark. In this paper we review data warehouse appliances by surveying thirteen products offered today. We assess the common characteristics among them and propose a classification for DA offerings. We hope our results will help define a useful benchmark for DAs.

  16. Data Warehouse Emissieregistratie. A new tool to sustainability; Data Warehouse Emissieregistratie. Een nieuw instrument op weg naar duurzaamheid

    Energy Technology Data Exchange (ETDEWEB)

    Van Grootveld, G. [VROM-Inpsectie, Den Haag (Netherlands); Op den Kamp, A. [OpdenKamp Adviesgroep, Den Haag (Netherlands)

    2002-12-01

    An overview is given of the possibilities to use and search the title database which contains data on emission of pollution sources in different sectors in the Netherlands. [Dutch] De voorliggende publicatie illustreert de kracht van het Data Warehouse aan de hand van zeven voorbeelden in de hoofdstukken 3 tot en met 9. Daarbij wordt telkens ook een doorkijk naar duurzame ontwikkeling gegeven.In hoofdstuk 10 worden twee cases met een korte handleiding behandeld. In hoofdstuk 1 staat achtergrondinformatie over de milieubeleidketen en de plaats die monitoring daarin neemt. In hoofdstuk 2 worden kort de drie dimensies van het Data Warehouse en de mogelijkheden die het Data Warehouse biedt beschreven (www.emissieregistratie.nl)

  17. Usage of data warehouse for analysing software's bugs

    Science.gov (United States)

    Živanov, Danijel; Krstićev, Danijela Boberić; Mirković, Duško

    2017-07-01

    We analysed the database schema of Bugzilla system and taking into account user's requirements for reporting, we presented a dimensional model for the data warehouse which will be used for reporting software defects. The idea proposed in this paper is not to throw away Bugzilla system because it certainly has many strengths, but to make integration of Bugzilla and the proposed data warehouse. Bugzilla would continue to be used for recording bugs that occur during the development and maintenance of software while the data warehouse would be used for storing data on bugs in an appropriate form, which is more suitable for analysis.

  18. PEMAHAMAN TEORI DATA WAREHOUSE BAGI MAHASISWA TAHUN AWAL JENJANG STRATA SATU BIDANG ILMU KOMPUTER

    Directory of Open Access Journals (Sweden)

    Harco Leslie Hendric Spits Warnars

    2015-01-01

    Full Text Available As a Computer scientist, a computer science students should have understanding about database theory as a concept of data maintenance. Database will be needed in every single human real life computer implementation such as information systems, information technology, internet, games, artificial intelligence, robot and so on. Inevitably, the right data handling and managament will produce excellent technology implementation. Data warehouse as one of the specialization subject which is offered in computer science study program final semester, provide challenge for computer science students.A survey was conducted on 18 students of early year of computer science study program at Surya university and giving hypothesis that for those students who ever heard of a data warehouse would be interested to learn data warehouse and on other hand, students who had never heard of the data warehouse will not be interested to learn data warehouse. Therefore, it is important that delivery of the Data warehouse subject material should be understood by lecturers, so that students can well understoodwith the data warehouse.

  19. A Relevance-Extended Multi-dimensional Model for a Data Warehouse Contextualized with Documents

    DEFF Research Database (Denmark)

    Perez, Juan Manuel; Pedersen, Torben Bach; Berlanga, Rafael

    2005-01-01

    Current data warehouse and OLAP technologies can be applied to analyze the structured data that companies store in their databases. The circumstances that describe the context associated with these data can be found in other internal and external sources of documents. In this paper we propose...... to combine the traditional corporate data warehouse with a document warehouse, resulting in a contextualized warehouse. Thus, contextualized warehouses keep a historical record of the facts and their contexts as described by the documents. In this framework, the user selects an analysis context which...

  20. Application of bioinformatics tools and databases in microbial dehalogenation research (a review).

    Science.gov (United States)

    Satpathy, R; Konkimalla, V B; Ratha, J

    2015-01-01

    Microbial dehalogenation is a biochemical process in which the halogenated substances are catalyzed enzymatically in to their non-halogenated form. The microorganisms have a wide range of organohalogen degradation ability both explicit and non-specific in nature. Most of these halogenated organic compounds being pollutants need to be remediated; therefore, the current approaches are to explore the potential of microbes at a molecular level for effective biodegradation of these substances. Several microorganisms with dehalogenation activity have been identified and characterized. In this aspect, the bioinformatics plays a key role to gain deeper knowledge in this field of dehalogenation. To facilitate the data mining, many tools have been developed to annotate these data from databases. Therefore, with the discovery of a microorganism one can predict a gene/protein, sequence analysis, can perform structural modelling, metabolic pathway analysis, biodegradation study and so on. This review highlights various methods of bioinformatics approach that describes the application of various databases and specific tools in the microbial dehalogenation fields with special focus on dehalogenase enzymes. Attempts have also been made to decipher some recent applications of in silico modeling methods that comprise of gene finding, protein modelling, Quantitative Structure Biodegradibility Relationship (QSBR) study and reconstruction of metabolic pathways employed in dehalogenation research area.

  1. ¿Why Data warehouse & Business Intelligence at Universidad Simon Bolivar?

    Directory of Open Access Journals (Sweden)

    Kamagate Azoumana

    2013-01-01

    Full Text Available Abstract The data warehouse is supposed to provide storage, functionality and responsiveness to queries beyond the capabilities of today’s transaction databases. Also Data warehouse is built to improve the data access performance of databases.   Resumen Los almacenes de datos se supone que proporcionan almacenamiento, funcionalidad y capacidad de repuesta a las consultas y análisis más eficiente que las bases de datos transaccionales. También el almacén de datos se construiye para mejorar el rendimiento de acceso a los datos.

  2. [Establishment of data warehouse of needling and moxibustion literature based on data mining].

    Science.gov (United States)

    Wang, Jian-Ling; Li, Ren-Ling; Jia, Chun-Sheng

    2012-02-01

    In order to explore the efficacy specificity and valuable rules of clinical application of needling and moxibustion methods in a large quantity of information from literature, a data warehouse needs being established. On the basis of the original databases of red-hot needle therapy and hydro-acupuncture therapy, and the newly-established databases of acupoint catgut embedding therapy, acupoint application therapy, etc., and in accordance with the characteristics of different types of needling-moxibustion literature information, databases on different subjects were established first. These subject databases constitute a general "literature data warehouse on needling moxibustion methods" composing of multi-subjects and multiple dimensions so as to discover useful regularities about clinical treatment and trials collected in the literature by using data mining techniques. In the present paper, the authors introduce the design of the data warehouse, determination of subjects, establishment of subject relations, application of the administration platform, and application of data. This data warehouse will provide a standard data representation mode, enlarge data attributes and create extensive data links among literature information in the network, and may bring us with considerable convenience and profits in clinical application decision making and scientific research about needling-moxibustion techniques.

  3. Importance of databases of nucleic acids for bioinformatic analysis focused to genomics

    Science.gov (United States)

    Jimenez-Gutierrez, L. R.; Barrios-Hernández, C. J.; Pedraza-Ferreira, G. R.; Vera-Cala, L.; Martinez-Perez, F.

    2016-08-01

    Recently, bioinformatics has become a new field of science, indispensable in the analysis of millions of nucleic acids sequences, which are currently deposited in international databases (public or private); these databases contain information of genes, RNA, ORF, proteins, intergenic regions, including entire genomes from some species. The analysis of this information requires computer programs; which were renewed in the use of new mathematical methods, and the introduction of the use of artificial intelligence. In addition to the constant creation of supercomputing units trained to withstand the heavy workload of sequence analysis. However, it is still necessary the innovation on platforms that allow genomic analyses, faster and more effectively, with a technological understanding of all biological processes.

  4. PERANCANGAN DAN IMPLEMENTASI DATA WAREHOUSE METEOROLOGI, KLIMATOLOGI, GEOFISIKA DAN BENCANA ALAM

    Directory of Open Access Journals (Sweden)

    Agus Safril

    2014-05-01

    Full Text Available BMKG telah memiliki data berasal dari beberapa sistem basis data historis (legacy system baik yang telah tersimpan dalam sistem informasi database maupun data dalam bentuk lembar kerja (worksheet. Data lama ini sering tidak digunakan ketika  sistem database baru dikembangkan. Agar data lama tetap dapat digunakan, diperlukan integrasi data lama dan baru. Data warehouse adalah konsep yang digunakan untuk mengintegrasikan data dalam penyimpanan sistem database terpadu BMKG. Integrasi data dilakukan dengan melakukan ekstraksi dari sumber data dengan mengambil item data yang diperlukan. Sumber data diperoleh dari sistem informasi yang ada di kelompok meteorologi, klimatologi dan geofisika. Proses integrasi data dimulai dengan ekstraksi (extraction kemudian dilakukan penyeragaman (transformation sehingga sesuai dengan format yang digunakan untuk kepentingan analisis. Selanjutnya dilakukan proses penyimpanan dalam data warehouse (loading. Prototipe data warehouse yang dibangun mencakup proses input data melalui ekstraksi data lama maupun data baru menggunakan media perangkat lunak akuisisi data. Hasil keluaran (output berupa laporan data dengan perioda data sesuai dengan kebutuhan.   The data collections of BMKG is captured from the legacy systems that is stored in the information systems or data worksheet. Sometimes the legacy system is not used when the new DBMS has been developed. In order the legacy system usefull for DBMS of BMKG, the data is integrated from the legacy systems to the new database systems. Data warehouse is the concept to integrate data to the BMKG Data Base Management System (DMBS. To integrate data, data is integrated the data sources from legacy systems that has been stored in the meteorology, climatology and geophysic information system. The next steps is transformed to data that has the format accordance with the weather analysis requirement. Finally, data must be loaded into the data warehouse.  The data warehouse

  5. Structuring warehouse management : Exploring the fit between warehouse characteristics and warehouse planning and control structure, and its effect on warehouse performance

    NARCIS (Netherlands)

    N. Faber (Nynke)

    2015-01-01

    markdownabstractThis dissertation studies the management processes that plan, control, and optimize warehouse operations. The inventory in warehouses decouples supply from demand. As such, economies of scale can be achieved in production, purchasing, and transport. As warehouses become more and more

  6. Warehouses information system design and development

    Science.gov (United States)

    Darajatun, R. A.; Sukanta

    2017-12-01

    Materials/goods handling industry is fundamental for companies to ensure the smooth running of their warehouses. Efficiency and organization within every aspect of the business is essential in order to gain a competitive advantage. The purpose of this research is design and development of Kanban of inventory storage and delivery system. Application aims to facilitate inventory stock checks to be more efficient and effective. Users easily input finished goods from production department, warehouse, customer, and also suppliers. Master data designed as complete as possible to be prepared applications used in a variety of process logistic warehouse variations. The author uses Java programming language to develop the application, which is used for building Java Web applications, while the database used is MySQL. System development methodology that I use is the Waterfall methodology. Waterfall methodology has several stages of the Analysis, System Design, Implementation, Integration, Operation and Maintenance. In the process of collecting data the author uses the method of observation, interviews, and literature.

  7. Electronic warehouse receipts registry as a step from paper to electronic warehouse receipts

    Directory of Open Access Journals (Sweden)

    Kovačević Vlado

    2016-01-01

    Full Text Available The aim of this paper is to determine the economic viability of the electronic warehouse receipt registry introduction, as a step toward electronic warehouse receipts. Both forms of warehouse receipt paper and electronic exist in practice, but paper warehouse receipts are more widespread. In this paper, the dematerialization process is analyzed in two steps. The first step is the dematerialization of warehouse receipt registry, with warehouse receipts still in paper form. The second step is the introduction of electronic warehouse receipts themselves. Dematerialization of warehouse receipts is more complex than that for financial securities, because of the individual characteristics of each warehouse receipt. As a consequence, electronic warehouse receipts are in place for only to a handful of commodities, namely cotton and a few grains. Nevertheless, the movement towards the electronic warehouse receipt, which began several decades ago with financial securities, is now taking hold in the agricultural sector. In this paper is analyzed Serbian electronic registry, since the Serbia is first country in EU with electronic warehouse receipts registry donated by FAO. Performed analysis shows the considerable impact of electronic warehouse receipts registry establishment on enhancing the security of the system of public warehouses, and on advancing the trade with warehouse receipt.

  8. [Peranesthesic Anaphylactic Shocks: Contribution of a Clinical Data Warehouse].

    Science.gov (United States)

    Osmont, Marie-Noëlle; Campillo-Gimenez, Boris; Metayer, Lucie; Jantzem, Hélène; Rochefort-Morel, Cécile; Cuggia, Marc; Polard, Elisabeth

    2015-10-16

    To evaluate the performance of the collection of cases of anaphylactic shock during anesthesia in the Regional Pharmacovigilance Center of Rennes and the contribution of a query in the biomedical data warehouse of the French University Hospital of Rennes in 2009. Different sources were evaluated: the French pharmacovigilance database (including spontaneous reports and reports from a query in the database of the programme de médicalisation des systèmes d'information [PMSI]), records of patients seen in allergo-anesthesia (source considered as comprehensive as possible) and a query in the data warehouse. Analysis of allergo-anesthesia records detected all cases identified by other methods, as well as two other cases (nine cases in total). The query in the data warehouse enabled detection of seven cases out of the nine. Querying full-text reports and structured data extracted from the hospital information system improves the detection of anaphylaxis during anesthesia and facilitates access to data. © 2015 Société Française de Pharmacologie et de Thérapeutique.

  9. Perancangan Data Warehouse Nilai Mahasiswa Dengan Kimball Nine-Step Methodology

    Directory of Open Access Journals (Sweden)

    Ganda Wijaya

    2017-04-01

    Abstract Student grades has many components that can be analyzed to support decision making. Based on this, the authors conducted a study of student grades. The study was conducted on a database that is in the Bureau of Academic and Student Affairs Administration Bina Sarana Informatika (BAAK BSI. The focus of this research is "How to model a data warehouse that can meet the management needs of the data value of students as supporters of evaluation, planning and decision making?". Data warehouse grades students need to be made in order to obtain the information, reports, and can perform multi-dimensional analysis, which in turn can assist management in making policy. Development of the system is done by using System Development Life Cycle (SDLC with Waterfall approach. While the design of the data warehouse using a nine-step methodology kimball. Results obtained in the form of a star schema and data warehouse value. Data warehouses can provide a summary of information that is fast, accurate and continuous so as to assist management in making policies for the future. In general, the benefits of this research are as additional reference in building a data warehouse using a nine-step methodology kimball.   Keywords: Data Warehouse, Kimball Nine-Step Methodology.

  10. Study and application of data mining and data warehouse in CIMS

    Science.gov (United States)

    Zhou, Lijuan; Liu, Chi; Liu, Daxin

    2003-03-01

    The interest in analyzing data has grown tremendously in recent years. To analyze data, a multitude of technologies is need, namely technologies from the fields of Data Warehouse, Data Mining, On-line Analytical Processing (OLAP). This paper gives a new architecture of data warehouse in CIMS according to CRGC-CIMS application engineering. The data source of this architecture comes from database of CRGC-CIMS system. The data is put in global data set by extracting, filtrating and integrating, and then the data is translated to data warehouse according information request. We have addressed two advantages of the new model in CRGC-CIMS application. In addition, a Data Warehouse contains lots of materialized views over the data provided by the distributed heterogeneous databases for the purpose of efficiently implementing decision-support, OLAP queries or data mining. It is important to select the right view to materialize that answer a given set of queries. In this paper, we also have designed algorithms for selecting a set of views to be materialized in a data warehouse in order to answer the most queries under the constraint of given space. First, we give a cost model for selecting materialized views. Then we give the algorithms that adopt gradually recursive method from bottom to top. We give description and realization of algorithms. Finally, we discuss the advantage and shortcoming of our approach and future work.

  11. Data warehouse for assessing animal health, welfare, risk management and -communication.

    Science.gov (United States)

    Nielsen, Annette Cleveland

    2011-01-01

    The objective of this paper is to give an overview of existing databases in Denmark and describe some of the most important of these in relation to establishment of the Danish Veterinary and Food Administrations' veterinary data warehouse. The purpose of the data warehouse and possible use of the data are described. Finally, sharing of data and validity of data is discussed. There are databases in other countries describing animal husbandry and veterinary antimicrobial consumption, but Denmark will be the first country relating all data concerning animal husbandry, -health and -welfare in Danish production animals to each other in a data warehouse. Moreover, creating access to these data for researchers and authorities will hopefully result in easier and more substantial risk based control, risk management and risk communication by the authorities and access to data for researchers for epidemiological studies in animal health and welfare.

  12. Developing a standardized healthcare cost data warehouse.

    Science.gov (United States)

    Visscher, Sue L; Naessens, James M; Yawn, Barbara P; Reinalda, Megan S; Anderson, Stephanie S; Borah, Bijan J

    2017-06-12

    Research addressing value in healthcare requires a measure of cost. While there are many sources and types of cost data, each has strengths and weaknesses. Many researchers appear to create study-specific cost datasets, but the explanations of their costing methodologies are not always clear, causing their results to be difficult to interpret. Our solution, described in this paper, was to use widely accepted costing methodologies to create a service-level, standardized healthcare cost data warehouse from an institutional perspective that includes all professional and hospital-billed services for our patients. The warehouse is based on a National Institutes of Research-funded research infrastructure containing the linked health records and medical care administrative data of two healthcare providers and their affiliated hospitals. Since all patients are identified in the data warehouse, their costs can be linked to other systems and databases, such as electronic health records, tumor registries, and disease or treatment registries. We describe the two institutions' administrative source data; the reference files, which include Medicare fee schedules and cost reports; the process of creating standardized costs; and the warehouse structure. The costing algorithm can create inflation-adjusted standardized costs at the service line level for defined study cohorts on request. The resulting standardized costs contained in the data warehouse can be used to create detailed, bottom-up analyses of professional and facility costs of procedures, medical conditions, and patient care cycles without revealing business-sensitive information. After its creation, a standardized cost data warehouse is relatively easy to maintain and can be expanded to include data from other providers. Individual investigators who may not have sufficient knowledge about administrative data do not have to try to create their own standardized costs on a project-by-project basis because our data

  13. Data Warehouse on the Web for Accelerator Fabrication And Maintenance

    International Nuclear Information System (INIS)

    Chan, A.; Crane, G.; Macgregor, I.; Meyer, S.

    2011-01-01

    A data warehouse grew out of the needs for a view of accelerator information from a lab-wide or project-wide standpoint (often needing off-site data access for the multi-lab PEP-II collaborators). A World Wide Web interface is used to link legacy database systems of the various labs and departments related to the PEP-II Accelerator. In this paper, we describe how links are made via the 'Formal Device Name' field(s) in the disparate databases. We also describe the functionality of a data warehouse in an accelerator environment. One can pick devices from the PEP-II Component List and find the actual components filling the functional slots, any calibration measurements, fabrication history, associated cables and modules, and operational maintenance records for the components. Information on inventory, drawings, publications, and purchasing history are also part of the PEP-II Database. A strategy of relying on a small team, and of linking existing databases rather than rebuilding systems is outlined.

  14. Conceptual Data Warehouse Structures

    DEFF Research Database (Denmark)

    Bækgaard, Lars

    1998-01-01

    changing information needs. We show how the event-entity-relationship model (EVER) can be used for schema design and query formulation in data warehouses. Our work is based on a layered data warehouse architecture in which a global data warehouse is used for flexible long-term organization and storage...... of all warehouse data whereas local data warehouses are used for efficient query formulation and answering. In order to support flexible modeling of global warehouses we use a flexible version of EVER for global schema design. In order to support efficient query formulation in local data warehouses we...

  15. SeqHound: biological sequence and structure database as a platform for bioinformatics research

    Directory of Open Access Journals (Sweden)

    Dumontier Michel

    2002-10-01

    Full Text Available Abstract Background SeqHound has been developed as an integrated biological sequence, taxonomy, annotation and 3-D structure database system. It provides a high-performance server platform for bioinformatics research in a locally-hosted environment. Results SeqHound is based on the National Center for Biotechnology Information data model and programming tools. It offers daily updated contents of all Entrez sequence databases in addition to 3-D structural data and information about sequence redundancies, sequence neighbours, taxonomy, complete genomes, functional annotation including Gene Ontology terms and literature links to PubMed. SeqHound is accessible via a web server through a Perl, C or C++ remote API or an optimized local API. It provides functionality necessary to retrieve specialized subsets of sequences, structures and structural domains. Sequences may be retrieved in FASTA, GenBank, ASN.1 and XML formats. Structures are available in ASN.1, XML and PDB formats. Emphasis has been placed on complete genomes, taxonomy, domain and functional annotation as well as 3-D structural functionality in the API, while fielded text indexing functionality remains under development. SeqHound also offers a streamlined WWW interface for simple web-user queries. Conclusions The system has proven useful in several published bioinformatics projects such as the BIND database and offers a cost-effective infrastructure for research. SeqHound will continue to develop and be provided as a service of the Blueprint Initiative at the Samuel Lunenfeld Research Institute. The source code and examples are available under the terms of the GNU public license at the Sourceforge site http://sourceforge.net/projects/slritools/ in the SLRI Toolkit.

  16. Securing Document Warehouses against Brute Force Query Attacks

    Directory of Open Access Journals (Sweden)

    Sergey Vladimirovich Zapechnikov

    2017-04-01

    Full Text Available The paper presents the scheme of data management and protocols for securing document collection against adversary users who try to abuse their access rights to find out the full content of confidential documents. The configuration of secure document retrieval system is described and a suite of protocols among the clients, warehouse server, audit server and database management server is specified. The scheme makes it infeasible for clients to establish correspondence between the documents relevant to different search queries until a moderator won’t give access to these documents. The proposed solution allows ensuring higher security level for document warehouses.

  17. Automation in Warehouse Development

    CERN Document Server

    Verriet, Jacques

    2012-01-01

    The warehouses of the future will come in a variety of forms, but with a few common ingredients. Firstly, human operational handling of items in warehouses is increasingly being replaced by automated item handling. Extended warehouse automation counteracts the scarcity of human operators and supports the quality of picking processes. Secondly, the development of models to simulate and analyse warehouse designs and their components facilitates the challenging task of developing warehouses that take into account each customer’s individual requirements and logistic processes. Automation in Warehouse Development addresses both types of automation from the innovative perspective of applied science. In particular, it describes the outcomes of the Falcon project, a joint endeavour by a consortium of industrial and academic partners. The results include a model-based approach to automate warehouse control design, analysis models for warehouse design, concepts for robotic item handling and computer vision, and auton...

  18. Warehouse Logistics

    OpenAIRE

    Panibratetc, Anastasiia

    2015-01-01

    This research is a review of warehouse logistics on the example of Kannustalo Oy, located in Kannus, Western region of Finland. Kannustalo is an international company of designing, manufacturing and assembling block and turn-key houses. The research subject is logistics process in warehouse system of industrial company. In my work I discussed about theoretical aspect of logistics, logistic functions and processes. Later I considered warehouse as a part of logistics system and provided inf...

  19. The design and application of data warehouse during modern enterprises environment

    Science.gov (United States)

    Zhou, Lijuan; Liu, Chi; Wang, Chunying

    2006-04-01

    The interest in analyzing data has grown tremendously in recent years. To analyze data, a multitude of technologies is need, namely technologies from the fields of Data Warehouse, Data Mining, On-line Analytical Processing (OLAP). This paper proposes the system structure model of the data warehouse during modern enterprises environment according to the information demand for enterprises and the actual demand of user's, and also analyses the benefit of this kind of model in practical application, and provides the setting-up course of the data warehouse model. At the same time it has proposes the total design plans of the data warehouses of modern enterprises. The data warehouse that we build in practical application can be offered: high performance of queries; efficiency of the data; independent characteristic of logical and physical data. In addition, A Data Warehouse contains lots of materialized views over the data provided by the distributed heterogeneous databases for the purpose of efficiently implementing decision-support, OLAP queries or data mining. One of the most important decisions in designing a data warehouse is selection of right views to be materialized. In this paper, we also have designed algorithms for selecting a set of views to be materialized in a data warehouse.First, we give the algorithms for selecting materialized views. Then we use experiments do demonstrate the power of our approach. The results show the proposed algorithm delivers an optimal solution. Finally, we discuss the advantage and shortcoming of our approach and future work.

  20. Warehouse Sanitation Workshop Handbook.

    Science.gov (United States)

    Food and Drug Administration (DHHS/PHS), Washington, DC.

    This workshop handbook contains information and reference materials on proper food warehouse sanitation. The materials have been used at Food and Drug Administration (FDA) food warehouse sanitation workshops, and are selected by the FDA for use by food warehouse operators and for training warehouse sanitation employees. The handbook is divided…

  1. Bioinformatics for cancer immunotherapy target discovery

    DEFF Research Database (Denmark)

    Olsen, Lars Rønn; Campos, Benito; Barnkob, Mike Stein

    2014-01-01

    therapy target discovery in a bioinformatics analysis pipeline. We describe specialized bioinformatics tools and databases for three main bottlenecks in immunotherapy target discovery: the cataloging of potentially antigenic proteins, the identification of potential HLA binders, and the selection epitopes...

  2. Quality controls in integrative approaches to detect errors and inconsistencies in biological databases

    Directory of Open Access Journals (Sweden)

    Ghisalberti Giorgio

    2010-12-01

    Full Text Available Numerous biomolecular data are available, but they are scattered in many databases and only some of them are curated by experts. Most available data are computationally derived and include errors and inconsistencies. Effective use of available data in order to derive new knowledge hence requires data integration and quality improvement. Many approaches for data integration have been proposed. Data warehousing seams to be the most adequate when comprehensive analysis of integrated data is required. This makes it the most suitable also to implement comprehensive quality controls on integrated data. We previously developed GFINDer (http://www.bioinformatics.polimi.it/GFINDer/, a web system that supports scientists in effectively using available information. It allows comprehensive statistical analysis and mining of functional and phenotypic annotations of gene lists, such as those identified by high-throughput biomolecular experiments. GFINDer backend is composed of a multi-organism genomic and proteomic data warehouse (GPDW. Within the GPDW, several controlled terminologies and ontologies, which describe gene and gene product related biomolecular processes, functions and phenotypes, are imported and integrated, together with their associations with genes and proteins of several organisms. In order to ease maintaining updated the GPDW and to ensure the best possible quality of data integrated in subsequent updating of the data warehouse, we developed several automatic procedures. Within them, we implemented numerous data quality control techniques to test the integrated data for a variety of possible errors and inconsistencies. Among other features, the implemented controls check data structure and completeness, ontological data consistency, ID format and evolution, unexpected data quantification values, and consistency of data from single and multiple sources. We use the implemented controls to analyze the quality of data available from several

  3. Model-driven user interfaces for bioinformatics data resources: regenerating the wheel as an alternative to reinventing it

    Directory of Open Access Journals (Sweden)

    Swainston Neil

    2006-12-01

    Full Text Available Abstract Background The proliferation of data repositories in bioinformatics has resulted in the development of numerous interfaces that allow scientists to browse, search and analyse the data that they contain. Interfaces typically support repository access by means of web pages, but other means are also used, such as desktop applications and command line tools. Interfaces often duplicate functionality amongst each other, and this implies that associated development activities are repeated in different laboratories. Interfaces developed by public laboratories are often created with limited developer resources. In such environments, reducing the time spent on creating user interfaces allows for a better deployment of resources for specialised tasks, such as data integration or analysis. Laboratories maintaining data resources are challenged to reconcile requirements for software that is reliable, functional and flexible with limitations on software development resources. Results This paper proposes a model-driven approach for the partial generation of user interfaces for searching and browsing bioinformatics data repositories. Inspired by the Model Driven Architecture (MDA of the Object Management Group (OMG, we have developed a system that generates interfaces designed for use with bioinformatics resources. This approach helps laboratory domain experts decrease the amount of time they have to spend dealing with the repetitive aspects of user interface development. As a result, the amount of time they can spend on gathering requirements and helping develop specialised features increases. The resulting system is known as Pierre, and has been validated through its application to use cases in the life sciences, including the PEDRoDB proteomics database and the e-Fungi data warehouse. Conclusion MDAs focus on generating software from models that describe aspects of service capabilities, and can be applied to support rapid development of repository

  4. YPED: an integrated bioinformatics suite and database for mass spectrometry-based proteomics research.

    Science.gov (United States)

    Colangelo, Christopher M; Shifman, Mark; Cheung, Kei-Hoi; Stone, Kathryn L; Carriero, Nicholas J; Gulcicek, Erol E; Lam, TuKiet T; Wu, Terence; Bjornson, Robert D; Bruce, Can; Nairn, Angus C; Rinehart, Jesse; Miller, Perry L; Williams, Kenneth R

    2015-02-01

    We report a significantly-enhanced bioinformatics suite and database for proteomics research called Yale Protein Expression Database (YPED) that is used by investigators at more than 300 institutions worldwide. YPED meets the data management, archival, and analysis needs of a high-throughput mass spectrometry-based proteomics research ranging from a single laboratory, group of laboratories within and beyond an institution, to the entire proteomics community. The current version is a significant improvement over the first version in that it contains new modules for liquid chromatography-tandem mass spectrometry (LC-MS/MS) database search results, label and label-free quantitative proteomic analysis, and several scoring outputs for phosphopeptide site localization. In addition, we have added both peptide and protein comparative analysis tools to enable pairwise analysis of distinct peptides/proteins in each sample and of overlapping peptides/proteins between all samples in multiple datasets. We have also implemented a targeted proteomics module for automated multiple reaction monitoring (MRM)/selective reaction monitoring (SRM) assay development. We have linked YPED's database search results and both label-based and label-free fold-change analysis to the Skyline Panorama repository for online spectra visualization. In addition, we have built enhanced functionality to curate peptide identifications into an MS/MS peptide spectral library for all of our protein database search identification results. Copyright © 2015 The Authors. Production and hosting by Elsevier Ltd.. All rights reserved.

  5. PERANCANGAN SISTEM METADATA UNTUK DATA WAREHOUSE DENGAN STUDI KASUS REVENUE TRACKING PADA PT. TELKOM DIVRE V JAWA TIMUR

    Directory of Open Access Journals (Sweden)

    Yudhi Purwananto

    2004-07-01

    Full Text Available Normal 0 false false false IN X-NONE X-NONE MicrosoftInternetExplorer4 /* Style Definitions */ table.MsoNormalTable {mso-style-name:"Table Normal"; mso-tstyle-rowband-size:0; mso-tstyle-colband-size:0; mso-style-noshow:yes; mso-style-priority:99; mso-style-qformat:yes; mso-style-parent:""; mso-padding-alt:0cm 5.4pt 0cm 5.4pt; mso-para-margin:0cm; mso-para-margin-bottom:.0001pt; mso-pagination:widow-orphan; font-size:11.0pt; font-family:"Calibri","sans-serif"; mso-ascii-font-family:Calibri; mso-ascii-theme-font:minor-latin; mso-fareast-font-family:"Times New Roman"; mso-fareast-theme-font:minor-fareast; mso-hansi-font-family:Calibri; mso-hansi-theme-font:minor-latin; mso-bidi-font-family:"Times New Roman"; mso-bidi-theme-font:minor-bidi;} Data warehouse merupakan media penyimpanan data dalam perusahaan yang diambil dari berbagai sistem dan dapat digunakan untuk berbagai keperluan seperti analisis dan pelaporan. Di PT Telkom Divre V Jawa Timur telah dibangun sebuah data warehouse yang disebut dengan Regional Database. Di Regional Database memerlukan sebuah komponen penting dalam data warehouse yaitu metadata. Definisi metadata secara sederhana adalah "data tentang data". Dalam penelitian ini dirancang sistem metadata dengan studi kasus Revenue Tracking sebagai komponen analisis dan pelaporan pada Regional Database. Metadata sangat perlu digunakan dalam pengelolaan dan memberikan informasi tentang data warehouse. Proses - proses di dalam data warehouse serta komponen - komponen yang berkaitan dengan data warehouse harus saling terintegrasi untuk mewujudkan karakteristik data warehouse yang subject-oriented, integrated, time-variant, dan non-volatile. Karena itu metadata juga harus memiliki kemampuan mempertukarkan informasi (exchange antar komponen dalam data warehouse tersebut. Web service digunakan sebagai mekanisme pertukaran ini. Web service menggunakan teknologi XML dan protokol HTTP dalam berkomunikasi. Dengan web service, setiap komponen

  6. An overview of the Hadoop/MapReduce/HBase framework and its current applications in bioinformatics

    International Nuclear Information System (INIS)

    Taylor, Ronald C.

    2010-01-01

    Bioinformatics researchers are increasingly confronted with analysis of ultra large-scale data sets, a problem that will only increase at an alarming rate in coming years. Recent developments in open source software, that is, the Hadoop project and associated software, provide a foundation for scaling to petabyte scale data warehouses on Linux clusters, providing fault-tolerant parallelized analysis on such data using a programming style named MapReduce. An overview is given of the current usage within the bioinformatics community of Hadoop, a top-level Apache Software Foundation project, and of associated open source software projects. The concepts behind Hadoop and the associated HBase project are defined, and current bioinformatics software that employ Hadoop is described. The focus is on next-generation sequencing, as the leading application area to date.

  7. PERANCANGAN DATA WAREHOUSE UNTUK MENDUKUNG PERENCANAAN PEMASARAN PERGURUAN TINGGI

    Directory of Open Access Journals (Sweden)

    agung prasetyo

    2017-02-01

    Full Text Available Salah satu indikasi perguruan tinggi yang besar adalah dilihat dari jumlah mahasiswa di perguruan tinggi tersebut. Karenanya, mahasiswa baru merupakan salah satu sumber daya yang menentukan berjalannya sebuah perguruan tinggi. Setiap tahunnya STMIK AMIKOM Purwokerto selalu melakukan penerimaan calon mahasiswa. Data mahasiswa baru tersebut sangat berguna bagi bagian pemasaran sebagai informasi untuk evaluasi kegiatan pemasaran berikutnya. Dengan dibangunnya data warehouse dan aplikasi OLAP dengan menggunakan aplikasi Pentaho Data Integration/Kettle sebagai perangkat ETL dan Pentaho Workbench yang merupakan Online Analytical Processing (OLAP sebagai pengolah database, manajemen di STMIK AMIKOM Purwokerto bisa mengambil beberapa informasi misalnya; banyak jumlah pendaftar per-periode/gelombang, per/asal sekolahnya, per/asal sumber informasi yang diperoleh calon mahasiswa baru, serta tren minat terhadap jurusan yang dipilih oleh calon mahasiswa baru. Data warehouse mampu menganalisis data transaksi, mampu memberikan laporan yang dinamis dan mampu memberikan informasi dalam berbagai dimensi tentang penerimaan mahasiswa baru di STMIK AMIKOM Purwokerto.Kata Kunci: Data Warehouse, OLAP, Pentaho, Penerimaan Calon Mahasiswa.

  8. Automation in Warehouse Development

    NARCIS (Netherlands)

    Hamberg, R.; Verriet, J.

    2012-01-01

    The warehouses of the future will come in a variety of forms, but with a few common ingredients. Firstly, human operational handling of items in warehouses is increasingly being replaced by automated item handling. Extended warehouse automation counteracts the scarcity of human operators and

  9. Efficient data management tools for the heterogeneous big data warehouse

    Science.gov (United States)

    Alekseev, A. A.; Osipova, V. V.; Ivanov, M. A.; Klimentov, A.; Grigorieva, N. V.; Nalamwar, H. S.

    2016-09-01

    The traditional RDBMS has been consistent for the normalized data structures. RDBMS served well for decades, but the technology is not optimal for data processing and analysis in data intensive fields like social networks, oil-gas industry, experiments at the Large Hadron Collider, etc. Several challenges have been raised recently on the scalability of data warehouse like workload against the transactional schema, in particular for the analysis of archived data or the aggregation of data for summary and accounting purposes. The paper evaluates new database technologies like HBase, Cassandra, and MongoDB commonly referred as NoSQL databases for handling messy, varied and large amount of data. The evaluation depends upon the performance, throughput and scalability of the above technologies for several scientific and industrial use-cases. This paper outlines the technologies and architectures needed for processing Big Data, as well as the description of the back-end application that implements data migration from RDBMS to NoSQL data warehouse, NoSQL database organization and how it could be useful for further data analytics.

  10. Bioinformatics tools and database resources for systems genetics analysis in mice-a short review and an evaluation of future needs

    NARCIS (Netherlands)

    Durrant, Caroline; Swertz, Morris A.; Alberts, Rudi; Arends, Danny; Moeller, Steffen; Mott, Richard; Prins, Pjotr; van der Velde, K. Joeri; Jansen, Ritsert C.; Schughart, Klaus

    During a meeting of the SYSGENET working group 'Bioinformatics', currently available software tools and databases for systems genetics in mice were reviewed and the needs for future developments discussed. The group evaluated interoperability and performed initial feasibility studies. To aid future

  11. PATRIC, the bacterial bioinformatics database and analysis resource.

    Science.gov (United States)

    Wattam, Alice R; Abraham, David; Dalay, Oral; Disz, Terry L; Driscoll, Timothy; Gabbard, Joseph L; Gillespie, Joseph J; Gough, Roger; Hix, Deborah; Kenyon, Ronald; Machi, Dustin; Mao, Chunhong; Nordberg, Eric K; Olson, Robert; Overbeek, Ross; Pusch, Gordon D; Shukla, Maulik; Schulman, Julie; Stevens, Rick L; Sullivan, Daniel E; Vonstein, Veronika; Warren, Andrew; Will, Rebecca; Wilson, Meredith J C; Yoo, Hyun Seung; Zhang, Chengdong; Zhang, Yan; Sobral, Bruno W

    2014-01-01

    The Pathosystems Resource Integration Center (PATRIC) is the all-bacterial Bioinformatics Resource Center (BRC) (http://www.patricbrc.org). A joint effort by two of the original National Institute of Allergy and Infectious Diseases-funded BRCs, PATRIC provides researchers with an online resource that stores and integrates a variety of data types [e.g. genomics, transcriptomics, protein-protein interactions (PPIs), three-dimensional protein structures and sequence typing data] and associated metadata. Datatypes are summarized for individual genomes and across taxonomic levels. All genomes in PATRIC, currently more than 10,000, are consistently annotated using RAST, the Rapid Annotations using Subsystems Technology. Summaries of different data types are also provided for individual genes, where comparisons of different annotations are available, and also include available transcriptomic data. PATRIC provides a variety of ways for researchers to find data of interest and a private workspace where they can store both genomic and gene associations, and their own private data. Both private and public data can be analyzed together using a suite of tools to perform comparative genomic or transcriptomic analysis. PATRIC also includes integrated information related to disease and PPIs. All the data and integrated analysis and visualization tools are freely available. This manuscript describes updates to the PATRIC since its initial report in the 2007 NAR Database Issue.

  12. PayDIBI: Pay-as-you-go data integration for bioinformatics

    NARCIS (Netherlands)

    Wanders, B.

    2012-01-01

    Background: Scientific research in bio-informatics is often data-driven and supported by biolog- ical databases. In a growing number of research projects, researchers like to ask questions that require the combination of information from more than one database. Most bio-informatics papers do not

  13. Pengembangan Data Warehouse Menggunakan Pendekatan Data-Driven untuk Membantu Pengelolaan SDM

    Directory of Open Access Journals (Sweden)

    Mujiono Mujiono

    2016-01-01

    Full Text Available The basis of bureaucratic reform is the reform of human resources management. One supporting factor is the development of an employee database. To support the management of human resources required including data warehouse and business intelligent tools. The data warehouse is an integrated concept of reliable data storage to provide support to all the needs of the data analysis. In this study developed a data warehouse using the data-driven approach to the source data comes from SIMPEG, SAPK and electronic presence. Data warehouses are designed using the nine steps methodology and unified modeling language (UML notation. Extract transform load (ETL is done by using Pentaho Data Integration by applying transformation maps. Furthermore, to help human resource management, the system is built to perform online analytical processing (OLAP to facilitate web-based information. In this study generated BI application development framework with Model-View-Controller (MVC architecture and OLAP operations are built using the dynamic query generation, PivotTable, and HighChart to present information about PNS, CPNS, Retirement, Kenpa and Presence

  14. A Framework for Designing a Healthcare Outcome Data Warehouse

    Science.gov (United States)

    Parmanto, Bambang; Scotch, Matthew; Ahmad, Sjarif

    2005-01-01

    Many healthcare processes involve a series of patient visits or a series of outcomes. The modeling of outcomes associated with these types of healthcare processes is different from and not as well understood as the modeling of standard industry environments. For this reason, the typical multidimensional data warehouse designs that are frequently seen in other industries are often not a good match for data obtained from healthcare processes. Dimensional modeling is a data warehouse design technique that uses a data structure similar to the easily understood entity-relationship (ER) model but is sophisticated in that it supports high-performance data access. In the context of rehabilitation services, we implemented a slight variation of the dimensional modeling technique to make a data warehouse more appropriate for healthcare. One of the key aspects of designing a healthcare data warehouse is finding the right grain (scope) for different levels of analysis. We propose three levels of grain that enable the analysis of healthcare outcomes from highly summarized reports on episodes of care to fine-grained studies of progress from one treatment visit to the next. These grains allow the database to support multiple levels of analysis, which is imperative for healthcare decision making. PMID:18066371

  15. A framework for designing a healthcare outcome data warehouse.

    Science.gov (United States)

    Parmanto, Bambang; Scotch, Matthew; Ahmad, Sjarif

    2005-09-06

    Many healthcare processes involve a series of patient visits or a series of outcomes. The modeling of outcomes associated with these types of healthcare processes is different from and not as well understood as the modeling of standard industry environments. For this reason, the typical multidimensional data warehouse designs that are frequently seen in other industries are often not a good match for data obtained from healthcare processes. Dimensional modeling is a data warehouse design technique that uses a data structure similar to the easily understood entity-relationship (ER) model but is sophisticated in that it supports high-performance data access. In the context of rehabilitation services, we implemented a slight variation of the dimensional modeling technique to make a data warehouse more appropriate for healthcare. One of the key aspects of designing a healthcare data warehouse is finding the right grain (scope) for different levels of analysis. We propose three levels of grain that enable the analysis of healthcare outcomes from highly summarized reports on episodes of care to fine-grained studies of progress from one treatment visit to the next. These grains allow the database to support multiple levels of analysis, which is imperative for healthcare decision making.

  16. DICOM Data Warehouse: Part 2.

    Science.gov (United States)

    Langer, Steve G

    2016-06-01

    In 2010, the DICOM Data Warehouse (DDW) was launched as a data warehouse for DICOM meta-data. Its chief design goals were to have a flexible database schema that enabled it to index standard patient and study information, modality specific tags (public and private), and create a framework to derive computable information (derived tags) from the former items. Furthermore, it was to map the above information to an internally standard lexicon that enables a non-DICOM savvy programmer to write standard SQL queries and retrieve the equivalent data from a cohort of scanners, regardless of what tag that data element was found in over the changing epochs of DICOM and ensuing migration of elements from private to public tags. After 5 years, the original design has scaled astonishingly well. Very little has changed in the database schema. The knowledge base is now fluent in over 90 device types. Also, additional stored procedures have been written to compute data that is derivable from standard or mapped tags. Finally, an early concern is that the system would not be able to address the variability DICOM-SR objects has been addressed. As of this writing the system is indexing 300 MR, 600 CT, and 2000 other (XA, DR, CR, MG) imaging studies per day. The only remaining issue to be solved is the case for tags that were not prospectively indexed-and indeed, this final challenge may lead to a noSQL, big data, approach in a subsequent version.

  17. Análisis de rendimiento académico estudiantil usando data warehouse y redes neuronales Analysis of students' academic performance using data warehouse and neural networks

    Directory of Open Access Journals (Sweden)

    Carolina Zambrano Matamala

    2011-12-01

    Full Text Available Cada día las organizaciones tienen más información porque sus sistemas producen una gran cantidad de operaciones diarias que se almacenan en bases de datos transaccionales. Con el fin de analizar esta información histórica, una alternativa interesante es implementar un Data Warehouse. Por otro lado, los Data Warehouse no son capaces de realizar un análisis predictivo por sí mismos, pero las técnicas de inteligencia de máquinas se pueden utilizar para clasificar, agrupar y predecir en base a información histórica con el fin de mejorar la calidad del análisis. En este trabajo se describe una arquitectura de Data Warehouse con el fin de realizar un análisis del desempeño académico de los estudiantes. El Data Warehouse es utilizado como entrada de una arquitectura de red neuronal con tal de analizar la información histórica y de tendencia en el tiempo. Los resultados muestran la viabilidad de utilizar un Data Warehouse para el análisis de rendimiento académico y la posibilidad de predecir el número de asignaturas aprobadas por los estudiantes usando solamente su propia información histórica.Every day organizations have more information because their systems produce a large amount of daily operations which are stored in transactional databases. In order to analyze this historical information, an interesting alternative is to implement a Data Warehouse. In the other hand, Data Warehouses are not able to perform predictive analysis for themselves, but machine learning techniques can be used to classify, grouping and predict historical information in order to improve the quality of analysis. This paper depicts architecture of a Data Warehouse useful to perform an analysis of students' academic performance. The Data Warehouse is used as input of a Neural Network in order to analyze historical information and forecast. The results show the viability of using Data Warehouse for academic performance analysis and the feasibility of

  18. Warehouse location and freight attraction in the greater El Paso region.

    Science.gov (United States)

    2013-12-01

    This project analyzes the current and future warehouse and distribution center locations along the El Paso-Juarez regions in the U.S.-Mexico border. This research seeks has developed a comprehensive database to aid in decision support process for ide...

  19. Perancangan Model Data Warehouse dan Perangkat Analitik untuk Memaksimalkan Proses Pemasaran Hotel: Studi Kasus pada Hotel Abc

    Directory of Open Access Journals (Sweden)

    Eka Miranda

    2013-06-01

    Full Text Available The increasing competition in hotel business forces every hotel to be equiped with analysis tools that can maximize its marketing performance. This paper discusses the development of a data warehouse model and analytic tools to enhance the company's competitive advantage through the utilization of a variety of data, information and knowledge held by the company as a raw material in the decision making process. A study is done at ABC Hotel which uses a database to save the transactional record. However, the database cannot be directly used to support analysis and decision making process. Based on this issue, the company needs a data warehouse model and analytic tools that can be used to store large amounts of data and also potentially to gain a new perspective of data distribution which allows to provide reporting and answers of ad hoc users questions and assist managers in making decisions. Further data warehouse model and analytic tools can be used to help manager to formulate planning and marketing strategies. Data are collected through interviews and literature study, followed by data analysis to analyze business processes, to identify the problems and the information to support analysis process. Furthermore, data warehouse is designed using analysis of records related to the activities in hotel's marketing area and data warehouse model. The result of this paper is data warehouse model and analytic tools to analyze the external and transactional data and to support decision making process in marketing area.

  20. Intelligent environmental data warehouse

    International Nuclear Information System (INIS)

    Ekechukwu, B.

    1998-01-01

    Making quick and effective decisions in environment management are based on multiple and complex parameters, a data warehouse is a powerful tool for the over all management of massive environmental information. Selecting the right data from a warehouse is an important factor consideration for end-users. This paper proposed an intelligent environmental data warehouse system. It consists of data warehouse to feed an environmental researchers and managers with desire environmental information needs to their research studies and decision in form of geometric and attribute data for study area, and a metadata for the other sources of environmental information. In addition, the proposed intelligent search engine works according to a set of rule, which enables the system to be aware of the environmental data wanted by the end-user. The system development process passes through four stages. These are data preparation, warehouse development, intelligent engine development and internet platform system development. (author)

  1. PATRIC, the bacterial bioinformatics database and analysis resource

    Science.gov (United States)

    Wattam, Alice R.; Abraham, David; Dalay, Oral; Disz, Terry L.; Driscoll, Timothy; Gabbard, Joseph L.; Gillespie, Joseph J.; Gough, Roger; Hix, Deborah; Kenyon, Ronald; Machi, Dustin; Mao, Chunhong; Nordberg, Eric K.; Olson, Robert; Overbeek, Ross; Pusch, Gordon D.; Shukla, Maulik; Schulman, Julie; Stevens, Rick L.; Sullivan, Daniel E.; Vonstein, Veronika; Warren, Andrew; Will, Rebecca; Wilson, Meredith J.C.; Yoo, Hyun Seung; Zhang, Chengdong; Zhang, Yan; Sobral, Bruno W.

    2014-01-01

    The Pathosystems Resource Integration Center (PATRIC) is the all-bacterial Bioinformatics Resource Center (BRC) (http://www.patricbrc.org). A joint effort by two of the original National Institute of Allergy and Infectious Diseases-funded BRCs, PATRIC provides researchers with an online resource that stores and integrates a variety of data types [e.g. genomics, transcriptomics, protein–protein interactions (PPIs), three-dimensional protein structures and sequence typing data] and associated metadata. Datatypes are summarized for individual genomes and across taxonomic levels. All genomes in PATRIC, currently more than 10 000, are consistently annotated using RAST, the Rapid Annotations using Subsystems Technology. Summaries of different data types are also provided for individual genes, where comparisons of different annotations are available, and also include available transcriptomic data. PATRIC provides a variety of ways for researchers to find data of interest and a private workspace where they can store both genomic and gene associations, and their own private data. Both private and public data can be analyzed together using a suite of tools to perform comparative genomic or transcriptomic analysis. PATRIC also includes integrated information related to disease and PPIs. All the data and integrated analysis and visualization tools are freely available. This manuscript describes updates to the PATRIC since its initial report in the 2007 NAR Database Issue. PMID:24225323

  2. Designing XML schemas for bioinformatics.

    Science.gov (United States)

    Bruhn, Russel Elton; Burton, Philip John

    2003-06-01

    Data interchange bioinformatics databases will, in the future, most likely take place using extensible markup language (XML). The document structure will be described by an XML Schema rather than a document type definition (DTD). To ensure flexibility, the XML Schema must incorporate aspects of Object-Oriented Modeling. This impinges on the choice of the data model, which, in turn, is based on the organization of bioinformatics data by biologists. Thus, there is a need for the general bioinformatics community to be aware of the design issues relating to XML Schema. This paper, which is aimed at a general bioinformatics audience, uses examples to describe the differences between a DTD and an XML Schema and indicates how Unified Modeling Language diagrams may be used to incorporate Object-Oriented Modeling in the design of schema.

  3. Web-enabled Data Warehouse and Data Webhouse

    Directory of Open Access Journals (Sweden)

    Cerasela PIRVU

    2008-01-01

    Full Text Available In this paper, our objectives are to understanding what data warehouse means examine the reasons for doing so, appreciate the implications of the convergence of Web technologies and those of the data warehouse and examine the steps for building a Web-enabled data warehouse. The web revolution has propelled the data warehouse out onto the main stage, because in many situations the data warehouse must be the engine that controls or analysis the web experience. In order to step up to this new responsibility, the data warehouse must adjust. The nature of the data warehouse needs to be somewhat different. As a result, our data warehouses are becoming data webhouses. The data warehouse is becoming the infrastructure that supports customer relationship management (CRM. And the data warehouse is being asked to make the customer clickstream available for analysis. This rebirth of data warehousing architecture is called the data webhouse.

  4. Contextualizing Data Warehouses with Documents

    DEFF Research Database (Denmark)

    Perez, Juan Manuel; Berlanga, Rafael; Aramburu, Maria Jose

    2008-01-01

    warehouse with a document warehouse, resulting in a contextualized warehouse. Thus, the user first selects an analysis context by supplying some keywords. Then, the analysis is performed on a novel type of OLAP cube, called an R-cube, which is materialized by retrieving and ranking the documents...

  5. Study on resources and environmental data integration towards data warehouse construction covering trans-boundary area of China, Russia and Mongolia

    Science.gov (United States)

    Wang, J.; Song, J.; Gao, M.; Zhu, L.

    2014-02-01

    The trans-boundary area between Northern China, Mongolia and eastern Siberia of Russia is a continuous geographical area located in north eastern Asia. Many common issues in this region need to be addressed based on a uniform resources and environmental data warehouse. Based on the practice of joint scientific expedition, the paper presented a data integration solution including 3 steps, i.e., data collection standards and specifications making, data reorganization and process, data warehouse design and development. A series of data collection standards and specifications were drawn up firstly covering more than 10 domains. According to the uniform standard, 20 resources and environmental survey databases in regional scale, and 11 in-situ observation databases were reorganized and integrated. North East Asia Resources and Environmental Data Warehouse was designed, which included 4 layers, i.e., resources layer, core business logic layer, internet interoperation layer, and web portal layer. The data warehouse prototype was developed and deployed initially. All the integrated data in this area can be accessed online.

  6. Study on resources and environmental data integration towards data warehouse construction covering trans-boundary area of China, Russia and Mongolia

    International Nuclear Information System (INIS)

    Wang, J; Song, J; Gao, M; Zhu, L

    2014-01-01

    The trans-boundary area between Northern China, Mongolia and eastern Siberia of Russia is a continuous geographical area located in north eastern Asia. Many common issues in this region need to be addressed based on a uniform resources and environmental data warehouse. Based on the practice of joint scientific expedition, the paper presented a data integration solution including 3 steps, i.e., data collection standards and specifications making, data reorganization and process, data warehouse design and development. A series of data collection standards and specifications were drawn up firstly covering more than 10 domains. According to the uniform standard, 20 resources and environmental survey databases in regional scale, and 11 in-situ observation databases were reorganized and integrated. North East Asia Resources and Environmental Data Warehouse was designed, which included 4 layers, i.e., resources layer, core business logic layer, internet interoperation layer, and web portal layer. The data warehouse prototype was developed and deployed initially. All the integrated data in this area can be accessed online

  7. Fire detection in warehouse facilities

    CERN Document Server

    Dinaburg, Joshua

    2013-01-01

    Automatic sprinklers systems are the primary fire protection system in warehouse and storage facilities. The effectiveness of this strategy has come into question due to the challenges presented by modern warehouse facilities, including increased storage heights and areas, automated storage retrieval systems (ASRS), limitations on water supplies, and changes in firefighting strategies. The application of fire detection devices used to provide early warning and notification of incipient warehouse fire events is being considered as a component of modern warehouse fire protection.Fire Detection i

  8. The Data Warehouse Lifecycle Toolkit

    CERN Document Server

    Kimball, Ralph; Thornthwaite, Warren; Mundy, Joy; Becker, Bob

    2011-01-01

    A thorough update to the industry standard for designing, developing, and deploying data warehouse and business intelligence systemsThe world of data warehousing has changed remarkably since the first edition of The Data Warehouse Lifecycle Toolkit was published in 1998. In that time, the data warehouse industry has reached full maturity and acceptance, hardware and software have made staggering advances, and the techniques promoted in the premiere edition of this book have been adopted by nearly all data warehouse vendors and practitioners. In addition, the term "business intelligence" emerge

  9. A Reference Viral Database (RVDB) To Enhance Bioinformatics Analysis of High-Throughput Sequencing for Novel Virus Detection.

    Science.gov (United States)

    Goodacre, Norman; Aljanahi, Aisha; Nandakumar, Subhiksha; Mikailov, Mike; Khan, Arifa S

    2018-01-01

    Detection of distantly related viruses by high-throughput sequencing (HTS) is bioinformatically challenging because of the lack of a public database containing all viral sequences, without abundant nonviral sequences, which can extend runtime and obscure viral hits. Our reference viral database (RVDB) includes all viral, virus-related, and virus-like nucleotide sequences (excluding bacterial viruses), regardless of length, and with overall reduced cellular sequences. Semantic selection criteria (SEM-I) were used to select viral sequences from GenBank, resulting in a first-generation viral database (VDB). This database was manually and computationally reviewed, resulting in refined, semantic selection criteria (SEM-R), which were applied to a new download of updated GenBank sequences to create a second-generation VDB. Viral entries in the latter were clustered at 98% by CD-HIT-EST to reduce redundancy while retaining high viral sequence diversity. The viral identity of the clustered representative sequences (creps) was confirmed by BLAST searches in NCBI databases and HMMER searches in PFAM and DFAM databases. The resulting RVDB contained a broad representation of viral families, sequence diversity, and a reduced cellular content; it includes full-length and partial sequences and endogenous nonretroviral elements, endogenous retroviruses, and retrotransposons. Testing of RVDBv10.2, with an in-house HTS transcriptomic data set indicated a significantly faster run for virus detection than interrogating the entirety of the NCBI nonredundant nucleotide database, which contains all viral sequences but also nonviral sequences. RVDB is publically available for facilitating HTS analysis, particularly for novel virus detection. It is meant to be updated on a regular basis to include new viral sequences added to GenBank. IMPORTANCE To facilitate bioinformatics analysis of high-throughput sequencing (HTS) data for the detection of both known and novel viruses, we have

  10. Relax with CouchDB - Into the non-relational DBMS era of Bioinformatics

    Science.gov (United States)

    Manyam, Ganiraju; Payton, Michelle A.; Roth, Jack A.; Abruzzo, Lynne V.; Coombes, Kevin R.

    2012-01-01

    With the proliferation of high-throughput technologies, genome-level data analysis has become common in molecular biology. Bioinformaticians are developing extensive resources to annotate and mine biological features from high-throughput data. The underlying database management systems for most bioinformatics software are based on a relational model. Modern non-relational databases offer an alternative that has flexibility, scalability, and a non-rigid design schema. Moreover, with an accelerated development pace, non-relational databases like CouchDB can be ideal tools to construct bioinformatics utilities. We describe CouchDB by presenting three new bioinformatics resources: (a) geneSmash, which collates data from bioinformatics resources and provides automated gene-centric annotations, (b) drugBase, a database of drug-target interactions with a web interface powered by geneSmash, and (c) HapMap-CN, which provides a web interface to query copy number variations from three SNP-chip HapMap datasets. In addition to the web sites, all three systems can be accessed programmatically via web services. PMID:22609849

  11. Bioinformatics for Exploration

    Science.gov (United States)

    Johnson, Kathy A.

    2006-01-01

    For the purpose of this paper, bioinformatics is defined as the application of computer technology to the management of biological information. It can be thought of as the science of developing computer databases and algorithms to facilitate and expedite biological research. This is a crosscutting capability that supports nearly all human health areas ranging from computational modeling, to pharmacodynamics research projects, to decision support systems within autonomous medical care. Bioinformatics serves to increase the efficiency and effectiveness of the life sciences research program. It provides data, information, and knowledge capture which further supports management of the bioastronautics research roadmap - identifying gaps that still remain and enabling the determination of which risks have been addressed.

  12. A decade of Web Server updates at the Bioinformatics Links Directory: 2003-2012.

    Science.gov (United States)

    Brazas, Michelle D; Yim, David; Yeung, Winston; Ouellette, B F Francis

    2012-07-01

    The 2012 Bioinformatics Links Directory update marks the 10th special Web Server issue from Nucleic Acids Research. Beginning with content from their 2003 publication, the Bioinformatics Links Directory in collaboration with Nucleic Acids Research has compiled and published a comprehensive list of freely accessible, online tools, databases and resource materials for the bioinformatics and life science research communities. The past decade has exhibited significant growth and change in the types of tools, databases and resources being put forth, reflecting both technology changes and the nature of research over that time. With the addition of 90 web server tools and 12 updates from the July 2012 Web Server issue of Nucleic Acids Research, the Bioinformatics Links Directory at http://bioinformatics.ca/links_directory/ now contains an impressive 134 resources, 455 databases and 1205 web server tools, mirroring the continued activity and efforts of our field.

  13. Building a Data Warehouse.

    Science.gov (United States)

    Levine, Elliott

    2002-01-01

    Describes how to build a data warehouse, using the Schools Interoperability Framework (www.sifinfo.org), that supports data-driven decision making and complies with the Freedom of Information Act. Provides several suggestions for building and maintaining a data warehouse. (PKP)

  14. Modelling of Data Warehouse on Food Distribution Center and Reserves in the Ministry of Agriculture

    Directory of Open Access Journals (Sweden)

    Edi Purnomo Putra

    2015-09-01

    Full Text Available The purpose of this study is to perform database’s planning that supports Prototype Modeling Data Warehouse in the Ministry of Agriculture, especially in the Distribution Center and Reserves in the field of distribution, reserve and price. With the prototype of Data Warehouse, the process of data analysis anddecision-making process by the top management will be easier and more accurate. Research’s method used was data collection and design method. Data warehouse’s design method was done by using Kimball’s nine stepsmethodology. Database design was done by using the ERD (Entity Relationship Diagram and activity diagram. The data used for the analysis was obtained from an interview with the head of Distribution, Reserve and Food Price. The results obtained through the analysis incorporated into the Data Warehouse Prototype have been designed to support decision-making. To conclude, Prototype Data Warehouse facilitates the analysis of data, the searching of history data and decision-making by the top management.

  15. Implementasi Data Warehouse dan Data Mining: Studi Kasus Analisis Peminatan Studi Siswa

    Directory of Open Access Journals (Sweden)

    Eka Miranda

    2011-06-01

    Full Text Available This paper discusses the implementation of data mining and their role in helping decision-making related to students’ specialization program selection. Currently, the university uses a database to store records of transactions which can not directly be used to assist analysis and decision making. Based on these issues then made the data warehouse design used to store large amounts of data and also has the potential to gain new data distribution perspectives and allows to answer the ad hoc question as well as to perform data analysis. The method used consists of: record analysis related to students’ academic achievement, designing data warehouse and data mining. The paper’s results are in a form of data warehouse and data mining design and its implementation with the classification techniques and association rules. From these results can be seen the students’ tendency and pattern background in choosing the specialization, to help them make decisions. 

  16. Development of prostate cancer research database with the clinical data warehouse technology for direct linkage with electronic medical record system.

    Science.gov (United States)

    Choi, In Young; Park, Seungho; Park, Bumjoon; Chung, Byung Ha; Kim, Choung-Soo; Lee, Hyun Moo; Byun, Seok-Soo; Lee, Ji Youl

    2013-01-01

    In spite of increased prostate cancer patients, little is known about the impact of treatments for prostate cancer patients and outcome of different treatments based on nationwide data. In order to obtain more comprehensive information for Korean prostate cancer patients, many professionals urged to have national system to monitor the quality of prostate cancer care. To gain its objective, the prostate cancer database system was planned and cautiously accommodated different views from various professions. This prostate cancer research database system incorporates information about a prostate cancer research including demographics, medical history, operation information, laboratory, and quality of life surveys. And, this system includes three different ways of clinical data collection to produce a comprehensive data base; direct data extraction from electronic medical record (EMR) system, manual data entry after linking EMR documents like magnetic resonance imaging findings and paper-based data collection for survey from patients. We implemented clinical data warehouse technology to test direct EMR link method with St. Mary's Hospital system. Using this method, total number of eligible patients were 2,300 from 1997 until 2012. Among them, 538 patients conducted surgery and others have different treatments. Our database system could provide the infrastructure for collecting error free data to support various retrospective and prospective studies.

  17. Creation of Warehouse Models for Different Layout Designs

    OpenAIRE

    Köhler, Mirko; Lukić, Ivica; Nenadić, Krešimir

    2014-01-01

    Warehouse is one of the most important components in logistics of the supply chain network. Efficiency of warehouse operations is influenced by many different factors. One of the key factors is the racks layout configuration. A warehouse with good racks layout may significantly reduce the cost of warehouse servicing. The objective of this paper is to give a scheme for building warehouses models with one-block and two-block layout for future research in warehouse optimization. An algorithm ...

  18. Relax with CouchDB--into the non-relational DBMS era of bioinformatics.

    Science.gov (United States)

    Manyam, Ganiraju; Payton, Michelle A; Roth, Jack A; Abruzzo, Lynne V; Coombes, Kevin R

    2012-07-01

    With the proliferation of high-throughput technologies, genome-level data analysis has become common in molecular biology. Bioinformaticians are developing extensive resources to annotate and mine biological features from high-throughput data. The underlying database management systems for most bioinformatics software are based on a relational model. Modern non-relational databases offer an alternative that has flexibility, scalability, and a non-rigid design schema. Moreover, with an accelerated development pace, non-relational databases like CouchDB can be ideal tools to construct bioinformatics utilities. We describe CouchDB by presenting three new bioinformatics resources: (a) geneSmash, which collates data from bioinformatics resources and provides automated gene-centric annotations, (b) drugBase, a database of drug-target interactions with a web interface powered by geneSmash, and (c) HapMap-CN, which provides a web interface to query copy number variations from three SNP-chip HapMap datasets. In addition to the web sites, all three systems can be accessed programmatically via web services. Copyright © 2012 Elsevier Inc. All rights reserved.

  19. University Accreditation using Data Warehouse

    Science.gov (United States)

    Sinaga, A. S.; Girsang, A. S.

    2017-01-01

    The accreditation aims assuring the quality the quality of the institution education. The institution needs the comprehensive documents for giving the information accurately before reviewed by assessor. Therefore, academic documents should be stored effectively to ease fulfilling the requirement of accreditation. However, the data are generally derived from various sources, various types, not structured and dispersed. This paper proposes designing a data warehouse to integrate all various data to prepare a good academic document for accreditation in a university. The data warehouse is built using nine steps that was introduced by Kimball. This method is applied to produce a data warehouse based on the accreditation assessment focusing in academic part. The data warehouse shows that it can analyse the data to prepare the accreditation assessment documents.

  20. Building the Readiness Data Warehouse

    National Research Council Canada - National Science Library

    Tysor, Sue

    2000-01-01

    .... This is the role of the data warehouse. The data warehouse will deliver business intelligence based on operational data, decision support data and external data to all business units in the organization...

  1. Evaluating a healthcare data warehouse for cancer diseases

    OpenAIRE

    Sheta, Dr. Osama E.; Eldeen, Ahmed Nour

    2013-01-01

    This paper presents the evaluation of the architecture of healthcare data warehouse specific to cancer diseases. This data warehouse containing relevant cancer medical information and patient data. The data warehouse provides the source for all current and historical health data to help executive manager and doctors to improve the decision making process for cancer patients. The evaluation model based on Bill Inmon's definition of data warehouse is proposed to evaluate the Cancer data warehouse.

  2. Event-Entity-Relationship Modeling in Data Warehouse Environments

    DEFF Research Database (Denmark)

    Bækgaard, Lars

    We use the event-entity-relationship model (EVER) to illustrate the use of entity-based modeling languages for conceptual schema design in data warehouse environments. EVER is a general-purpose information modeling language that supports the specification of both general schema structures and multi......-dimensional schemes that are customized to serve specific information needs. EVER is based on an event concept that is very well suited for multi-dimensional modeling because measurement data often represent events in multi-dimensional databases...

  3. Pay-as-you-go data integration for bio-informatics

    NARCIS (Netherlands)

    Wanders, B.

    2012-01-01

    Scientific research in bio-informatics is often data-driven and supported by numerous biological databases. A biological database contains factual information collected from scientific experiments and computational analyses about areas including genomics, proteomics, metabolomics, microarray gene

  4. miRToolsGallery: a tag-based and rankable microRNA bioinformatics resources database portal

    Science.gov (United States)

    Chen, Liang; Heikkinen, Liisa; Wang, ChangLiang; Yang, Yang; Knott, K Emily

    2018-01-01

    Abstract Hundreds of bioinformatics tools have been developed for MicroRNA (miRNA) investigations including those used for identification, target prediction, structure and expression profile analysis. However, finding the correct tool for a specific application requires the tedious and laborious process of locating, downloading, testing and validating the appropriate tool from a group of nearly a thousand. In order to facilitate this process, we developed a novel database portal named miRToolsGallery. We constructed the portal by manually curating > 950 miRNA analysis tools and resources. In the portal, a query to locate the appropriate tool is expedited by being searchable, filterable and rankable. The ranking feature is vital to quickly identify and prioritize the more useful from the obscure tools. Tools are ranked via different criteria including the PageRank algorithm, date of publication, number of citations, average of votes and number of publications. miRToolsGallery provides links and data for the comprehensive collection of currently available miRNA tools with a ranking function which can be adjusted using different criteria according to specific requirements. Database URL: http://www.mirtoolsgallery.org PMID:29688355

  5. Metagenomics and Bioinformatics in Microbial Ecology: Current Status and Beyond.

    Science.gov (United States)

    Hiraoka, Satoshi; Yang, Ching-Chia; Iwasaki, Wataru

    2016-09-29

    Metagenomic approaches are now commonly used in microbial ecology to study microbial communities in more detail, including many strains that cannot be cultivated in the laboratory. Bioinformatic analyses make it possible to mine huge metagenomic datasets and discover general patterns that govern microbial ecosystems. However, the findings of typical metagenomic and bioinformatic analyses still do not completely describe the ecology and evolution of microbes in their environments. Most analyses still depend on straightforward sequence similarity searches against reference databases. We herein review the current state of metagenomics and bioinformatics in microbial ecology and discuss future directions for the field. New techniques will allow us to go beyond routine analyses and broaden our knowledge of microbial ecosystems. We need to enrich reference databases, promote platforms that enable meta- or comprehensive analyses of diverse metagenomic datasets, devise methods that utilize long-read sequence information, and develop more powerful bioinformatic methods to analyze data from diverse perspectives.

  6. BioStar: an online question & answer resource for the bioinformatics community

    Science.gov (United States)

    Although the era of big data has produced many bioinformatics tools and databases, using them effectively often requires specialized knowledge. Many groups lack bioinformatics expertise, and frequently find that software documentation is inadequate and local colleagues may be overburdened or unfamil...

  7. Bioinformatics and the Undergraduate Curriculum

    Science.gov (United States)

    Maloney, Mark; Parker, Jeffrey; LeBlanc, Mark; Woodard, Craig T.; Glackin, Mary; Hanrahan, Michael

    2010-01-01

    Recent advances involving high-throughput techniques for data generation and analysis have made familiarity with basic bioinformatics concepts and programs a necessity in the biological sciences. Undergraduate students increasingly need training in methods related to finding and retrieving information stored in vast databases. The rapid rise of…

  8. Challenge: A Multidisciplinary Degree Program in Bioinformatics

    Directory of Open Access Journals (Sweden)

    Mudasser Fraz Wyne

    2006-06-01

    Full Text Available Bioinformatics is a new field that is poorly served by any of the traditional science programs in Biology, Computer science or Biochemistry. Known to be a rapidly evolving discipline, Bioinformatics has emerged from experimental molecular biology and biochemistry as well as from the artificial intelligence, database, pattern recognition, and algorithms disciplines of computer science. While institutions are responding to this increased demand by establishing graduate programs in bioinformatics, entrance barriers for these programs are high, largely due to the significant prerequisite knowledge which is required, both in the fields of biochemistry and computer science. Although many schools currently have or are proposing graduate programs in bioinformatics, few are actually developing new undergraduate programs. In this paper I explore the blend of a multidisciplinary approach, discuss the response of academia and highlight challenges faced by this emerging field.

  9. 27 CFR 24.141 - Bonded wine warehouse.

    Science.gov (United States)

    2010-04-01

    ... 27 Alcohol, Tobacco Products and Firearms 1 2010-04-01 2010-04-01 false Bonded wine warehouse. 24..., DEPARTMENT OF THE TREASURY LIQUORS WINE Establishment and Operations Permanent Discontinuance of Operations § 24.141 Bonded wine warehouse. Where all operations at a bonded wine warehouse are to be permanently...

  10. Translational Bioinformatics and Clinical Research (Biomedical) Informatics.

    Science.gov (United States)

    Sirintrapun, S Joseph; Zehir, Ahmet; Syed, Aijazuddin; Gao, JianJiong; Schultz, Nikolaus; Cheng, Donavan T

    2015-06-01

    Translational bioinformatics and clinical research (biomedical) informatics are the primary domains related to informatics activities that support translational research. Translational bioinformatics focuses on computational techniques in genetics, molecular biology, and systems biology. Clinical research (biomedical) informatics involves the use of informatics in discovery and management of new knowledge relating to health and disease. This article details 3 projects that are hybrid applications of translational bioinformatics and clinical research (biomedical) informatics: The Cancer Genome Atlas, the cBioPortal for Cancer Genomics, and the Memorial Sloan Kettering Cancer Center clinical variants and results database, all designed to facilitate insights into cancer biology and clinical/therapeutic correlations. Copyright © 2015 Elsevier Inc. All rights reserved.

  11. 7 CFR 735.302 - Paper warehouse receipts.

    Science.gov (United States)

    2010-01-01

    ... 7 Agriculture 7 2010-01-01 2010-01-01 false Paper warehouse receipts. 735.302 Section 735.302... § 735.302 Paper warehouse receipts. Paper warehouse receipts must be issued as follows: (a) On distinctive paper specified by DACO; (b) Printed by a printer authorized by DACO; and (c) Issued, identified...

  12. Protection of warehouses and plants under capacity constraint

    International Nuclear Information System (INIS)

    Bricha, Naji; Nourelfath, Mustapha

    2015-01-01

    While warehouses may be subjected to less protection effort than plants, their unavailability may have substantial impact on the supply chain performance. This paper presents a method for protection of plants and warehouses against intentional attacks in the context of the capacitated plant and warehouses location and capacity acquisition problem. A non-cooperative two-period game is developed to find the equilibrium solution and the optimal defender strategy under capacity constraints. The defender invests in the first period to minimize the expected damage and the attacker moves in the second period to maximize the expected damage. Extra-capacity of neighboring functional plants and warehouses is used after attacks, to satisfy all customers demand and to avoid the backorders. The contest success function is used to evaluate success probability of an attack of plants and warehouses. A numerical example is presented to illustrate an application of the model. The defender strategy obtained by our model is compared to the case where warehouses are subjected to less protection effort than the plants. This comparison allows us to measure how much our method is better, and illustrates the effect of direct investments in protection and indirect protection by warehouse extra-capacities to reduce the expected damage. - Highlights: • Protection of warehouses and plants against intentional attacks. • Capacitated plant and warehouse location and capacity acquisition problem. • A non-cooperative two-period game between the defender and the attacker. • A method to evaluate the utilities and determine the optimal defender strategy. • Using warehouse extra-capacities to reduce the expected damage

  13. System and method for integrating and accessing multiple data sources within a data warehouse architecture

    Science.gov (United States)

    Musick, Charles R [Castro Valley, CA; Critchlow, Terence [Livermore, CA; Ganesh, Madhaven [San Jose, CA; Slezak, Tom [Livermore, CA; Fidelis, Krzysztof [Brentwood, CA

    2006-12-19

    A system and method is disclosed for integrating and accessing multiple data sources within a data warehouse architecture. The metadata formed by the present method provide a way to declaratively present domain specific knowledge, obtained by analyzing data sources, in a consistent and useable way. Four types of information are represented by the metadata: abstract concepts, databases, transformations and mappings. A mediator generator automatically generates data management computer code based on the metadata. The resulting code defines a translation library and a mediator class. The translation library provides a data representation for domain specific knowledge represented in a data warehouse, including "get" and "set" methods for attributes that call transformation methods and derive a value of an attribute if it is missing. The mediator class defines methods that take "distinguished" high-level objects as input and traverse their data structures and enter information into the data warehouse.

  14. Data Warehouse Discovery Framework: The Foundation

    Science.gov (United States)

    Apanowicz, Cas

    The cost of building an Enterprise Data Warehouse Environment runs usually in millions of dollars and takes years to complete. The cost, as big as it is, is not the primary problem for a given corporation. The risk that all money allocated for planning, design and implementation of the Data Warehouse and Business Intelligence Environment may not bring the result expected, fare out way the cost of entire effort [2,10]. The combination of the two above factors is the main reason that Data Warehouse/Business Intelligence is often single most expensive and most risky IT endeavor for companies [13]. That situation was the main author's inspiration behind founding of Infobright Corp and later on the concept of Data Warehouse Discovery Framework.

  15. Application of XML in real-time data warehouse

    Science.gov (United States)

    Zhao, Yanhong; Wang, Beizhan; Liu, Lizhao; Ye, Su

    2009-07-01

    At present, XML is one of the most widely-used technologies of data-describing and data-exchanging, and the needs for real-time data make real-time data warehouse a popular area in the research of data warehouse. What effects can we have if we apply XML technology to the research of real-time data warehouse? XML technology solves many technologic problems which are impossible to be addressed in traditional real-time data warehouse, and realize the integration of OLAP (On-line Analytical Processing) and OLTP (Online transaction processing) environment. Then real-time data warehouse can truly be called "real time".

  16. Bioinformatics and moonlighting proteins

    Directory of Open Access Journals (Sweden)

    Sergio eHernández

    2015-06-01

    Full Text Available Multitasking or moonlighting is the capability of some proteins to execute two or more biochemical functions. Usually, moonlighting proteins are experimentally revealed by serendipity. For this reason, it would be helpful that Bioinformatics could predict this multifunctionality, especially because of the large amounts of sequences from genome projects. In the present work, we analyse and describe several approaches that use sequences, structures, interactomics and current bioinformatics algorithms and programs to try to overcome this problem. Among these approaches are: a remote homology searches using Psi-Blast, b detection of functional motifs and domains, c analysis of data from protein-protein interaction databases (PPIs, d match the query protein sequence to 3D databases (i.e., algorithms as PISITE, e mutation correlation analysis between amino acids by algorithms as MISTIC. Programs designed to identify functional motif/domains detect mainly the canonical function but usually fail in the detection of the moonlighting one, Pfam and ProDom being the best methods. Remote homology search by Psi-Blast combined with data from interactomics databases (PPIs have the best performance. Structural information and mutation correlation analysis can help us to map the functional sites. Mutation correlation analysis can only be used in very specific situations –it requires the existence of multialigned family protein sequences - but can suggest how the evolutionary process of second function acquisition took place. The multitasking protein database MultitaskProtDB (http://wallace.uab.es/multitask/, previously published by our group, has been used as a benchmark for the all of the analyses.

  17. The impact of e-commerce on warehouse operations

    Directory of Open Access Journals (Sweden)

    Wiktor Żuchowski

    2016-03-01

    Full Text Available Background: We often encounter opinions concerning the unusual nature of warehouses used for the purposes of e-commerce, most often spread by providers of modern technological equipment and designers of such solutions. Of course, in the case of newly built facilities, it is advisable to consider innovative technologies, especially in terms of order picking. However, in many cases, the differences between "standard" warehouses, serving, for example, the vehicle spare parts market, and warehouses that are ready to handle retail orders placed electronically (defined as e-commerce are negligible. The scale of the differences between the existing "standard" warehouses and those adapted to handle e-commerce is dependent on the industry and supported of customers' structure. Methods: On the basis of experiences and on examples of enterprises two cases of the impact of a hypothetical e-commerce implementation for the warehouse organization and technology have been analysed. Results: The introduction of e-commerce into warehouses entails respective changes to previously handled orders. Warehouses serving the retail market are in principle prepared to process electronic orders. In this case, the introduction of (direct electronic sales is justified and feasible with relatively little effort. Conclusions: It cannot be said with certainty that the introduction of e-commerce in the warehouse is a revolution for its employees and managers. It depends on the markets in which the company operates, and on customers served by the warehouse prior to the introduction of e-commerce.

  18. Warehouse order-picking process. Order-picker routing problem

    Directory of Open Access Journals (Sweden)

    E. V. Korobkov

    2015-01-01

    Full Text Available This article continues “Warehouse order-picking process” cycle and describes order-picker routing sub-problem of a warehouse order-picking process. It draws analogies between the orderpickers’ routing problem and traveling salesman’s problem, shows differences between the standard problem statement of a traveling salesman and routing problem of warehouse orderpickers, and gives the particular Steiner’s problem statement of a traveling salesman.Warehouse layout with a typical order is represented by a graph, with some its vertices corresponding to mandatory order-picker’s visits and some other ones being noncompulsory. The paper describes an optimal Ratliff-Rosenthal algorithm to solve order-picker’s routing problem for the single-block warehouses, i.e. warehouses with only two crossing aisles, defines seven equivalent classes of partial routing sub-graphs and five transitions used to have an optimal routing sub-graph of a order-picker. An extension of optimal Ratliff-Rosenthal order-picker routing algorithm for multi-block warehouses is presented and also reasons for using the routing heuristics instead of exact optimal algorithms are given. The paper offers algorithmic description of the following seven routing heuristics: S-shaped, return, midpoint, largest gap, aisle-by-aisle, composite, and combined as well as modification of combined heuristics. The comparison of orderpicker routing heuristics for one- and two-block warehouses is to be described in the next article of the “Warehouse order-picking process” cycle.

  19. Congestion-Aware Warehouse Flow Analysis and Optimization

    KAUST Repository

    AlHalawani, Sawsan

    2015-12-18

    Generating realistic configurations of urban models is a vital part of the modeling process, especially if these models are used for evaluation and analysis. In this work, we address the problem of assigning objects to their storage locations inside a warehouse which has a great impact on the quality of operations within a warehouse. Existing storage policies aim to improve the efficiency by minimizing travel time or by classifying the items based on some features. We go beyond existing methods as we analyze warehouse layout network in an attempt to understand the factors that affect traffic within the warehouse. We use simulated annealing based sampling to assign items to their storage locations while reducing traffic congestion and enhancing the speed of order picking processes. The proposed method enables a range of applications including efficient storage assignment, warehouse reliability evaluation and traffic congestion estimation.

  20. Congestion-Aware Warehouse Flow Analysis and Optimization

    KAUST Repository

    AlHalawani, Sawsan; Mitra, Niloy J.

    2015-01-01

    Generating realistic configurations of urban models is a vital part of the modeling process, especially if these models are used for evaluation and analysis. In this work, we address the problem of assigning objects to their storage locations inside a warehouse which has a great impact on the quality of operations within a warehouse. Existing storage policies aim to improve the efficiency by minimizing travel time or by classifying the items based on some features. We go beyond existing methods as we analyze warehouse layout network in an attempt to understand the factors that affect traffic within the warehouse. We use simulated annealing based sampling to assign items to their storage locations while reducing traffic congestion and enhancing the speed of order picking processes. The proposed method enables a range of applications including efficient storage assignment, warehouse reliability evaluation and traffic congestion estimation.

  1. PEMODELAN INTEGRASI NEARLY REAL TIME DATA WAREHOUSE DENGAN SERVICE ORIENTED ARCHITECTURE UNTUK MENUNJANG SISTEM INFORMASI RETAIL

    Directory of Open Access Journals (Sweden)

    I Made Dwi Jendra Sulastra

    2015-12-01

    Full Text Available Updates the data in the data warehouse is not traditionally done every transaction. Retail information systems require the latest data and can be accessed from anywhere for business analysis needs. Therefore, in this study will be made data warehouse model that is able to produce the information near real time, and can be accessed from anywhere by end users application. Modeling design integration of nearly real time data warehouse (NRTDWH with a service oriented architecture (SOA to support the retail information system is done in two stages. In the first stage will be designed modeling NRTDWH using Change Data Capture (CDC based Transaction Log. In the second stage will be designed modeling NRTDWH integration with SOA-based web service. Tests conducted by a simulation test applications. Test applications used retail information systems, web-based web service client, desktop, and mobile. Results of this study were (1 ETL-based CDC captures changes to the source table and then store it in the database NRTDWH with the help of a scheduler; (2 Middleware web service makes 6 service based on data contained in the database NRTDWH, and each of these services accessible and implemented by the web service client.

  2. An Overview of Bioinformatics Tools and Resources in Allergy.

    Science.gov (United States)

    Fu, Zhiyan; Lin, Jing

    2017-01-01

    The rapidly increasing number of characterized allergens has created huge demands for advanced information storage, retrieval, and analysis. Bioinformatics and machine learning approaches provide useful tools for the study of allergens and epitopes prediction, which greatly complement traditional laboratory techniques. The specific applications mainly include identification of B- and T-cell epitopes, and assessment of allergenicity and cross-reactivity. In order to facilitate the work of clinical and basic researchers who are not familiar with bioinformatics, we review in this chapter the most important databases, bioinformatic tools, and methods with relevance to the study of allergens.

  3. Handling Imprecision in Qualitative Data Warehouse: Urban Building Sites Annoyance Analysis Use Case

    Science.gov (United States)

    Amanzougarene, F.; Chachoua, M.; Zeitouni, K.

    2013-05-01

    Data warehouse means a decision support database allowing integration, organization, historisation, and management of data from heterogeneous sources, with the aim of exploiting them for decision-making. Data warehouses are essentially based on multidimensional model. This model organizes data into facts (subjects of analysis) and dimensions (axes of analysis). In classical data warehouses, facts are composed of numerical measures and dimensions which characterize it. Dimensions are organized into hierarchical levels of detail. Based on the navigation and aggregation mechanisms offered by OLAP (On-Line Analytical Processing) tools, facts can be analyzed according to the desired level of detail. In real world applications, facts are not always numerical, and can be of qualitative nature. In addition, sometimes a human expert or learned model such as a decision tree provides a qualitative evaluation of phenomenon based on its different parameters i.e. dimensions. Conventional data warehouses are thus not adapted to qualitative reasoning and have not the ability to deal with qualitative data. In previous work, we have proposed an original approach of qualitative data warehouse modeling, which permits integrating qualitative measures. Based on computing with words methodology, we have extended classical multidimensional data model to allow the aggregation and analysis of qualitative data in OLAP environment. We have implemented this model in a Spatial Decision Support System to help managers of public spaces to reduce annoyances and improve the quality of life of the citizens. In this paper, we will focus our study on the representation and management of imprecision in annoyance analysis process. The main objective of this process consists in determining the least harmful scenario of urban building sites, particularly in dense urban environments.

  4. Comprehensive analysis of the N-glycan biosynthetic pathway using bioinformatics to generate UniCorn: A theoretical N-glycan structure database.

    Science.gov (United States)

    Akune, Yukie; Lin, Chi-Hung; Abrahams, Jodie L; Zhang, Jingyu; Packer, Nicolle H; Aoki-Kinoshita, Kiyoko F; Campbell, Matthew P

    2016-08-05

    Glycan structures attached to proteins are comprised of diverse monosaccharide sequences and linkages that are produced from precursor nucleotide-sugars by a series of glycosyltransferases. Databases of these structures are an essential resource for the interpretation of analytical data and the development of bioinformatics tools. However, with no template to predict what structures are possible the human glycan structure databases are incomplete and rely heavily on the curation of published, experimentally determined, glycan structure data. In this work, a library of 45 human glycosyltransferases was used to generate a theoretical database of N-glycan structures comprised of 15 or less monosaccharide residues. Enzyme specificities were sourced from major online databases including Kyoto Encyclopedia of Genes and Genomes (KEGG) Glycan, Consortium for Functional Glycomics (CFG), Carbohydrate-Active enZymes (CAZy), GlycoGene DataBase (GGDB) and BRENDA. Based on the known activities, more than 1.1 million theoretical structures and 4.7 million synthetic reactions were generated and stored in our database called UniCorn. Furthermore, we analyzed the differences between the predicted glycan structures in UniCorn and those contained in UniCarbKB (www.unicarbkb.org), a database which stores experimentally described glycan structures reported in the literature, and demonstrate that UniCorn can be used to aid in the assignment of ambiguous structures whilst also serving as a discovery database. Copyright © 2016 Elsevier Ltd. All rights reserved.

  5. Designing a Clinical Data Warehouse Architecture to Support Quality Improvement Initiatives.

    Science.gov (United States)

    Chelico, John D; Wilcox, Adam B; Vawdrey, David K; Kuperman, Gilad J

    2016-01-01

    Clinical data warehouses, initially directed towards clinical research or financial analyses, are evolving to support quality improvement efforts, and must now address the quality improvement life cycle. In addition, data that are needed for quality improvement often do not reside in a single database, requiring easier methods to query data across multiple disparate sources. We created a virtual data warehouse at NewYork Presbyterian Hospital that allowed us to bring together data from several source systems throughout the organization. We also created a framework to match the maturity of a data request in the quality improvement life cycle to proper tools needed for each request. As projects progress in the Define, Measure, Analyze, Improve, Control stages of quality improvement, there is a proper matching of resources the data needs at each step. We describe the analysis and design creating a robust model for applying clinical data warehousing to quality improvement.

  6. Outpatient health care statistics data warehouse--implementation.

    Science.gov (United States)

    Zilli, D

    1999-01-01

    Data warehouse implementation is assumed to be a very knowledge-demanding, expensive and long-lasting process. As such it requires senior management sponsorship, involvement of experts, a big budget and probably years of development time. Presented Outpatient Health Care Statistics Data Warehouse implementation research provides ample evidence against the infallibility of the above statements. New, inexpensive, but powerful technology, which provides outstanding platform for On-Line Analytical Processing (OLAP), has emerged recently. Presumably, it will be the basis for the estimated future growth of data warehouse market, both in the medical and in other business fields. Methods and tools for building, maintaining and exploiting data warehouses are also briefly discussed in the paper.

  7. Bioinformatics in Undergraduate Education: Practical Examples

    Science.gov (United States)

    Boyle, John A.

    2004-01-01

    Bioinformatics has emerged as an important research tool in recent years. The ability to mine large databases for relevant information has become increasingly central to many different aspects of biochemistry and molecular biology. It is important that undergraduates be introduced to the available information and methodologies. We present a…

  8. Energy Finance Data Warehouse Manual

    Energy Technology Data Exchange (ETDEWEB)

    Lee, Sangkeun [Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States); Chinthavali, Supriya [Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States); Shankar, Mallikarjun [Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States); Zeng, Claire [Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States); Hendrickson, Stephen [Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States)

    2016-11-30

    The Office of Energy Policy and Systems Analysis s finance team (EPSA-50) requires a suite of automated applications that can extract specific data from a flexible data warehouse (where datasets characterizing energy-related finance, economics and markets are maintained and integrated), perform relevant operations and creatively visualize them to provide a better understanding of what policy options affect various operators/sectors of the electricity system. In addition, the underlying data warehouse should be structured in the most effective and efficient way so that it can become increasingly valuable over time. This report describes the Energy Finance Data Warehouse (EFDW) framework that has been developed to accomplish the defined requirement above. We also specifically dive into the Sankey generator use-case scenario to explain the components of the EFDW framework and their roles. An excel-based data warehouse was used in the creation of the energy finance Sankey diagram and other detailed data finance visualizations to support energy policy analysis. The framework also captures the methodology, calculations and estimations analysts used for the calculation as well as relevant sources so newer analysts can build on work done previously.

  9. Work prioritization by using data warehouse solution; Priorizacao de obras usando solucao de data warehouse

    Energy Technology Data Exchange (ETDEWEB)

    Grupelli Junior, Fernando Antonio; Azoni, Edivar Garcia [Companhia Paranaense de Energia (COPEL), Curitiba, PR (Brazil)

    2000-07-01

    This work proposes the utilization of data warehouse technology for helping of gathering adequate and reliable information, and allows the calculation of cost-benefits ratios of work in the distribution primary network. The paper also intends to suggest a better integration and the utilization of the possibility of a data warehouse and his future integration with a geo processing system.

  10. Database Description - SSBD | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available base Description General information of database Database name SSBD Alternative nam...ss 2-2-3 Minatojima-minamimachi, Chuo-ku, Kobe 650-0047, Japan, RIKEN Quantitative Biology Center Shuichi Onami E-mail: Database... classification Other Molecular Biology Databases Database classification Dynamic databa...elegans Taxonomy ID: 6239 Taxonomy Name: Escherichia coli Taxonomy ID: 562 Database description Systems Scie...i Onami Journal: Bioinformatics/April, 2015/Volume 31, Issue 7 External Links: Original website information Database

  11. The Implementation of Data Warehouse and OLAP for Rehabilitation Outcome Evaluation: ReDWinE System

    Science.gov (United States)

    Guo, Fei-Ran; Parmanto, Bambang; Irrgang, James J.; Wang, Jiunjie; Fang, Huaijin

    2000-01-01

    We created a data warehouse and OLAP system on the web for outcome evaluation of rehabilitation services. Thirteen outcome indicators were use in this research. Efficiency of therapists and clinics, expected utility of treatments and graphic patterns were generated for data exploration, data mining and decision support. Users can retrieve plenty of graphs and statistical tables without knowing database structure or attributes. Our experiences showed that multi-dimensional database and OLAP could serve as a decision support system.

  12. Implementation of Lean Warehouse to Minimize Wastes in Finished Goods Warehouse of PT Charoen Pokphand Indonesia Semarang

    Directory of Open Access Journals (Sweden)

    Nia Budi Puspitasari

    2016-03-01

    Full Text Available PT. Charoen Pokphand Indonesia Semarang is one of the largest poultry feed companies in Indonesia. To store the finished products that are ready to be distributed, it needs a finished goods warehouse. To minimize the wastes that occur in the process of warehousing the finished goods, the implementation of lean warehouse is required. The core process of finished goods warehouse is the process of putting bag that has been through the process of pallets packing, and then transporting the pallets contained bags of feed at finished goods warehouses and the process of unloading food from the finished goods warehouse to the distribution truck. With the implementation of the lean warehouse, we can know whether the activities are value added or not, to be identified later which type of waste happened. Opinions of stakeholders regarding the waste that must be eliminated first need to be determined by questionnaires. Based on the results of the questionnaires, three top wastes are selected to be identified the cause by using fishbone diagram. They can be repaired by using the implementation of 5S, namely Seiri, Seiton, Seiso, Seiketsu, and Shitsuke. Defect waste can be minimized by selecting pallet, putting sack correctly, forklift line clearance, applying working procedures, and creating cleaning schedule. Next, overprocessing waste is minimized by removing unnecessary items, putting based on the date of manufacture, and manufacture of feed plan. Inventory waste is minimized by removing junks, putting feed based on the expired date, and cleaning the barn

  13. Benchmarking distributed data warehouse solutions for storing genomic variant information

    Science.gov (United States)

    Wiewiórka, Marek S.; Wysakowicz, Dawid P.; Okoniewski, Michał J.

    2017-01-01

    Abstract Genomic-based personalized medicine encompasses storing, analysing and interpreting genomic variants as its central issues. At a time when thousands of patientss sequenced exomes and genomes are becoming available, there is a growing need for efficient database storage and querying. The answer could be the application of modern distributed storage systems and query engines. However, the application of large genomic variant databases to this problem has not been sufficiently far explored so far in the literature. To investigate the effectiveness of modern columnar storage [column-oriented Database Management System (DBMS)] and query engines, we have developed a prototypic genomic variant data warehouse, populated with large generated content of genomic variants and phenotypic data. Next, we have benchmarked performance of a number of combinations of distributed storages and query engines on a set of SQL queries that address biological questions essential for both research and medical applications. In addition, a non-distributed, analytical database (MonetDB) has been used as a baseline. Comparison of query execution times confirms that distributed data warehousing solutions outperform classic relational DBMSs. Moreover, pre-aggregation and further denormalization of data, which reduce the number of distributed join operations, significantly improve query performance by several orders of magnitude. Most of distributed back-ends offer a good performance for complex analytical queries, while the Optimized Row Columnar (ORC) format paired with Presto and Parquet with Spark 2 query engines provide, on average, the lowest execution times. Apache Kudu on the other hand, is the only solution that guarantees a sub-second performance for simple genome range queries returning a small subset of data, where low-latency response is expected, while still offering decent performance for running analytical queries. In summary, research and clinical applications that require

  14. Integrating Data Warehouses with Web Data

    DEFF Research Database (Denmark)

    Perez, Juan Manuel; Berlanga, Rafael; Aramburu, Maria Jose

    This paper surveys the most relevant research on combining Data Warehouse (DW) and Web data. It studies the XML technologies that are currently being used to integrate, store, query and retrieve web data, and their application to data warehouses. The paper addresses the problem of integrating...

  15. Information Architecture: The Data Warehouse Foundation.

    Science.gov (United States)

    Thomas, Charles R.

    1997-01-01

    Colleges and universities are initiating data warehouse projects to provide integrated information for planning and reporting purposes. A survey of 40 institutions with active data warehouse projects reveals the kinds of tools, contents, data cycles, and access currently used. Essential elements of an integrated information architecture are…

  16. Development of Auto-Stacking Warehouse Truck

    Directory of Open Access Journals (Sweden)

    Kuo-Hsien Hsia

    2018-03-01

    Full Text Available Warehouse automation is a very important issue for the promotion of traditional industries. For the production of larger and stackable products, it is usually necessary to operate a fork-lifter for the stacking and storage of the products by a skilled person. The general autonomous warehouse-truck does not have the ability of stacking objects. In this paper, we develop a prototype of auto-stacking warehouse-truck that can work without direct operation by a skill person. With command made by an RFID card, the stacker truck can take the packaged product to the warehouse on the prior-planned route and store it in a stacking way in the designated storage area, or deliver the product to the shipping area or into the container from the storage area. It can significantly reduce the manpower requirements of the skilled-person of forklift technician and improve the safety of the warehousing area.

  17. Establishment of the Integrated Plant Data Warehouse

    International Nuclear Information System (INIS)

    Oota, Yoshimi; Yoshinaga, Toshiaki

    1999-01-01

    This paper presents 'The Establishment of the Integrated Plant Data Warehouse and Verification Tests on Inter-corporate Electronic Commerce based on the Data Warehouse (PDWH)', one of the 'Shared Infrastructure for the Electronic Commerce Consolidation Project', promoted by the Ministry of International Trade and Industry (MITI) through the Information-Technology Promotion Agency (IPA), Japan. A study group called Japan Plant EC (PlantEC) was organized to perform relevant activities. One of the main activities of plantEC involves the construction of the Integrated (including manufacturers, engineering companies, plant construction companies, and machinery and parts manufacturers, etc.) Data Warehouse which is an essential part of the infrastructure necessary for a system to share information on industrial life cycle ranging from planning/designing to operation/maintenance. Another activity is the utilization of this warehouse for the purpose of conducting verification tests to prove its usefulness. Through these verification tests, PlantEC will endeavor to establish a warehouse with standardized data which can be used for the infrastructure of EC in the process plant industry. (author)

  18. Establishment of the Integrated Plant Data Warehouse

    Energy Technology Data Exchange (ETDEWEB)

    Oota, Yoshimi; Yoshinaga, Toshiaki [Hitachi Works, Hitachi Ltd., hitachi, Ibaraki (Japan)

    1999-07-01

    This paper presents 'The Establishment of the Integrated Plant Data Warehouse and Verification Tests on Inter-corporate Electronic Commerce based on the Data Warehouse (PDWH)', one of the 'Shared Infrastructure for the Electronic Commerce Consolidation Project', promoted by the Ministry of International Trade and Industry (MITI) through the Information-Technology Promotion Agency (IPA), Japan. A study group called Japan Plant EC (PlantEC) was organized to perform relevant activities. One of the main activities of plantEC involves the construction of the Integrated (including manufacturers, engineering companies, plant construction companies, and machinery and parts manufacturers, etc.) Data Warehouse which is an essential part of the infrastructure necessary for a system to share information on industrial life cycle ranging from planning/designing to operation/maintenance. Another activity is the utilization of this warehouse for the purpose of conducting verification tests to prove its usefulness. Through these verification tests, PlantEC will endeavor to establish a warehouse with standardized data which can be used for the infrastructure of EC in the process plant industry. (author)

  19. Importance of public warehouse system for financing agribusiness sector

    Directory of Open Access Journals (Sweden)

    Zakić Vladimir

    2014-01-01

    Full Text Available The aim of this study was to determine the economic viability of the use of warehouse receipts for the storage of wheat and corn, based on the analysis of trends in product prices, storage costs in public warehouses and interest rate of loans against warehouse receipts. Agricultural producers are urged to sell grain at the harvest time when the price of agricultural products is usually lowest, mostly because of their needs for financial sources. Instead of selling products, farmers can store them in the public warehouses and use short-time financing by lending against warehouse receipt with usually lowest interest rate. In following months, farmers can sell products at higher price and repay short-term loan. This study showed that strategy of using public warehouses and postponing the sale of grains after harvest is profitable strategy for agricultural producers.

  20. The GMOD Drupal bioinformatic server framework.

    Science.gov (United States)

    Papanicolaou, Alexie; Heckel, David G

    2010-12-15

    Next-generation sequencing technologies have led to the widespread use of -omic applications. As a result, there is now a pronounced bioinformatic bottleneck. The general model organism database (GMOD) tool kit (http://gmod.org) has produced a number of resources aimed at addressing this issue. It lacks, however, a robust online solution that can deploy heterogeneous data and software within a Web content management system (CMS). We present a bioinformatic framework for the Drupal CMS. It consists of three modules. First, GMOD-DBSF is an application programming interface module for the Drupal CMS that simplifies the programming of bioinformatic Drupal modules. Second, the Drupal Bioinformatic Software Bench (biosoftware_bench) allows for a rapid and secure deployment of bioinformatic software. An innovative graphical user interface (GUI) guides both use and administration of the software, including the secure provision of pre-publication datasets. Third, we present genes4all_experiment, which exemplifies how our work supports the wider research community. Given the infrastructure presented here, the Drupal CMS may become a powerful new tool set for bioinformaticians. The GMOD-DBSF base module is an expandable community resource that decreases development time of Drupal modules for bioinformatics. The biosoftware_bench module can already enhance biologists' ability to mine their own data. The genes4all_experiment module has already been responsible for archiving of more than 150 studies of RNAi from Lepidoptera, which were previously unpublished. Implemented in PHP and Perl. Freely available under the GNU Public License 2 or later from http://gmod-dbsf.googlecode.com.

  1. Building a Data Warehouse step by step

    Directory of Open Access Journals (Sweden)

    2007-01-01

    Full Text Available Data warehouses have been developed to answer the increasing demands of quality information required by the top managers and economic analysts of organizations. Their importance in now a day business area is unanimous recognized, being the foundation for developing business intelligence systems. Data warehouses offer support for decision-making process, allowing complex analyses which cannot be properly achieved from operational systems. This paper presents the ways in which a data warehouse may be developed and the stages of building it.

  2. The Data Warehouse: Keeping It Simple. MIT Shares Valuable Lessons Learned from a Successful Data Warehouse Implementation.

    Science.gov (United States)

    Thorne, Scott

    2000-01-01

    Explains why the data warehouse is important to the Massachusetts Institute of Technology community, describing its basic functions and technical design points; sharing some non-technical aspects of the school's data warehouse implementation that have proved to be important; examining the importance of proper training in a successful warehouse…

  3. Designing a Data Warehouse for Cyber Crimes

    Directory of Open Access Journals (Sweden)

    Il-Yeol Song

    2006-09-01

    Full Text Available One of the greatest challenges facing modern society is the rising tide of cyber crimes. These crimes, since they rarely fit the model of conventional crimes, are difficult to investigate, hard to analyze, and difficult to prosecute. Collecting data in a unified framework is a mandatory step that will assist the investigator in sorting through the mountains of data. In this paper, we explore designing a dimensional model for a data warehouse that can be used in analyzing cyber crime data. We also present some interesting queries and the types of cyber crime analyses that can be performed based on the data warehouse. We discuss several ways of utilizing the data warehouse using OLAP and data mining technologies. We finally discuss legal issues and data population issues for the data warehouse.

  4. Public Refrigerated Warehouses

    Data.gov (United States)

    Department of Homeland Security — The International Association of Refrigerated Warehouses (IARW) came into existence in 1891 when a number of conventional warehousemen took on the demands of storing...

  5. Protecting privacy in a clinical data warehouse.

    Science.gov (United States)

    Kong, Guilan; Xiao, Zhichun

    2015-06-01

    Peking University has several prestigious teaching hospitals in China. To make secondary use of massive medical data for research purposes, construction of a clinical data warehouse is imperative in Peking University. However, a big concern for clinical data warehouse construction is how to protect patient privacy. In this project, we propose to use a combination of symmetric block ciphers, asymmetric ciphers, and cryptographic hashing algorithms to protect patient privacy information. The novelty of our privacy protection approach lies in message-level data encryption, the key caching system, and the cryptographic key management system. The proposed privacy protection approach is scalable to clinical data warehouse construction with any size of medical data. With the composite privacy protection approach, the clinical data warehouse can be secure enough to keep the confidential data from leaking to the outside world. © The Author(s) 2014.

  6. Managing dual warehouses with an incentive policy for deteriorating items

    Science.gov (United States)

    Yu, Jonas C. P.; Wang, Kung-Jeng; Lin, Yu-Siang

    2016-02-01

    Distributors in a supply chain usually limit their own warehouse in finite capacity for cost reduction and excess stock is held in a rent warehouse. In this study, we examine inventory control for deteriorating items in a two-warehouse setting. Assuming that there is an incentive offered by a rent warehouse that allows the rental fee to decrease over time, the objective of this study is to maximise the joint profit of the manufacturer and the distributor. An optimisation procedure is developed to derive the optimal joint economic lot size policy. Several criteria are identified to select the most appropriate warehouse configuration and inventory policy on the basis of storage duration of materials in a rent warehouse. Sensitivity analysis is done to examine the results of model robustness. The proposed model enables a manufacturer with a channel distributor to coordinate the use of alternative warehouses, and to maximise the joint profit of the manufacturer and the distributor.

  7. What Academia Can Gain from Building a Data Warehouse.

    Science.gov (United States)

    Wierschem, David; McMillen, Jeremy; McBroom, Randy

    2003-01-01

    Describes how, when used effectively, data warehouses can be a significant component of strategic decision making on campus. Discusses what a data warehouse is and what its informational contents may include, environmental drivers and obstacles, and strategies to justify developing a data warehouse for an academic institution. (EV)

  8. Subcritical calculation of the nuclear material warehouse

    International Nuclear Information System (INIS)

    Garcia M, T.; Mazon R, R.

    2009-01-01

    In this work the subcritical calculation of the nuclear material warehouse of the Reactor TRIGA Mark III labyrinth in the Mexico Nuclear Center is presented. During the adaptation of the nuclear warehouse (vault I), the fuel was temporarily changed to the warehouse (vault II) and it was also carried out the subcritical calculation for this temporary arrangement. The code used for the calculation of the effective multiplication factor, it was the Monte Carlo N-Particle Extended code known as MCNPX, developed by the National Laboratory of Los Alamos, for the particles transport. (Author)

  9. The GMOD Drupal Bioinformatic Server Framework

    Science.gov (United States)

    Papanicolaou, Alexie; Heckel, David G.

    2010-01-01

    Motivation: Next-generation sequencing technologies have led to the widespread use of -omic applications. As a result, there is now a pronounced bioinformatic bottleneck. The general model organism database (GMOD) tool kit (http://gmod.org) has produced a number of resources aimed at addressing this issue. It lacks, however, a robust online solution that can deploy heterogeneous data and software within a Web content management system (CMS). Results: We present a bioinformatic framework for the Drupal CMS. It consists of three modules. First, GMOD-DBSF is an application programming interface module for the Drupal CMS that simplifies the programming of bioinformatic Drupal modules. Second, the Drupal Bioinformatic Software Bench (biosoftware_bench) allows for a rapid and secure deployment of bioinformatic software. An innovative graphical user interface (GUI) guides both use and administration of the software, including the secure provision of pre-publication datasets. Third, we present genes4all_experiment, which exemplifies how our work supports the wider research community. Conclusion: Given the infrastructure presented here, the Drupal CMS may become a powerful new tool set for bioinformaticians. The GMOD-DBSF base module is an expandable community resource that decreases development time of Drupal modules for bioinformatics. The biosoftware_bench module can already enhance biologists' ability to mine their own data. The genes4all_experiment module has already been responsible for archiving of more than 150 studies of RNAi from Lepidoptera, which were previously unpublished. Availability and implementation: Implemented in PHP and Perl. Freely available under the GNU Public License 2 or later from http://gmod-dbsf.googlecode.com Contact: alexie@butterflybase.org PMID:20971988

  10. Integrating Brazilian health information systems in order to support the building of data warehouses

    Directory of Open Access Journals (Sweden)

    Sergio Miranda Freire

    Full Text Available AbstractIntroductionThis paper's aim is to develop a data warehouse from the integration of the files of three Brazilian health information systems concerned with the production of ambulatory and hospital procedures for cancer care, and cancer mortality. These systems do not have a unique patient identification, which makes their integration difficult even within a single system.MethodsData from the Brazilian Public Hospital Information System (SIH-SUS, the Oncology Module for the Outpatient Information System (APAC-ONCO and the Mortality Information System (SIM for the State of Rio de Janeiro, in the period from January 2000 to December 2004 were used. Each of the systems has the monthly data production compiled in dbase files (dbf. All the files pertaining to the same system were then read into a corresponding table in a MySQL Server 5.1. The SIH-SUS and APAC-ONCO tables were linked internally and with one another through record linkage methods. The APAC-ONCO table was linked to the SIM table. Afterwards a data warehouse was built using Pentaho and the MySQL database management system.ResultsThe sensitivities and specificities of the linkage processes were above 95% and close to 100% respectively. The data warehouse provided several analytical views that are accessed through the Pentaho Schema Workbench.ConclusionThis study presented a proposal for the integration of Brazilian Health Systems to support the building of data warehouses and provide information beyond those currently available with the individual systems.

  11. Nigerian Concept Of Bonded Warehouses And Dry Ports | Ndikom ...

    African Journals Online (AJOL)

    The bonded warehouse in Nigeria is a strategic expansion of ordinary warehouses that are usually developed in the ports and related cities, strictly meant for safe-keeping of cargoes for owners before final take-over by consignees after payment of some customs duties. A bonded warehouse and ICDs are seen as ...

  12. Automatic generation of warehouse mediators using an ontology engine

    Energy Technology Data Exchange (ETDEWEB)

    Critchlow, T., LLNL

    1998-04-01

    Data warehouses created for dynamic scientific environments, such as genetics, face significant challenges to their long-term feasibility One of the most significant of these is the high frequency of schema evolution resulting from both technological advances and scientific insight Failure to quickly incorporate these modifications will quickly render the warehouse obsolete, yet each evolution requires significant effort to ensure the changes are correctly propagated DataFoundry utilizes a mediated warehouse architecture with an ontology infrastructure to reduce the maintenance acquirements of a warehouse. Among the things, the ontology is used as an information source for automatically generating mediators, the methods that transfer data between the data sources and the warehouse The identification, definition and representation of the metadata required to perform this task is a primary contribution of this work.

  13. An efficiency improvement in warehouse operation using simulation analysis

    Science.gov (United States)

    Samattapapong, N.

    2017-11-01

    In general, industry requires an efficient system for warehouse operation. There are many important factors that must be considered when designing an efficient warehouse system. The most important is an effective warehouse operation system that can help transfer raw material, reduce costs and support transportation. By all these factors, researchers are interested in studying about work systems and warehouse distribution. We start by collecting the important data for storage, such as the information on products, information on size and location, information on data collection and information on production, and all this information to build simulation model in Flexsim® simulation software. The result for simulation analysis found that the conveyor belt was a bottleneck in the warehouse operation. Therefore, many scenarios to improve that problem were generated and testing through simulation analysis process. The result showed that an average queuing time was reduced from 89.8% to 48.7% and the ability in transporting the product increased from 10.2% to 50.9%. Thus, it can be stated that this is the best method for increasing efficiency in the warehouse operation.

  14. Bioinformatics: A History of Evolution "In Silico"

    Science.gov (United States)

    Ondrej, Vladan; Dvorak, Petr

    2012-01-01

    Bioinformatics, biological databases, and the worldwide use of computers have accelerated biological research in many fields, such as evolutionary biology. Here, we describe a primer of nucleotide sequence management and the construction of a phylogenetic tree with two examples; the two selected are from completely different groups of organisms:…

  15. HRSA Data Warehouse

    Data.gov (United States)

    U.S. Department of Health & Human Services — The HRSA Data Warehouse is the go-to source for data, maps, reports, locators, and dashboards on HRSA's public health programs. This website provides a wide variety...

  16. Database Resources of the BIG Data Center in 2018.

    Science.gov (United States)

    2018-01-04

    The BIG Data Center at Beijing Institute of Genomics (BIG) of the Chinese Academy of Sciences provides freely open access to a suite of database resources in support of worldwide research activities in both academia and industry. With the vast amounts of omics data generated at ever-greater scales and rates, the BIG Data Center is continually expanding, updating and enriching its core database resources through big-data integration and value-added curation, including BioCode (a repository archiving bioinformatics tool codes), BioProject (a biological project library), BioSample (a biological sample library), Genome Sequence Archive (GSA, a data repository for archiving raw sequence reads), Genome Warehouse (GWH, a centralized resource housing genome-scale data), Genome Variation Map (GVM, a public repository of genome variations), Gene Expression Nebulas (GEN, a database of gene expression profiles based on RNA-Seq data), Methylation Bank (MethBank, an integrated databank of DNA methylomes), and Science Wikis (a series of biological knowledge wikis for community annotations). In addition, three featured web services are provided, viz., BIG Search (search as a service; a scalable inter-domain text search engine), BIG SSO (single sign-on as a service; a user access control system to gain access to multiple independent systems with a single ID and password) and Gsub (submission as a service; a unified submission service for all relevant resources). All of these resources are publicly accessible through the home page of the BIG Data Center at http://bigd.big.ac.cn. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.

  17. Exploring Cystic Fibrosis Using Bioinformatics Tools: A Module Designed for the Freshman Biology Course

    Science.gov (United States)

    Zhang, Xiaorong

    2011-01-01

    We incorporated a bioinformatics component into the freshman biology course that allows students to explore cystic fibrosis (CF), a common genetic disorder, using bioinformatics tools and skills. Students learn about CF through searching genetic databases, analyzing genetic sequences, and observing the three-dimensional structures of proteins…

  18. A Bioinformatics Facility for NASA

    Science.gov (United States)

    Schweighofer, Karl; Pohorille, Andrew

    2006-01-01

    Building on an existing prototype, we have fielded a facility with bioinformatics technologies that will help NASA meet its unique requirements for biological research. This facility consists of a cluster of computers capable of performing computationally intensive tasks, software tools, databases and knowledge management systems. Novel computational technologies for analyzing and integrating new biological data and already existing knowledge have been developed. With continued development and support, the facility will fulfill strategic NASA s bioinformatics needs in astrobiology and space exploration. . As a demonstration of these capabilities, we will present a detailed analysis of how spaceflight factors impact gene expression in the liver and kidney for mice flown aboard shuttle flight STS-108. We have found that many genes involved in signal transduction, cell cycle, and development respond to changes in microgravity, but that most metabolic pathways appear unchanged.

  19. Konsolidasi Data Warehouse untuk Aplikasi Business Intelligence

    Directory of Open Access Journals (Sweden)

    Rudy Rudy

    2012-12-01

    Full Text Available As the business competition is getting strong, corporate leaders need complete data that as a basis for determining future business strategies. Similarly with management of company "A", a pharmaceutical company which has three distribution companies. Each distribution company already has a data warehouse to generate reports for each of them. For business operational and corporate strategies, chairman PT "A" requires an integrated report, so analysis of data owned by the three distribution companies can be done in a full reportto answer the problems faced by the managemet. Thus, data warehouse consilidation can be used as a solution for company "A". Methodology starts with analysis of information needs to be displayed on the application ofbusiness intelligence, data warehouse consolidation, ETL (extract, transform and load, data warehousing, OLAP and Dashboard. Using data warehouse consolidation, information access by management of company "A" can be done in a single presentation, which can display data comparison between the three distribution companies.

  20. Analisis Dan Perancangan Data Warehouse Pada PT Gajah Tunggal Prakarsa

    Directory of Open Access Journals (Sweden)

    Choirul Huda

    2010-12-01

    Full Text Available The purpose of this helpful in making decisions more quickly and precisely. Research methodology includes analysis study was to analyze the data base support in helping decisions making, identifying needs and designing a data warehouse. With the support of data warehouse, company leaders can be more of current systems, library research, designing a data warehouse using star schema. The result of this research is the availability of a data warehouse that can generate information quickly and precisely, thus helping the company in making decisions. The conclusion of this research is the application of data warehouse can be a media aide related parties on PT. Gajah Tunggal initiative in decision making. 

  1. 7 CFR 1421.106 - Warehouse-stored marketing assistance loan collateral.

    Science.gov (United States)

    2010-01-01

    ... 7 Agriculture 10 2010-01-01 2010-01-01 false Warehouse-stored marketing assistance loan collateral... Marketing Assistance Loans § 1421.106 Warehouse-stored marketing assistance loan collateral. (a) A commodity may be pledged as collateral for a warehouse-stored marketing assistance loan in the quantity...

  2. Automated Data Aggregation for Time-Series Analysis: Study Case on Anaesthesia Data Warehouse.

    Science.gov (United States)

    Lamer, Antoine; Jeanne, Mathieu; Ficheur, Grégoire; Marcilly, Romaric

    2016-01-01

    Data stored in operational databases are not reusable directly. Aggregation modules are necessary to facilitate secondary use. They decrease volume of data while increasing the number of available information. In this paper, we present four automated engines of aggregation, integrated into an anaesthesia data warehouse. Four instances of clinical questions illustrate the use of those engines for various improvements of quality of care: duration of procedure, drug administration, assessment of hypotension and its related treatment.

  3. Simple re-instantiation of small databases using cloud computing.

    Science.gov (United States)

    Tan, Tin Wee; Xie, Chao; De Silva, Mark; Lim, Kuan Siong; Patro, C Pawan K; Lim, Shen Jean; Govindarajan, Kunde Ramamoorthy; Tong, Joo Chuan; Choo, Khar Heng; Ranganathan, Shoba; Khan, Asif M

    2013-01-01

    Small bioinformatics databases, unlike institutionally funded large databases, are vulnerable to discontinuation and many reported in publications are no longer accessible. This leads to irreproducible scientific work and redundant effort, impeding the pace of scientific progress. We describe a Web-accessible system, available online at http://biodb100.apbionet.org, for archival and future on demand re-instantiation of small databases within minutes. Depositors can rebuild their databases by downloading a Linux live operating system (http://www.bioslax.com), preinstalled with bioinformatics and UNIX tools. The database and its dependencies can be compressed into an ".lzm" file for deposition. End-users can search for archived databases and activate them on dynamically re-instantiated BioSlax instances, run as virtual machines over the two popular full virtualization standard cloud-computing platforms, Xen Hypervisor or vSphere. The system is adaptable to increasing demand for disk storage or computational load and allows database developers to use the re-instantiated databases for integration and development of new databases. Herein, we demonstrate that a relatively inexpensive solution can be implemented for archival of bioinformatics databases and their rapid re-instantiation should the live databases disappear.

  4. The importance of data warehouses for physician executives.

    Science.gov (United States)

    Ruffin, M

    1994-11-01

    Soon, most physicians will begin to learn about data warehouses and clinical and financial data about their patients stored in them. What is a data warehouse? Why are we seeing their emergence in health care only now? How does a hospital, or group practice, or health plan acquire or create a data warehouse? Who should be responsible for it, and what sort of training is needed by those in charge of using it for the edification of the sponsoring organization? I'll try to answer these questions in this article.

  5. BIRCH: A user-oriented, locally-customizable, bioinformatics system

    Science.gov (United States)

    Fristensky, Brian

    2007-01-01

    Background Molecular biologists need sophisticated analytical tools which often demand extensive computational resources. While finding, installing, and using these tools can be challenging, pipelining data from one program to the next is particularly awkward, especially when using web-based programs. At the same time, system administrators tasked with maintaining these tools do not always appreciate the needs of research biologists. Results BIRCH (Biological Research Computing Hierarchy) is an organizational framework for delivering bioinformatics resources to a user group, scaling from a single lab to a large institution. The BIRCH core distribution includes many popular bioinformatics programs, unified within the GDE (Genetic Data Environment) graphic interface. Of equal importance, BIRCH provides the system administrator with tools that simplify the job of managing a multiuser bioinformatics system across different platforms and operating systems. These include tools for integrating locally-installed programs and databases into BIRCH, and for customizing the local BIRCH system to meet the needs of the user base. BIRCH can also act as a front end to provide a unified view of already-existing collections of bioinformatics software. Documentation for the BIRCH and locally-added programs is merged in a hierarchical set of web pages. In addition to manual pages for individual programs, BIRCH tutorials employ step by step examples, with screen shots and sample files, to illustrate both the important theoretical and practical considerations behind complex analytical tasks. Conclusion BIRCH provides a versatile organizational framework for managing software and databases, and making these accessible to a user base. Because of its network-centric design, BIRCH makes it possible for any user to do any task from anywhere. PMID:17291351

  6. BIRCH: A user-oriented, locally-customizable, bioinformatics system

    Directory of Open Access Journals (Sweden)

    Fristensky Brian

    2007-02-01

    Full Text Available Abstract Background Molecular biologists need sophisticated analytical tools which often demand extensive computational resources. While finding, installing, and using these tools can be challenging, pipelining data from one program to the next is particularly awkward, especially when using web-based programs. At the same time, system administrators tasked with maintaining these tools do not always appreciate the needs of research biologists. Results BIRCH (Biological Research Computing Hierarchy is an organizational framework for delivering bioinformatics resources to a user group, scaling from a single lab to a large institution. The BIRCH core distribution includes many popular bioinformatics programs, unified within the GDE (Genetic Data Environment graphic interface. Of equal importance, BIRCH provides the system administrator with tools that simplify the job of managing a multiuser bioinformatics system across different platforms and operating systems. These include tools for integrating locally-installed programs and databases into BIRCH, and for customizing the local BIRCH system to meet the needs of the user base. BIRCH can also act as a front end to provide a unified view of already-existing collections of bioinformatics software. Documentation for the BIRCH and locally-added programs is merged in a hierarchical set of web pages. In addition to manual pages for individual programs, BIRCH tutorials employ step by step examples, with screen shots and sample files, to illustrate both the important theoretical and practical considerations behind complex analytical tasks. Conclusion BIRCH provides a versatile organizational framework for managing software and databases, and making these accessible to a user base. Because of its network-centric design, BIRCH makes it possible for any user to do any task from anywhere.

  7. Data warehouse til elbilers opladning og elpriser

    DEFF Research Database (Denmark)

    Andersen, Ove; Krogh, Benjamin Bjerre; Torp, Kristian

    Denne rapport præsenterer, hvordan GPS og CAN bus målinger fra opladning af elbilerne er renset for typiske fejl og gemt i et data warehouse. GPS og CAN bus målingerne er i data warehouset integreret med priserne fra det Nordeuropæiske el spotmarked Nord Pool Spot. Denne integration muliggør...... målinger om opladningen af elbiler er sammen med priserne fra el spotmarkedet indlæst i et data warehouse, som er fuldt ud implementeret. Den logiske data model for dette data warehouse præsenteres i detaljer. Håndteringen af GPS og CAN bus målingerne er generisk og kan udvides til nye data kilder...

  8. Expanding Post-Harvest Finance Through Warehouse Receipts and Related Instruments

    OpenAIRE

    Baldwin, Marisa; Bryla, Erin; Langenbucher, Anja

    2006-01-01

    Warehouse receipt financing and similar types of collateralized lending provide an alternative to traditional lending requirements of banks and other financiers and could provide opportunities to expand this lending in emerging economies for agricultural trade. The main contents include: what is warehouse receipt financing; what is the value of warehouse receipt financing; other collater...

  9. Warehouse Order-Picking Process. Review

    Directory of Open Access Journals (Sweden)

    E. V. Korobkov

    2015-01-01

    Full Text Available This article describes basic warehousing activities, namely: movement, information storage and transfer, as well as connections between typical warehouse operations (reception, transfer, assigning storage position and put-away, order-picking, hoarding and sorting, cross-docking, shipping. It presents a classification of the warehouse order-picking systems in terms of manual labor on offer as well as external (marketing channels, consumer’s demand structure, supplier’s replenishment structure and inventory level, total production demand, economic situation and internal (mechanization level, information accessibility, warehouse dimensionality, method of dispatch for shipping, zoning, batching, storage assignment method, routing method factors affecting the designing systems complexity. Basic optimization considerations are described. There is a literature review on the following sub-problems of planning and control of orderpicking processes.A layout design problem has been taken in account at two levels — external (facility layout problem and internal (aisle configuration problem. For a problem of distributing goods or stock keeping units the following methods are emphasized: random, nearest open storage position, and dedicated (COI-based, frequency-based distribution, as well as class-based and familygrouped (complimentary- and contact-based one. Batching problem can be solved by two main methods, i.e. proximity order batching (seed and saving algorithms and time-window order batching. There are two strategies for a zoning problem: progressive and synchronized, and also a special case of zoning — bucket brigades method. Hoarding/sorting problem is briefly reviewed. Order-picking routing problem will be thoroughly described in the next article of the cycle “Warehouse order-picking process”.

  10. Minimizing Warehouse Space through Inventory Reduction at Reckitt Benckiser

    OpenAIRE

    KILINC, IZGI SELEN

    2009-01-01

    This dissertation represents a ten week internship at pharmaceutical plant of Reckitt Benckiser for the Warehouse Stock Reduction Project. Due to foreseeable growth by the factory, there is increasing pressure to utilise existing warehouse space by reducing the existing stock level by 50 %. Therefore, this study aims to identify the opportunities to reduce the physical stock held in raw/pack materials in the warehouse and save space for additional manufacturing resources. The analysis demo...

  11. Decision method for optimal selection of warehouse material handling strategies by production companies

    Science.gov (United States)

    Dobos, P.; Tamás, P.; Illés, B.

    2016-11-01

    Adequate establishment and operation of warehouse logistics determines the companies’ competitiveness significantly because it effects greatly the quality and the selling price of the goods that the production companies produce. In order to implement and manage an adequate warehouse system, adequate warehouse position, stock management model, warehouse technology, motivated work force committed to process improvement and material handling strategy are necessary. In practical life, companies have paid small attantion to select the warehouse strategy properly. Although it has a major influence on the production in the case of material warehouse and on smooth costumer service in the case of finished goods warehouse because this can happen with a huge loss in material handling. Due to the dynamically changing production structure, frequent reorganization of warehouse activities is needed, on what the majority of the companies react basically with no reactions. This work presents a simulation test system frames for eligible warehouse material handling strategy selection and also the decision method for selection.

  12. Design and Applications of a Multimodality Image Data Warehouse Framework

    Science.gov (United States)

    Wong, Stephen T.C.; Hoo, Kent Soo; Knowlton, Robert C.; Laxer, Kenneth D.; Cao, Xinhau; Hawkins, Randall A.; Dillon, William P.; Arenson, Ronald L.

    2002-01-01

    A comprehensive data warehouse framework is needed, which encompasses imaging and non-imaging information in supporting disease management and research. The authors propose such a framework, describe general design principles and system architecture, and illustrate a multimodality neuroimaging data warehouse system implemented for clinical epilepsy research. The data warehouse system is built on top of a picture archiving and communication system (PACS) environment and applies an iterative object-oriented analysis and design (OOAD) approach and recognized data interface and design standards. The implementation is based on a Java CORBA (Common Object Request Broker Architecture) and Web-based architecture that separates the graphical user interface presentation, data warehouse business services, data staging area, and backend source systems into distinct software layers. To illustrate the practicality of the data warehouse system, the authors describe two distinct biomedical applications—namely, clinical diagnostic workup of multimodality neuroimaging cases and research data analysis and decision threshold on seizure foci lateralization. The image data warehouse framework can be modified and generalized for new application domains. PMID:11971885

  13. Multidimensi Pada Data Warehouse Dengan Menggunakan Rumus Kombinasi

    OpenAIRE

    Hendric, Spits Warnars Harco Leslie

    2006-01-01

    Multidimensional in data warehouse is a compulsion and become the most important for information delivery, without multidimensional data warehouse is incomplete. Multidimensional give the able to analyze business measurement in many different ways. Multidimensional is also synonymous with online analytical processing (OLAP).

  14. Fundamentals of bioinformatics and computational biology methods and exercises in matlab

    CERN Document Server

    Singh, Gautam B

    2015-01-01

    This book offers comprehensive coverage of all the core topics of bioinformatics, and includes practical examples completed using the MATLAB bioinformatics toolbox™. It is primarily intended as a textbook for engineering and computer science students attending advanced undergraduate and graduate courses in bioinformatics and computational biology. The book develops bioinformatics concepts from the ground up, starting with an introductory chapter on molecular biology and genetics. This chapter will enable physical science students to fully understand and appreciate the ultimate goals of applying the principles of information technology to challenges in biological data management, sequence analysis, and systems biology. The first part of the book also includes a survey of existing biological databases, tools that have become essential in today’s biotechnology research. The second part of the book covers methodologies for retrieving biological information, including fundamental algorithms for sequence compar...

  15. A Multidimensional Data Warehouse for Community Health Centers.

    Science.gov (United States)

    Kunjan, Kislaya; Toscos, Tammy; Turkcan, Ayten; Doebbeling, Brad N

    2015-01-01

    Community health centers (CHCs) play a pivotal role in healthcare delivery to vulnerable populations, but have not yet benefited from a data warehouse that can support improvements in clinical and financial outcomes across the practice. We have developed a multidimensional clinic data warehouse (CDW) by working with 7 CHCs across the state of Indiana and integrating their operational, financial and electronic patient records to support ongoing delivery of care. We describe in detail the rationale for the project, the data architecture employed, the content of the data warehouse, along with a description of the challenges experienced and strategies used in the development of this repository that may help other researchers, managers and leaders in health informatics. The resulting multidimensional data warehouse is highly practical and is designed to provide a foundation for wide-ranging healthcare data analytics over time and across the community health research enterprise.

  16. Chronic Condition Data Warehouse

    Data.gov (United States)

    U.S. Department of Health & Human Services — The CMS Chronic Condition Data Warehouse (CCW) provides researchers with Medicare and Medicaid beneficiary, claims, and assessment data linked by beneficiary across...

  17. Combining information from a clinical data warehouse and a pharmaceutical database to generate a framework to detect comorbidities in electronic health records.

    Science.gov (United States)

    Sylvestre, Emmanuelle; Bouzillé, Guillaume; Chazard, Emmanuel; His-Mahier, Cécil; Riou, Christine; Cuggia, Marc

    2018-01-24

    Medical coding is used for a variety of activities, from observational studies to hospital billing. However, comorbidities tend to be under-reported by medical coders. The aim of this study was to develop an algorithm to detect comorbidities in electronic health records (EHR) by using a clinical data warehouse (CDW) and a knowledge database. We enriched the Theriaque pharmaceutical database with the French national Comorbidities List to identify drugs associated with at least one major comorbid condition and diagnoses associated with a drug indication. Then, we compared the drug indications in the Theriaque database with the ICD-10 billing codes in EHR to detect potentially missing comorbidities based on drug prescriptions. Finally, we improved comorbidity detection by matching drug prescriptions and laboratory test results. We tested the obtained algorithm by using two retrospective datasets extracted from the Rennes University Hospital (RUH) CDW. The first dataset included all adult patients hospitalized in the ear, nose, throat (ENT) surgical ward between October and December 2014 (ENT dataset). The second included all adult patients hospitalized at RUH between January and February 2015 (general dataset). We reviewed medical records to find written evidence of the suggested comorbidities in current or past stays. Among the 22,132 Common Units of Dispensation (CUD) codes present in the Theriaque database, 19,970 drugs (90.2%) were associated with one or several ICD-10 diagnoses, based on their indication, and 11,162 (50.4%) with at least one of the 4878 comorbidities from the comorbidity list. Among the 122 patients of the ENT dataset, 75.4% had at least one drug prescription without corresponding ICD-10 code. The comorbidity diagnoses suggested by the algorithm were confirmed in 44.6% of the cases. Among the 4312 patients of the general dataset, 68.4% had at least one drug prescription without corresponding ICD-10 code. The comorbidity diagnoses suggested by the

  18. CHOmine: an integrated data warehouse for CHO systems biology and modeling.

    Science.gov (United States)

    Gerstl, Matthias P; Hanscho, Michael; Ruckerbauer, David E; Zanghellini, Jürgen; Borth, Nicole

    2017-01-01

    The last decade has seen a surge in published genome-scale information for Chinese hamster ovary (CHO) cells, which are the main production vehicles for therapeutic proteins. While a single access point is available at www.CHOgenome.org, the primary data is distributed over several databases at different institutions. Currently research is frequently hampered by a plethora of gene names and IDs that vary between published draft genomes and databases making systems biology analyses cumbersome and elaborate. Here we present CHOmine, an integrative data warehouse connecting data from various databases and links to other ones. Furthermore, we introduce CHOmodel, a web based resource that provides access to recently published CHO cell line specific metabolic reconstructions. Both resources allow to query CHO relevant data, find interconnections between different types of data and thus provides a simple, standardized entry point to the world of CHO systems biology. http://www.chogenome.org. © The Author(s) 2017. Published by Oxford University Press.

  19. Genomics and bioinformatics resources for translational science in Rosaceae.

    Science.gov (United States)

    Jung, Sook; Main, Dorrie

    2014-01-01

    Recent technological advances in biology promise unprecedented opportunities for rapid and sustainable advancement of crop quality. Following this trend, the Rosaceae research community continues to generate large amounts of genomic, genetic and breeding data. These include annotated whole genome sequences, transcriptome and expression data, proteomic and metabolomic data, genotypic and phenotypic data, and genetic and physical maps. Analysis, storage, integration and dissemination of these data using bioinformatics tools and databases are essential to provide utility of the data for basic, translational and applied research. This review discusses the currently available genomics and bioinformatics resources for the Rosaceae family.

  20. The DBCLS BioHackathon: standardization and interoperability for bioinformatics web services and workflows

    NARCIS (Netherlands)

    Katayama, T.; Arakawa, K.; Nakao, M.; Prins, J.C.P.

    2010-01-01

    Web services have become a key technology for bioinformatics, since life science databases are globally decentralized and the exponential increase in the amount of available data demands for efficient systems without the need to transfer entire databases for every step of an analysis. However,

  1. Operational management system for warehouse logistics of metal trading companies

    Directory of Open Access Journals (Sweden)

    Khayrullin Rustam Zinnatullovich

    2014-07-01

    Full Text Available Logistics is an effective tool in business management. Metal trading business is a part of metal promotion chain from producer to consumer. It's designed to serve as a link connecting the interests of steel producers and end users. We should account for the specifics warehousing trading. The specificity of warehouse metal trading consists primarily in the fact that the purchase is made in large lots, and the sale - in medium and small parties. Loading and unloading of cars and trucks is produced by overhead cranes. Some part of the purchased goods are shipped in relatively large lots without presales preparation. Another part of the goods undergoes presale preparation. Indoor and outdoor warehouses are used with the address storage system. In the process of prolonged storage the metal rusts. Some part of the goods is subjected to final completion (cutting, welding, coloration in service centers and small factories, usually located at the warehouse. The quantity of simultaneously shipped cars, and the quantity of the loader workers brigade can reach few dozens. So it is necessary to control the loading workers, to coordinate and monitor the performance of loading and unloading operations, to make the daily analysis of their work, to evaluate the warehouse operations as a whole. There is a need to manage and control movement of cars and trucks on the warehouse territory to reduce storage and transport costs and improve customer service. ERP-systems and WMS-systems, which are widely used, do not cover fully the functions and processes of the warehouse trading, and do not effectively manage all logistics processes. In this paper the specialized software is proposed. The software is intended for operational logistics management in warehouse metal products trading. The basic functions and processes of metal warehouse trading are described. The effectiveness indices for logistics processes and key effective indicators of warehouse trading are proposed

  2. Implemetasi Data Warehouse pada Bagian Pemasaran Perguruan Tinggi

    Directory of Open Access Journals (Sweden)

    Eka Miranda

    2012-06-01

    Full Text Available Transactional data are widely owned by higher education institutes, but the utilization of the data to support decision making has not functioned maximally. Therefore, higher education institutes need analysis tools to maximize decision making processes. Based on the issue, then data warehouse design was created to: (1 store large-amount data; (2 potentially gain new perspectives of distributed data; (3 provide reports and answers to users’ ad hoc questions; (4 perform data analysis of external conditions and transactional data from the marketing activities of universities, since marketing is one supporting field as well as the cutting edge of higher education institutes. The methods used to design and implement data warehouse are analysis of records related to the marketing activities of higher education institutes and data warehouse design. This study results in a data warehouse design and its implementation to analyze the external data and transactional data from the marketing activities of universities to support decision making.

  3. Warehouse receipts functioning to reduce market risk

    Directory of Open Access Journals (Sweden)

    Jovičić Daliborka

    2014-01-01

    Full Text Available Cereal production underlies the market risk to a great extent due to its elastic demand. Prices of grain have cyclic movements and significant decline in the harvest periods as a result of insufficient supply and high demand. The very specificity of agricultural production leads to the fact that agricultures are forced to sell their products at unfavorable conditions in order to resume production. The Public Warehouses System allows the agriculturers, who were previously unable to use the bank loans to finance the continuation of their production, to efficiently acquire the necessary funds, by the support of the warehouse receipts which serve as collaterals. Based on the results obtained by applying statistical methods (variance and standard deviation, as a measure of market risk under the assumption that warehouse receipts' prices will approximately follow the overall consumer price index, it can be concluded that the warehouse receipts trade will have a significant impact on risk reduction in cereal production. Positive effects can be manifested through the stabilization of prices, reduction of cyclic movements in the production of basic grains and, in the final stage, on the country's food security.

  4. Information Support of Processes in Warehouse Logistics

    Directory of Open Access Journals (Sweden)

    Gordei Kirill

    2013-11-01

    Full Text Available In the conditions of globalization and the world economic communications, the role of information support of business processes increases in various branches and fields of activity. There is not an exception for the warehouse activity. Such information support is realized in warehouse logistic systems. In relation to territorial administratively education, the warehouse logistic system gets a format of difficult social and economic structure which controls the economic streams covering the intermediary, trade and transport organizations and the enterprises of other branches and spheres. Spatial movement of inventory items makes new demands to participants of merchandising. Warehousing (in the meaning – storage – is one of the operations entering into logistic activity, on the organization of a material stream, as a requirement. Therefore, warehousing as "management of spatial movement of stocks" – is justified. Warehousing, in such understanding, tries to get rid of the perception as to containing stocks – a business expensive. This aspiration finds reflection in the logistic systems working by the principle: "just in time", "economical production" and others. Therefore, the role of warehouses as places of storage is transformed to understanding of warehousing as an innovative logistic system.

  5. Harvesting Information from a Library Data Warehouse

    Directory of Open Access Journals (Sweden)

    Siew-Phek T. Su

    2017-09-01

    Full Text Available Data warehousing technology has been defined by John Ladley as "a set of methods, techniques, and tools that are leveraged together and used to produce a vehicle that delivers data to end users on an integrated platform." (1 This concept h s been applied increasingly by industries worldwide to develop data warehouses for decision support and knowledge discovery. In the academic sector, several universities have developed data warehouses containing the universities' financial, payroll, personnel, budget, and student data. (2 These data warehouses across all industries and academia have met with varying degrees of success. Data warehousing technology and its related issues have been widely discussed and published. (3 Little has been done, however, on the application of this cutting edge technology in the library environment using library data.

  6. Bioinformatics approaches for identifying new therapeutic bioactive peptides in food

    Directory of Open Access Journals (Sweden)

    Nora Khaldi

    2012-10-01

    Full Text Available ABSTRACT:The traditional methods for mining foods for bioactive peptides are tedious and long. Similar to the drug industry, the length of time to identify and deliver a commercial health ingredient that reduces disease symptoms can take anything between 5 to 10 years. Reducing this time and effort is crucial in order to create new commercially viable products with clear and important health benefits. In the past few years, bioinformatics, the science that brings together fast computational biology, and efficient genome mining, is appearing as the long awaited solution to this problem. By quickly mining food genomes for characteristics of certain food therapeutic ingredients, researchers can potentially find new ones in a matter of a few weeks. Yet, surprisingly, very little success has been achieved so far using bioinformatics in mining for food bioactives.The absence of food specific bioinformatic mining tools, the slow integration of both experimental mining and bioinformatics, and the important difference between different experimental platforms are some of the reasons for the slow progress of bioinformatics in the field of functional food and more specifically in bioactive peptide discovery.In this paper I discuss some methods that could be easily translated, using a rational peptide bioinformatics design, to food bioactive peptide mining. I highlight the need for an integrated food peptide database. I also discuss how to better integrate experimental work with bioinformatics in order to improve the mining of food for bioactive peptides, therefore achieving a higher success rates.

  7. 19 CFR 19.1 - Classes of customs warehouses.

    Science.gov (United States)

    2010-04-01

    ... 19 Customs Duties 1 2010-04-01 2010-04-01 false Classes of customs warehouses. 19.1 Section 19.1 Customs Duties U.S. CUSTOMS AND BORDER PROTECTION, DEPARTMENT OF HOMELAND SECURITY; DEPARTMENT OF THE TREASURY CUSTOMS WAREHOUSES, CONTAINER STATIONS AND CONTROL OF MERCHANDISE THEREIN § 19.1 Classes of...

  8. A study on building data warehouse of hospital information system.

    Science.gov (United States)

    Li, Ping; Wu, Tao; Chen, Mu; Zhou, Bin; Xu, Wei-guo

    2011-08-01

    Existing hospital information systems with simple statistical functions cannot meet current management needs. It is well known that hospital resources are distributed with private property rights among hospitals, such as in the case of the regional coordination of medical services. In this study, to integrate and make full use of medical data effectively, we propose a data warehouse modeling method for the hospital information system. The method can also be employed for a distributed-hospital medical service system. To ensure that hospital information supports the diverse needs of health care, the framework of the hospital information system has three layers: datacenter layer, system-function layer, and user-interface layer. This paper discusses the role of a data warehouse management system in handling hospital information from the establishment of the data theme to the design of a data model to the establishment of a data warehouse. Online analytical processing tools assist user-friendly multidimensional analysis from a number of different angles to extract the required data and information. Use of the data warehouse improves online analytical processing and mitigates deficiencies in the decision support system. The hospital information system based on a data warehouse effectively employs statistical analysis and data mining technology to handle massive quantities of historical data, and summarizes from clinical and hospital information for decision making. This paper proposes the use of a data warehouse for a hospital information system, specifically a data warehouse for the theme of hospital information to determine latitude, modeling and so on. The processing of patient information is given as an example that demonstrates the usefulness of this method in the case of hospital information management. Data warehouse technology is an evolving technology, and more and more decision support information extracted by data mining and with decision-making technology is

  9. Warehouse Performance Improvement at Linfox Logistics Indonesia

    OpenAIRE

    Pratama, Riyan Galuh; Simatupang, Togar M

    2013-01-01

    The objective of this research is to provide alternative solutions for Linfox Logistics Indonesia (LLI) in facing warehouse performance issues. The main warehouse performance indicators called Customer Case Filling on Time (CCFOT) and Case Picking Productivity failed to achieve the target. Several analyses were carried out regarding current dispatch process, value stream mapping, and root causes identification. The results find that much waste occurred in dispatch process. Proposed improvemen...

  10. The eBioKit, a stand-alone educational platform for bioinformatics.

    Science.gov (United States)

    Hernández-de-Diego, Rafael; de Villiers, Etienne P; Klingström, Tomas; Gourlé, Hadrien; Conesa, Ana; Bongcam-Rudloff, Erik

    2017-09-01

    Bioinformatics skills have become essential for many research areas; however, the availability of qualified researchers is usually lower than the demand and training to increase the number of able bioinformaticians is an important task for the bioinformatics community. When conducting training or hands-on tutorials, the lack of control over the analysis tools and repositories often results in undesirable situations during training, as unavailable online tools or version conflicts may delay, complicate, or even prevent the successful completion of a training event. The eBioKit is a stand-alone educational platform that hosts numerous tools and databases for bioinformatics research and allows training to take place in a controlled environment. A key advantage of the eBioKit over other existing teaching solutions is that all the required software and databases are locally installed on the system, significantly reducing the dependence on the internet. Furthermore, the architecture of the eBioKit has demonstrated itself to be an excellent balance between portability and performance, not only making the eBioKit an exceptional educational tool but also providing small research groups with a platform to incorporate bioinformatics analysis in their research. As a result, the eBioKit has formed an integral part of training and research performed by a wide variety of universities and organizations such as the Pan African Bioinformatics Network (H3ABioNet) as part of the initiative Human Heredity and Health in Africa (H3Africa), the Southern Africa Network for Biosciences (SAnBio) initiative, the Biosciences eastern and central Africa (BecA) hub, and the International Glossina Genome Initiative.

  11. Column-Oriented Databases, an Alternative for Analytical Environment

    Directory of Open Access Journals (Sweden)

    Gheorghe MATEI

    2010-12-01

    Full Text Available It is widely accepted that a data warehouse is the central place of a Business Intelligence system. It stores all data that is relevant for the company, data that is acquired both from internal and external sources. Such a repository stores data from more years than a transactional system can do, and offer valuable information to its users to make the best decisions, based on accurate and reliable data. As the volume of data stored in an enterprise data warehouse becomes larger and larger, new approaches are needed to make the analytical system more efficient. This paper presents column-oriented databases, which are considered an element of the new generation of DBMS technology. The paper emphasizes the need and the advantages of these databases for an analytical environment and make a short presentation of two of the DBMS built in a columnar approach.

  12. Deteksi Outlier Transaksi Menggunakan Visualisasi-Olap Pada Data Warehouse Perguruan Tinggi Swasta

    Directory of Open Access Journals (Sweden)

    Gusti Ngurah Mega Nata

    2016-07-01

    Full Text Available Mendeteksi outlier pada data warehouse merupakan hal penting. Data pada data warehouse sudah diagregasi dan memiliki model multidimensional. Agregasi pada data warehouse dilakukan karena data warehouse digunakan untuk menganalisis data secara cepat pada top level manajemen. Sedangkan, model data multidimensional digunakan untuk melihat data dari berbagai dimensi objek bisnis. Jadi, Mendeteksi outlier pada data warehouse membutuhkan teknik yang dapat melihat outlier pada data yang sudah diagregasi dan dapat melihat dari berbagai dimensi objek bisnis. Mendeteksi outlier pada data warehouse akan menjadi tantangan baru.        Di lain hal, Visualisasi On-line Analytic process (OLAP merupakan tugas penting dalam menyajikan informasi trend (report pada data warehouse dalam bentuk visualisasi data. Pada penelitian ini, visualisasi OLAP digunakan untuk deteksi outlier transaksi. Maka, dalam penelitian ini melakukan analisis untuk mendeteksi outlier menggunakan visualisasi-OLAP. Operasi OLAP yang digunakan yaitu operasi drill-down. Jenis visualisasi yang akan digunakan yaitu visualisasi satu dimensi, dua dimensi dan multi dimensi menggunakan tool weave desktop. Pembangunan data warehouse dilakukan secara button-up. Studi kasus dilakukan pada perguruan tinggi swasta. Kasus yang diselesaikan yaitu mendeteksi outlier transaki pembayaran mahasiswa pada setiap semester. Deteksi outlier pada visualisasi data menggunakan satu tabel dimensional lebih mudah dianalisis dari pada deteksi outlier pada visualisasi data menggunakan dua atau multi tabel dimensional. Dengan kata lain semakin banyak tabel dimensi yang terlibat semakin sulit analisis deteksi outlier yang dilakukan. Kata kunci — Deteksi Outlier,  Visualisasi OLAP, Data warehouse

  13. E-MSD: the European Bioinformatics Institute Macromolecular Structure Database.

    Science.gov (United States)

    Boutselakis, H; Dimitropoulos, D; Fillon, J; Golovin, A; Henrick, K; Hussain, A; Ionides, J; John, M; Keller, P A; Krissinel, E; McNeil, P; Naim, A; Newman, R; Oldfield, T; Pineda, J; Rachedi, A; Copeland, J; Sitnov, A; Sobhany, S; Suarez-Uruena, A; Swaminathan, J; Tagari, M; Tate, J; Tromm, S; Velankar, S; Vranken, W

    2003-01-01

    The E-MSD macromolecular structure relational database (http://www.ebi.ac.uk/msd) is designed to be a single access point for protein and nucleic acid structures and related information. The database is derived from Protein Data Bank (PDB) entries. Relational database technologies are used in a comprehensive cleaning procedure to ensure data uniformity across the whole archive. The search database contains an extensive set of derived properties, goodness-of-fit indicators, and links to other EBI databases including InterPro, GO, and SWISS-PROT, together with links to SCOP, CATH, PFAM and PROSITE. A generic search interface is available, coupled with a fast secondary structure domain search tool.

  14. Data warehouse based decision support system in nuclear power plants

    International Nuclear Information System (INIS)

    Nadinic, B.

    2004-01-01

    Safety is an important element in business decision making processes in nuclear power plants. Information about component reliability, structures and systems, data recorded during the nuclear power plant's operation and outage periods, as well as experiences from other power plants are located in different database systems throughout the power plant. It would be possible to create a decision support system which would collect data, transform it into a standardized form and store it in a single location in a format more suitable for analyses and knowledge discovery. This single location where the data would be stored would be a data warehouse. Such data warehouse based decision support system could help make decision making processes more efficient by providing more information about business processes and predicting possible consequences of different decisions. Two main functionalities in this decision support system would be an OLAP (On Line Analytical Processing) and a data mining system. An OLAP system would enable the users to perform fast, simple and efficient multidimensional analysis of existing data and identify trends. Data mining techniques and algorithms would help discover new, previously unknown information from the data as well as hidden dependencies between various parameters. Data mining would also enable analysts to create relevant prediction models that could predict behaviour of different systems during operation and inspection results during outages. The basic characteristics and theoretical foundations of such decision support system are described and the reasons for choosing a data warehouse as the underlying structure are explained. The article analyzes obvious business benefits of such system as well as potential uses of OLAP and data mining technologies. Possible implementation methodologies and problems that may arise, especially in the field of data integration, are discussed and analyzed.(author)

  15. The Androgen Receptor Gene Mutations Database.

    Science.gov (United States)

    Gottlieb, B; Lehvaslaiho, H; Beitel, L K; Lumbroso, R; Pinsky, L; Trifiro, M

    1998-01-01

    The current version of the androgen receptor (AR) gene mutations database is described. The total number of reported mutations has risen from 272 to 309 in the past year. We have expanded the database: (i) by giving each entry an accession number; (ii) by adding information on the length of polymorphic polyglutamine (polyGln) and polyglycine (polyGly) tracts in exon 1; (iii) by adding information on large gene deletions; (iv) by providing a direct link with a completely searchable database (courtesy EMBL-European Bioinformatics Institute). The addition of the exon 1 polymorphisms is discussed in light of their possible relevance as markers for predisposition to prostate or breast cancer. The database is also available on the internet (http://www.mcgill. ca/androgendb/ ), from EMBL-European Bioinformatics Institute (ftp. ebi.ac.uk/pub/databases/androgen ), or as a Macintosh FilemakerPro or Word file (MC33@musica.mcgill.ca).

  16. Logistics Cost Calculation of Implementation Warehouse Management System: A Case Study

    Directory of Open Access Journals (Sweden)

    Kučera Tomáš

    2017-01-01

    Full Text Available Warehouse management system can take full advantage of the resources and provide efficient warehousing services. The paper aims to show advantages and disadvantages of the warehouse management system in a chosen enterprise, which is focused on logistics services and transportation. The paper can bring new innovative approach for warehousing and presents how logistics enterprise can reduce logistics costs. This approach includes cost reduction of the establishment, operation and savings in the overall assessment of the implementation of the warehouse management system. The innovative warehouse management system will be demonstrated as the case study, which is classified as a qualitative scientific method, in the chosen logistics enterprise. The paper is based on the research of the world literature, analyses of the internal logistics processes, data and finally enterprise documents. The paper discovers costs related to personnel costs, handling equipment costs and costs for material identification. Implementation of the warehouse management system will reduce overall logistics costs of warehousing and extend the warehouse management system to other parts of the logistics chain.

  17. 27 CFR 46.236 - Articles in a warehouse.

    Science.gov (United States)

    2010-04-01

    ... 27 Alcohol, Tobacco Products and Firearms 2 2010-04-01 2010-04-01 false Articles in a warehouse... Tubes Held for Sale on April 1, 2009 Filing Requirements § 46.236 Articles in a warehouse. (a) Articles... articles will be offered for sale. (b) Articles offered for sale at several locations must be reported on a...

  18. Demand Response Opportunities in Industrial Refrigerated Warehouses in California

    Energy Technology Data Exchange (ETDEWEB)

    Goli, Sasank; McKane, Aimee; Olsen, Daniel

    2011-06-14

    Industrial refrigerated warehouses that implemented energy efficiency measures and have centralized control systems can be excellent candidates for Automated Demand Response (Auto-DR) due to equipment synergies, and receptivity of facility managers to strategies that control energy costs without disrupting facility operations. Auto-DR utilizes OpenADR protocol for continuous and open communication signals over internet, allowing facilities to automate their Demand Response (DR). Refrigerated warehouses were selected for research because: They have significant power demand especially during utility peak periods; most processes are not sensitive to short-term (2-4 hours) lower power and DR activities are often not disruptive to facility operations; the number of processes is limited and well understood; and past experience with some DR strategies successful in commercial buildings may apply to refrigerated warehouses. This paper presents an overview of the potential for load sheds and shifts from baseline electricity use in response to DR events, along with physical configurations and operating characteristics of refrigerated warehouses. Analysis of data from two case studies and nine facilities in Pacific Gas and Electric territory, confirmed the DR abilities inherent to refrigerated warehouses but showed significant variation across facilities. Further, while load from California's refrigerated warehouses in 2008 was 360 MW with estimated DR potential of 45-90 MW, actual achieved was much less due to low participation. Efforts to overcome barriers to increased participation may include, improved marketing and recruitment of potential DR sites, better alignment and emphasis on financial benefits of participation, and use of Auto-DR to increase consistency of participation.

  19. Bioinformatics on the Cloud Computing Platform Azure

    Science.gov (United States)

    Shanahan, Hugh P.; Owen, Anne M.; Harrison, Andrew P.

    2014-01-01

    We discuss the applicability of the Microsoft cloud computing platform, Azure, for bioinformatics. We focus on the usability of the resource rather than its performance. We provide an example of how R can be used on Azure to analyse a large amount of microarray expression data deposited at the public database ArrayExpress. We provide a walk through to demonstrate explicitly how Azure can be used to perform these analyses in Appendix S1 and we offer a comparison with a local computation. We note that the use of the Platform as a Service (PaaS) offering of Azure can represent a steep learning curve for bioinformatics developers who will usually have a Linux and scripting language background. On the other hand, the presence of an additional set of libraries makes it easier to deploy software in a parallel (scalable) fashion and explicitly manage such a production run with only a few hundred lines of code, most of which can be incorporated from a template. We propose that this environment is best suited for running stable bioinformatics software by users not involved with its development. PMID:25050811

  20. Health Claims Data Warehouse (HCDW)

    Data.gov (United States)

    Office of Personnel Management — The Health Claims Data Warehouse (HCDW) will receive and analyze health claims data to support management and administrative purposes. The Federal Employee Health...

  1. THE DEVELOPMENT OF THE APPLICATION OF A DATA WAREHOUSE AT PT JKL

    Directory of Open Access Journals (Sweden)

    Choirul Huda

    2012-05-01

    Full Text Available One rapidly evolving technology today is information technology, which can help decision-making in an organization or a company. The data warehouse is one form of information technology that supports those needs, as one of the right solutions for companies in decision-making. The objective of this research is the development of a data warehouse at PT JKL in order to support executives in analyzing the organization and support the decision-making process. Methodology of this research is conducting interview with related units, literature study and document examination. This research also used the Nine Step Methodology developed by Kimball to design the data warehouse. The results obtained is an application that can summarize the data warehouse, integrating and presenting historical data in multidimensional. The conclusion from this research is the data warehouse can help companies to analyze data in a flexible, fast, and effective data access.Keywords: Data Warehouse; Inventory; Contract Approval; Inventory; Dashboard

  2. DATA WAREHOUSES SECURITY IMPLEMENTATION

    Directory of Open Access Journals (Sweden)

    Burtescu Emil

    2009-05-01

    Full Text Available Data warehouses were initially implemented and developed by the big firms and they were used for working out the managerial problems and for making decisions. Later on, because of the economic tendencies and of the technological progress, the data warehou

  3. Evolutionary Multiobjective Query Workload Optimization of Cloud Data Warehouses

    Science.gov (United States)

    Dokeroglu, Tansel; Sert, Seyyit Alper; Cinar, Muhammet Serkan

    2014-01-01

    With the advent of Cloud databases, query optimizers need to find paretooptimal solutions in terms of response time and monetary cost. Our novel approach minimizes both objectives by deploying alternative virtual resources and query plans making use of the virtual resource elasticity of the Cloud. We propose an exact multiobjective branch-and-bound and a robust multiobjective genetic algorithm for the optimization of distributed data warehouse query workloads on the Cloud. In order to investigate the effectiveness of our approach, we incorporate the devised algorithms into a prototype system. Finally, through several experiments that we have conducted with different workloads and virtual resource configurations, we conclude remarkable findings of alternative deployments as well as the advantages and disadvantages of the multiobjective algorithms we propose. PMID:24892048

  4. Incorporating Genomics and Bioinformatics across the Life Sciences Curriculum

    Energy Technology Data Exchange (ETDEWEB)

    Ditty, Jayna L.; Kvaal, Christopher A.; Goodner, Brad; Freyermuth, Sharyn K.; Bailey, Cheryl; Britton, Robert A.; Gordon, Stuart G.; Heinhorst, Sabine; Reed, Kelynne; Xu, Zhaohui; Sanders-Lorenz, Erin R.; Axen, Seth; Kim, Edwin; Johns, Mitrick; Scott, Kathleen; Kerfeld, Cheryl A.

    2011-08-01

    Undergraduate life sciences education needs an overhaul, as clearly described in the National Research Council of the National Academies publication BIO 2010: Transforming Undergraduate Education for Future Research Biologists. Among BIO 2010's top recommendations is the need to involve students in working with real data and tools that reflect the nature of life sciences research in the 21st century. Education research studies support the importance of utilizing primary literature, designing and implementing experiments, and analyzing results in the context of a bona fide scientific question in cultivating the analytical skills necessary to become a scientist. Incorporating these basic scientific methodologies in undergraduate education leads to increased undergraduate and post-graduate retention in the sciences. Toward this end, many undergraduate teaching organizations offer training and suggestions for faculty to update and improve their teaching approaches to help students learn as scientists, through design and discovery (e.g., Council of Undergraduate Research [www.cur.org] and Project Kaleidoscope [www.pkal.org]). With the advent of genome sequencing and bioinformatics, many scientists now formulate biological questions and interpret research results in the context of genomic information. Just as the use of bioinformatic tools and databases changed the way scientists investigate problems, it must change how scientists teach to create new opportunities for students to gain experiences reflecting the influence of genomics, proteomics, and bioinformatics on modern life sciences research. Educators have responded by incorporating bioinformatics into diverse life science curricula. While these published exercises in, and guidelines for, bioinformatics curricula are helpful and inspirational, faculty new to the area of bioinformatics inevitably need training in the theoretical underpinnings of the algorithms. Moreover, effectively integrating bioinformatics

  5. Selecting materialized views in a data warehouse

    Science.gov (United States)

    Zhou, Lijuan; Liu, Chi; Liu, Daxin

    2003-01-01

    A Data Warehouse contains lots of materialized views over the data provided by the distributed heterogeneous databases for the purpose of efficiently implementing decision-support or OLAP queries. It is important to select the right view to materialize that answer a given set of queries. In this paper, we have addressed and designed algorithm to select a set of views to materialize in order to answer the most queries under the constraint of a given space. The algorithm presented in this paper aim at making out a minimum set of views, by which we can directly respond to as many as possible user"s query requests. We use experiments to demonstrate our approach. The results show that our algorithm works better. We implemented our algorithms and a performance study of the algorithm shows that the proposed algorithm gives a less complexity and higher speeds and feasible expandability.

  6. Creating a specialist protein resource network: a meeting report for the protein bioinformatics and community resources retreat

    DEFF Research Database (Denmark)

    Babbitt, Patricia C.; Bagos, Pantelis G.; Bairoch, Amos

    2015-01-01

    During 11–12 August 2014, a Protein Bioinformatics and Community Resources Retreat was held at the Wellcome Trust Genome Campus in Hinxton, UK. This meeting brought together the principal investigators of several specialized protein resources (such as CAZy, TCDB and MEROPS) as well as those from...... protein databases from the large Bioinformatics centres (including UniProt and RefSeq). The retreat was divided into five sessions: (1) key challenges, (2) the databases represented, (3) best practices for maintenance and curation, (4) information flow to and from large data centers and (5) communication...

  7. Development of Bioinformatics Infrastructure for Genomics Research.

    Science.gov (United States)

    Mulder, Nicola J; Adebiyi, Ezekiel; Adebiyi, Marion; Adeyemi, Seun; Ahmed, Azza; Ahmed, Rehab; Akanle, Bola; Alibi, Mohamed; Armstrong, Don L; Aron, Shaun; Ashano, Efejiro; Baichoo, Shakuntala; Benkahla, Alia; Brown, David K; Chimusa, Emile R; Fadlelmola, Faisal M; Falola, Dare; Fatumo, Segun; Ghedira, Kais; Ghouila, Amel; Hazelhurst, Scott; Isewon, Itunuoluwa; Jung, Segun; Kassim, Samar Kamal; Kayondo, Jonathan K; Mbiyavanga, Mamana; Meintjes, Ayton; Mohammed, Somia; Mosaku, Abayomi; Moussa, Ahmed; Muhammd, Mustafa; Mungloo-Dilmohamud, Zahra; Nashiru, Oyekanmi; Odia, Trust; Okafor, Adaobi; Oladipo, Olaleye; Osamor, Victor; Oyelade, Jellili; Sadki, Khalid; Salifu, Samson Pandam; Soyemi, Jumoke; Panji, Sumir; Radouani, Fouzia; Souiai, Oussama; Tastan Bishop, Özlem

    2017-06-01

    Although pockets of bioinformatics excellence have developed in Africa, generally, large-scale genomic data analysis has been limited by the availability of expertise and infrastructure. H3ABioNet, a pan-African bioinformatics network, was established to build capacity specifically to enable H3Africa (Human Heredity and Health in Africa) researchers to analyze their data in Africa. Since the inception of the H3Africa initiative, H3ABioNet's role has evolved in response to changing needs from the consortium and the African bioinformatics community. H3ABioNet set out to develop core bioinformatics infrastructure and capacity for genomics research in various aspects of data collection, transfer, storage, and analysis. Various resources have been developed to address genomic data management and analysis needs of H3Africa researchers and other scientific communities on the continent. NetMap was developed and used to build an accurate picture of network performance within Africa and between Africa and the rest of the world, and Globus Online has been rolled out to facilitate data transfer. A participant recruitment database was developed to monitor participant enrollment, and data is being harmonized through the use of ontologies and controlled vocabularies. The standardized metadata will be integrated to provide a search facility for H3Africa data and biospecimens. Because H3Africa projects are generating large-scale genomic data, facilities for analysis and interpretation are critical. H3ABioNet is implementing several data analysis platforms that provide a large range of bioinformatics tools or workflows, such as Galaxy, the Job Management System, and eBiokits. A set of reproducible, portable, and cloud-scalable pipelines to support the multiple H3Africa data types are also being developed and dockerized to enable execution on multiple computing infrastructures. In addition, new tools have been developed for analysis of the uniquely divergent African data and for

  8. Engaging Students in a Bioinformatics Activity to Introduce Gene Structure and Function

    Directory of Open Access Journals (Sweden)

    Barbara J. May

    2013-02-01

    Full Text Available Bioinformatics spans many fields of biological research and plays a vital role in mining and analyzing data. Therefore, there is an ever-increasing need for students to understand not only what can be learned from this data, but also how to use basic bioinformatics tools.  This activity is designed to provide secondary and undergraduate biology students to a hands-on activity meant to explore and understand gene structure with the use of basic bioinformatic tools.  Students are provided an “unknown” sequence from which they are asked to use a free online gene finder program to identify the gene. Students then predict the putative function of this gene with the use of additional online databases.

  9. A PROPOSAL OF DATA QUALITY FOR DATA WAREHOUSES ENVIRONMENT

    Directory of Open Access Journals (Sweden)

    Leo Willyanto Santoso

    2006-01-01

    Full Text Available The quality of the data provided is critical to the success of data warehousing initiatives. There is strong evidence that many organisations have significant data quality problems, and that these have substantial social and economic impacts. This paper describes a study which explores modeling of the dynamic parts of the data warehouse. This metamodel enables data warehouse management, design and evolution based on a high level conceptual perspective, which can be linked to the actual structural and physical aspects of the data warehouse architecture. Moreover, this metamodel is capable of modeling complex activities, their interrelationships, the relationship of activities with data sources and execution details.

  10. Engineering bioinformatics: building reliability, performance and productivity into bioinformatics software.

    Science.gov (United States)

    Lawlor, Brendan; Walsh, Paul

    2015-01-01

    There is a lack of software engineering skills in bioinformatic contexts. We discuss the consequences of this lack, examine existing explanations and remedies to the problem, point out their shortcomings, and propose alternatives. Previous analyses of the problem have tended to treat the use of software in scientific contexts as categorically different from the general application of software engineering in commercial settings. In contrast, we describe bioinformatic software engineering as a specialization of general software engineering, and examine how it should be practiced. Specifically, we highlight the difference between programming and software engineering, list elements of the latter and present the results of a survey of bioinformatic practitioners which quantifies the extent to which those elements are employed in bioinformatics. We propose that the ideal way to bring engineering values into research projects is to bring engineers themselves. We identify the role of Bioinformatic Engineer and describe how such a role would work within bioinformatic research teams. We conclude by recommending an educational emphasis on cross-training software engineers into life sciences, and propose research on Domain Specific Languages to facilitate collaboration between engineers and bioinformaticians.

  11. Engineering bioinformatics: building reliability, performance and productivity into bioinformatics software

    Science.gov (United States)

    Lawlor, Brendan; Walsh, Paul

    2015-01-01

    There is a lack of software engineering skills in bioinformatic contexts. We discuss the consequences of this lack, examine existing explanations and remedies to the problem, point out their shortcomings, and propose alternatives. Previous analyses of the problem have tended to treat the use of software in scientific contexts as categorically different from the general application of software engineering in commercial settings. In contrast, we describe bioinformatic software engineering as a specialization of general software engineering, and examine how it should be practiced. Specifically, we highlight the difference between programming and software engineering, list elements of the latter and present the results of a survey of bioinformatic practitioners which quantifies the extent to which those elements are employed in bioinformatics. We propose that the ideal way to bring engineering values into research projects is to bring engineers themselves. We identify the role of Bioinformatic Engineer and describe how such a role would work within bioinformatic research teams. We conclude by recommending an educational emphasis on cross-training software engineers into life sciences, and propose research on Domain Specific Languages to facilitate collaboration between engineers and bioinformaticians. PMID:25996054

  12. Building a data warehouse with examples in SQL server

    CERN Document Server

    Rainardi, Vincent

    2008-01-01

    ""Building a Data Warehouse: With Examples in SQL Server"" describes how to build a data warehouse completely from scratch and shows practical examples on how to do it. Author Rainardi also describes some practical issues that developers are likely to encounter in their first data warehousing project, along with solutions and advice.

  13. Bringing Web 2.0 to bioinformatics.

    Science.gov (United States)

    Zhang, Zhang; Cheung, Kei-Hoi; Townsend, Jeffrey P

    2009-01-01

    Enabling deft data integration from numerous, voluminous and heterogeneous data sources is a major bioinformatic challenge. Several approaches have been proposed to address this challenge, including data warehousing and federated databasing. Yet despite the rise of these approaches, integration of data from multiple sources remains problematic and toilsome. These two approaches follow a user-to-computer communication model for data exchange, and do not facilitate a broader concept of data sharing or collaboration among users. In this report, we discuss the potential of Web 2.0 technologies to transcend this model and enhance bioinformatics research. We propose a Web 2.0-based Scientific Social Community (SSC) model for the implementation of these technologies. By establishing a social, collective and collaborative platform for data creation, sharing and integration, we promote a web services-based pipeline featuring web services for computer-to-computer data exchange as users add value. This pipeline aims to simplify data integration and creation, to realize automatic analysis, and to facilitate reuse and sharing of data. SSC can foster collaboration and harness collective intelligence to create and discover new knowledge. In addition to its research potential, we also describe its potential role as an e-learning platform in education. We discuss lessons from information technology, predict the next generation of Web (Web 3.0), and describe its potential impact on the future of bioinformatics studies.

  14. Quality assessment of digital annotated ECG data from clinical trials by the FDA ECG Warehouse.

    Science.gov (United States)

    Sarapa, Nenad

    2007-09-01

    The FDA mandates that digital electrocardiograms (ECGs) from 'thorough' QTc trials be submitted into the ECG Warehouse in Health Level 7 extended markup language format with annotated onset and offset points of waveforms. The FDA did not disclose the exact Warehouse metrics and minimal acceptable quality standards. The author describes the Warehouse scoring algorithms and metrics used by FDA, points out ways to improve FDA review and suggests Warehouse benefits for pharmaceutical sponsors. The Warehouse ranks individual ECGs according to their score for each quality metric and produces histogram distributions with Warehouse-specific thresholds that identify ECGs of questionable quality. Automatic Warehouse algorithms assess the quality of QT annotation and duration of manual QT measurement by the central ECG laboratory.

  15. MouseMine: a new data warehouse for MGI.

    Science.gov (United States)

    Motenko, H; Neuhauser, S B; O'Keefe, M; Richardson, J E

    2015-08-01

    MouseMine (www.mousemine.org) is a new data warehouse for accessing mouse data from Mouse Genome Informatics (MGI). Based on the InterMine software framework, MouseMine supports powerful query, reporting, and analysis capabilities, the ability to save and combine results from different queries, easy integration into larger workflows, and a comprehensive Web Services layer. Through MouseMine, users can access a significant portion of MGI data in new and useful ways. Importantly, MouseMine is also a member of a growing community of online data resources based on InterMine, including those established by other model organism databases. Adopting common interfaces and collaborating on data representation standards are critical to fostering cross-species data analysis. This paper presents a general introduction to MouseMine, presents examples of its use, and discusses the potential for further integration into the MGI interface.

  16. The Data Warehouse in Service Oriented Architectures and Network Centric Warfare

    National Research Council Canada - National Science Library

    Lenahan, Jack

    2005-01-01

    ... at all policy and command levels to support superior decision making? Analyzing the anticipated massive amount of GIG data will almost certainly require data warehouses and federated data warehouses...

  17. Optimal time policy for deteriorating items of two-warehouse

    Indian Academy of Sciences (India)

    ... goods in which the first is rented warehouse and the second is own warehouse that deteriorates with two different rates. The aim of this study is to determine the optimal order quantity to maximize the profit of the projected model. Finally, some numerical examples and sensitivity analysis of parameters are made to validate ...

  18. FCDD: A Database for Fruit Crops Diseases.

    Science.gov (United States)

    Chauhan, Rupal; Jasrai, Yogesh; Pandya, Himanshu; Chaudhari, Suman; Samota, Chand Mal

    2014-01-01

    Fruit Crops Diseases Database (FCDD) requires a number of biotechnology and bioinformatics tools. The FCDD is a unique bioinformatics resource that compiles information about 162 details on fruit crops diseases, diseases type, its causal organism, images, symptoms and their control. The FCDD contains 171 phytochemicals from 25 fruits, their 2D images and their 20 possible sequences. This information has been manually extracted and manually verified from numerous sources, including other electronic databases, textbooks and scientific journals. FCDD is fully searchable and supports extensive text search. The main focus of the FCDD is on providing possible information of fruit crops diseases, which will help in discovery of potential drugs from one of the common bioresource-fruits. The database was developed using MySQL. The database interface is developed in PHP, HTML and JAVA. FCDD is freely available. http://www.fruitcropsdd.com/

  19. A novel approach for intelligent distribution of data warehouses

    Directory of Open Access Journals (Sweden)

    Abhay Kumar Agarwal

    2016-07-01

    Full Text Available With the continuous growth in the amount of data, data storage systems have come a long way from flat files systems to RDBMS, Data Warehousing (DW and Distributed Data Warehousing systems. This paper proposes a new distributed data warehouse model. The model is built on a novel approach, for the intelligent distribution of data warehouse. Overall the model is named as Intelligent and Distributed Data Warehouse (IDDW. The proposed model has N-levels and is based on top-down hierarchical design approach of building distributed data warehouse. The building process of IDDW starts with the identification of various locations where DW may be built. Initially, a single location is considered at top-most level of IDDW where DW is built. Thereafter, DW at any other location of any level may be built. A method, to transfer concerned data from any upper level DW to concerned lower level DW, is also presented in the paper. The paper also presents IDDW modeling, its architecture based on modeling, the internal organization of IDDW via which all the operations within IDDW are performed.

  20. Computer modeling of commercial refrigerated warehouse facilities

    International Nuclear Information System (INIS)

    Nicoulin, C.V.; Jacobs, P.C.; Tory, S.

    1997-01-01

    The use of computer models to simulate the energy performance of large commercial refrigeration systems typically found in food processing facilities is an area of engineering practice that has seen little development to date. Current techniques employed in predicting energy consumption by such systems have focused on temperature bin methods of analysis. Existing simulation tools such as DOE2 are designed to model commercial buildings and grocery store refrigeration systems. The HVAC and Refrigeration system performance models in these simulations tools model equipment common to commercial buildings and groceries, and respond to energy-efficiency measures likely to be applied to these building types. The applicability of traditional building energy simulation tools to model refrigerated warehouse performance and analyze energy-saving options is limited. The paper will present the results of modeling work undertaken to evaluate energy savings resulting from incentives offered by a California utility to its Refrigerated Warehouse Program participants. The TRNSYS general-purpose transient simulation model was used to predict facility performance and estimate program savings. Custom TRNSYS components were developed to address modeling issues specific to refrigerated warehouse systems, including warehouse loading door infiltration calculations, an evaporator model, single-state and multi-stage compressor models, evaporative condenser models, and defrost energy requirements. The main focus of the paper will be on the modeling approach. The results from the computer simulations, along with overall program impact evaluation results, will also be presented

  1. The visit-data warehouse: enabling novel secondary use of health information exchange data.

    Science.gov (United States)

    Fleischman, William; Lowry, Tina; Shapiro, Jason

    2014-01-01

    Health Information Exchange (HIE) efforts face challenges with data quality and performance, and this becomes especially problematic when data is leveraged for uses beyond primary clinical use. We describe a secondary data infrastructure focusing on patient-encounter, nonclinical data that was built on top of a functioning HIE platform to support novel secondary data uses and prevent potentially negative impacts these uses might have otherwise had on HIE system performance. HIE efforts have generally formed for the primary clinical use of individual clinical providers searching for data on individual patients under their care, but many secondary uses have been proposed and are being piloted to support care management, quality improvement, and public health. This infrastructure review describes a module built into the Healthix HIE. Healthix, based in the New York metropolitan region, comprises 107 participating organizations with 29,946 acute-care beds in 383 facilities, and includes more than 9.2 million unique patients. The primary infrastructure is based on the InterSystems proprietary Caché data model distributed across servers in multiple locations, and uses a master patient index to link individual patients' records across multiple sites. We built a parallel platform, the "visit data warehouse," of patient encounter data (demographics, date, time, and type of visit) using a relational database model to allow accessibility using standard database tools and flexibility for developing secondary data use cases. These four secondary use cases include the following: (1) tracking encounter-based metrics in a newly established geriatric emergency department (ED), (2) creating a dashboard to provide a visual display as well as a tabular output of near-real-time de-identified encounter data from the data warehouse, (3) tracking frequent ED users as part of a regional-approach to case management intervention, and (4) improving an existing quality improvement program

  2. 19 CFR 19.14 - Materials for use in manufacturing warehouse.

    Science.gov (United States)

    2010-04-01

    ... warehouse is located under an immediate transportation without appraisement entry or warehouse withdrawal for transportation, whichever is applicable. (b) Bond required. Before the transfer of the merchandise... the manufacture of articles as authorized by law. Port Director (d) Domestic spirits and wines. For...

  3. A bioinformatics potpourri.

    Science.gov (United States)

    Schönbach, Christian; Li, Jinyan; Ma, Lan; Horton, Paul; Sjaugi, Muhammad Farhan; Ranganathan, Shoba

    2018-01-19

    The 16th International Conference on Bioinformatics (InCoB) was held at Tsinghua University, Shenzhen from September 20 to 22, 2017. The annual conference of the Asia-Pacific Bioinformatics Network featured six keynotes, two invited talks, a panel discussion on big data driven bioinformatics and precision medicine, and 66 oral presentations of accepted research articles or posters. Fifty-seven articles comprising a topic assortment of algorithms, biomolecular networks, cancer and disease informatics, drug-target interactions and drug efficacy, gene regulation and expression, imaging, immunoinformatics, metagenomics, next generation sequencing for genomics and transcriptomics, ontologies, post-translational modification, and structural bioinformatics are the subject of this editorial for the InCoB2017 supplement issues in BMC Genomics, BMC Bioinformatics, BMC Systems Biology and BMC Medical Genomics. New Delhi will be the location of InCoB2018, scheduled for September 26-28, 2018.

  4. 27 CFR 28.286 - Receipt in customs bonded warehouse.

    Science.gov (United States)

    2010-04-01

    ... 27 Alcohol, Tobacco Products and Firearms 1 2010-04-01 2010-04-01 false Receipt in customs bonded... in Customs Bonded Warehouse § 28.286 Receipt in customs bonded warehouse. On receipt of the distilled spirits or wine and the related TTB Form 5100.11 or 5110.30 as the case may be, the customs officer in...

  5. Criticality calculation of the nuclear material warehouse of the ININ

    International Nuclear Information System (INIS)

    Garcia, T.; Angeles, A.; Flores C, J.

    2013-10-01

    In this work the conditions of nuclear safety were determined as much in normal conditions as in the accident event of the nuclear fuel warehouse of the reactor TRIGA Mark III of the Instituto Nacional de Investigaciones Nucleares (ININ). The warehouse contains standard fuel elements Leu - 8.5/20, a control rod with follower of standard fuel type Leu - 8.5/20, fuel elements Leu - 30/20, and the reactor fuel Sur-100. To check the subcritical state of the warehouse the effective multiplication factor (keff) was calculated. The keff calculation was carried out with the code MCNPX. (Author)

  6. Design and Control of Warehouse Order Picking: a literature review

    NARCIS (Netherlands)

    M.B.M. de Koster (René); T. Le-Duc (Tho); K.J. Roodbergen (Kees-Jan)

    2006-01-01

    textabstractOrder picking has long been identified as the most labour-intensive and costly activity for almost every warehouse; the cost of order picking is estimated to be as much as 55% of the total warehouse operating expense. Any underperformance in order picking can lead to unsatisfactory

  7. Promotion bureau warehouse system design. Case study in University of AA

    Science.gov (United States)

    Parwati, N.; Qibtiyah, M.

    2017-12-01

    The warehouse becomes one of the important parts in an industry. By having a good warehousing system, an industry can improve the effectiveness of its performance, so that profits for the company can continue to increase. Meanwhile, if it has a poorly organized warehouse system, it is feared there will be a decrease in the level of effectiveness of the industry itself. In this research, the object was warehousing system in promotion bureau of University AA. To improve the effectiveness of warehousing system, warehouse layout design is done by specifying categories of goods based on the flow of goods in and out of warehouse with ABC analysis method. In addition, the design of information systems to assist in controlling the system to support all the demand for every burreau and department in the university.

  8. A Performance Evaluation of Online Warehouse Update Algorithms

    Science.gov (United States)

    1998-01-01

    able to present a fully consistent ver- sion of the warehouse to the queries while the warehouse is being updated. Multiversioning has been used...LST97]). Special- ized multiversion access structures have also been proposed ([LS89, LS90, dBS96, BC97, VV97, MOPW98]) In the context of OLTP systems...collection processes. 2.1 Multiversioning MVNL supports multiple versions by using Time Travel ([Sto87]). Each row has two extra at- tributes, Tmin

  9. Facilitating the use of large-scale biological data and tools in the era of translational bioinformatics

    DEFF Research Database (Denmark)

    Kouskoumvekaki, Irene; Shublaq, Nour; Brunak, Søren

    2014-01-01

    As both the amount of generated biological data and the processing compute power increase, computational experimentation is no longer the exclusivity of bioinformaticians, but it is moving across all biomedical domains. For bioinformatics to realize its translational potential, domain experts need...... access to user-friendly solutions to navigate, integrate and extract information out of biological databases, as well as to combine tools and data resources in bioinformatics workflows. In this review, we present services that assist biomedical scientists in incorporating bioinformatics tools...... into their research.We review recent applications of Cytoscape, BioGPS and DAVID for data visualization, integration and functional enrichment. Moreover, we illustrate the use of Taverna, Kepler, GenePattern, and Galaxy as open-access workbenches for bioinformatics workflows. Finally, we mention services...

  10. Data warehouse governance programs in healthcare settings: a literature review and a call to action.

    Science.gov (United States)

    Elliott, Thomas E; Holmes, John H; Davidson, Arthur J; La Chance, Pierre-Andre; Nelson, Andrew F; Steiner, John F

    2013-01-01

    Given the extensive data stored in healthcare data warehouses, data warehouse governance policies are needed to ensure data integrity and privacy. This review examines the current state of the data warehouse governance literature as it applies to healthcare data warehouses, identifies knowledge gaps, provides recommendations, and suggests approaches for further research. A comprehensive literature search using five data bases, journal article title-search, and citation searches was conducted between 1997 and 2012. Data warehouse governance documents from two healthcare systems in the USA were also reviewed. A modified version of nine components from the Data Governance Institute Framework for data warehouse governance guided the qualitative analysis. Fifteen articles were retrieved. Only three were related to healthcare settings, each of which addressed only one of the nine framework components. Of the remaining 12 articles, 10 addressed between one and seven framework components and the remainder addressed none. Each of the two data warehouse governance plans obtained from healthcare systems in the USA addressed a subset of the framework components, and between them they covered all nine. While published data warehouse governance policies are rare, the 15 articles and two healthcare organizational documents reviewed in this study may provide guidance to creating such policies. Additional research is needed in this area to ensure that data warehouse governance polices are feasible and effective. The gap between the development of data warehouses in healthcare settings and formal governance policies is substantial, as evidenced by the sparse literature in this domain.

  11. Financing agribusiness: Insurance coverage as protection against credit risk of warehouse receipt collateral

    Directory of Open Access Journals (Sweden)

    Jovičić Daliborka

    2017-01-01

    Full Text Available Financing agribusiness by warehouse receipts allows the agricultural producers to obtain working capital on the basis of agricultural products stored in licensed warehouses, as collateral. The implementation of the system of licensed warehouses and issuance of warehouse receipts as collateral for obtaining a bank loan is supported by the European Bank for Reconstruction and Development and it has had positive results in the neighbouring countries. The precondition for financing this project was to establish a Compensation Fund for providing insurance coverage for licensed warehouses against professional liability. However, in the lack of an adequate legal framework, the operational risk is possible to occur. Bearing in mind that Serbia has a tradition in insurance industry and a number of operating insurance companies, the issue is that of the economic benefit and the method of insuring against this risk. The paper will present a detailed analysis of the operation of the Fund, capital requirement, solvency margin and a critical review of the Law on Public Warehouses which regulates the rights and obligations of the Compensation Fund in the case of loss occurrence.

  12. Data Warehouse for support to the electric energy commercialization; Data Warehouse para apoio a comercializacao de energia eletrica

    Energy Technology Data Exchange (ETDEWEB)

    Lanzotti, Carla R.; Correia, Paulo B. [Universidade Estadual de Campinas (UNICAMP), SP (Brazil). Faculdade de Engenharia Mecanica]. E-mails: clanzotti@yahoo.com; pcorreia@fem.unicamp.br

    2006-07-01

    This paper specifies data base using a data warehouse containing the energy market, the electric system, and the economy information allowing the visualization and analysis of the data through tables and dynamic charts. This data warehouse corresponds to the module 'Information base from Platform helping the electric power commercialization'. The platform is a computer program viewing to help the interested agents in commercializing energy and is formed by three modules as follows: Information Base, Contracting Strategies and Contracting Process. It is expected that the use os these data base, joint to Platform establishes positive conditions to agents from the interested in electric energy commercialization.

  13. Data Warehouse Architecture for Army Installations

    National Research Council Canada - National Science Library

    Reddy, Prameela

    1999-01-01

    .... A data warehouse is a single store of information to answer complex queries from management using cross-functional data to perform advanced data analysis methods and to compare with historical data...

  14. Data mining for bioinformatics applications

    CERN Document Server

    Zengyou, He

    2015-01-01

    Data Mining for Bioinformatics Applications provides valuable information on the data mining methods have been widely used for solving real bioinformatics problems, including problem definition, data collection, data preprocessing, modeling, and validation. The text uses an example-based method to illustrate how to apply data mining techniques to solve real bioinformatics problems, containing 45 bioinformatics problems that have been investigated in recent research. For each example, the entire data mining process is described, ranging from data preprocessing to modeling and result validation. Provides valuable information on the data mining methods have been widely used for solving real bioinformatics problems Uses an example-based method to illustrate how to apply data mining techniques to solve real bioinformatics problems Contains 45 bioinformatics problems that have been investigated in recent research.

  15. 27 CFR 28.28 - Withdrawal of wine and distilled spirits from customs bonded warehouses.

    Science.gov (United States)

    2010-04-01

    ... 27 Alcohol, Tobacco Products and Firearms 1 2010-04-01 2010-04-01 false Withdrawal of wine and... Miscellaneous Provisions Customs Bonded Warehouses § 28.28 Withdrawal of wine and distilled spirits from customs bonded warehouses. Wine and bottled distilled spirits entered into customs bonded warehouses as provided...

  16. Analisis Dan Perancangan Data Warehouse Pada PT Pelita Tatamas Jaya

    Directory of Open Access Journals (Sweden)

    Choirul Huda

    2010-12-01

    Full Text Available The purpose of this research is to assist in providing information to support decision-making processes in sales, purchasing and inventory control at PT Tatamas Pelita Jaya. With the support of data warehouse, business leaders can be more helpful in making decisions more quickly and precisely. Research methodology includes analysis of current systems, library research, designing a data warehousing schema using bintang. The result of this research is the availability of a data warehouse that can generate information quickly and precisely, thus helping the company in making decisions. The conclusion of this research is the application of data warehouse can be a media aide related parties on PT Tatamas Pelita Jaya in decision making. 

  17. A database for coconut crop improvement.

    Science.gov (United States)

    Rajagopal, Velamoor; Manimekalai, Ramaswamy; Devakumar, Krishnamurthy; Rajesh; Karun, Anitha; Niral, Vittal; Gopal, Murali; Aziz, Shamina; Gunasekaran, Marimuthu; Kumar, Mundappurathe Ramesh; Chandrasekar, Arumugam

    2005-12-08

    Coconut crop improvement requires a number of biotechnology and bioinformatics tools. A database containing information on CG (coconut germplasm), CCI (coconut cultivar identification), CD (coconut disease), MIFSPC (microbial information systems in plantation crops) and VO (vegetable oils) is described. The database was developed using MySQL and PostgreSQL running in Linux operating system. The database interface is developed in PHP, HTML and JAVA. http://www.bioinfcpcri.org.

  18. Why Choose This One? Factors in Scientists' Selection of Bioinformatics Tools

    Science.gov (United States)

    Bartlett, Joan C.; Ishimura, Yusuke; Kloda, Lorie A.

    2011-01-01

    Purpose: The objective was to identify and understand the factors involved in scientists' selection of preferred bioinformatics tools, such as databases of gene or protein sequence information (e.g., GenBank) or programs that manipulate and analyse biological data (e.g., BLAST). Methods: Eight scientists maintained research diaries for a two-week…

  19. Mastering data warehouse design relational and dimensional techniques

    CERN Document Server

    Imhoff, Claudia; Geiger, Jonathan G

    2003-01-01

    A cutting-edge response to Ralph Kimball''s challenge to the data warehouse community that answers some tough questions about the effectiveness of the relational approach to data warehousingWritten by one of the best-known exponents of the Bill Inmon approach to data warehousingAddresses head-on the tough issues raised by Kimball and explains how to choose the best modeling technique for solving common data warehouse design problemsWeighs the pros and cons of relational vs. dimensional modeling techniquesFocuses on tough modeling problems, including creating and maintaining keys and modeling c

  20. 6th International Conference on Practical Applications of Computational Biology & Bioinformatics

    CERN Document Server

    Luscombe, Nicholas; Fdez-Riverola, Florentino; Rodríguez, Juan; Practical Applications of Computational Biology & Bioinformatics

    2012-01-01

    The growth in the Bioinformatics and Computational Biology fields over the last few years has been remarkable.. The analysis of the datasets of Next Generation Sequencing needs new algorithms and approaches from fields such as Databases, Statistics, Data Mining, Machine Learning, Optimization, Computer Science and Artificial Intelligence. Also Systems Biology has also been emerging as an alternative to the reductionist view that dominated biological research in the last decades. This book presents the results of the  6th International Conference on Practical Applications of Computational Biology & Bioinformatics held at University of Salamanca, Spain, 28-30th March, 2012 which brought together interdisciplinary scientists that have a strong background in the biological and computational sciences.

  1. Creating a specialist protein resource network: a meeting report for the protein bioinformatics and community resources retreat.

    Science.gov (United States)

    Babbitt, Patricia C; Bagos, Pantelis G; Bairoch, Amos; Bateman, Alex; Chatonnet, Arnaud; Chen, Mark Jinan; Craik, David J; Finn, Robert D; Gloriam, David; Haft, Daniel H; Henrissat, Bernard; Holliday, Gemma L; Isberg, Vignir; Kaas, Quentin; Landsman, David; Lenfant, Nicolas; Manning, Gerard; Nagano, Nozomi; Srinivasan, Narayanaswamy; O'Donovan, Claire; Pruitt, Kim D; Sowdhamini, Ramanathan; Rawlings, Neil D; Saier, Milton H; Sharman, Joanna L; Spedding, Michael; Tsirigos, Konstantinos D; Vastermark, Ake; Vriend, Gerrit

    2015-01-01

    During 11-12 August 2014, a Protein Bioinformatics and Community Resources Retreat was held at the Wellcome Trust Genome Campus in Hinxton, UK. This meeting brought together the principal investigators of several specialized protein resources (such as CAZy, TCDB and MEROPS) as well as those from protein databases from the large Bioinformatics centres (including UniProt and RefSeq). The retreat was divided into five sessions: (1) key challenges, (2) the databases represented, (3) best practices for maintenance and curation, (4) information flow to and from large data centers and (5) communication and funding. An important outcome of this meeting was the creation of a Specialist Protein Resource Network that we believe will improve coordination of the activities of its member resources. We invite further protein database resources to join the network and continue the dialogue.

  2. Bioinformatics analysis identifies several intrinsically disordered human E3 ubiquitin-protein ligases

    DEFF Research Database (Denmark)

    Boomsma, Wouter Krogh; Nielsen, Sofie Vincents; Lindorff-Larsen, Kresten

    2016-01-01

    conduct a bioinformatics analysis to examine >600 human and S. cerevisiae E3 ligases to identify enzymes that are similar to San1 in terms of function and/or mechanism of substrate recognition. An initial sequence-based database search was found to detect candidates primarily based on the homology...

  3. Development of a clinical data warehouse from an intensive care clinical information system.

    Science.gov (United States)

    de Mul, Marleen; Alons, Peter; van der Velde, Peter; Konings, Ilse; Bakker, Jan; Hazelzet, Jan

    2012-01-01

    There are relatively few institutions that have developed clinical data warehouses, containing patient data from the point of care. Because of the various care practices, data types and definitions, and the perceived incompleteness of clinical information systems, the development of a clinical data warehouse is a challenge. In order to deal with managerial and clinical information needs, as well as educational and research aims that are important in the setting of a university hospital, Erasmus Medical Center Rotterdam, The Netherlands, developed a data warehouse incrementally. In this paper we report on the in-house development of an integral part of the data warehouse specifically for the intensive care units (ICU-DWH). It was modeled using Atos Origin Metadata Frame method. The paper describes the methodology, the development process and the content of the ICU-DWH, and discusses the need for (clinical) data warehouses in intensive care. Copyright © 2010 Elsevier Ireland Ltd. All rights reserved.

  4. Selection of Forklift Unit for Warehouse Operation by Applying Multi-Criteria Analysis

    Directory of Open Access Journals (Sweden)

    Predrag Atanasković

    2013-07-01

    Full Text Available This paper presents research related to the choice of the criteria that can be used to perform an optimal selection of the forklift unit for warehouse operation. The analysis has been done with the aim of exploring the requirements and defining relevant criteria that are important when investment decision is made for forklift procurement, and based on the conducted research by applying multi-criteria analysis, to determine the appropriate parameters and their relative weights that form the input data and database for selection of the optimal handling unit. This paper presents an example of choosing the optimal forklift based on the selected criteria for the purpose of making the relevant investment decision.

  5. 7 CFR 1427.16 - Movement and protection of warehouse-stored cotton.

    Science.gov (United States)

    2010-01-01

    ... 7 Agriculture 10 2010-01-01 2010-01-01 false Movement and protection of warehouse-stored cotton. 1427.16 Section 1427.16 Agriculture Regulations of the Department of Agriculture (Continued) COMMODITY... Cotton Loan and Loan Deficiency Payments § 1427.16 Movement and protection of warehouse-stored cotton. (a...

  6. Warehouse Plan for the Multi-Canister Overpacks (MC0) and Baskets

    International Nuclear Information System (INIS)

    MARTIN, M.K.

    2000-01-01

    The Multi-Canister Overpacks (MCO) will contain spent nuclear fuel (SNF) removed from the K East and West Basins. The SNF will be placed in fuel storage baskets that will be stacked inside the MCOs. Approximately 400 MCOs and 21 70 baskets will be fabricated for this purpose. These MCOs, loaded with SNF, will be placed in interim storage in the Canister Storage Building (CSB) located in the 200 Area of the Hanford Site. The MCOs consist of different components/sub-assemblies that will be manufactured by one or more vendors. All component/sub-assemblies will be shipped to the Hanford Site Central Stores Warehouse, 2355 Stevens Drive, Building 1163 in the 1100 Area, for inspection and storage until these components are required at the CSB and K Basins. The MCO fuel storage baskets will be manufactured in the MCO basket fabrication shop located in Building 328 of the Hanford Site 300 Area. The MCO baskets will be inspected at the fabrication shop before shipment to the Central Stores Warehouse for storage. The MCO components and baskets will be stored as received from the manufacturer with specified protective coatings, wrappings, and packaging intact to maintain mechanical integrity of the components and to prevent corrosion. The components and baskets will be shipped as needed from the warehouse to the CSB and K Basins. This warehouse plan includes the requirements for receipt of MCO components and baskets from the manufacturers and storage at the Hanford Site Central Stores Warehouse. Transportation of the MCO components and baskets from the warehouse, unwrapping, and assembly of the MCOs are the responsibility of SNF Operations and are not included in this plan

  7. Aspects of Data Warehouse Technologies for Complex Web Data

    DEFF Research Database (Denmark)

    Thomsen, Christian

    This thesis is about aspects of specification and development of data warehouse technologies for complex web data. Today, large amounts of data exist in different web resources and in different formats. But it is often hard to analyze and query the often big and complex data or data about the data...... (i.e., metadata). It is therefore interesting to apply Data Warehouse (DW) technology to the data. But to apply DW technology to complex web data is not straightforward and the DW community faces new and exciting challenges. This thesis considers some of these challenges. The work leading...... to this thesis has primarily been done in relation to the project European Internet Accessibility Observatory (EIAO) where a data warehouse for accessibility data (roughly data about how usable web resources are for disabled users) has been specified and developed. But the results of the thesis can also...

  8. What is bioinformatics? A proposed definition and overview of the field.

    Science.gov (United States)

    Luscombe, N M; Greenbaum, D; Gerstein, M

    2001-01-01

    The recent flood of data from genome sequences and functional genomics has given rise to new field, bioinformatics, which combines elements of biology and computer science. Here we propose a definition for this new field and review some of the research that is being pursued, particularly in relation to transcriptional regulatory systems. Our definition is as follows: Bioinformatics is conceptualizing biology in terms of macromolecules (in the sense of physical-chemistry) and then applying "informatics" techniques (derived from disciplines such as applied maths, computer science, and statistics) to understand and organize the information associated with these molecules, on a large-scale. Analyses in bioinformatics predominantly focus on three types of large datasets available in molecular biology: macromolecular structures, genome sequences, and the results of functional genomics experiments (e.g. expression data). Additional information includes the text of scientific papers and "relationship data" from metabolic pathways, taxonomy trees, and protein-protein interaction networks. Bioinformatics employs a wide range of computational techniques including sequence and structural alignment, database design and data mining, macromolecular geometry, phylogenetic tree construction, prediction of protein structure and function, gene finding, and expression data clustering. The emphasis is on approaches integrating a variety of computational methods and heterogeneous data sources. Finally, bioinformatics is a practical discipline. We survey some representative applications, such as finding homologues, designing drugs, and performing large-scale censuses. Additional information pertinent to the review is available over the web at http://bioinfo.mbb.yale.edu/what-is-it.

  9. 27 CFR 28.27 - Entry of wine into customs bonded warehouses.

    Science.gov (United States)

    2010-04-01

    ... 27 Alcohol, Tobacco Products and Firearms 1 2010-04-01 2010-04-01 false Entry of wine into customs... TRADE BUREAU, DEPARTMENT OF THE TREASURY LIQUORS EXPORTATION OF ALCOHOL Miscellaneous Provisions Customs Bonded Warehouses § 28.27 Entry of wine into customs bonded warehouses. Upon filing of the application or...

  10. A Novel Optimization Method on Logistics Operation for Warehouse & Port Enterprises Based on Game Theory

    Directory of Open Access Journals (Sweden)

    Junyang Li

    2013-09-01

    Full Text Available Purpose: The following investigation aims to deal with the competitive relationship among different warehouses & ports in the same company. Design/methodology/approach: In this paper, Game Theory is used in carrying out the optimization model. Genetic Algorithm is used to solve the model. Findings: Unnecessary competition will rise up if there is little internal communication among different warehouses & ports in one company. This paper carries out a novel optimization method on warehouse & port logistics operation model. Originality/value: Warehouse logistics business is a combination of warehousing services and terminal services which is provided by port logistics through the existing port infrastructure on the basis of a port. The newly proposed method can help to optimize logistics operation model for warehouse & port enterprises effectively. We set Sinotrans Guangdong Company as an example to illustrate the newly proposed method. Finally, according to the case study, this paper gives some responses and suggestions on logistics operation in Sinotrans Guangdong warehouse & port for its future development.

  11. High Performance Protein Sequence Database Scanning on the Cell Broadband Engine

    Directory of Open Access Journals (Sweden)

    Adrianto Wirawan

    2009-01-01

    Full Text Available The enormous growth of biological sequence databases has caused bioinformatics to be rapidly moving towards a data-intensive, computational science. As a result, the computational power needed by bioinformatics applications is growing rapidly as well. The recent emergence of low cost parallel multicore accelerator technologies has made it possible to reduce execution times of many bioinformatics applications. In this paper, we demonstrate how the Cell Broadband Engine can be used as a computational platform to accelerate two approaches for protein sequence database scanning: exhaustive and heuristic. We present efficient parallelization techniques for two representative algorithms: the dynamic programming based Smith–Waterman algorithm and the popular BLASTP heuristic. Their implementation on a Playstation®3 leads to significant runtime savings compared to corresponding sequential implementations.

  12. Bioinformatics-Driven Identification and Examination of Candidate Genes for Non-Alcoholic Fatty Liver Disease

    DEFF Research Database (Denmark)

    Banasik, Karina; Justesen, Johanne M.; Hornbak, Malene

    2011-01-01

    Objective: Candidate genes for non-alcoholic fatty liver disease (NAFLD) identified by a bioinformatics approach were examined for variant associations to quantitative traits of NAFLD-related phenotypes. Research Design and Methods: By integrating public database text mining, trans-organism protein...

  13. Protocol for a national blood transfusion data warehouse from donor to recipient

    NARCIS (Netherlands)

    van Hoeven, Loan R; Hooftman, Babette H; Janssen, Mart P; de Bruijne, Martine C; de Vooght, Karen M K; Kemper, Peter; Koopman, Maria M W

    2016-01-01

    INTRODUCTION: Blood transfusion has health-related, economical and safety implications. In order to optimise the transfusion chain, comprehensive research data are needed. The Dutch Transfusion Data warehouse (DTD) project aims to establish a data warehouse where data from donors and transfusion

  14. Deep learning in bioinformatics.

    Science.gov (United States)

    Min, Seonwoo; Lee, Byunghan; Yoon, Sungroh

    2017-09-01

    In the era of big data, transformation of biomedical big data into valuable knowledge has been one of the most important challenges in bioinformatics. Deep learning has advanced rapidly since the early 2000s and now demonstrates state-of-the-art performance in various fields. Accordingly, application of deep learning in bioinformatics to gain insight from data has been emphasized in both academia and industry. Here, we review deep learning in bioinformatics, presenting examples of current research. To provide a useful and comprehensive perspective, we categorize research both by the bioinformatics domain (i.e. omics, biomedical imaging, biomedical signal processing) and deep learning architecture (i.e. deep neural networks, convolutional neural networks, recurrent neural networks, emergent architectures) and present brief descriptions of each study. Additionally, we discuss theoretical and practical issues of deep learning in bioinformatics and suggest future research directions. We believe that this review will provide valuable insights and serve as a starting point for researchers to apply deep learning approaches in their bioinformatics studies. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com.

  15. 27 CFR 28.244a - Shipment to a customs bonded warehouse.

    Science.gov (United States)

    2010-04-01

    ... 27 Alcohol, Tobacco Products and Firearms 1 2010-04-01 2010-04-01 false Shipment to a customs... Export Consignment § 28.244a Shipment to a customs bonded warehouse. Distilled spirits and wine withdrawn for shipment to a customs bonded warehouse shall be consigned in care of the customs officer in charge...

  16. Development of a public health reporting data warehouse: lessons learned.

    Science.gov (United States)

    Rizi, Seyed Ali Mussavi; Roudsari, Abdul

    2013-01-01

    Data warehouse projects are perceived to be risky and prone to failure due to many organizational and technical challenges. However, often iterative and lengthy processes of implementation of data warehouses at an enterprise level provide an opportunity for formative evaluation of these solutions. This paper describes lessons learned from successful development and implementation of the first phase of an enterprise data warehouse to support public health surveillance at British Columbia Centre for Disease Control. Iterative and prototyping approach to development, overcoming technical challenges of extraction and integration of data from large scale clinical and ancillary systems, a novel approach to record linkage, flexible and reusable modeling of clinical data, and securing senior management support at the right time were the main factors that contributed to the success of the data warehousing project.

  17. A simulated annealing approach for redesigning a warehouse network problem

    Science.gov (United States)

    Khairuddin, Rozieana; Marlizawati Zainuddin, Zaitul; Jiun, Gan Jia

    2017-09-01

    Now a day, several companies consider downsizing their distribution networks in ways that involve consolidation or phase-out of some of their current warehousing facilities due to the increasing competition, mounting cost pressure and taking advantage on the economies of scale. Consequently, the changes on economic situation after a certain period of time require an adjustment on the network model in order to get the optimal cost under the current economic conditions. This paper aimed to develop a mixed-integer linear programming model for a two-echelon warehouse network redesign problem with capacitated plant and uncapacitated warehouses. The main contribution of this study is considering capacity constraint for existing warehouses. A Simulated Annealing algorithm is proposed to tackle with the proposed model. The numerical solution showed the model and method of solution proposed was practical.

  18. E-MSD: an integrated data resource for bioinformatics.

    Science.gov (United States)

    Velankar, S; McNeil, P; Mittard-Runte, V; Suarez, A; Barrell, D; Apweiler, R; Henrick, K

    2005-01-01

    The Macromolecular Structure Database (MSD) group (http://www.ebi.ac.uk/msd/) continues to enhance the quality and consistency of macromolecular structure data in the worldwide Protein Data Bank (wwPDB) and to work towards the integration of various bioinformatics data resources. One of the major obstacles to the improved integration of structural databases such as MSD and sequence databases like UniProt is the absence of up to date and well-maintained mapping between corresponding entries. We have worked closely with the UniProt group at the EBI to clean up the taxonomy and sequence cross-reference information in the MSD and UniProt databases. This information is vital for the reliable integration of the sequence family databases such as Pfam and Interpro with the structure-oriented databases of SCOP and CATH. This information has been made available to the eFamily group (http://www.efamily.org.uk/) and now forms the basis of the regular interchange of information between the member databases (MSD, UniProt, Pfam, Interpro, SCOP and CATH). This exchange of annotation information has enriched the structural information in the MSD database with annotation from wider sequence-oriented resources. This work was carried out under the 'Structure Integration with Function, Taxonomy and Sequences (SIFTS)' initiative (http://www.ebi.ac.uk/msd-srv/docs/sifts) in the MSD group.

  19. Scale out databases for CERN use cases

    International Nuclear Information System (INIS)

    Baranowski, Zbigniew; Grzybek, Maciej; Canali, Luca; Garcia, Daniel Lanza; Surdy, Kacper

    2015-01-01

    Data generation rates are expected to grow very fast for some database workloads going into LHC run 2 and beyond. In particular this is expected for data coming from controls, logging and monitoring systems. Storing, administering and accessing big data sets in a relational database system can quickly become a very hard technical challenge, as the size of the active data set and the number of concurrent users increase. Scale-out database technologies are a rapidly developing set of solutions for deploying and managing very large data warehouses on commodity hardware and with open source software. In this paper we will describe the architecture and tests on database systems based on Hadoop and the Cloudera Impala engine. We will discuss the results of our tests, including tests of data loading and integration with existing data sources and in particular with relational databases. We will report on query performance tests done with various data sets of interest at CERN, notably data from the accelerator log database. (paper)

  20. Clinical Use of an Enterprise Data Warehouse

    Science.gov (United States)

    Evans, R. Scott; Lloyd, James F.; Pierce, Lee A.

    2012-01-01

    The enormous amount of data being collected by electronic medical records (EMR) has found additional value when integrated and stored in data warehouses. The enterprise data warehouse (EDW) allows all data from an organization with numerous inpatient and outpatient facilities to be integrated and analyzed. We have found the EDW at Intermountain Healthcare to not only be an essential tool for management and strategic decision making, but also for patient specific clinical decision support. This paper presents the structure and two case studies of a framework that has provided us the ability to create a number of decision support applications that are dependent on the integration of previous enterprise-wide data in addition to a patient’s current information in the EMR. PMID:23304288

  1. 27 CFR 24.126 - Change in proprietorship involving a bonded wine warehouse.

    Science.gov (United States)

    2010-04-01

    ... involving a bonded wine warehouse. 24.126 Section 24.126 Alcohol, Tobacco Products and Firearms ALCOHOL AND TOBACCO TAX AND TRADE BUREAU, DEPARTMENT OF THE TREASURY LIQUORS WINE Establishment and Operations Changes Subsequent to Original Establishment § 24.126 Change in proprietorship involving a bonded wine warehouse...

  2. Ontology-Based Big Dimension Modeling in Data Warehouse Schema Design

    DEFF Research Database (Denmark)

    Iftikhar, Nadeem

    2013-01-01

    During data warehouse schema design, designers often encounter how to model big dimensions that typically contain a large number of attributes and records. To investigate effective approaches for modeling big dimensions is necessary in order to achieve better query performance, with respect...... partitioning, vertical partitioning and their hybrid. We formalize the design methods and propose an algorithm that describes the modeling process from an OWL ontology to a data warehouse schema. In addition, this paper also presents an effective ontology-based tool to automate the modeling process. The tool...... can automatically generate the data warehouse schema from the ontology of describing the terms and business semantics for the big dimension. In case of any change in the requirements, we only need to modify the ontology, and re-generate the schema using the tool. This paper also evaluates the proposed...

  3. Some Considerations about Modern Database Machines

    Directory of Open Access Journals (Sweden)

    Manole VELICANU

    2010-01-01

    Full Text Available Optimizing the two computing resources of any computing system - time and space - has al-ways been one of the priority objectives of any database. A current and effective solution in this respect is the computer database. Optimizing computer applications by means of database machines has been a steady preoccupation of researchers since the late seventies. Several information technologies have revolutionized the present information framework. Out of these, those which have brought a major contribution to the optimization of the databases are: efficient handling of large volumes of data (Data Warehouse, Data Mining, OLAP – On Line Analytical Processing, the improvement of DBMS – Database Management Systems facilities through the integration of the new technologies, the dramatic increase in computing power and the efficient use of it (computer networks, massive parallel computing, Grid Computing and so on. All these information technologies, and others, have favored the resumption of the research on database machines and the obtaining in the last few years of some very good practical results, as far as the optimization of the computing resources is concerned.

  4. Reliability in Warehouse-Scale Computing: Why Low Latency Matters

    DEFF Research Database (Denmark)

    Nannarelli, Alberto

    2015-01-01

    , the limiting factor of these warehouse-scale data centers is the power dissipation. Power is dissipated not only in the computation itself, but also in heat removal (fans, air conditioning, etc.) to keep the temperature of the devices within the operating ranges. The need to keep the temperature low within......Warehouse sized buildings are nowadays hosting several types of large computing systems: from supercomputers to large clusters of servers to provide the infrastructure to the cloud. Although the main target, especially for high-performance computing, is still to achieve high throughput...

  5. WAREHOUSE PERFORMANCE MEASUREMENT - A CASE STUDY

    Directory of Open Access Journals (Sweden)

    Crisan Emil

    2009-05-01

    Full Text Available Companies could gain cost advantage using their logistics area of the business. Warehouse management is a possible source of cost improvements from logistics that companies could use during this economic crisis. The goal of this article is to expose a few

  6. PENGEMBANGAN DATA WAREHOUSE DAN ON-LINE ANALYTICAL PROCESSING (OLAP UNTUK PENEMUAN INFORMASI DAN ANALISIS DATA (Studi Kasus : Sistem Informasi Penerimaan Mahasiswa Baru STMIK AMIKOM PURWOKERTO

    Directory of Open Access Journals (Sweden)

    Giat Karyono

    2011-08-01

    databases (OLTP SIPMB process of creating a data warehouse is done on different machines by performing replicate of the database used SIPMB. For the needs of the application is made data analysis OLAP PMB. These applications could present multidimensional data in grid view. Analysis of these data may include analysis of specialization in new student enrollment based on the period of new admissions, enrollment surge, home school, home province and district, Pengembangan Data Warehouse dan On-line Analytical Processing (OLAP Untuk Penemuan Informasi Dan Analisis DataJurnal Telematika Vol. 4 No.2 Agustus 2011 14registration information, day, month, or year. So that the results of data analysis can be the management in determining the appropriate marketing strategies to increase the number of freshmen applicants either currently running or for admission in the coming year.

  7. Database and Bioinformatics Studies of Probiotics.

    Science.gov (United States)

    Tao, Lin; Wang, Bohua; Zhong, Yafen; Pow, Siok Hoon; Zeng, Xian; Qin, Chu; Zhang, Peng; Chen, Shangying; He, Weidong; Tan, Ying; Liu, Hongxia; Jiang, Yuyang; Chen, Weiping; Chen, Yu Zong

    2017-09-06

    Probiotics have been widely explored for health benefits, animal cares, and agricultural applications. Recent advances in microbiome, microbiota, and microbial dark matter research have fueled greater interests in and paved ways for the study of the mechanisms of probiotics and the discovery of new probiotics from uncharacterized microbial sources. A probiotics database named PROBIO was developed to facilitate these efforts and the need for the information on the known probiotics, which provides the comprehensive information about the probiotic functions of 448 marketed, 167 clinical trial/field trial, and 382 research probiotics for use or being studied for use in humans, animals, and plants. The potential applications of the probiotics data are illustrated by several literature-reported investigations, which have used the relevant information for probing the function and mechanism of the probiotics and for discovering new probiotics. PROBIO can be accessed free of charge at http://bidd2.nus.edu.sg/probio/homepage.htm .

  8. Metodología crisp para la implementación Data Warehouse

    Directory of Open Access Journals (Sweden)

    Octavio José Salcedo

    2010-06-01

    Full Text Available Currently, the generation of crystal clear reports, concise and above all based on true corporate information is a fundamental element in decision making, because this imminent need arises data warehouse as an essential resource for conducting the process, primarily founded on the philosophy using the concept OLAP and EIS and DSS for the completion of reports. Within the processes carried out for construction of the data warehouse is mainly involving the extraction, processing and handling Information for further definition of the metadata which in turn are used to define the data warehouse as an integrated system. The trend towards pointing BI, is to the dissemination of information both management and to all who need it from different dimensions and levels associated in order to obtainconsolidated or detailed reports to facilitate the synthesis of certain business process that directly impact the decision-making, which at last is the same purpose of the data warehouse. To carry out the implementation of this process is necessary to have an appropriate methodology, so that the project was designed under the structure ofinternational standards, which are the foundation for obtaining excellent results on project implementation.

  9. Improving warehouse responsiveness by job priority management : A European distribution centre field study

    NARCIS (Netherlands)

    T.Y. Kim (Thai Young)

    2018-01-01

    textabstractWarehouses employ order cut-off times to ensure sufficient time for fulfilment. To satisfy higher consumer expectations, these cut-off times are gradually postponed to improve order responsiveness. Warehouses therefore have to allocate jobs more efficiently to meet compressed response

  10. Development of a clinical data warehouse for hospital infection control.

    Science.gov (United States)

    Wisniewski, Mary F; Kieszkowski, Piotr; Zagorski, Brandon M; Trick, William E; Sommers, Michael; Weinstein, Robert A

    2003-01-01

    Existing data stored in a hospital's transactional servers have enormous potential to improve performance measurement and health care quality. Accessing, organizing, and using these data to support research and quality improvement projects are evolving challenges for hospital systems. The authors report development of a clinical data warehouse that they created by importing data from the information systems of three affiliated public hospitals. They describe their methodology; difficulties encountered; responses from administrators, computer specialists, and clinicians; and the steps taken to capture and store patient-level data. The authors provide examples of their use of the clinical data warehouse to monitor antimicrobial resistance, to measure antimicrobial use, to detect hospital-acquired bloodstream infections, to measure the cost of infections, and to detect antimicrobial prescribing errors. In addition, they estimate the amount of time and money saved and the increased precision achieved through the practical application of the data warehouse.

  11. Development of a Clinical Data Warehouse for Hospital Infection Control

    Science.gov (United States)

    Wisniewski, Mary F.; Kieszkowski, Piotr; Zagorski, Brandon M.; Trick, William E.; Sommers, Michael; Weinstein, Robert A.

    2003-01-01

    Existing data stored in a hospital's transactional servers have enormous potential to improve performance measurement and health care quality. Accessing, organizing, and using these data to support research and quality improvement projects are evolving challenges for hospital systems. The authors report development of a clinical data warehouse that they created by importing data from the information systems of three affiliated public hospitals. They describe their methodology; difficulties encountered; responses from administrators, computer specialists, and clinicians; and the steps taken to capture and store patient-level data. The authors provide examples of their use of the clinical data warehouse to monitor antimicrobial resistance, to measure antimicrobial use, to detect hospital-acquired bloodstream infections, to measure the cost of infections, and to detect antimicrobial prescribing errors. In addition, they estimate the amount of time and money saved and the increased precision achieved through the practical application of the data warehouse. PMID:12807807

  12. Scale out databases for CERN use cases

    CERN Document Server

    Baranowski, Zbigniew; Canali, Luca; Garcia, Daniel Lanza; Surdy, Kacper

    2015-01-01

    Data generation rates are expected to grow very fast for some database workloads going into LHC run 2 and beyond. In particular this is expected for data coming from controls, logging and monitoring systems. Storing, administering and accessing big data sets in a relational database system can quickly become a very hard technical challenge, as the size of the active data set and the number of concurrent users increase. Scale-out database technologies are a rapidly developing set of solutions for deploying and managing very large data warehouses on commodity hardware and with open source software. In this paper we will describe the architecture and tests on database systems based on Hadoop and the Cloudera Impala engine. We will discuss the results of our tests, including tests of data loading and integration with existing data sources and in particular with relational databases. We will report on query performance tests done with various data sets of interest at CERN, notably data from the accelerator log dat...

  13. Validating the extract, transform, load process used to populate a large clinical research database.

    Science.gov (United States)

    Denney, Michael J; Long, Dustin M; Armistead, Matthew G; Anderson, Jamie L; Conway, Baqiyyah N

    2016-10-01

    Informaticians at any institution that are developing clinical research support infrastructure are tasked with populating research databases with data extracted and transformed from their institution's operational databases, such as electronic health records (EHRs). These data must be properly extracted from these source systems, transformed into a standard data structure, and then loaded into the data warehouse while maintaining the integrity of these data. We validated the correctness of the extract, load, and transform (ETL) process of the extracted data of West Virginia Clinical and Translational Science Institute's Integrated Data Repository, a clinical data warehouse that includes data extracted from two EHR systems. Four hundred ninety-eight observations were randomly selected from the integrated data repository and compared with the two source EHR systems. Of the 498 observations, there were 479 concordant and 19 discordant observations. The discordant observations fell into three general categories: a) design decision differences between the IDR and source EHRs, b) timing differences, and c) user interface settings. After resolving apparent discordances, our integrated data repository was found to be 100% accurate relative to its source EHR systems. Any institution that uses a clinical data warehouse that is developed based on extraction processes from operational databases, such as EHRs, employs some form of an ETL process. As secondary use of EHR data begins to transform the research landscape, the importance of the basic validation of the extracted EHR data cannot be underestimated and should start with the validation of the extraction process itself. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.

  14. Two warehouse inventory model for deteriorating item with exponential demand rate and permissible delay in payment

    Directory of Open Access Journals (Sweden)

    Kaliraman Naresh Kumar

    2017-01-01

    Full Text Available A two warehouse inventory model for deteriorating items is considered with exponential demand rate and permissible delay in payment. Shortage is not allowed and deterioration rate is constant. In the model, one warehouse is rented and the other is owned. The rented warehouse is provided with better facility for the stock than the owned warehouse, but is charged more. The objective of this model is to find the best replenishment policies for minimizing the total appropriate inventory cost. A numerical illustration and sensitivity analysis is provided.

  15. Multidimensional Databases and Data Warehousing

    CERN Document Server

    Jensen, Christian

    2010-01-01

    The present book's subject is multidimensional data models and data modeling concepts as they are applied in real data warehouses. The book aims to present the most important concepts within this subject in a precise and understandable manner. The book's coverage of fundamental concepts includes data cubes and their elements, such as dimensions, facts, and measures and their representation in a relational setting; it includes architecture-related concepts; and it includes the querying of multidimensional databases.The book also covers advanced multidimensional concepts that are considered to b

  16. The Challenges of Data Quality Evaluation in a Joint Data Warehouse.

    Science.gov (United States)

    Bae, Charles J; Griffith, Sandra; Fan, Youran; Dunphy, Cheryl; Thompson, Nicolas; Urchek, John; Parchman, Alandra; Katzan, Irene L

    2015-01-01

    The use of clinically derived data from electronic health records (EHRs) and other electronic clinical systems can greatly facilitate clinical research as well as operational and quality initiatives. One approach for making these data available is to incorporate data from different sources into a joint data warehouse. When using such a data warehouse, it is important to understand the quality of the data. The primary objective of this study was to determine the completeness and concordance of common types of clinical data available in the Knowledge Program (KP) joint data warehouse, which contains feeds from several electronic systems including the EHR. A manual review was performed of specific data elements for 250 patients from an EHR, and these were compared with corresponding elements in the KP data warehouse. Completeness and concordance were calculated for five categories of data including demographics, vital signs, laboratory results, diagnoses, and medications. In general, data elements for demographics, vital signs, diagnoses, and laboratory results were present in more cases in the source EHR compared to the KP. When data elements were available in both sources, there was a high concordance. In contrast, the KP data warehouse documented a higher prevalence of deaths and medications compared to the EHR. Several factors contributed to the discrepancies between data in the KP and the EHR-including the start date and frequency of data feeds updates into the KP, inability to transfer data located in nonstructured formats (e.g., free text or scanned documents), as well as incomplete and missing data variables in the source EHR. When evaluating the quality of a data warehouse with multiple data sources, assessing completeness and concordance between data set and source data may be better than designating one to be a gold standard. This will allow the user to optimize the method and timing of data transfer in order to capture data with better accuracy.

  17. Emerging strengths in Asia Pacific bioinformatics.

    Science.gov (United States)

    Ranganathan, Shoba; Hsu, Wen-Lian; Yang, Ueng-Cheng; Tan, Tin Wee

    2008-12-12

    The 2008 annual conference of the Asia Pacific Bioinformatics Network (APBioNet), Asia's oldest bioinformatics organisation set up in 1998, was organized as the 7th International Conference on Bioinformatics (InCoB), jointly with the Bioinformatics and Systems Biology in Taiwan (BIT 2008) Conference, Oct. 20-23, 2008 at Taipei, Taiwan. Besides bringing together scientists from the field of bioinformatics in this region, InCoB is actively involving researchers from the area of systems biology, to facilitate greater synergy between these two groups. Marking the 10th Anniversary of APBioNet, this InCoB 2008 meeting followed on from a series of successful annual events in Bangkok (Thailand), Penang (Malaysia), Auckland (New Zealand), Busan (South Korea), New Delhi (India) and Hong Kong. Additionally, tutorials and the Workshop on Education in Bioinformatics and Computational Biology (WEBCB) immediately prior to the 20th Federation of Asian and Oceanian Biochemists and Molecular Biologists (FAOBMB) Taipei Conference provided ample opportunity for inducting mainstream biochemists and molecular biologists from the region into a greater level of awareness of the importance of bioinformatics in their craft. In this editorial, we provide a brief overview of the peer-reviewed manuscripts accepted for publication herein, grouped into thematic areas. As the regional research expertise in bioinformatics matures, the papers fall into thematic areas, illustrating the specific contributions made by APBioNet to global bioinformatics efforts.

  18. A study of multidimensional modeling approaches for data warehouse

    Science.gov (United States)

    Yusof, Sharmila Mat; Sidi, Fatimah; Ibrahim, Hamidah; Affendey, Lilly Suriani

    2016-08-01

    Data warehouse system is used to support the process of organizational decision making. Hence, the system must extract and integrate information from heterogeneous data sources in order to uncover relevant knowledge suitable for decision making process. However, the development of data warehouse is a difficult and complex process especially in its conceptual design (multidimensional modeling). Thus, there have been various approaches proposed to overcome the difficulty. This study surveys and compares the approaches of multidimensional modeling and highlights the issues, trend and solution proposed to date. The contribution is on the state of the art of the multidimensional modeling design.

  19. To the question about the layout of the racks in the warehouse

    Directory of Open Access Journals (Sweden)

    Ilesaliev D.I.

    2017-03-01

    Full Text Available Warehouses, which are located at points of transshipment of cargo from one type of transport on the other, play a sig-nificant role in the transformation of cargo to further the most effective transportation of goods. The location of racks and longitudinal passages are important in the work of transhipment warehouse. Typically, racks and longitudinal pas-sages are perpendicular to each other, the article proposes a radical change with the "euclidean advantage". This is an-other way of designing warehouses for efficiency overload packaged cargo in the supply chain. Purpose is to reduce the mileage for one cycle of the loader from loading and unloading areas to storage areas.

  20. Security Data Warehouse Application

    Science.gov (United States)

    Vernon, Lynn R.; Hennan, Robert; Ortiz, Chris; Gonzalez, Steve; Roane, John

    2012-01-01

    The Security Data Warehouse (SDW) is used to aggregate and correlate all JSC IT security data. This includes IT asset inventory such as operating systems and patch levels, users, user logins, remote access dial-in and VPN, and vulnerability tracking and reporting. The correlation of this data allows for an integrated understanding of current security issues and systems by providing this data in a format that associates it to an individual host. The cornerstone of the SDW is its unique host-mapping algorithm that has undergone extensive field tests, and provides a high degree of accuracy. The algorithm comprises two parts. The first part employs fuzzy logic to derive a best-guess host assignment using incomplete sensor data. The second part is logic to identify and correct errors in the database, based on subsequent, more complete data. Host records are automatically split or merged, as appropriate. The process had to be refined and thoroughly tested before the SDW deployment was feasible. Complexity was increased by adding the dimension of time. The SDW correlates all data with its relationship to time. This lends support to forensic investigations, audits, and overall situational awareness. Another important feature of the SDW architecture is that all of the underlying complexities of the data model and host-mapping algorithm are encapsulated in an easy-to-use and understandable Perl language Application Programming Interface (API). This allows the SDW to be quickly augmented with additional sensors using minimal coding and testing. It also supports rapid generation of ad hoc reports and integration with other information systems.

  1. Development of global data warehouse for beam diagnostics at SSRF

    International Nuclear Information System (INIS)

    Lai Longwei; Leng Yongbin; Yan Yingbing; Chen Zhichu

    2015-01-01

    The beam diagnostic system is adequate during the daily operation and machine study at the Shanghai Synchrotron Radiation Facility (SSRF). Without the effective event detecting mechanism, it is difficult to dump and analyze abnormal phenomena such as the global orbital disturbance, the malfunction of the BPM and the noise of the DCCT. The global beam diagnostic data warehouse was built in order to monitor the status of the accelerator and the beam instruments. The data warehouse was designed as a Soft IOC hosted on an independent server. Once abnormal phenomena happen it will be triggered and will store the relevant data for further analysis. The results show that the data warehouse can detect abnormal phenomena of the machine and the beam diagnostic system effectively, and can be used for calculating confidential indicators of the beam instruments. It provides an efficient tool for the improvement of the beam diagnostic system and accelerator. (authors)

  2. Semantic integration of medication data into the EHOP Clinical Data Warehouse.

    Science.gov (United States)

    Delamarre, Denis; Bouzille, Guillaume; Dalleau, Kevin; Courtel, Denis; Cuggia, Marc

    2015-01-01

    Reusing medication data is crucial for many medical research domains. Semantic integration of such data in clinical data warehouse (CDW) is quite challenging. Our objective was to develop a reliable and scalable method for integrating prescription data into EHOP (a French CDW). PN13/PHAST was used as the semantic interoperability standard during the ETL process, and to store the prescriptions as documents in the CDW. Theriaque was used as a drug knowledge database (DKDB), to annotate the prescription dataset with the finest granularity, and to provide semantic capabilities to the EHOP query workbench. the system was evaluated on a clinical data set. Depending on the use case, the precision ranged from 52% to 100%, Recall was always 100%. interoperability standards and DKDB, document approach, and the finest granularity approach are the key factors for successful drug data integration in CDW.

  3. Expiration of Historical Databases

    DEFF Research Database (Denmark)

    Toman, David

    2001-01-01

    We present a technique for automatic expiration of data in a historical data warehouse that preserves answers to a known and fixed set of first-order queries. In addition, we show that for queries with output size bounded by a function of the active data domain size (the number of values that have...... ever appeared in the warehouse), the size of the portion of the data warehouse history needed to answer the queries is also bounded by a function of the active data do-main size and therefore does not depend on the age of the warehouse (the length of the history)....

  4. Biggest challenges in bioinformatics.

    Science.gov (United States)

    Fuller, Jonathan C; Khoueiry, Pierre; Dinkel, Holger; Forslund, Kristoffer; Stamatakis, Alexandros; Barry, Joseph; Budd, Aidan; Soldatos, Theodoros G; Linssen, Katja; Rajput, Abdul Mateen

    2013-04-01

    The third Heidelberg Unseminars in Bioinformatics (HUB) was held on 18th October 2012, at Heidelberg University, Germany. HUB brought together around 40 bioinformaticians from academia and industry to discuss the 'Biggest Challenges in Bioinformatics' in a 'World Café' style event.

  5. Biggest challenges in bioinformatics

    OpenAIRE

    Fuller, Jonathan C; Khoueiry, Pierre; Dinkel, Holger; Forslund, Kristoffer; Stamatakis, Alexandros; Barry, Joseph; Budd, Aidan; Soldatos, Theodoros G; Linssen, Katja; Rajput, Abdul Mateen

    2013-01-01

    The third Heidelberg Unseminars in Bioinformatics (HUB) was held in October at Heidelberg University in Germany. HUB brought together around 40 bioinformaticians from academia and industry to discuss the ‘Biggest Challenges in Bioinformatics' in a ‘World Café' style event.

  6. The secondary metabolite bioinformatics portal

    DEFF Research Database (Denmark)

    Weber, Tilmann; Kim, Hyun Uk

    2016-01-01

    . In this context, this review gives a summary of tools and databases that currently are available to mine, identify and characterize natural product biosynthesis pathways and their producers based on ‘omics data. A web portal called Secondary Metabolite Bioinformatics Portal (SMBP at http...... analytical and chemical methods gave access to this group of compounds, nowadays genomics-based methods offer complementary approaches to find, identify and characterize such molecules. This paradigm shift also resulted in a high demand for computational tools to assist researchers in their daily work......Natural products are among the most important sources of lead molecules for drug discovery. With the development of affordable whole-genome sequencing technologies and other ‘omics tools, the field of natural products research is currently undergoing a shift in paradigms. While, for decades, mainly...

  7. Establishing bioinformatics research in the Asia Pacific

    Directory of Open Access Journals (Sweden)

    Tammi Martti

    2006-12-01

    Full Text Available Abstract In 1998, the Asia Pacific Bioinformatics Network (APBioNet, Asia's oldest bioinformatics organisation was set up to champion the advancement of bioinformatics in the Asia Pacific. By 2002, APBioNet was able to gain sufficient critical mass to initiate the first International Conference on Bioinformatics (InCoB bringing together scientists working in the field of bioinformatics in the region. This year, the InCoB2006 Conference was organized as the 5th annual conference of the Asia-Pacific Bioinformatics Network, on Dec. 18–20, 2006 in New Delhi, India, following a series of successful events in Bangkok (Thailand, Penang (Malaysia, Auckland (New Zealand and Busan (South Korea. This Introduction provides a brief overview of the peer-reviewed manuscripts accepted for publication in this Supplement. It exemplifies a typical snapshot of the growing research excellence in bioinformatics of the region as we embark on a trajectory of establishing a solid bioinformatics research culture in the Asia Pacific that is able to contribute fully to the global bioinformatics community.

  8. Saccharomyces genome database informs human biology

    OpenAIRE

    Skrzypek, Marek S; Nash, Robert S; Wong, Edith D; MacPherson, Kevin A; Hellerstedt, Sage T; Engel, Stacia R; Karra, Kalpana; Weng, Shuai; Sheppard, Travis K; Binkley, Gail; Simison, Matt; Miyasato, Stuart R; Cherry, J Michael

    2017-01-01

    Abstract The Saccharomyces Genome Database (SGD; http://www.yeastgenome.org) is an expertly curated database of literature-derived functional information for the model organism budding yeast, Saccharomyces cerevisiae. SGD constantly strives to synergize new types of experimental data and bioinformatics predictions with existing data, and to organize them into a comprehensive and up-to-date information resource. The primary mission of SGD is to facilitate research into the biology of yeast and...

  9. Preface to Introduction to Structural Bioinformatics

    NARCIS (Netherlands)

    Feenstra, K. Anton; Abeln, Sanne

    2018-01-01

    While many good textbooks are available on Protein Structure, Molecular Simulations, Thermodynamics and Bioinformatics methods in general, there is no good introductory level book for the field of Structural Bioinformatics. This book aims to give an introduction into Structural Bioinformatics, which

  10. Warehouse design and product assignment and allocation: A mathematical programming model

    OpenAIRE

    Geraldes, Carla A. S.; Carvalho, Maria Sameiro; Pereira, Guilherme

    2012-01-01

    Warehouses can be considered one of the most important nodes in supply chains. The dynamic nature of today's markets compels organizations to an incessant reassessment in an effort to respond to continuous challenges. Therefore warehouses must be continually re-evaluated to ensure that they are consistent with both market's demands and management's strategies. In this paper we discuss a mathematical programming model aiming to support product assignment and allocation to the functional areas ...

  11. An integrative data analysis platform for gene set analysis and knowledge discovery in a data warehouse framework.

    Science.gov (United States)

    Chen, Yi-An; Tripathi, Lokesh P; Mizuguchi, Kenji

    2016-01-01

    Data analysis is one of the most critical and challenging steps in drug discovery and disease biology. A user-friendly resource to visualize and analyse high-throughput data provides a powerful medium for both experimental and computational biologists to understand vastly different biological data types and obtain a concise, simplified and meaningful output for better knowledge discovery. We have previously developed TargetMine, an integrated data warehouse optimized for target prioritization. Here we describe how upgraded and newly modelled data types in TargetMine can now survey the wider biological and chemical data space, relevant to drug discovery and development. To enhance the scope of TargetMine from target prioritization to broad-based knowledge discovery, we have also developed a new auxiliary toolkit to assist with data analysis and visualization in TargetMine. This toolkit features interactive data analysis tools to query and analyse the biological data compiled within the TargetMine data warehouse. The enhanced system enables users to discover new hypotheses interactively by performing complicated searches with no programming and obtaining the results in an easy to comprehend output format. Database URL: http://targetmine.mizuguchilab.org. © The Author(s) 2016. Published by Oxford University Press.

  12. REDfly: a Regulatory Element Database for Drosophila.

    Science.gov (United States)

    Gallo, Steven M; Li, Long; Hu, Zihua; Halfon, Marc S

    2006-02-01

    Bioinformatics studies of transcriptional regulation in the metazoa are significantly hindered by the absence of readily available data on large numbers of transcriptional cis-regulatory modules (CRMs). Even the richly annotated Drosophila melanogaster genome lacks extensive CRM information. We therefore present here a database of Drosophila CRMs curated from the literature complete with both DNA sequence and a searchable description of the gene expression pattern regulated by each CRM. This resource should greatly facilitate the development of computational approaches to CRM discovery as well as bioinformatics analyses of regulatory sequence properties and evolution.

  13. Warehouse hazardous and toxic waste design in Karingau Balikpapan

    Science.gov (United States)

    Pratama, Bayu Rendy; Kencanawati, Martheana

    2017-11-01

    PT. Balikpapan Environmental Services (PT. BES) is company that having core business in Hazardous and Toxic Waste Management Services which consisting storage and transporter at Balikpapan. This research starting with data collection such as type of waste, quantity of waste, dimension area of existing building, waste packaging (Drum, IBC tank, Wooden Box, & Bulk Bag). Processing data that will be done are redesign for warehouse dimension and layout of position waste, specify of capacity, specify of quantity, type and detector placement, specify of quantity, type and fire extinguishers position which refers to Bapedal Regulation No. 01 In 1995, SNI 03-3985-2000, Employee Minister Regulation RI No. Per-04/Men/1980. Based on research that already done, founded the design for warehouse dimension of waste is 23 m × 22 m × 5 m with waste layout position appropriate with type of waste. The necessary of quantity for detector on this waste warehouse design are 56 each. The type of fire extinguisher that appropriate with this design is dry powder which containing natrium carbonate, alkali salts, with having each weight of 12 Kg about 18 units.

  14. Finding patients using similarity measures in a rare diseases-oriented clinical data warehouse: Dr. Warehouse and the needle in the needle stack.

    Science.gov (United States)

    Garcelon, Nicolas; Neuraz, Antoine; Benoit, Vincent; Salomon, Rémi; Kracker, Sven; Suarez, Felipe; Bahi-Buisson, Nadia; Hadj-Rabia, Smail; Fischer, Alain; Munnich, Arnold; Burgun, Anita

    2017-09-01

    In the context of rare diseases, it may be helpful to detect patients with similar medical histories, diagnoses and outcomes from a large number of cases with automated methods. To reduce the time to find new cases, we developed a method to find similar patients given an index case leveraging data from the electronic health records. We used the clinical data warehouse of a children academic hospital in Paris, France (Necker-Enfants Malades), containing about 400,000 patients. Our model was based on a vector space model (VSM) to compute the similarity distance between an index patient and all the patients of the data warehouse. The dimensions of the VSM were built upon Unified Medical Language System concepts extracted from clinical narratives stored in the clinical data warehouse. The VSM was enhanced using three parameters: a pertinence score (TF-IDF of the concepts), the polarity of the concept (negated/not negated) and the minimum number of concepts in common. We evaluated this model by displaying the most similar patients for five different rare diseases: Lowe Syndrome (LOWE), Dystrophic Epidermolysis Bullosa (DEB), Activated PI3K delta Syndrome (APDS), Rett Syndrome (RETT) and Dowling Meara (EBS-DM), from the clinical data warehouse representing 18, 103, 21, 84 and 7 patients respectively. The percentages of index patients returning at least one true positive similar patient in the Top30 similar patients were 94% for LOWE, 97% for DEB, 86% for APDS, 71% for EBS-DM and 99% for RETT. The mean number of patients with the exact same genetic diseases among the 30 returned patients was 51%. This tool offers new perspectives in a translational context to identify patients for genetic research. Moreover, when new molecular bases are discovered, our strategy will help to identify additional eligible patients for genetic screening. Copyright © 2017. Published by Elsevier Inc.

  15. Refrigerated Warehouse Demand Response Strategy Guide

    Energy Technology Data Exchange (ETDEWEB)

    Scott, Doug [VaCom Technologies, San Luis Obispo, CA (United States); Castillo, Rafael [VaCom Technologies, San Luis Obispo, CA (United States); Larson, Kyle [VaCom Technologies, San Luis Obispo, CA (United States); Dobbs, Brian [VaCom Technologies, San Luis Obispo, CA (United States); Olsen, Daniel [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States)

    2015-11-01

    This guide summarizes demand response measures that can be implemented in refrigerated warehouses. In an appendix, it also addresses related energy efficiency opportunities. Reducing overall grid demand during peak periods and energy consumption has benefits for facility operators, grid operators, utility companies, and society. State wide demand response potential for the refrigerated warehouse sector in California is estimated to be over 22.1 Megawatts. Two categories of demand response strategies are described in this guide: load shifting and load shedding. Load shifting can be accomplished via pre-cooling, capacity limiting, and battery charger load management. Load shedding can be achieved by lighting reduction, demand defrost and defrost termination, infiltration reduction, and shutting down miscellaneous equipment. Estimation of the costs and benefits of demand response participation yields simple payback periods of 2-4 years. To improve demand response performance, it’s suggested to install air curtains and another form of infiltration barrier, such as a rollup door, for the passageways. Further modifications to increase efficiency of the refrigeration unit are also analyzed. A larger condenser can maintain the minimum saturated condensing temperature (SCT) for more hours of the day. Lowering the SCT reduces the compressor lift, which results in an overall increase in refrigeration system capacity and energy efficiency. Another way of saving energy in refrigerated warehouses is eliminating the use of under-floor resistance heaters. A more energy efficient alternative to resistance heaters is to utilize the heat that is being rejected from the condenser through a heat exchanger. These energy efficiency measures improve efficiency either by reducing the required electric energy input for the refrigeration system, by helping to curtail the refrigeration load on the system, or by reducing both the load and required energy input.

  16. Protocol for a national blood transfusion data warehouse from donor to recipient

    Science.gov (United States)

    van Hoeven, Loan R; Hooftman, Babette H; Janssen, Mart P; de Bruijne, Martine C; de Vooght, Karen M K; Kemper, Peter; Koopman, Maria M W

    2016-01-01

    Introduction Blood transfusion has health-related, economical and safety implications. In order to optimise the transfusion chain, comprehensive research data are needed. The Dutch Transfusion Data warehouse (DTD) project aims to establish a data warehouse where data from donors and transfusion recipients are linked. This paper describes the design of the data warehouse, challenges and illustrative applications. Study design and methods Quantitative data on blood donors (eg, age, blood group, antibodies) and products (type of product, processing, storage time) are obtained from the national blood bank. These are linked to data on the transfusion recipients (eg, transfusions administered, patient diagnosis, surgical procedures, laboratory parameters), which are extracted from hospital electronic health records. Applications Expected scientific contributions are illustrated for 4 applications: determine risk factors, predict blood use, benchmark blood use and optimise process efficiency. For each application, examples of research questions are given and analyses planned. Conclusions The DTD project aims to build a national, continuously updated transfusion data warehouse. These data have a wide range of applications, on the donor/production side, recipient studies on blood usage and benchmarking and donor–recipient studies, which ultimately can contribute to the efficiency and safety of blood transfusion. PMID:27491665

  17. Warehouse site selection in an international environment

    Directory of Open Access Journals (Sweden)

    Sebastjan ŠKERLIČ

    2013-01-01

    Full Text Available The changed conditions in the automotive industry as the market and the production are moving from west to east, both at global and at European level, require constant adjustment from Slovenian companies. The companies strive to remain close to their customers and suppliers, as only by maintaining a high quality and streamlined supply chain, their existence within the demanding automotive industry is guaranteed in the long term. Choosing the right location for a warehouse in an international environment is therefore one of the most important strategic decisions that takes into account a number of interrelated factors such as transport networks, transport infrastructure, trade flows and the total cost. This paper aims to explore the important aspects of selecting a location for a warehouse and to identify potential international strategic locations, which could have a significant impact on the future operations of Slovenian companies in the global automotive industry.

  18. Computational biology and bioinformatics in Nigeria.

    Science.gov (United States)

    Fatumo, Segun A; Adoga, Moses P; Ojo, Opeolu O; Oluwagbemi, Olugbenga; Adeoye, Tolulope; Ewejobi, Itunuoluwa; Adebiyi, Marion; Adebiyi, Ezekiel; Bewaji, Clement; Nashiru, Oyekanmi

    2014-04-01

    Over the past few decades, major advances in the field of molecular biology, coupled with advances in genomic technologies, have led to an explosive growth in the biological data generated by the scientific community. The critical need to process and analyze such a deluge of data and turn it into useful knowledge has caused bioinformatics to gain prominence and importance. Bioinformatics is an interdisciplinary research area that applies techniques, methodologies, and tools in computer and information science to solve biological problems. In Nigeria, bioinformatics has recently played a vital role in the advancement of biological sciences. As a developing country, the importance of bioinformatics is rapidly gaining acceptance, and bioinformatics groups comprised of biologists, computer scientists, and computer engineers are being constituted at Nigerian universities and research institutes. In this article, we present an overview of bioinformatics education and research in Nigeria. We also discuss professional societies and academic and research institutions that play central roles in advancing the discipline in Nigeria. Finally, we propose strategies that can bolster bioinformatics education and support from policy makers in Nigeria, with potential positive implications for other developing countries.

  19. Computational biology and bioinformatics in Nigeria.

    Directory of Open Access Journals (Sweden)

    Segun A Fatumo

    2014-04-01

    Full Text Available Over the past few decades, major advances in the field of molecular biology, coupled with advances in genomic technologies, have led to an explosive growth in the biological data generated by the scientific community. The critical need to process and analyze such a deluge of data and turn it into useful knowledge has caused bioinformatics to gain prominence and importance. Bioinformatics is an interdisciplinary research area that applies techniques, methodologies, and tools in computer and information science to solve biological problems. In Nigeria, bioinformatics has recently played a vital role in the advancement of biological sciences. As a developing country, the importance of bioinformatics is rapidly gaining acceptance, and bioinformatics groups comprised of biologists, computer scientists, and computer engineers are being constituted at Nigerian universities and research institutes. In this article, we present an overview of bioinformatics education and research in Nigeria. We also discuss professional societies and academic and research institutions that play central roles in advancing the discipline in Nigeria. Finally, we propose strategies that can bolster bioinformatics education and support from policy makers in Nigeria, with potential positive implications for other developing countries.

  20. The ATLAS Distributed Data Management System & Databases

    CERN Document Server

    Garonne, V; The ATLAS collaboration; Barisits, M; Beermann, T; Vigne, R; Serfon, C

    2013-01-01

    The ATLAS Distributed Data Management (DDM) System is responsible for the global management of petabytes of high energy physics data. The current system, DQ2, has a critical dependency on Relational Database Management Systems (RDBMS), like Oracle. RDBMS are well-suited to enforcing data integrity in online transaction processing applications, however, concerns have been raised about the scalability of its data warehouse-like workload. In particular, analysis of archived data or aggregation of transactional data for summary purposes is problematic. Therefore, we have evaluated new approaches to handle vast amounts of data. We have investigated a class of database technologies commonly referred to as NoSQL databases. This includes distributed filesystems, like HDFS, that support parallel execution of computational tasks on distributed data, as well as schema-less approaches via key-value stores, like HBase. In this talk we will describe our use cases in ATLAS, share our experiences with various databases used ...

  1. Quantitative performance of E-Scribe warehouse in detecting quality issues with digital annotated ECG data from healthy subjects.

    Science.gov (United States)

    Sarapa, Nenad; Mortara, Justin L; Brown, Barry D; Isola, Lamberto; Badilini, Fabio

    2008-05-01

    The US Food and Drug Administration recommends submission of digital electrocardiograms in the standard HL7 XML format into the electrocardiogram warehouse to support preapproval review of new drug applications. The Food and Drug Administration scrutinizes electrocardiogram quality by viewing the annotated waveforms and scoring electrocardiogram quality by the warehouse algorithms. Part of the Food and Drug Administration warehouse is commercially available to sponsors as the E-Scribe Warehouse. The authors tested the performance of E-Scribe Warehouse algorithms by quantifying electrocardiogram acquisition quality, adherence to QT annotation protocol, and T-wave signal strength in 2 data sets: "reference" (104 digital electrocardiograms from a phase I study with sotalol in 26 healthy subjects with QT annotations by computer-assisted manual adjustment) and "test" (the same electrocardiograms with an intentionally introduced predefined number of quality issues). The E-Scribe Warehouse correctly detected differences between the 2 sets expected from the number and pattern of errors in the "test" set (except for 1 subject with QT misannotated in different leads of serial electrocardiograms) and confirmed the absence of differences where none was expected. E-Scribe Warehouse scores below the threshold value identified individual electrocardiograms with questionable T-wave signal strength. The E-Scribe Warehouse showed satisfactory performance in detecting electrocardiogram quality issues that may impair reliability of QTc assessment in clinical trials in healthy subjects.

  2. Establishing bioinformatics research in the Asia Pacific

    OpenAIRE

    Ranganathan, Shoba; Tammi, Martti; Gribskov, Michael; Tan, Tin Wee

    2006-01-01

    Abstract In 1998, the Asia Pacific Bioinformatics Network (APBioNet), Asia's oldest bioinformatics organisation was set up to champion the advancement of bioinformatics in the Asia Pacific. By 2002, APBioNet was able to gain sufficient critical mass to initiate the first International Conference on Bioinformatics (InCoB) bringing together scientists working in the field of bioinformatics in the region. This year, the InCoB2006 Conference was organized as the 5th annual conference of the Asia-...

  3. Benefits of a clinical data warehouse with data mining tools to collect data for a radiotherapy trial.

    Science.gov (United States)

    Roelofs, Erik; Persoon, Lucas; Nijsten, Sebastiaan; Wiessler, Wolfgang; Dekker, André; Lambin, Philippe

    2013-07-01

    Collecting trial data in a medical environment is at present mostly performed manually and therefore time-consuming, prone to errors and often incomplete with the complex data considered. Faster and more accurate methods are needed to improve the data quality and to shorten data collection times where information is often scattered over multiple data sources. The purpose of this study is to investigate the possible benefit of modern data warehouse technology in the radiation oncology field. In this study, a Computer Aided Theragnostics (CAT) data warehouse combined with automated tools for feature extraction was benchmarked against the regular manual data-collection processes. Two sets of clinical parameters were compiled for non-small cell lung cancer (NSCLC) and rectal cancer, using 27 patients per disease. Data collection times and inconsistencies were compared between the manual and the automated extraction method. The average time per case to collect the NSCLC data manually was 10.4 ± 2.1 min and 4.3 ± 1.1 min when using the automated method (pdata collected for NSCLC and 5.3% for rectal cancer, there was a discrepancy between the manual and automated method. Aggregating multiple data sources in a data warehouse combined with tools for extraction of relevant parameters is beneficial for data collection times and offers the ability to improve data quality. The initial investments in digitizing the data are expected to be compensated due to the flexibility of the data analysis. Furthermore, successive investigations can easily select trial candidates and extract new parameters from the existing databases. Copyright © 2012 Elsevier Ireland Ltd. All rights reserved.

  4. Rancang Bangun Data Warehouse Untuk Analisis Kinerja Penjualan Pada Industri Dengan Model Spa-Dw

    Directory of Open Access Journals (Sweden)

    Randy Oktrima Putra

    2014-02-01

    Full Text Available A company, majorly company that active in commercial (profit orientation need to analyze their sales performance. By analyzing sales performance, company can increase their sales performance. One of method to analyze sales performance is by collecting historical data that relates to sales and then process that data so that produce information that show company sales performance.   A data warehouse is a set of data that has characteristic subject oriented, time variant, integrated, and nonvolatile that help company management in processing of decision making. Design of data warehouse is started from collecting data that relate to sales such as product, customer, sales area, sales transaction, etc. After collecting the data, next is data extraction and transformation. Data extraction is a process f or selecting data that will be loaded into data warehouse. Data transformation is making some change to the data afte r extracted to be more consistent. After transformation processing, data are loaded into data warehouse. Data in data warehouse is processed by OLAP (On Line Analytical Processing to produce information.  Information that are produced from data processing  by OLAP are chart and query reporting. Chart reporting are sales chart based on cement type, sales chart based on sales area, sales chart based on plant, monthly and year ly sales chart, and chart based on customer feedback. Query reporting are sales based on cement type, sales area, plant and customer.Keywords: Data warehouse; OLAP; Sales performance analysis; Ready mix market

  5. Warehouse operations planning model for Bausch & Lomb

    NARCIS (Netherlands)

    Atilgan, Ceren

    2009-01-01

    Operations planning is a major part of the Sales& Operations Planning (S&OP) process. It provides an overview on the operations capacity requirements by considering the supply and demand plan. However, Bausch& Lomb does not have a structured operations planning process for their warehouse

  6. Unexpected levels and movement of radon in a large warehouse

    International Nuclear Information System (INIS)

    Gammage, R.B.; Espinosa, G.

    2004-01-01

    Alpha-track detectors, used in screening for radon, identified a large warehouse with levels of radon as high as 20 p Ci/l. This circumstance was unexpected because large bay doors were left open for much of the day to admit 1 8-wheeler trucks, and exhaust fans in the roof produced good ventilation. More detailed temporal and spatial investigations of radon and air-flow patterns were made with electret chambers, Lucas-cell flow chambers, tracer gas, smoke pencils and pressure sensing micrometers. An oval-dome shaped zone of radon (>4 p Ci/L) persisted in the central region of each of four separate bays composing the warehouse. Detailed studies of air movement in the bay with the highest levels of radon showed clockwise rotation of air near the outer walls with a central dead zone. Sub slab, radon-laden air ingresses the building through expansion joints between the floor slabs to produce the measured radon. The likely source of radon is air within porous, karst bedrock that underlies much of north-central Tennessee where the warehouse is situated

  7. Clinical Data Warehouse: An Effective Tool to Create Intelligence in Disease Management.

    Science.gov (United States)

    Karami, Mahtab; Rahimi, Azin; Shahmirzadi, Ali Hosseini

    Clinical business intelligence tools such as clinical data warehouse enable health care organizations to objectively assess the disease management programs that affect the quality of patients' life and well-being in public. The purpose of these programs is to reduce disease occurrence, improve patient care, and decrease health care costs. Therefore, applying clinical data warehouse can be effective in generating useful information about aspects of patient care to facilitate budgeting, planning, research, process improvement, external reporting, benchmarking, and trend analysis, as well as to enable the decisions needed to prevent the progression or appearance of the illness aligning with maintaining the health of the population. The aim of this review article is to describe the benefits of clinical data warehouse applications in creating intelligence for disease management programs.

  8. Population dynamics of stored maize insect pests in warehouses in two districts of Ghana

    Science.gov (United States)

    Understanding what insect species are present and their temporal and spatial patterns of distribution is important for developing a successful integrated pest management strategy for food storage in warehouses. Maize in many countries in Africa is stored in bags in warehouses, but little monitoring ...

  9. A Simulation Modeling Approach Method Focused on the Refrigerated Warehouses Using Design of Experiment

    Science.gov (United States)

    Cho, G. S.

    2017-09-01

    For performance optimization of Refrigerated Warehouses, design parameters are selected based on the physical parameters such as number of equipment and aisles, speeds of forklift for ease of modification. This paper provides a comprehensive framework approach for the system design of Refrigerated Warehouses. We propose a modeling approach which aims at the simulation optimization so as to meet required design specifications using the Design of Experiment (DOE) and analyze a simulation model using integrated aspect-oriented modeling approach (i-AOMA). As a result, this suggested method can evaluate the performance of a variety of Refrigerated Warehouses operations.

  10. Generalized Centroid Estimators in Bioinformatics

    Science.gov (United States)

    Hamada, Michiaki; Kiryu, Hisanori; Iwasaki, Wataru; Asai, Kiyoshi

    2011-01-01

    In a number of estimation problems in bioinformatics, accuracy measures of the target problem are usually given, and it is important to design estimators that are suitable to those accuracy measures. However, there is often a discrepancy between an employed estimator and a given accuracy measure of the problem. In this study, we introduce a general class of efficient estimators for estimation problems on high-dimensional binary spaces, which represent many fundamental problems in bioinformatics. Theoretical analysis reveals that the proposed estimators generally fit with commonly-used accuracy measures (e.g. sensitivity, PPV, MCC and F-score) as well as it can be computed efficiently in many cases, and cover a wide range of problems in bioinformatics from the viewpoint of the principle of maximum expected accuracy (MEA). It is also shown that some important algorithms in bioinformatics can be interpreted in a unified manner. Not only the concept presented in this paper gives a useful framework to design MEA-based estimators but also it is highly extendable and sheds new light on many problems in bioinformatics. PMID:21365017

  11. Bioinformatics: Cheap and robust method to explore biomaterial from Indonesia biodiversity

    Science.gov (United States)

    Widodo

    2015-02-01

    Indonesia has a huge amount of biodiversity, which may contain many biomaterials for pharmaceutical application. These resources potency should be explored to discover new drugs for human wealth. However, the bioactive screening using conventional methods is very expensive and time-consuming. Therefore, we developed a methodology for screening the potential of natural resources based on bioinformatics. The method is developed based on the fact that organisms in the same taxon will have similar genes, metabolism and secondary metabolites product. Then we employ bioinformatics to explore the potency of biomaterial from Indonesia biodiversity by comparing species with the well-known taxon containing the active compound through published paper or chemical database. Then we analyze drug-likeness, bioactivity and the target proteins of the active compound based on their molecular structure. The target protein was examined their interaction with other proteins in the cell to determine action mechanism of the active compounds in the cellular level, as well as to predict its side effects and toxicity. By using this method, we succeeded to screen anti-cancer, immunomodulators and anti-inflammation from Indonesia biodiversity. For example, we found anticancer from marine invertebrate by employing the method. The anti-cancer was explore based on the isolated compounds of marine invertebrate from published article and database, and then identified the protein target, followed by molecular pathway analysis. The data suggested that the active compound of the invertebrate able to kill cancer cell. Further, we collect and extract the active compound from the invertebrate, and then examined the activity on cancer cell (MCF7). The MTT result showed that the methanol extract of marine invertebrate was highly potent in killing MCF7 cells. Therefore, we concluded that bioinformatics is cheap and robust way to explore bioactive from Indonesia biodiversity for source of drug and another

  12. A Million Cancer Genome Warehouse

    Science.gov (United States)

    2012-11-20

    of a national program for Cancer Information Donors, the American Society for Clinical Oncology (ASCO) has proposed a rapid learning system for...or Scala and Spark; “scrum” organization of small programming teams; calculating “velocity” to predict time to develop new features; and Agile...2012 to 00-00-2012 4. TITLE AND SUBTITLE A Million Cancer Genome Warehouse 5a. CONTRACT NUMBER 5b. GRANT NUMBER 5c. PROGRAM ELEMENT NUMBER 6

  13. Contextual snowflake modelling for pattern warehouse logical design

    Indian Academy of Sciences (India)

    being managed by the pattern warehouse management system (PWMS) ... The authors pointed out that the necessity to find out the relationship between patterns .... (i) Some customer queries can only be satisfied by specific DM technique.

  14. Introduction to bioinformatics.

    Science.gov (United States)

    Can, Tolga

    2014-01-01

    Bioinformatics is an interdisciplinary field mainly involving molecular biology and genetics, computer science, mathematics, and statistics. Data intensive, large-scale biological problems are addressed from a computational point of view. The most common problems are modeling biological processes at the molecular level and making inferences from collected data. A bioinformatics solution usually involves the following steps: Collect statistics from biological data. Build a computational model. Solve a computational modeling problem. Test and evaluate a computational algorithm. This chapter gives a brief introduction to bioinformatics by first providing an introduction to biological terminology and then discussing some classical bioinformatics problems organized by the types of data sources. Sequence analysis is the analysis of DNA and protein sequences for clues regarding function and includes subproblems such as identification of homologs, multiple sequence alignment, searching sequence patterns, and evolutionary analyses. Protein structures are three-dimensional data and the associated problems are structure prediction (secondary and tertiary), analysis of protein structures for clues regarding function, and structural alignment. Gene expression data is usually represented as matrices and analysis of microarray data mostly involves statistics analysis, classification, and clustering approaches. Biological networks such as gene regulatory networks, metabolic pathways, and protein-protein interaction networks are usually modeled as graphs and graph theoretic approaches are used to solve associated problems such as construction and analysis of large-scale networks.

  15. IR and OLAP in XML document warehouses

    DEFF Research Database (Denmark)

    Perez, Juan Manuel; Pedersen, Torben Bach; Berlanga, Rafael

    2005-01-01

    In this paper we propose to combine IR and OLAP (On-Line Analytical Processing) technologies to exploit a warehouse of text-rich XML documents. In the system we plan to develop, a multidimensional implementation of a relevance modeling document model will be used for interactively querying...

  16. Bioinformatics clouds for big data manipulation.

    Science.gov (United States)

    Dai, Lin; Gao, Xin; Guo, Yan; Xiao, Jingfa; Zhang, Zhang

    2012-11-28

    As advances in life sciences and information technology bring profound influences on bioinformatics due to its interdisciplinary nature, bioinformatics is experiencing a new leap-forward from in-house computing infrastructure into utility-supplied cloud computing delivered over the Internet, in order to handle the vast quantities of biological data generated by high-throughput experimental technologies. Albeit relatively new, cloud computing promises to address big data storage and analysis issues in the bioinformatics field. Here we review extant cloud-based services in bioinformatics, classify them into Data as a Service (DaaS), Software as a Service (SaaS), Platform as a Service (PaaS), and Infrastructure as a Service (IaaS), and present our perspectives on the adoption of cloud computing in bioinformatics. This article was reviewed by Frank Eisenhaber, Igor Zhulin, and Sandor Pongor.

  17. Configurable Web Warehouses construction through BPM Systems

    Directory of Open Access Journals (Sweden)

    Andrea Delgado

    2016-08-01

    Full Text Available The process of building Data Warehouses (DW is well known with well defined stages but at the same time, mostly carried out manually by IT people in conjunction with business people. Web Warehouses (WW are DW whose data sources are taken from the web. We define a flexible WW, which can be configured accordingly to different domains, through the selection of the web sources and the definition of data processing characteristics. A Business Process Management (BPM System allows modeling and executing Business Processes (BPs providing support for the automation of processes. To support the process of building flexible WW we propose a two BPs level: a configuration process to support the selection of web sources and the definition of schemas and mappings, and a feeding process which takes the defined configuration and loads the data into the WW. In this paper we present a proof of concept of both processes, with focus on the configuration process and the defined data.

  18. Design of data warehouse in teaching state based on OLAP and data mining

    Science.gov (United States)

    Zhou, Lijuan; Wu, Minhua; Li, Shuang

    2009-04-01

    The data warehouse and the data mining technology is one of information technology research hot topics. At present the data warehouse and the data mining technology in aspects and so on commercial, financial industry as well as enterprise's production, market marketing obtained the widespread application, but is relatively less in educational fields' application. Over the years, the teaching and management have been accumulating large amounts of data in colleges and universities, while the data can not be effectively used, in the light of social needs of the university development and the current status of data management, the establishment of data warehouse in university state, the better use of existing data, and on the basis dealing with a higher level of disposal --data mining are particularly important. In this paper, starting from the decision-making needs design data warehouse structure of university teaching state, and then through the design structure and data extraction, loading, conversion create a data warehouse model, finally make use of association rule mining algorithm for data mining, to get effective results applied in practice. Based on the data analysis and mining, get a lot of valuable information, which can be used to guide teaching management, thereby improving the quality of teaching and promoting teaching devotion in universities and enhancing teaching infrastructure. At the same time it can provide detailed, multi-dimensional information for universities assessment and higher education research.

  19. Bioinformatics and systems biology research update from the 15th International Conference on Bioinformatics (InCoB2016).

    Science.gov (United States)

    Schönbach, Christian; Verma, Chandra; Bond, Peter J; Ranganathan, Shoba

    2016-12-22

    The International Conference on Bioinformatics (InCoB) has been publishing peer-reviewed conference papers in BMC Bioinformatics since 2006. Of the 44 articles accepted for publication in supplement issues of BMC Bioinformatics, BMC Genomics, BMC Medical Genomics and BMC Systems Biology, 24 articles with a bioinformatics or systems biology focus are reviewed in this editorial. InCoB2017 is scheduled to be held in Shenzen, China, September 20-22, 2017.

  20. Medical Big Data Warehouse: Architecture and System Design, a Case Study: Improving Healthcare Resources Distribution.

    Science.gov (United States)

    Sebaa, Abderrazak; Chikh, Fatima; Nouicer, Amina; Tari, AbdelKamel

    2018-02-19

    The huge increases in medical devices and clinical applications which generate enormous data have raised a big issue in managing, processing, and mining this massive amount of data. Indeed, traditional data warehousing frameworks can not be effective when managing the volume, variety, and velocity of current medical applications. As a result, several data warehouses face many issues over medical data and many challenges need to be addressed. New solutions have emerged and Hadoop is one of the best examples, it can be used to process these streams of medical data. However, without an efficient system design and architecture, these performances will not be significant and valuable for medical managers. In this paper, we provide a short review of the literature about research issues of traditional data warehouses and we present some important Hadoop-based data warehouses. In addition, a Hadoop-based architecture and a conceptual data model for designing medical Big Data warehouse are given. In our case study, we provide implementation detail of big data warehouse based on the proposed architecture and data model in the Apache Hadoop platform to ensure an optimal allocation of health resources.

  1. Bioinformatics

    DEFF Research Database (Denmark)

    Baldi, Pierre; Brunak, Søren

    , and medicine will be particularly affected by the new results and the increased understanding of life at the molecular level. Bioinformatics is the development and application of computer methods for analysis, interpretation, and prediction, as well as for the design of experiments. It has emerged...

  2. Two-warehouse partial backlogging inventory model for deteriorating items with linear trend in demand under inflationary conditions

    Science.gov (United States)

    Jaggi, Chandra K.; Khanna, Aditi; Verma, Priyanka

    2011-07-01

    In today's business transactions, there are various reasons, namely, bulk purchase discounts, re-ordering costs, seasonality of products, inflation induced demand, etc., which force the buyer to order more than the warehouse capacity. Such situations call for additional storage space to store the excess units purchased. This additional storage space is typically a rented warehouse. Inflation plays a very interesting and significant role here: It increases the cost of goods. To safeguard from the rising prices, during the inflation regime, the organisation prefers to keep a higher inventory, thereby increasing the aggregate demand. This additional inventory needs additional storage space, which is facilitated by a rented warehouse. Ignoring the effects of the time value of money and inflation might yield misleading results. In this study, a two-warehouse inventory model with linear trend in demand under inflationary conditions having different rates of deterioration has been developed. Shortages at the owned warehouse are also allowed subject to partial backlogging. The solution methodology provided in the model helps to decide on the feasibility of renting a warehouse. Finally, findings have been illustrated with the help of numerical examples. Comprehensive sensitivity analysis has also been provided.

  3. Database and applications security integrating information security and data management

    CERN Document Server

    Thuraisingham, Bhavani

    2005-01-01

    This is the first book to provide an in-depth coverage of all the developments, issues and challenges in secure databases and applications. It provides directions for data and application security, including securing emerging applications such as bioinformatics, stream information processing and peer-to-peer computing. Divided into eight sections, each of which focuses on a key concept of secure databases and applications, this book deals with all aspects of technology, including secure relational databases, inference problems, secure object databases, secure distributed databases and emerging

  4. The Pediatrix BabySteps® Data Warehouse--a unique national resource for improving outcomes for neonates.

    Science.gov (United States)

    Spitzer, Alan R; Ellsbury, Dan; Clark, Reese H

    2015-01-01

    The Pediatrix Medical Group Clinical Data Warehouse represents a unique electronic data capture system for the assessment of outcomes, the management of quality improvement (CQI) initiatives, and the resolution of important research questions in the neonatal intensive care unit (NICU). This system is described in detail and the manner in which the Data Warehouse has been used to measure and improve patient outcomes through CQI projects and research is outlined. The Pediatrix Data Warehouse now contains more than 1 million patients, serving as an exceptional tool for evaluating NICU care. Examples are provided of how significant outcome improvement has been achieved and several papers are cited that have used the "Big Data" contained in the Data Warehouse for novel observations that could not be made otherwise.

  5. Warehousing performance improvement using Frazelle Model and per group benchmarking: A case study in retail warehouse in Yogyakarta and Central Java

    Directory of Open Access Journals (Sweden)

    Kusrini Elisa

    2018-01-01

    Full Text Available Warehouse performance management has an important role in improving logistic's business activities. Good warehouse management could increase profit, time delivery, quality and customer service. This study is conducted to assess performance of retail warehouses in some supermarket located in Central Java and Yogyakarta. Performance improvement is proposed base on the warehouse measurement using Frazelle model (2002, that measure on five indicators, namely Financial, Productivity, Utility, Quality and Cycle time along five business process in warehousing, i.e. Receiving, Put Away, Storage, Order picking and shipping. In order to obtain more precise performance, the indicators are weighted using Analytic Hierarchy Analysis (AHP method. Then, warehouse performance are measured and final score is determined using SNORM method. From this study, it is found the final score of each warehouse and opportunity to improve warehouse performance using peer group benchmarking

  6. Order Picking Process in Warehouse: Case Study of Dairy Industry in Croatia

    Directory of Open Access Journals (Sweden)

    Josip Habazin

    2017-02-01

    Full Text Available The proper functioning of warehouse processes is fundamental for operational improvement and overall logistic supply chain improvement. Order picking is considered one of the most important from the group. Throughout picking orders in warehouses, the presence of human work is highly reflected, with the main goal to reduce the process time as much as possible, that is, to the very minimum. There are several different order picking methods, and nowadays, the most common ones are being developed and are significantly dependent on the type of goods, the warehouse equipment, etc., and those that stand out are scanning and picking by voice. This paper will provide information regarding the dairy industry in the Republic of Croatia with the analysis of order picking process in the observed company. Overall research highlighted the problem and resulted in proposals of solutions.

  7. Bioinformatics clouds for big data manipulation

    Directory of Open Access Journals (Sweden)

    Dai Lin

    2012-11-01

    Full Text Available Abstract As advances in life sciences and information technology bring profound influences on bioinformatics due to its interdisciplinary nature, bioinformatics is experiencing a new leap-forward from in-house computing infrastructure into utility-supplied cloud computing delivered over the Internet, in order to handle the vast quantities of biological data generated by high-throughput experimental technologies. Albeit relatively new, cloud computing promises to address big data storage and analysis issues in the bioinformatics field. Here we review extant cloud-based services in bioinformatics, classify them into Data as a Service (DaaS, Software as a Service (SaaS, Platform as a Service (PaaS, and Infrastructure as a Service (IaaS, and present our perspectives on the adoption of cloud computing in bioinformatics. Reviewers This article was reviewed by Frank Eisenhaber, Igor Zhulin, and Sandor Pongor.

  8. Bioinformatics clouds for big data manipulation

    KAUST Repository

    Dai, Lin

    2012-11-28

    As advances in life sciences and information technology bring profound influences on bioinformatics due to its interdisciplinary nature, bioinformatics is experiencing a new leap-forward from in-house computing infrastructure into utility-supplied cloud computing delivered over the Internet, in order to handle the vast quantities of biological data generated by high-throughput experimental technologies. Albeit relatively new, cloud computing promises to address big data storage and analysis issues in the bioinformatics field. Here we review extant cloud-based services in bioinformatics, classify them into Data as a Service (DaaS), Software as a Service (SaaS), Platform as a Service (PaaS), and Infrastructure as a Service (IaaS), and present our perspectives on the adoption of cloud computing in bioinformatics.This article was reviewed by Frank Eisenhaber, Igor Zhulin, and Sandor Pongor. 2012 Dai et al.; licensee BioMed Central Ltd.

  9. Creating databases for biological information: an introduction.

    Science.gov (United States)

    Stein, Lincoln

    2013-06-01

    The essence of bioinformatics is dealing with large quantities of information. Whether it be sequencing data, microarray data files, mass spectrometric data (e.g., fingerprints), the catalog of strains arising from an insertional mutagenesis project, or even large numbers of PDF files, there inevitably comes a time when the information can simply no longer be managed with files and directories. This is where databases come into play. This unit briefly reviews the characteristics of several database management systems, including flat file, indexed file, relational databases, and NoSQL databases. It compares their strengths and weaknesses and offers some general guidelines for selecting an appropriate database management system. Copyright 2013 by JohnWiley & Sons, Inc.

  10. The Impact of E-Commerce Development on the Warehouse Space Market in Poland

    Directory of Open Access Journals (Sweden)

    Dembińska Izabela

    2016-12-01

    Full Text Available The subject of discussion in the article is the impact of e-commerce sector on the warehouse space market. On the basis of available reports, the development of e-commerce has been characterized in Poland, showing the dynamics and the type of change. The needs of e-commerce sector in the field of logistics, in particular in the area of storage, have been presented in the paper. These needs have been characterized and at the same time, how representatives of the warehouse space market are prepared to support companies in the e-commerce sector is also discussed. The considerations are illustrated by the changes that occur as a result of the development of e-commerce on the warehouse space market in Poland.

  11. Worldwide Warehouse: A Customer Perspective

    Science.gov (United States)

    1994-09-01

    Management Office (PMO) and the customers (returnees and buyers) 23 will be developed or adapted from existing software programs. The hardware could be... customer requirements and desires is the first aspect to be approached. Sections 4.7 to 4.11 were dedicated to inivestigate those relationships and...R x NTIS CRA&I DTIC TAB WORLDWIDE WAREHOUSE: Ju’a-noj1c0[ed 0 A CUSTOMER PERSPECTIVE J-f-c-.tion .......... THESIS By D i s ib , tio

  12. Minimizing Warehouse Space with a Dedicated Storage Policy

    Directory of Open Access Journals (Sweden)

    Andrea Fumi

    2013-07-01

    inevitably be supported by warehouse management system software. On the contrary, the proposed methodology relies upon a dedicated storage policy, which is easily implementable by companies of all sizes without the need for investing in expensive IT tools.

  13. Ontology based heterogeneous materials database integration and semantic query

    Science.gov (United States)

    Zhao, Shuai; Qian, Quan

    2017-10-01

    Materials digital data, high throughput experiments and high throughput computations are regarded as three key pillars of materials genome initiatives. With the fast growth of materials data, the integration and sharing of data is very urgent, that has gradually become a hot topic of materials informatics. Due to the lack of semantic description, it is difficult to integrate data deeply in semantic level when adopting the conventional heterogeneous database integration approaches such as federal database or data warehouse. In this paper, a semantic integration method is proposed to create the semantic ontology by extracting the database schema semi-automatically. Other heterogeneous databases are integrated to the ontology by means of relational algebra and the rooted graph. Based on integrated ontology, semantic query can be done using SPARQL. During the experiments, two world famous First Principle Computational databases, OQMD and Materials Project are used as the integration targets, which show the availability and effectiveness of our method.

  14. The World Bacterial Biogeography and Biodiversity through Databases: A Case Study of NCBI Nucleotide Database and GBIF Database

    Directory of Open Access Journals (Sweden)

    Okba Selama

    2013-01-01

    Full Text Available Databases are an essential tool and resource within the field of bioinformatics. The primary aim of this study was to generate an overview of global bacterial biodiversity and biogeography using available data from the two largest public online databases, NCBI Nucleotide and GBIF. The secondary aim was to highlight the contribution each geographic area has to each database. The basis for data analysis of this study was the metadata provided by both databases, mainly, the taxonomy and the geographical area origin of isolation of the microorganism (record. These were directly obtained from GBIF through the online interface, while E-utilities and Python were used in combination with a programmatic web service access to obtain data from the NCBI Nucleotide Database. Results indicate that the American continent, and more specifically the USA, is the top contributor, while Africa and Antarctica are less well represented. This highlights the imbalance of exploration within these areas rather than any reduction in biodiversity. This study describes a novel approach to generating global scale patterns of bacterial biodiversity and biogeography and indicates that the Proteobacteria are the most abundant and widely distributed phylum within both databases.

  15. Opportunities and challenges provided by cloud repositories for bioinformatics-enabled drug discovery.

    Science.gov (United States)

    Dalpé, Gratien; Joly, Yann

    2014-09-01

    Healthcare-related bioinformatics databases are increasingly offering the possibility to maintain, organize, and distribute DNA sequencing data. Different national and international institutions are currently hosting such databases that offer researchers website platforms where they can obtain sequencing data on which they can perform different types of analysis. Until recently, this process remained mostly one-dimensional, with most analysis concentrated on a limited amount of data. However, newer genome sequencing technology is producing a huge amount of data that current computer facilities are unable to handle. An alternative approach has been to start adopting cloud computing services for combining the information embedded in genomic and model system biology data, patient healthcare records, and clinical trials' data. In this new technological paradigm, researchers use virtual space and computing power from existing commercial or not-for-profit cloud service providers to access, store, and analyze data via different application programming interfaces. Cloud services are an alternative to the need of larger data storage; however, they raise different ethical, legal, and social issues. The purpose of this Commentary is to summarize how cloud computing can contribute to bioinformatics-based drug discovery and to highlight some of the outstanding legal, ethical, and social issues that are inherent in the use of cloud services. © 2014 Wiley Periodicals, Inc.

  16. Desain Sistem Semantic Data Warehouse dengan Metode Ontology dan Rule Based untuk Mengolah Data Akademik Universitas XYZ di Bali

    Directory of Open Access Journals (Sweden)

    Made Pradnyana Ambara

    2016-06-01

    Full Text Available Data warehouse pada umumnya yang sering dikenal data warehouse tradisional mempunyai beberapa kelemahan yang mengakibatkan kualitas data yang dihasilkan tidak spesifik dan efektif. Sistem semantic data warehouse merupakan solusi untuk menangani permasalahan pada data warehouse tradisional dengan kelebihan antara lain: manajeman kualitas data yang spesifik dengan format data seragam untuk mendukung laporan OLAP yang baik, dan performance pencarian informasi yang lebih efektif dengan kata kunci bahasa alami. Pemodelan sistem semantic data warehouse menggunakan metode ontology menghasilkan model resource description framework schema (RDFS logic yang akan ditransformasikan menjadi snowflake schema. Laporan akademik yang dibutuhkan dihasilkan melalui metode nine step Kimball dan pencarian semantic menggunakan metode rule based. Pengujian dilakukan menggunakan dua metode uji yaitu pengujian dengan black box testing dan angket kuesioner cheklist. Dari hasil penelitian ini dapat disimpulkan bahwa sistem semantic data warehouse dapat membantu proses pengolahan data akademik yang menghasilkan laporan yang berkualitas untuk mendukung proses pengambilan keputusan.

  17. Interdisciplinary Introductory Course in Bioinformatics

    Science.gov (United States)

    Kortsarts, Yana; Morris, Robert W.; Utell, Janine M.

    2010-01-01

    Bioinformatics is a relatively new interdisciplinary field that integrates computer science, mathematics, biology, and information technology to manage, analyze, and understand biological, biochemical and biophysical information. We present our experience in teaching an interdisciplinary course, Introduction to Bioinformatics, which was developed…

  18. The ESID Online Database network.

    Science.gov (United States)

    Guzman, D; Veit, D; Knerr, V; Kindle, G; Gathmann, B; Eades-Perner, A M; Grimbacher, B

    2007-03-01

    Primary immunodeficiencies (PIDs) belong to the group of rare diseases. The European Society for Immunodeficiencies (ESID), is establishing an innovative European patient and research database network for continuous long-term documentation of patients, in order to improve the diagnosis, classification, prognosis and therapy of PIDs. The ESID Online Database is a web-based system aimed at data storage, data entry, reporting and the import of pre-existing data sources in an enterprise business-to-business integration (B2B). The online database is based on Java 2 Enterprise System (J2EE) with high-standard security features, which comply with data protection laws and the demands of a modern research platform. The ESID Online Database is accessible via the official website (http://www.esid.org/). Supplementary data are available at Bioinformatics online.

  19. The Analytic Information Warehouse (AIW): a platform for analytics using electronic health record data.

    Science.gov (United States)

    Post, Andrew R; Kurc, Tahsin; Cholleti, Sharath; Gao, Jingjing; Lin, Xia; Bornstein, William; Cantrell, Dedra; Levine, David; Hohmann, Sam; Saltz, Joel H

    2013-06-01

    To create an analytics platform for specifying and detecting clinical phenotypes and other derived variables in electronic health record (EHR) data for quality improvement investigations. We have developed an architecture for an Analytic Information Warehouse (AIW). It supports transforming data represented in different physical schemas into a common data model, specifying derived variables in terms of the common model to enable their reuse, computing derived variables while enforcing invariants and ensuring correctness and consistency of data transformations, long-term curation of derived data, and export of derived data into standard analysis tools. It includes software that implements these features and a computing environment that enables secure high-performance access to and processing of large datasets extracted from EHRs. We have implemented and deployed the architecture in production locally. The software is available as open source. We have used it as part of hospital operations in a project to reduce rates of hospital readmission within 30days. The project examined the association of over 100 derived variables representing disease and co-morbidity phenotypes with readmissions in 5years of data from our institution's clinical data warehouse and the UHC Clinical Database (CDB). The CDB contains administrative data from over 200 hospitals that are in academic medical centers or affiliated with such centers. A widely available platform for managing and detecting phenotypes in EHR data could accelerate the use of such data in quality improvement and comparative effectiveness studies. Copyright © 2013 Elsevier Inc. All rights reserved.

  20. Benefits of a clinical data warehouse with data mining tools to collect data for a radiotherapy trial

    International Nuclear Information System (INIS)

    Roelofs, Erik; Persoon, Lucas; Nijsten, Sebastiaan; Wiessler, Wolfgang; Dekker, André; Lambin, Philippe

    2013-01-01

    Introduction: Collecting trial data in a medical environment is at present mostly performed manually and therefore time-consuming, prone to errors and often incomplete with the complex data considered. Faster and more accurate methods are needed to improve the data quality and to shorten data collection times where information is often scattered over multiple data sources. The purpose of this study is to investigate the possible benefit of modern data warehouse technology in the radiation oncology field. Material and methods: In this study, a Computer Aided Theragnostics (CAT) data warehouse combined with automated tools for feature extraction was benchmarked against the regular manual data-collection processes. Two sets of clinical parameters were compiled for non-small cell lung cancer (NSCLC) and rectal cancer, using 27 patients per disease. Data collection times and inconsistencies were compared between the manual and the automated extraction method. Results: The average time per case to collect the NSCLC data manually was 10.4 ± 2.1 min and 4.3 ± 1.1 min when using the automated method (p < 0.001). For rectal cancer, these times were 13.5 ± 4.1 and 6.8 ± 2.4 min, respectively (p < 0.001). In 3.2% of the data collected for NSCLC and 5.3% for rectal cancer, there was a discrepancy between the manual and automated method. Conclusions: Aggregating multiple data sources in a data warehouse combined with tools for extraction of relevant parameters is beneficial for data collection times and offers the ability to improve data quality. The initial investments in digitizing the data are expected to be compensated due to the flexibility of the data analysis. Furthermore, successive investigations can easily select trial candidates and extract new parameters from the existing databases

  1. From capturing nursing knowledge to retrieval of data from a data warehouse.

    Science.gov (United States)

    Thoroddsen, Asta; Guðjónsdóttir, Hanna K; Guðjónsdóttir, Elisabet

    2014-01-01

    The purpose of the project was to capture nursing data and knowledge, represent it for use and re-use by retrieval from a data warehouse, which contains both clinical and financial hospital data. Today nurses at LUH use standardized nursing terminologies to document information related to patients and the nursing care in the EHR at all times. Pre-defined order sets for nursing care have been developed using best practice where available and tacit nursing knowledge has been captured and coded with standardized nursing terminologies and made explicit for dissemination in the EHR. All patient-nursing data is permanently stored in a data repository. Core nursing data elements have been selected for transfer and storage in the data warehouse and patient-nursing data are now captured, stored, can be related to other data elements from the warehouse and be retrieved for use and re-use.

  2. 78 FR 65300 - Notice of Availability (NOA) for General Purpose Warehouse and Information Technology Center...

    Science.gov (United States)

    2013-10-31

    ... (NOA) for General Purpose Warehouse and Information Technology Center Construction (GPW/IT)--Tracy Site... proposed action to construct a General Purpose Warehouse and Information Technology Center at Defense..., Suite 02G09, Alexandria, VA 22350- 3100. FOR FURTHER INFORMATION CONTACT: Ann Engelberger at (703) 767...

  3. Analyzing the field of bioinformatics with the multi-faceted topic modeling technique.

    Science.gov (United States)

    Heo, Go Eun; Kang, Keun Young; Song, Min; Lee, Jeong-Hoon

    2017-05-31

    Bioinformatics is an interdisciplinary field at the intersection of molecular biology and computing technology. To characterize the field as convergent domain, researchers have used bibliometrics, augmented with text-mining techniques for content analysis. In previous studies, Latent Dirichlet Allocation (LDA) was the most representative topic modeling technique for identifying topic structure of subject areas. However, as opposed to revealing the topic structure in relation to metadata such as authors, publication date, and journals, LDA only displays the simple topic structure. In this paper, we adopt the Tang et al.'s Author-Conference-Topic (ACT) model to study the field of bioinformatics from the perspective of keyphrases, authors, and journals. The ACT model is capable of incorporating the paper, author, and conference into the topic distribution simultaneously. To obtain more meaningful results, we use journals and keyphrases instead of conferences and bag-of-words.. For analysis, we use PubMed to collected forty-six bioinformatics journals from the MEDLINE database. We conducted time series topic analysis over four periods from 1996 to 2015 to further examine the interdisciplinary nature of bioinformatics. We analyze the ACT Model results in each period. Additionally, for further integrated analysis, we conduct a time series analysis among the top-ranked keyphrases, journals, and authors according to their frequency. We also examine the patterns in the top journals by simultaneously identifying the topical probability in each period, as well as the top authors and keyphrases. The results indicate that in recent years diversified topics have become more prevalent and convergent topics have become more clearly represented. The results of our analysis implies that overtime the field of bioinformatics becomes more interdisciplinary where there is a steady increase in peripheral fields such as conceptual, mathematical, and system biology. These results are

  4. Nanoinformatics: an emerging area of information technology at the intersection of bioinformatics, computational chemistry and nanobiotechnology

    Directory of Open Access Journals (Sweden)

    Fernando González-Nilo

    2011-01-01

    Full Text Available After the progress made during the genomics era, bioinformatics was tasked with supporting the flow of information generated by nanobiotechnology efforts. This challenge requires adapting classical bioinformatic and computational chemistry tools to store, standardize, analyze, and visualize nanobiotechnological information. Thus, old and new bioinformatic and computational chemistry tools have been merged into a new sub-discipline: nanoinformatics. This review takes a second look at the development of this new and exciting area as seen from the perspective of the evolution of nanobiotechnology applied to the life sciences. The knowledge obtained at the nano-scale level implies answers to new questions and the development of new concepts in different fields. The rapid convergence of technologies around nanobiotechnologies has spun off collaborative networks and web platforms created for sharing and discussing the knowledge generated in nanobiotechnology. The implementation of new database schemes suitable for storage, processing and integrating physical, chemical, and biological properties of nanoparticles will be a key element in achieving the promises in this convergent field. In this work, we will review some applications of nanobiotechnology to life sciences in generating new requirements for diverse scientific fields, such as bioinformatics and computational chemistry.

  5. Migration from relational to NoSQL database

    Science.gov (United States)

    Ghotiya, Sunita; Mandal, Juhi; Kandasamy, Saravanakumar

    2017-11-01

    Data generated by various real time applications, social networking sites and sensor devices is of very huge amount and unstructured, which makes it difficult for Relational database management systems to handle the data. Data is very precious component of any application and needs to be analysed after arranging it in some structure. Relational databases are only able to deal with structured data, so there is need of NoSQL Database management System which can deal with semi -structured data also. Relational database provides the easiest way to manage the data but as the use of NoSQL is increasing it is becoming necessary to migrate the data from Relational to NoSQL databases. Various frameworks has been proposed previously which provides mechanisms for migration of data stored at warehouses in SQL, middle layer solutions which can provide facility of data to be stored in NoSQL databases to handle data which is not structured. This paper provides a literature review of some of the recent approaches proposed by various researchers to migrate data from relational to NoSQL databases. Some researchers proposed mechanisms for the co-existence of NoSQL and Relational databases together. This paper provides a summary of mechanisms which can be used for mapping data stored in Relational databases to NoSQL databases. Various techniques for data transformation and middle layer solutions are summarised in the paper.

  6. Biopython: freely available Python tools for computational molecular biology and bioinformatics

    DEFF Research Database (Denmark)

    Cock, Peter J A; Antao, Tiago; Chang, Jeffrey T

    2009-01-01

    SUMMARY: The Biopython project is a mature open source international collaboration of volunteer developers, providing Python libraries for a wide range of bioinformatics problems. Biopython includes modules for reading and writing different sequence file formats and multiple sequence alignments......, dealing with 3D macro molecular structures, interacting with common tools such as BLAST, ClustalW and EMBOSS, accessing key online databases, as well as providing numerical methods for statistical learning. AVAILABILITY: Biopython is freely available, with documentation and source code at (www...

  7. SysBioCube: A Data Warehouse and Integrative Data Analysis Platform Facilitating Systems Biology Studies of Disorders of Military Relevance.

    Science.gov (United States)

    Chowbina, Sudhir; Hammamieh, Rasha; Kumar, Raina; Chakraborty, Nabarun; Yang, Ruoting; Mudunuri, Uma; Jett, Marti; Palma, Joseph M; Stephens, Robert

    2013-01-01

    SysBioCube is an integrated data warehouse and analysis platform for experimental data relating to diseases of military relevance developed for the US Army Medical Research and Materiel Command Systems Biology Enterprise (SBE). It brings together, under a single database environment, pathophysio-, psychological, molecular and biochemical data from mouse models of post-traumatic stress disorder and (pre-) clinical data from human PTSD patients.. SysBioCube will organize, centralize and normalize this data and provide an access portal for subsequent analysis to the SBE. It provides new or expanded browsing, querying and visualization to provide better understanding of the systems biology of PTSD, all brought about through the integrated environment. We employ Oracle database technology to store the data using an integrated hierarchical database schema design. The web interface provides researchers with systematic information and option to interrogate the profiles of pan-omics component across different data types, experimental designs and other covariates.

  8. Protocol for a national blood transfusion data warehouse from donor to recipient.

    Science.gov (United States)

    van Hoeven, Loan R; Hooftman, Babette H; Janssen, Mart P; de Bruijne, Martine C; de Vooght, Karen M K; Kemper, Peter; Koopman, Maria M W

    2016-08-04

    Blood transfusion has health-related, economical and safety implications. In order to optimise the transfusion chain, comprehensive research data are needed. The Dutch Transfusion Data warehouse (DTD) project aims to establish a data warehouse where data from donors and transfusion recipients are linked. This paper describes the design of the data warehouse, challenges and illustrative applications. Quantitative data on blood donors (eg, age, blood group, antibodies) and products (type of product, processing, storage time) are obtained from the national blood bank. These are linked to data on the transfusion recipients (eg, transfusions administered, patient diagnosis, surgical procedures, laboratory parameters), which are extracted from hospital electronic health records. Expected scientific contributions are illustrated for 4 applications: determine risk factors, predict blood use, benchmark blood use and optimise process efficiency. For each application, examples of research questions are given and analyses planned. The DTD project aims to build a national, continuously updated transfusion data warehouse. These data have a wide range of applications, on the donor/production side, recipient studies on blood usage and benchmarking and donor-recipient studies, which ultimately can contribute to the efficiency and safety of blood transfusion. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://www.bmj.com/company/products-services/rights-and-licensing/

  9. Navigating the changing learning landscape: perspective from bioinformatics.ca.

    Science.gov (United States)

    Brazas, Michelle D; Ouellette, B F Francis

    2013-09-01

    With the advent of YouTube channels in bioinformatics, open platforms for problem solving in bioinformatics, active web forums in computing analyses and online resources for learning to code or use a bioinformatics tool, the more traditional continuing education bioinformatics training programs have had to adapt. Bioinformatics training programs that solely rely on traditional didactic methods are being superseded by these newer resources. Yet such face-to-face instruction is still invaluable in the learning continuum. Bioinformatics.ca, which hosts the Canadian Bioinformatics Workshops, has blended more traditional learning styles with current online and social learning styles. Here we share our growing experiences over the past 12 years and look toward what the future holds for bioinformatics training programs.

  10. On sustainable operation of warehouse order picking systems

    NARCIS (Netherlands)

    Andriansyah, R.; Etman, L.F.P.; Rooda, J.E.

    2009-01-01

    Sustainable development calls for an efficient utilization of natural and human resources. This issue also arises for warehouse systems, where typically extensive capital investment and labor intensive work are involved. It is therefore important to assess and continuously monitor the performance of

  11. Taking Bioinformatics to Systems Medicine.

    Science.gov (United States)

    van Kampen, Antoine H C; Moerland, Perry D

    2016-01-01

    Systems medicine promotes a range of approaches and strategies to study human health and disease at a systems level with the aim of improving the overall well-being of (healthy) individuals, and preventing, diagnosing, or curing disease. In this chapter we discuss how bioinformatics critically contributes to systems medicine. First, we explain the role of bioinformatics in the management and analysis of data. In particular we show the importance of publicly available biological and clinical repositories to support systems medicine studies. Second, we discuss how the integration and analysis of multiple types of omics data through integrative bioinformatics may facilitate the determination of more predictive and robust disease signatures, lead to a better understanding of (patho)physiological molecular mechanisms, and facilitate personalized medicine. Third, we focus on network analysis and discuss how gene networks can be constructed from omics data and how these networks can be decomposed into smaller modules. We discuss how the resulting modules can be used to generate experimentally testable hypotheses, provide insight into disease mechanisms, and lead to predictive models. Throughout, we provide several examples demonstrating how bioinformatics contributes to systems medicine and discuss future challenges in bioinformatics that need to be addressed to enable the advancement of systems medicine.

  12. Crowdsourcing for bioinformatics.

    Science.gov (United States)

    Good, Benjamin M; Su, Andrew I

    2013-08-15

    Bioinformatics is faced with a variety of problems that require human involvement. Tasks like genome annotation, image analysis, knowledge-base population and protein structure determination all benefit from human input. In some cases, people are needed in vast quantities, whereas in others, we need just a few with rare abilities. Crowdsourcing encompasses an emerging collection of approaches for harnessing such distributed human intelligence. Recently, the bioinformatics community has begun to apply crowdsourcing in a variety of contexts, yet few resources are available that describe how these human-powered systems work and how to use them effectively in scientific domains. Here, we provide a framework for understanding and applying several different types of crowdsourcing. The framework considers two broad classes: systems for solving large-volume 'microtasks' and systems for solving high-difficulty 'megatasks'. Within these classes, we discuss system types, including volunteer labor, games with a purpose, microtask markets and open innovation contests. We illustrate each system type with successful examples in bioinformatics and conclude with a guide for matching problems to crowdsourcing solutions that highlights the positives and negatives of different approaches.

  13. Routing Optimization of Intelligent Vehicle in Automated Warehouse

    Directory of Open Access Journals (Sweden)

    Yan-cong Zhou

    2014-01-01

    Full Text Available Routing optimization is a key technology in the intelligent warehouse logistics. In order to get an optimal route for warehouse intelligent vehicle, routing optimization in complex global dynamic environment is studied. A new evolutionary ant colony algorithm based on RFID and knowledge-refinement is proposed. The new algorithm gets environmental information timely through the RFID technology and updates the environment map at the same time. It adopts elite ant kept, fallback, and pheromones limitation adjustment strategy. The current optimal route in population space is optimized based on experiential knowledge. The experimental results show that the new algorithm has higher convergence speed and can jump out the U-type or V-type obstacle traps easily. It can also find the global optimal route or approximate optimal one with higher probability in the complex dynamic environment. The new algorithm is proved feasible and effective by simulation results.

  14. Is there room for ethics within bioinformatics education?

    Science.gov (United States)

    Taneri, Bahar

    2011-07-01

    When bioinformatics education is considered, several issues are addressed. At the undergraduate level, the main issue revolves around conveying information from two main and different fields: biology and computer science. At the graduate level, the main issue is bridging the gap between biology students and computer science students. However, there is an educational component that is rarely addressed within the context of bioinformatics education: the ethics component. Here, a different perspective is provided on bioinformatics education, and the current status of ethics is analyzed within the existing bioinformatics programs. Analysis of the existing undergraduate and graduate programs, in both Europe and the United States, reveals the minimal attention given to ethics within bioinformatics education. Given that bioinformaticians speedily and effectively shape the biomedical sciences and hence their implications for society, here redesigning of the bioinformatics curricula is suggested in order to integrate the necessary ethics education. Unique ethical problems awaiting bioinformaticians and bioinformatics ethics as a separate field of study are discussed. In addition, a template for an "Ethics in Bioinformatics" course is provided.

  15. Rising Strengths Hong Kong SAR in Bioinformatics.

    Science.gov (United States)

    Chakraborty, Chiranjib; George Priya Doss, C; Zhu, Hailong; Agoramoorthy, Govindasamy

    2017-06-01

    Hong Kong's bioinformatics sector is attaining new heights in combination with its economic boom and the predominance of the working-age group in its population. Factors such as a knowledge-based and free-market economy have contributed towards a prominent position on the world map of bioinformatics. In this review, we have considered the educational measures, landmark research activities and the achievements of bioinformatics companies and the role of the Hong Kong government in the establishment of bioinformatics as strength. However, several hurdles remain. New government policies will assist computational biologists to overcome these hurdles and further raise the profile of the field. There is a high expectation that bioinformatics in Hong Kong will be a promising area for the next generation.

  16. The Xeno-glycomics database (XDB): a relational database of qualitative and quantitative pig glycome repertoire.

    Science.gov (United States)

    Park, Hae-Min; Park, Ju-Hyeong; Kim, Yoon-Woo; Kim, Kyoung-Jin; Jeong, Hee-Jin; Jang, Kyoung-Soon; Kim, Byung-Gee; Kim, Yun-Gon

    2013-11-15

    In recent years, the improvement of mass spectrometry-based glycomics techniques (i.e. highly sensitive, quantitative and high-throughput analytical tools) has enabled us to obtain a large dataset of glycans. Here we present a database named Xeno-glycomics database (XDB) that contains cell- or tissue-specific pig glycomes analyzed with mass spectrometry-based techniques, including a comprehensive pig glycan information on chemical structures, mass values, types and relative quantities. It was designed as a user-friendly web-based interface that allows users to query the database according to pig tissue/cell types or glycan masses. This database will contribute in providing qualitative and quantitative information on glycomes characterized from various pig cells/organs in xenotransplantation and might eventually provide new targets in the α1,3-galactosyltransferase gene-knock out pigs era. The database can be accessed on the web at http://bioinformatics.snu.ac.kr/xdb.

  17. EURASIP journal on bioinformatics & systems biology

    National Research Council Canada - National Science Library

    2006-01-01

    "The overall aim of "EURASIP Journal on Bioinformatics and Systems Biology" is to publish research results related to signal processing and bioinformatics theories and techniques relevant to a wide...

  18. Event-Entity-Relationship Modeling in Data Warehouse Environments

    DEFF Research Database (Denmark)

    Bækgaard, Lars

    We use the event-entity-relationship model (EVER) to illustrate the use of entity-based modeling languages for conceptual schema design in data warehouse environments. EVER is a general-purpose information modeling language that supports the specification of both general schema structures and multi...

  19. Using Data Warehouses to extract knowledge from Agro-Hydrological simulations

    Science.gov (United States)

    Bouadi, Tassadit; Gascuel-Odoux, Chantal; Cordier, Marie-Odile; Quiniou, René; Moreau, Pierre

    2013-04-01

    In recent years, simulation models have been used more and more in hydrology to test the effect of scenarios and help stakeholders in decision making. Agro-hydrological models have oriented agricultural water management, by testing the effect of landscape structure and farming system changes on water and chemical emission in rivers. Such models generate a large amount of data while few of them, such as daily concentrations at the outlet of the catchment, or annual budgets regarding soil, water and atmosphere emissions, are stored and analyzed. Thus, a great amount of information is lost from the simulation process. This is due to the large volumes of simulated data, but also to the difficulties in analyzing and transforming the data in an usable information. In this talk we illustrate a data warehouse which has been built to store and manage simulation data coming from the agro-hydrological model TNT (Topography-based nitrogen transfer and transformations, (Beaujouan et al., 2002)). This model simulates the transfer and transformation of nitrogen in agricultural catchments. TNT was used over 10 years on the Yar catchment (western France), a 50 km2 square area which present a detailed data set and have to facing to environmental issue (coastal eutrophication). 44 output key simulated variables are stored at a daily time step, i.e, 8 GB of storage size, which allows the users to explore the N emission in space and time, to quantify all the processes of transfer and transformation regarding the cropping systems, their location within the catchment, the emission in water and atmosphere, and finally to get new knowledge and help in making specific and detailed decision in space and time. We present the dimensional modeling process of the Nitrogen in catchment data warehouse (i.e. the snowflake model). After identifying the set of multileveled dimensions with complex hierarchical structures and relationships among related dimension levels, we chose the snowflake model to

  20. An Advanced Data Warehouse for Integrating Large Sets of GPS Data

    DEFF Research Database (Denmark)

    Andersen, Ove; Krogh, Benjamin Bjerre; Thomsen, Christian

    2014-01-01

    GPS data recorded from driving vehicles is available from many sources and is a very good data foundation for answering traffic related queries. However, most approaches so far have not considered combining GPS data from many sources into a single data warehouse. Further, the integration of GPS...... data with fuel consumption data (from the so-called CAN bus in the vehicles) and weather data has not been done. In this paper, we propose a data warehouse design for handling GPS data, fuel consumption data, and weather data. The design is fully implemented in a running system using the Postgre...

  1. [Construction and realization of real world integrated data warehouse from HIS on re-evaluation of post-maketing traditional Chinese medicine].

    Science.gov (United States)

    Zhuang, Yan; Xie, Bangtie; Weng, Shengxin; Xie, Yanming

    2011-10-01

    To construct real world integrated data warehouse on re-evaluation of post-marketing traditional Chinese medicine for the research on key techniques of clinic re-evaluation which mainly includes indication of traditional Chinese medicine, dosage usage, course of treatment, unit medication, combined disease and adverse reaction, which provides data for reviewed research on its safety,availability and economy,and provides foundation for perspective research. The integrated data warehouse extracts and integrate data from HIS by information collection system and data warehouse technique and forms standard structure and data. The further research is on process based on the data. A data warehouse and several sub data warehouses were built, which focused on patients' main records, doctor orders, diseases diagnoses, laboratory results and economic indications in hospital. These data warehouses can provide research data for re-evaluation of post-marketing traditional Chinese medicine, and it has clinical value. Besides, it points out the direction for further research.

  2. GENEASE: Real time bioinformatics tool for multi-omics and disease ontology exploration, analysis and visualization.

    Science.gov (United States)

    Ghandikota, Sudhir; Hershey, Gurjit K Khurana; Mersha, Tesfaye B

    2018-03-24

    Advances in high-throughput sequencing technologies have made it possible to generate multiple omics data at an unprecedented rate and scale. The accumulation of these omics data far outpaces the rate at which biologists can mine and generate new hypothesis to test experimentally. There is an urgent need to develop a myriad of powerful tools to efficiently and effectively search and filter these resources to address specific post-GWAS functional genomics questions. However, to date, these resources are scattered across several databases and often lack a unified portal for data annotation and analytics. In addition, existing tools to analyze and visualize these databases are highly fragmented, resulting researchers to access multiple applications and manual interventions for each gene or variant in an ad hoc fashion until all the questions are answered. In this study, we present GENEASE, a web-based one-stop bioinformatics tool designed to not only query and explore multi-omics and phenotype databases (e.g., GTEx, ClinVar, dbGaP, GWAS Catalog, ENCODE, Roadmap Epigenomics, KEGG, Reactome, Gene and Phenotype Ontology) in a single web interface but also to perform seamless post genome-wide association downstream functional and overlap analysis for non-coding regulatory variants. GENEASE accesses over 50 different databases in public domain including model organism-specific databases to facilitate gene/variant and disease exploration, enrichment and overlap analysis in real time. It is a user-friendly tool with point-and-click interface containing links for support information including user manual and examples. GENEASE can be accessed freely at http://research.cchmc.org/mershalab/genease_new/login.html. Tesfaye.Mersha@cchmc.org, Sudhir.Ghandikota@cchmc.org. Supplementary data are available at Bioinformatics online.

  3. Virtual Bioinformatics Distance Learning Suite

    Science.gov (United States)

    Tolvanen, Martti; Vihinen, Mauno

    2004-01-01

    Distance learning as a computer-aided concept allows students to take courses from anywhere at any time. In bioinformatics, computers are needed to collect, store, process, and analyze massive amounts of biological and biomedical data. We have applied the concept of distance learning in virtual bioinformatics to provide university course material…

  4. Diseño, elaboración y explotación de un data warehouse para una institución sanitaria

    OpenAIRE

    Castillo Hernández, Iván

    2014-01-01

    Diseño, elaboración y explotación de un data warehouse para una institución sanitaria. Disseny, elaboració i explotació d'un data warehouse per a una institució sanitària. Bachelor thesis for the Computer Science program on Data warehouse.

  5. ECG-ViEW II, a freely accessible electrocardiogram database

    Science.gov (United States)

    Park, Man Young; Lee, Sukhoon; Jeon, Min Seok; Yoon, Dukyong; Park, Rae Woong

    2017-01-01

    The Electrocardiogram Vigilance with Electronic data Warehouse II (ECG-ViEW II) is a large, single-center database comprising numeric parameter data of the surface electrocardiograms of all patients who underwent testing from 1 June 1994 to 31 July 2013. The electrocardiographic data include the test date, clinical department, RR interval, PR interval, QRS duration, QT interval, QTc interval, P axis, QRS axis, and T axis. These data are connected with patient age, sex, ethnicity, comorbidities, age-adjusted Charlson comorbidity index, prescribed drugs, and electrolyte levels. This longitudinal observational database contains 979,273 electrocardiograms from 461,178 patients over a 19-year study period. This database can provide an opportunity to study electrocardiographic changes caused by medications, disease, or other demographic variables. ECG-ViEW II is freely available at http://www.ecgview.org. PMID:28437484

  6. Statewide Transportation Engineering Warehouse for Archived Regional Data (STEWARD).

    Science.gov (United States)

    2009-12-01

    This report documents Phase III of the development and operation of a prototype for the Statewide Transportation : Engineering Warehouse for Archived Regional Data (STEWARD). It reflects the progress on the development and : operation of STEWARD sinc...

  7. The 2016 Bioinformatics Open Source Conference (BOSC).

    Science.gov (United States)

    Harris, Nomi L; Cock, Peter J A; Chapman, Brad; Fields, Christopher J; Hokamp, Karsten; Lapp, Hilmar; Muñoz-Torres, Monica; Wiencko, Heather

    2016-01-01

    Message from the ISCB: The Bioinformatics Open Source Conference (BOSC) is a yearly meeting organized by the Open Bioinformatics Foundation (OBF), a non-profit group dedicated to promoting the practice and philosophy of Open Source software development and Open Science within the biological research community. BOSC has been run since 2000 as a two-day Special Interest Group (SIG) before the annual ISMB conference. The 17th annual BOSC ( http://www.open-bio.org/wiki/BOSC_2016) took place in Orlando, Florida in July 2016. As in previous years, the conference was preceded by a two-day collaborative coding event open to the bioinformatics community. The conference brought together nearly 100 bioinformatics researchers, developers and users of open source software to interact and share ideas about standards, bioinformatics software development, and open and reproducible science.

  8. Multidimensional Analysis and Location Intelligence Application for Spatial Data Warehouse Hotspot in Indonesia using SpagoBI

    Science.gov (United States)

    Uswatun Hasanah, Gamma; Trisminingsih, Rina

    2016-01-01

    Spatial data warehouse refers to data warehouse which has a spatial component that represents the geographic location of the position or an object on the Earth's surface. Spatial data warehouse can be visualized in the form of a crosstab tables, graphs, and maps. Spatial data warehouse of hotspot in Indonesia has been constructed by researchers from FIRM NASA 2006-2015. This research develops multidimensional analysis module and location intelligence module using SpagoBI. The multidimensional analysis module is able to visualize online analytical processing (OLAP). The location intelligence module creates dynamic map visualization in map zone and map point. Map zone can display the different colors based on the number of hotspot in each region and map point can display different sizes of the point to represent the number of hotspots in each region. This research is expected to facilitate users in the presentation of hotspot data as needed.

  9. What Information Does Your EHR Contain? Automatic Generation of a Clinical Metadata Warehouse (CMDW) to Support Identification and Data Access Within Distributed Clinical Research Networks.

    Science.gov (United States)

    Bruland, Philipp; Doods, Justin; Storck, Michael; Dugas, Martin

    2017-01-01

    Data dictionaries provide structural meta-information about data definitions in health information technology (HIT) systems. In this regard, reusing healthcare data for secondary purposes offers several advantages (e.g. reduce documentation times or increased data quality). Prerequisites for data reuse are its quality, availability and identical meaning of data. In diverse projects, research data warehouses serve as core components between heterogeneous clinical databases and various research applications. Given the complexity (high number of data elements) and dynamics (regular updates) of electronic health record (EHR) data structures, we propose a clinical metadata warehouse (CMDW) based on a metadata registry standard. Metadata of two large hospitals were automatically inserted into two CMDWs containing 16,230 forms and 310,519 data elements. Automatic updates of metadata are possible as well as semantic annotations. A CMDW allows metadata discovery, data quality assessment and similarity analyses. Common data models for distributed research networks can be established based on similarity analyses.

  10. Opportunities for Energy Efficiency and Automated Demand Response in Industrial Refrigerated Warehouses in California

    Energy Technology Data Exchange (ETDEWEB)

    Lekov, Alex; Thompson, Lisa; McKane, Aimee; Rockoff, Alexandra; Piette, Mary Ann

    2009-05-11

    This report summarizes the Lawrence Berkeley National Laboratory's research to date in characterizing energy efficiency and open automated demand response opportunities for industrial refrigerated warehouses in California. The report describes refrigerated warehouses characteristics, energy use and demand, and control systems. It also discusses energy efficiency and open automated demand response opportunities and provides analysis results from three demand response studies. In addition, several energy efficiency, load management, and demand response case studies are provided for refrigerated warehouses. This study shows that refrigerated warehouses can be excellent candidates for open automated demand response and that facilities which have implemented energy efficiency measures and have centralized control systems are well-suited to shift or shed electrical loads in response to financial incentives, utility bill savings, and/or opportunities to enhance reliability of service. Control technologies installed for energy efficiency and load management purposes can often be adapted for open automated demand response (OpenADR) at little additional cost. These improved controls may prepare facilities to be more receptive to OpenADR due to both increased confidence in the opportunities for controlling energy cost/use and access to the real-time data.

  11. CBS Genome Atlas Database: a dynamic storage for bioinformatic results and sequence data

    DEFF Research Database (Denmark)

    Hallin, Peter Fischer; Ussery, David

    2004-01-01

    , these results counts to more than 220 pieces of information. The backbone of this solution consists of a program package written in Perl, which enables administrators to synchronize and update the database content. The MySQL database has been connected to the CBS web-server via PHP4, to present a dynamic web...... and frequent addition of new models are factors that require a dynamic database layout. Using basic tools like the GNU Make system, csh, Perl and MySQL, we have created a flexible database environment for storing and maintaining such results for a collection of complete microbial genomes. Currently...... content for users outside the center. This solution is tightly fitted to existing server infrastructure and the solutions proposed here can perhaps serve as a template for other research groups to solve database issues....

  12. Access to DNA and protein databases on the Internet.

    Science.gov (United States)

    Harper, R

    1994-02-01

    During the past year, the number of biological databases that can be queried via Internet has dramatically increased. This increase has resulted from the introduction of networking tools, such as Gopher and WAIS, that make it easy for research workers to index databases and make them available for on-line browsing. Biocomputing in the nineties will see the advent of more client/server options for the solution of problems in bioinformatics.

  13. Biowep: a workflow enactment portal for bioinformatics applications.

    Science.gov (United States)

    Romano, Paolo; Bartocci, Ezio; Bertolini, Guglielmo; De Paoli, Flavio; Marra, Domenico; Mauri, Giancarlo; Merelli, Emanuela; Milanesi, Luciano

    2007-03-08

    The huge amount of biological information, its distribution over the Internet and the heterogeneity of available software tools makes the adoption of new data integration and analysis network tools a necessity in bioinformatics. ICT standards and tools, like Web Services and Workflow Management Systems (WMS), can support the creation and deployment of such systems. Many Web Services are already available and some WMS have been proposed. They assume that researchers know which bioinformatics resources can be reached through a programmatic interface and that they are skilled in programming and building workflows. Therefore, they are not viable to the majority of unskilled researchers. A portal enabling these to take profit from new technologies is still missing. We designed biowep, a web based client application that allows for the selection and execution of a set of predefined workflows. The system is available on-line. Biowep architecture includes a Workflow Manager, a User Interface and a Workflow Executor. The task of the Workflow Manager is the creation and annotation of workflows. These can be created by using either the Taverna Workbench or BioWMS. Enactment of workflows is carried out by FreeFluo for Taverna workflows and by BioAgent/Hermes, a mobile agent-based middleware, for BioWMS ones. Main workflows' processing steps are annotated on the basis of their input and output, elaboration type and application domain by using a classification of bioinformatics data and tasks. The interface supports users authentication and profiling. Workflows can be selected on the basis of users' profiles and can be searched through their annotations. Results can be saved. We developed a web system that support the selection and execution of predefined workflows, thus simplifying access for all researchers. The implementation of Web Services allowing specialized software to interact with an exhaustive set of biomedical databases and analysis software and the creation of

  14. Biowep: a workflow enactment portal for bioinformatics applications

    Directory of Open Access Journals (Sweden)

    Romano Paolo

    2007-03-01

    databases and analysis software and the creation of effective workflows can significantly improve automation of in-silico analysis. Biowep is available for interested researchers as a reference portal. They are invited to submit their workflows to the workflow repository. Biowep is further being developed in the sphere of the Laboratory of Interdisciplinary Technologies in Bioinformatics – LITBIO.

  15. Integr8: enhanced inter-operability of European molecular biology databases.

    Science.gov (United States)

    Kersey, P J; Morris, L; Hermjakob, H; Apweiler, R

    2003-01-01

    The increasing production of molecular biology data in the post-genomic era, and the proliferation of databases that store it, require the development of an integrative layer in database services to facilitate the synthesis of related information. The solution of this problem is made more difficult by the absence of universal identifiers for biological entities, and the breadth and variety of available data. Integr8 was modelled using UML (Universal Modelling Language). Integr8 is being implemented as an n-tier system using a modern object-oriented programming language (Java). An object-relational mapping tool, OJB, is being used to specify the interface between the upper layers and an underlying relational database. The European Bioinformatics Institute is launching the Integr8 project. Integr8 will be an automatically populated database in which we will maintain stable identifiers for biological entities, describe their relationships with each other (in accordance with the central dogma of biology), and store equivalences between identified entities in the source databases. Only core data will be stored in Integr8, with web links to the source databases providing further information. Integr8 will provide the integrative layer of the next generation of bioinformatics services from the EBI. Web-based interfaces will be developed to offer gene-centric views of the integrated data, presenting (where known) the links between genome, proteome and phenotype.

  16. The DBCLS BioHackathon: standardization and interoperability for bioinformatics web services and workflows. The DBCLS BioHackathon Consortium*

    Directory of Open Access Journals (Sweden)

    Katayama Toshiaki

    2010-08-01

    Full Text Available Abstract Web services have become a key technology for bioinformatics, since life science databases are globally decentralized and the exponential increase in the amount of available data demands for efficient systems without the need to transfer entire databases for every step of an analysis. However, various incompatibilities among database resources and analysis services make it difficult to connect and integrate these into interoperable workflows. To resolve this situation, we invited domain specialists from web service providers, client software developers, Open Bio* projects, the BioMoby project and researchers of emerging areas where a standard exchange data format is not well established, for an intensive collaboration entitled the BioHackathon 2008. The meeting was hosted by the Database Center for Life Science (DBCLS and Computational Biology Research Center (CBRC and was held in Tokyo from February 11th to 15th, 2008. In this report we highlight the work accomplished and the common issues arisen from this event, including the standardization of data exchange formats and services in the emerging fields of glycoinformatics, biological interaction networks, text mining, and phyloinformatics. In addition, common shared object development based on BioSQL, as well as technical challenges in large data management, asynchronous services, and security are discussed. Consequently, we improved interoperability of web services in several fields, however, further cooperation among major database centers and continued collaborative efforts between service providers and software developers are still necessary for an effective advance in bioinformatics web service technologies.

  17. Location of Farmers Warehouse at Adaklu Traditional Area, Volta Region, Ghana

    Directory of Open Access Journals (Sweden)

    Vincent Tulasi

    2016-01-01

    Full Text Available Postharvest loss is one major problem farmers in Adaklu Traditional Area that most Ghanaian farmers face. As a result, many farmers wallow in abject poverty. Warehouses are important facilities that help to reduce postharvest loss. In this research, Beresnev pseudo-Boolean Simple Plant Location Problem (SPLP model is used to locate a warehouse at Adaklu Traditional Area, Volta Region, Ghana. This model was used because it gives a straightforward computation and produces no iteration as compared with other models. The SPLP is a problem of selecting a site from candidate sites to locate a plant so that customers can be supplied from the plant at a minimum cost. The model is made up of fixed cost and transportation cost. Location index ordering matrix was developed from the transportation cost matrix and we used it with the fixed cost and differences between variable costs to formulate the Beresnev function. Linear term developed from the function which was partial is pegged to obtain a complete solution. Of the 14 notable communities considered, Adaklu Waya is found most suitable for the setting of the warehouse. The total cost involved is Gh₵ 78,180.00.

  18. Functional proteomics with new mass spectrometric and bioinformatics tools

    International Nuclear Information System (INIS)

    Kesners, P.W.A.

    2001-01-01

    A comprehensive range of mass spectrometric tools is required to investigate todays life science applications and a strong focus is on addressing the needs of functional proteomics. Application examples are given showing the streamlined process of protein identification from low femtomole amounts of digests. Sample preparation is achieved with a convertible robot for automated 2D gel picking, and MALDI target dispensing. MALDI-TOF or ESI-MS subsequent to enzymatic digestion. A choice of mass spectrometers including Q-q-TOF with multipass capability, MALDI-MS/MS with unsegmented PSD, Ion Trap and FT-MS are discussed for their respective strengths and applications. Bioinformatics software that allows both database work and novel peptide mass spectra interpretation is reviewed. The automated database searching uses either entire digest LC-MS n ESI Ion Trap data or MALDI MS and MS/MS spectra. It is shown how post translational modifications are interactively uncovered and de-novo sequencing of peptides is facilitated

  19. The Analytic Information Warehouse (AIW): a Platform for Analytics using Electronic Health Record Data

    Science.gov (United States)

    Post, Andrew R.; Kurc, Tahsin; Cholleti, Sharath; Gao, Jingjing; Lin, Xia; Bornstein, William; Cantrell, Dedra; Levine, David; Hohmann, Sam; Saltz, Joel H.

    2013-01-01

    Objective To create an analytics platform for specifying and detecting clinical phenotypes and other derived variables in electronic health record (EHR) data for quality improvement investigations. Materials and Methods We have developed an architecture for an Analytic Information Warehouse (AIW). It supports transforming data represented in different physical schemas into a common data model, specifying derived variables in terms of the common model to enable their reuse, computing derived variables while enforcing invariants and ensuring correctness and consistency of data transformations, long-term curation of derived data, and export of derived data into standard analysis tools. It includes software that implements these features and a computing environment that enables secure high-performance access to and processing of large datasets extracted from EHRs. Results We have implemented and deployed the architecture in production locally. The software is available as open source. We have used it as part of hospital operations in a project to reduce rates of hospital readmission within 30 days. The project examined the association of over 100 derived variables representing disease and co-morbidity phenotypes with readmissions in five years of data from our institution’s clinical data warehouse and the UHC Clinical Database (CDB). The CDB contains administrative data from over 200 hospitals that are in academic medical centers or affiliated with such centers. Discussion and Conclusion A widely available platform for managing and detecting phenotypes in EHR data could accelerate the use of such data in quality improvement and comparative effectiveness studies. PMID:23402960

  20. Bioinformatics algorithm based on a parallel implementation of a machine learning approach using transducers

    International Nuclear Information System (INIS)

    Roche-Lima, Abiel; Thulasiram, Ruppa K

    2012-01-01

    Finite automata, in which each transition is augmented with an output label in addition to the familiar input label, are considered finite-state transducers. Transducers have been used to analyze some fundamental issues in bioinformatics. Weighted finite-state transducers have been proposed to pairwise alignments of DNA and protein sequences; as well as to develop kernels for computational biology. Machine learning algorithms for conditional transducers have been implemented and used for DNA sequence analysis. Transducer learning algorithms are based on conditional probability computation. It is calculated by using techniques, such as pair-database creation, normalization (with Maximum-Likelihood normalization) and parameters optimization (with Expectation-Maximization - EM). These techniques are intrinsically costly for computation, even worse when are applied to bioinformatics, because the databases sizes are large. In this work, we describe a parallel implementation of an algorithm to learn conditional transducers using these techniques. The algorithm is oriented to bioinformatics applications, such as alignments, phylogenetic trees, and other genome evolution studies. Indeed, several experiences were developed using the parallel and sequential algorithm on Westgrid (specifically, on the Breeze cluster). As results, we obtain that our parallel algorithm is scalable, because execution times are reduced considerably when the data size parameter is increased. Another experience is developed by changing precision parameter. In this case, we obtain smaller execution times using the parallel algorithm. Finally, number of threads used to execute the parallel algorithm on the Breezy cluster is changed. In this last experience, we obtain as result that speedup is considerably increased when more threads are used; however there is a convergence for number of threads equal to or greater than 16.

  1. Capturing Complex Multidimensional Data in Location-Based Data Warehouses

    DEFF Research Database (Denmark)

    Timko, Igor; Pedersen, Torben Bach

    2004-01-01

    Motivated by the increasing need to handle complex multidimensional data inlocation-based data warehouses, this paper proposes apowerful data model that is able to capture the complexities of such data. The model provides a foundation for handling complex transportationinfrastructures...

  2. TA-60 Warehouse and Salvage SWPPP Rev 2 Jan 2017-Final

    Energy Technology Data Exchange (ETDEWEB)

    Burgin, Jillian Elizabeth [Los Alamos National Lab. (LANL), Los Alamos, NM (United States)

    2017-02-07

    The Stormwater Pollution Prevention Team (PPT) for the TA-60-0002 Salvage and Warehouse Area consists of operations and management personnel from the facility, Multi-Sector General Permitting (MSGP) stormwater personnel from Environmental Compliance Programs (EPC-CP) organization, and Deployed Environmental Professionals. The EPC-CP representative is responsible for Laboratory compliance under the National Pollutant Discharge Elimination System (NPDES) permit regulations. The team members are selected on the basis of their familiarity with the activities at the facility and the potential impacts of those activities on stormwater runoff. The Warehouse and Salvage Yard are a single shift operation; therefore, a member of the PPT is always present during operations.

  3. EDAM: an ontology of bioinformatics operations, types of data and identifiers, topics and formats

    Science.gov (United States)

    Ison, Jon; Kalaš, Matúš; Jonassen, Inge; Bolser, Dan; Uludag, Mahmut; McWilliam, Hamish; Malone, James; Lopez, Rodrigo; Pettifer, Steve; Rice, Peter

    2013-01-01

    Motivation: Advancing the search, publication and integration of bioinformatics tools and resources demands consistent machine-understandable descriptions. A comprehensive ontology allowing such descriptions is therefore required. Results: EDAM is an ontology of bioinformatics operations (tool or workflow functions), types of data and identifiers, application domains and data formats. EDAM supports semantic annotation of diverse entities such as Web services, databases, programmatic libraries, standalone tools, interactive applications, data schemas, datasets and publications within bioinformatics. EDAM applies to organizing and finding suitable tools and data and to automating their integration into complex applications or workflows. It includes over 2200 defined concepts and has successfully been used for annotations and implementations. Availability: The latest stable version of EDAM is available in OWL format from http://edamontology.org/EDAM.owl and in OBO format from http://edamontology.org/EDAM.obo. It can be viewed online at the NCBO BioPortal and the EBI Ontology Lookup Service. For documentation and license please refer to http://edamontology.org. This article describes version 1.2 available at http://edamontology.org/EDAM_1.2.owl. Contact: jison@ebi.ac.uk PMID:23479348

  4. Navigating the changing learning landscape: perspective from bioinformatics.ca

    OpenAIRE

    Brazas, Michelle D.; Ouellette, B. F. Francis

    2013-01-01

    With the advent of YouTube channels in bioinformatics, open platforms for problem solving in bioinformatics, active web forums in computing analyses and online resources for learning to code or use a bioinformatics tool, the more traditional continuing education bioinformatics training programs have had to adapt. Bioinformatics training programs that solely rely on traditional didactic methods are being superseded by these newer resources. Yet such face-to-face instruction is still invaluable...

  5. Design Criteria in Revitalizing Old Warehouse District on the Kalimas Riverbank Area of Surabaya City

    Directory of Open Access Journals (Sweden)

    Endang Titi Sunarti Darjosanjoto

    2015-09-01

    Full Text Available Neglected warehouse buildings along the Kalimas River have created a poor urban façade in terms of visual quality. However the city government is planning to encourage tourism activities that take advantage of Kalimas River and its surrounding environment. If there is no good plan in accordance with the concept of local identity for old city of Surabaya, it will reduce it as a tourist attraction. In reference to the issue above, design criteria needs to be compiled for revitalizing the old warehouse district, which is expected to revive the identity of this district and be able to support the city’s tourism. This study was conducted by recording field observations, and the data was analyzed using the character appraisal method. The character appraisal analysis method is presented in the form of street picture data, which is divided into determined segments. The results show that there are five components including place attachment, sustainable urban design, green open space design, ecological riverfront design, and activity support that should be considered in the revitalization of the warehouse district. Those components are divided into two parts: building and open space at the riverbank. There are 13 design criteria for building at the riverbank, while there are 14 design criteria for open space at the riverbank. These design criteria can enrich the warehouse district’s revitalization by improving the visual quality of the urban environment.Keywords: design criteria; warehouse district; riverbank; Surabaya; revitalization.

  6. Developing and Marketing a Client/Server-Based Data Warehouse.

    Science.gov (United States)

    Singleton, Michele; And Others

    1993-01-01

    To provide better access to information, the University of Arizona information technology center has designed a data warehouse accessible from the desktop computer. A team approach has proved successful in introducing and demonstrating a prototype to the campus community. (Author/MSE)

  7. Respiratory cancer database: An open access database of respiratory cancer gene and miRNA.

    Science.gov (United States)

    Choubey, Jyotsna; Choudhari, Jyoti Kant; Patel, Ashish; Verma, Mukesh Kumar

    2017-01-01

    Respiratory cancer database (RespCanDB) is a genomic and proteomic database of cancer of respiratory organ. It also includes the information of medicinal plants used for the treatment of various respiratory cancers with structure of its active constituents as well as pharmacological and chemical information of drug associated with various respiratory cancers. Data in RespCanDB has been manually collected from published research article and from other databases. Data has been integrated using MySQL an object-relational database management system. MySQL manages all data in the back-end and provides commands to retrieve and store the data into the database. The web interface of database has been built in ASP. RespCanDB is expected to contribute to the understanding of scientific community regarding respiratory cancer biology as well as developments of new way of diagnosing and treating respiratory cancer. Currently, the database consist the oncogenomic information of lung cancer, laryngeal cancer, and nasopharyngeal cancer. Data for other cancers, such as oral and tracheal cancers, will be added in the near future. The URL of RespCanDB is http://ridb.subdic-bioinformatics-nitrr.in/.

  8. OralCard: a bioinformatic tool for the study of oral proteome.

    Science.gov (United States)

    Arrais, Joel P; Rosa, Nuno; Melo, José; Coelho, Edgar D; Amaral, Diana; Correia, Maria José; Barros, Marlene; Oliveira, José Luís

    2013-07-01

    The molecular complexity of the human oral cavity can only be clarified through identification of components that participate within it. However current proteomic techniques produce high volumes of information that are dispersed over several online databases. Collecting all of this data and using an integrative approach capable of identifying unknown associations is still an unsolved problem. This is the main motivation for this work. We present the online bioinformatic tool OralCard, which comprises results from 55 manually curated articles reflecting the oral molecular ecosystem (OralPhysiOme). It comprises experimental information available from the oral proteome both of human (OralOme) and microbial origin (MicroOralOme) structured in protein, disease and organism. This tool is a key resource for researchers to understand the molecular foundations implicated in biology and disease mechanisms of the oral cavity. The usefulness of this tool is illustrated with the analysis of the oral proteome associated with diabetes melitus type 2. OralCard is available at http://bioinformatics.ua.pt/oralcard. Copyright © 2013 Elsevier Ltd. All rights reserved.

  9. Data Delivery and Mapping Over the Web: National Water-Quality Assessment Data Warehouse

    Science.gov (United States)

    Bell, Richard W.; Williamson, Alex K.

    2006-01-01

    The U.S. Geological Survey began its National Water-Quality Assessment (NAWQA) Program in 1991, systematically collecting chemical, biological, and physical water-quality data from study units (basins) across the Nation. In 1999, the NAWQA Program developed a data warehouse to better facilitate national and regional analysis of data from 36 study units started in 1991 and 1994. Data from 15 study units started in 1997 were added to the warehouse in 2001. The warehouse currently contains and links the following data: -- Chemical concentrations in water, sediment, and aquatic-organism tissues and related quality-control data from the USGS National Water Information System (NWIS), -- Biological data for stream-habitat and ecological-community data on fish, algae, and benthic invertebrates, -- Site, well, and basin information associated with thousands of descriptive variables derived from spatial analysis, like land use, soil, and population density, and -- Daily streamflow and temperature information from NWIS for selected sampling sites.

  10. Roadmap to a Comprehensive Clinical Data Warehouse for Precision Medicine Applications in Oncology.

    Science.gov (United States)

    Foran, David J; Chen, Wenjin; Chu, Huiqi; Sadimin, Evita; Loh, Doreen; Riedlinger, Gregory; Goodell, Lauri A; Ganesan, Shridar; Hirshfield, Kim; Rodriguez, Lorna; DiPaola, Robert S

    2017-01-01

    Leading institutions throughout the country have established Precision Medicine programs to support personalized treatment of patients. A cornerstone for these programs is the establishment of enterprise-wide Clinical Data Warehouses. Working shoulder-to-shoulder, a team of physicians, systems biologists, engineers, and scientists at Rutgers Cancer Institute of New Jersey have designed, developed, and implemented the Warehouse with information originating from data sources, including Electronic Medical Records, Clinical Trial Management Systems, Tumor Registries, Biospecimen Repositories, Radiology and Pathology archives, and Next Generation Sequencing services. Innovative solutions were implemented to detect and extract unstructured clinical information that was embedded in paper/text documents, including synoptic pathology reports. Supporting important precision medicine use cases, the growing Warehouse enables physicians to systematically mine and review the molecular, genomic, image-based, and correlated clinical information of patient tumors individually or as part of large cohorts to identify changes and patterns that may influence treatment decisions and potential outcomes.

  11. The Importance of Biological Databases in Biological Discovery.

    Science.gov (United States)

    Baxevanis, Andreas D; Bateman, Alex

    2015-06-19

    Biological databases play a central role in bioinformatics. They offer scientists the opportunity to access a wide variety of biologically relevant data, including the genomic sequences of an increasingly broad range of organisms. This unit provides a brief overview of major sequence databases and portals, such as GenBank, the UCSC Genome Browser, and Ensembl. Model organism databases, including WormBase, The Arabidopsis Information Resource (TAIR), and those made available through the Mouse Genome Informatics (MGI) resource, are also covered. Non-sequence-centric databases, such as Online Mendelian Inheritance in Man (OMIM), the Protein Data Bank (PDB), MetaCyc, and the Kyoto Encyclopedia of Genes and Genomes (KEGG), are also discussed. Copyright © 2015 John Wiley & Sons, Inc.

  12. 19 CFR 19.13 - Requirements for establishment of warehouse.

    Science.gov (United States)

    2010-04-01

    ...; DEPARTMENT OF THE TREASURY CUSTOMS WAREHOUSES, CONTAINER STATIONS AND CONTROL OF MERCHANDISE THEREIN... secured area separated from the remainder of the premises to be used exclusively for the storage of imported merchandise, domestic spirits, and merchandise subject to internal-revenue tax transferred into...

  13. A generally applicable lightweight method for calculating a value structure for tools and services in bioinformatics infrastructure projects.

    Science.gov (United States)

    Mayer, Gerhard; Quast, Christian; Felden, Janine; Lange, Matthias; Prinz, Manuel; Pühler, Alfred; Lawerenz, Chris; Scholz, Uwe; Glöckner, Frank Oliver; Müller, Wolfgang; Marcus, Katrin; Eisenacher, Martin

    2017-10-30

    Sustainable noncommercial bioinformatics infrastructures are a prerequisite to use and take advantage of the potential of big data analysis for research and economy. Consequently, funders, universities and institutes as well as users ask for a transparent value model for the tools and services offered. In this article, a generally applicable lightweight method is described by which bioinformatics infrastructure projects can estimate the value of tools and services offered without determining exactly the total costs of ownership. Five representative scenarios for value estimation from a rough estimation to a detailed breakdown of costs are presented. To account for the diversity in bioinformatics applications and services, the notion of service-specific 'service provision units' is introduced together with the factors influencing them and the main underlying assumptions for these 'value influencing factors'. Special attention is given on how to handle personnel costs and indirect costs such as electricity. Four examples are presented for the calculation of the value of tools and services provided by the German Network for Bioinformatics Infrastructure (de.NBI): one for tool usage, one for (Web-based) database analyses, one for consulting services and one for bioinformatics training events. Finally, from the discussed values, the costs of direct funding and the costs of payment of services by funded projects are calculated and compared. © The Author 2017. Published by Oxford University Press.

  14. Using Bioinformatics to Develop and Test Hypotheses: E. coli-Specific Virulence Determinants

    Directory of Open Access Journals (Sweden)

    Joanna R. Klein

    2012-09-01

    Full Text Available Bioinformatics, the use of computer resources to understand biological information, is an important tool in research, and can be easily integrated into the curriculum of undergraduate courses. Such an example is provided in this series of four activities that introduces students to the field of bioinformatics as they design PCR based tests for pathogenic E. coli strains. A variety of computer tools are used including BLAST searches at NCBI, bacterial genome searches at the Integrated Microbial Genomes (IMG database, protein analysis at Pfam and literature research at PubMed. In the process, students also learn about virulence factors, enzyme function and horizontal gene transfer. Some or all of the four activities can be incorporated into microbiology or general biology courses taken by students at a variety of levels, ranging from high school through college. The activities build on one another as they teach and reinforce knowledge and skills, promote critical thinking, and provide for student collaboration and presentation. The computer-based activities can be done either in class or outside of class, thus are appropriate for inclusion in online or blended learning formats. Assessment data showed that students learned general microbiology concepts related to pathogenesis and enzyme function, gained skills in using tools of bioinformatics and molecular biology, and successfully developed and tested a scientific hypothesis.

  15. A Clinical Data Warehouse Based on OMOP and i2b2 for Austrian Health Claims Data.

    Science.gov (United States)

    Rinner, Christoph; Gezgin, Deniz; Wendl, Christopher; Gall, Walter

    2018-01-01

    To develop simulation models for healthcare related questions clinical data can be reused. Develop a clinical data warehouse to harmonize different data sources in a standardized manner and get a reproducible interface for clinical data reuse. The Kimball life cycle for the development of data warehouse was used. The development is split into the technical, the data and the business intelligence pathway. Sample data was persisted in the Observational Medical Outcomes Partnership (OMOP) Common Data Model (CDM). The i2b2 clinical data warehouse tools were used to query the OMOP CDM by applying the new i2b2 multi-fact table feature. A clinical data warehouse was set up and sample data, data dimensions and ontologies for Austrian health claims data were created. The ability of the standardized data access layer to create and apply simulation models will be evaluated next.

  16. Real Time Business Analytics for Buying or Selling Transaction on Commodity Warehouse Receipt System

    Science.gov (United States)

    Djatna, Taufik; Teniwut, Wellem A.; Hairiyah, Nina; Marimin

    2017-10-01

    The requirement for smooth information such as buying and selling is essential for commodity warehouse receipt system such as dried seaweed and their stakeholders to transact for an operational transaction. Transactions of buying or selling a commodity warehouse receipt system are a risky process due to the fluctuations in dynamic commodity prices. An integrated system to determine the condition of the real time was needed to make a decision-making transaction by the owner or prospective buyer. The primary motivation of this study is to propose computational methods to trace market tendency for either buying or selling processes. The empirical results reveal that feature selection gain ratio and k-NN outperforms other forecasting models, implying that the proposed approach is a promising alternative to the stock market tendency of warehouse receipt document exploration with accurate level rate is 95.03%.

  17. Managing data quality in an existing medical data warehouse using business intelligence technologies.

    Science.gov (United States)

    Eaton, Scott; Ostrander, Michael; Santangelo, Jennifer; Kamal, Jyoti

    2008-11-06

    The Ohio State University Medical Center (OSUMC) Information Warehouse (IW) is a comprehensive data warehousing facility that provides providing data integration, management, mining, training, and development services to a diversity of customers across the clinical, education, and research sectors of the OSUMC. Providing accurate and complete data is a must for these purposes. In order to monitor the data quality of targeted data sets, an online scorecard has been developed to allow visualization of the critical measures of data quality in the Information Warehouse.

  18. ON PROBLEM OF REGIONAL WAREHOUSE AND TRANSPORT INFRASTRUCTURE OPTIMIZATION

    Directory of Open Access Journals (Sweden)

    I. Yu. Miretskiy

    2017-01-01

    Full Text Available The article suggests an approach of solving the problem of warehouse and transport infrastructure optimization in a region. The task is to determine the optimal capacity and location of the support network of warehouses in the region, as well as power, composition and location of motor fleets. Optimization is carried out using mathematical models of a regional warehouse network and a network of motor fleets. These models are presented as mathematical programming problems with separable functions. The process of finding the optimal solution of problems is complicated due to high dimensionality, non-linearity of functions, and the fact that a part of variables are constrained to integer, and some variables can take values only from a discrete set. Given the mentioned above complications search for an exact solution was rejected. The article suggests an approximate approach to solving problems. This approach employs effective computational schemes for solving multidimensional optimization problems. We use the continuous relaxation of the original problem to obtain its approximate solution. An approximately optimal solution of continuous relaxation is taken as an approximate solution of the original problem. The suggested solution method implies linearization of the obtained continuous relaxation and use of the separable programming scheme and the scheme of branches and bounds. We describe the use of the simplex method for solving the linearized continuous relaxation of the original problem and the specific moments of the branches and bounds method implementation. The paper shows the finiteness of the algorithm and recommends how to accelerate process of finding a solution.

  19. The Microsoft Data Warehouse Toolkit With SQL Server 2008 R2 and the Microsoft Business Intelligence Toolset

    CERN Document Server

    Mundy, Joy; Kimball, Ralph

    2011-01-01

    Best practices and invaluable advice from world-renowned data warehouse expertsIn this book, leading data warehouse experts from the Kimball Group share best practices for using the upcoming "Business Intelligence release" of SQL Server, referred to as SQL Server 2008 R2. In this new edition, the authors explain how SQL Server 2008 R2 provides a collection of powerful new tools that extend the power of its BI toolset to Excel and SharePoint users and they show how to use SQL Server to build a successful data warehouse that supports the business intelligence requirements that are common to most

  20. Risk control for staff planning in e-commerce warehouses

    NARCIS (Netherlands)

    Wruck, Susanne; Vis, Iris F A; Boter, Jaap

    2016-01-01

    Internet sale supply chains often need to fulfil quickly small orders for many customers. The resulting high demand and planning uncertainties pose new challenges for e-commerce warehouse operations. Here, we develop a decision support tool to assist managers in selecting appropriate risk policies

  1. The Revolution in Viral Genomics as Exemplified by the Bioinformatic Analysis of Human Adenoviruses

    Directory of Open Access Journals (Sweden)

    Sarah Torres

    2010-06-01

    Full Text Available Over the past 30 years, genomic and bioinformatic analysis of human adenoviruses has been achieved using a variety of DNA sequencing methods; initially with the use of restriction enzymes and more currently with the use of the GS FLX pyrosequencing technology. Following the conception of DNA sequencing in the 1970s, analysis of adenoviruses has evolved from 100 base pair mRNA fragments to entire genomes. Comparative genomics of adenoviruses made its debut in 1984 when nucleotides and amino acids of coding sequences within the hexon genes of two human adenoviruses (HAdV, HAdV–C2 and HAdV–C5, were compared and analyzed. It was determined that there were three different zones (1-393, 394-1410, 1411-2910 within the hexon gene, of which HAdV–C2 and HAdV–C5 shared zones 1 and 3 with 95% and 89.5% nucleotide identity, respectively. In 1992, HAdV-C5 became the first adenovirus genome to be fully sequenced using the Sanger method. Over the next seven years, whole genome analysis and characterization was completed using bioinformatic tools such as blastn, tblastx, ClustalV and FASTA, in order to determine key proteins in species HAdV-A through HAdV-F. The bioinformatic revolution was initiated with the introduction of a novel species, HAdV-G, that was typed and named by the use of whole genome sequencing and phylogenetics as opposed to traditional serology. HAdV bioinformatics will continue to advance as the latest sequencing technology enables scientists to add to and expand the resource databases. As a result of these advancements, how novel HAdVs are typed has changed. Bioinformatic analysis has become the revolutionary tool that has significantly accelerated the in-depth study of HAdV microevolution through comparative genomics.

  2. The 2015 Bioinformatics Open Source Conference (BOSC 2015).

    Science.gov (United States)

    Harris, Nomi L; Cock, Peter J A; Lapp, Hilmar; Chapman, Brad; Davey, Rob; Fields, Christopher; Hokamp, Karsten; Munoz-Torres, Monica

    2016-02-01

    The Bioinformatics Open Source Conference (BOSC) is organized by the Open Bioinformatics Foundation (OBF), a nonprofit group dedicated to promoting the practice and philosophy of open source software development and open science within the biological research community. Since its inception in 2000, BOSC has provided bioinformatics developers with a forum for communicating the results of their latest efforts to the wider research community. BOSC offers a focused environment for developers and users to interact and share ideas about standards; software development practices; practical techniques for solving bioinformatics problems; and approaches that promote open science and sharing of data, results, and software. BOSC is run as a two-day special interest group (SIG) before the annual Intelligent Systems in Molecular Biology (ISMB) conference. BOSC 2015 took place in Dublin, Ireland, and was attended by over 125 people, about half of whom were first-time attendees. Session topics included "Data Science;" "Standards and Interoperability;" "Open Science and Reproducibility;" "Translational Bioinformatics;" "Visualization;" and "Bioinformatics Open Source Project Updates". In addition to two keynote talks and dozens of shorter talks chosen from submitted abstracts, BOSC 2015 included a panel, titled "Open Source, Open Door: Increasing Diversity in the Bioinformatics Open Source Community," that provided an opportunity for open discussion about ways to increase the diversity of participants in BOSC in particular, and in open source bioinformatics in general. The complete program of BOSC 2015 is available online at http://www.open-bio.org/wiki/BOSC_2015_Schedule.

  3. DATABASES DEVELOPED IN INDIA FOR BIOLOGICAL SCIENCES

    Directory of Open Access Journals (Sweden)

    Gitanjali Yadav

    2017-09-01

    Full Text Available The complexity of biological systems requires use of a variety of experimental methods with ever increasing sophistication to probe various cellular processes at molecular and atomic resolution. The availability of technologies for determining nucleic acid sequences of genes and atomic resolution structures of biomolecules prompted development of major biological databases like GenBank and PDB almost four decades ago. India was one of the few countries to realize early, the utility of such databases for progress in modern biology/biotechnology. Department of Biotechnology (DBT, India established Biotechnology Information System (BTIS network in late eighties. Starting with the genome sequencing revolution at the turn of the century, application of high-throughput sequencing technologies in biology and medicine for analysis of genomes, transcriptomes, epigenomes and microbiomes have generated massive volumes of sequence data. BTIS network has not only provided state of the art computational infrastructure to research institutes and universities for utilizing various biological databases developed abroad in their research, it has also actively promoted research and development (R&D projects in Bioinformatics to develop a variety of biological databases in diverse areas. It is encouraging to note that, a large number of biological databases or data driven software tools developed in India, have been published in leading peer reviewed international journals like Nucleic Acids Research, Bioinformatics, Database, BMC, PLoS and NPG series publication. Some of these databases are not only unique, they are also highly accessed as reflected in number of citations. Apart from databases developed by individual research groups, BTIS has initiated consortium projects to develop major India centric databases on Mycobacterium tuberculosis, Rice and Mango, which can potentially have practical applications in health and agriculture. Many of these biological

  4. Knowledge Management through a Fully Extensible, Schema Independent, XML Database

    National Research Council Canada - National Science Library

    Direen, H

    2001-01-01

    ... (databases in particular) is that the context must be predefined. In a field that is developing as fast as bioinformatics, it is as impossible to predefine all of the context as it is to predefine all of the data that is being...

  5. Component-Based Approach for Educating Students in Bioinformatics

    Science.gov (United States)

    Poe, D.; Venkatraman, N.; Hansen, C.; Singh, G.

    2009-01-01

    There is an increasing need for an effective method of teaching bioinformatics. Increased progress and availability of computer-based tools for educating students have led to the implementation of a computer-based system for teaching bioinformatics as described in this paper. Bioinformatics is a recent, hybrid field of study combining elements of…

  6. Phylogenetic trees in bioinformatics

    Energy Technology Data Exchange (ETDEWEB)

    Burr, Tom L [Los Alamos National Laboratory

    2008-01-01

    Genetic data is often used to infer evolutionary relationships among a collection of viruses, bacteria, animal or plant species, or other operational taxonomic units (OTU). A phylogenetic tree depicts such relationships and provides a visual representation of the estimated branching order of the OTUs. Tree estimation is unique for several reasons, including: the types of data used to represent each OTU; the use ofprobabilistic nucleotide substitution models; the inference goals involving both tree topology and branch length, and the huge number of possible trees for a given sample of a very modest number of OTUs, which implies that fmding the best tree(s) to describe the genetic data for each OTU is computationally demanding. Bioinformatics is too large a field to review here. We focus on that aspect of bioinformatics that includes study of similarities in genetic data from multiple OTUs. Although research questions are diverse, a common underlying challenge is to estimate the evolutionary history of the OTUs. Therefore, this paper reviews the role of phylogenetic tree estimation in bioinformatics, available methods and software, and identifies areas for additional research and development.

  7. Automated realtime data import for the i2b2 clinical data warehouse: introducing the HL7 ETL cell.

    Science.gov (United States)

    Majeed, Raphael W; Röhrig, Rainer

    2012-01-01

    Clinical data warehouses are used to consolidate all available clinical data from one or multiple organizations. They represent an important source for clinical research, quality management and controlling. Since its introduction, the data warehouse i2b2 gathered a large user base in the research community. Yet, little work has been done on the process of importing clinical data into data warehouses using existing standards. In this article, we present a novel approach of utilizing the clinical integration server as data source, commonly available in most hospitals. As information is transmitted through the integration server, the standardized HL7 message is immediately parsed and inserted into the data warehouse. Evaluation of import speeds suggest feasibility of the provided solution for real-time processing of HL7 messages. By using the presented approach of standardized data import, i2b2 can be used as a plug and play data warehouse, without the hurdle of customized import for every clinical information system or electronic medical record. The provided solution is available for download at http://sourceforge.net/projects/histream/.

  8. Respiratory cancer database: An open access database of respiratory cancer gene and miRNA

    Directory of Open Access Journals (Sweden)

    Jyotsna Choubey

    2017-01-01

    Results and Conclusions: RespCanDB is expected to contribute to the understanding of scientific community regarding respiratory cancer biology as well as developments of new way of diagnosing and treating respiratory cancer. Currently, the database consist the oncogenomic information of lung cancer, laryngeal cancer, and nasopharyngeal cancer. Data for other cancers, such as oral and tracheal cancers, will be added in the near future. The URL of RespCanDB is http://ridb.subdic-bioinformatics-nitrr.in/.

  9. Data-driven warehouse optimization : Deploying skills of order pickers

    NARCIS (Netherlands)

    M. Matusiak (Marek); M.B.M. de Koster (René); J. Saarinen (Jari)

    2015-01-01

    textabstractBatching orders and routing order pickers is a commonly studied problem in many picker-to-parts warehouses. The impact of individual differences in picking skills on performance has received little attention. In this paper, we show that taking into account differences in the skills of

  10. Applications and Methods Utilizing the Simple Semantic Web Architecture and Protocol (SSWAP) for Bioinformatics Resource Discovery and Disparate Data and Service Integration

    Science.gov (United States)

    Scientific data integration and computational service discovery are challenges for the bioinformatic community. This process is made more difficult by the separate and independent construction of biological databases, which makes the exchange of scientific data between information resources difficu...

  11. A Mathematical Optimization Problem in Bioinformatics

    Science.gov (United States)

    Heyer, Laurie J.

    2008-01-01

    This article describes the sequence alignment problem in bioinformatics. Through examples, we formulate sequence alignment as an optimization problem and show how to compute the optimal alignment with dynamic programming. The examples and sample exercises have been used by the author in a specialized course in bioinformatics, but could be adapted…

  12. Can Leader–Member Exchange Contribute to Safety Performance in An Italian Warehouse?

    Directory of Open Access Journals (Sweden)

    Marco G. Mariani

    2017-05-01

    Full Text Available Introduction: The research considers safety climate in a warehouse and wants to analyze the Leader–Member Exchange (LMX role in respect to safety performance. Griffin and Neal’s safety model was adopted and Leader-Member Exchange was inserted as moderator in the relationships between safety climate and proximal antecedents (motivation and knowledge of safety performance constructs (compliance and participation.Materials and Methods: Survey data were collected from a sample of 133 full-time employees in an Italian warehouse. The statistical framework of Hayes (2013 was adopted for moderated mediation analysis.Results: Proximal antecedents partially mediated the relationship between Safety climate and safety participation, but not safety compliance. Moreover, the results from the moderation analysis showed that the Leader–Member Exchange moderated the influence of safety climate on proximal antecedents and the mediation exist only at the higher level of LMX.Conclusion: The study shows that the different aspects of leadership processes interact in explaining individual proficiency in safety practices.Practical Implications: Organizations as warehouses should improve the quality of the relationship between a leader and a subordinate based upon the dimensions of respect, trust, and obligation for high level of safety performance.

  13. FungiDB: An Integrated Bioinformatic Resource for Fungi and Oomycetes

    Directory of Open Access Journals (Sweden)

    Evelina Y. Basenko

    2018-03-01

    Full Text Available FungiDB (fungidb.org is a free online resource for data mining and functional genomics analysis for fungal and oomycete species. FungiDB is part of the Eukaryotic Pathogen Genomics Database Resource (EuPathDB, eupathdb.org platform that integrates genomic, transcriptomic, proteomic, and phenotypic datasets, and other types of data for pathogenic and nonpathogenic, free-living and parasitic organisms. FungiDB is one of the largest EuPathDB databases containing nearly 100 genomes obtained from GenBank, Aspergillus Genome Database (AspGD, The Broad Institute, Joint Genome Institute (JGI, Ensembl, and other sources. FungiDB offers a user-friendly web interface with embedded bioinformatics tools that support custom in silico experiments that leverage FungiDB-integrated data. In addition, a Galaxy-based workspace enables users to generate custom pipelines for large-scale data analysis (e.g., RNA-Seq, variant calling, etc.. This review provides an introduction to the FungiDB resources and focuses on available features, tools, and queries and how they can be used to mine data across a diverse range of integrated FungiDB datasets and records.

  14. WATCHMAN: A Data Warehouse Intelligent Cache Manager

    Science.gov (United States)

    Scheuermann, Peter; Shim, Junho; Vingralek, Radek

    1996-01-01

    Data warehouses store large volumes of data which are used frequently by decision support applications. Such applications involve complex queries. Query performance in such an environment is critical because decision support applications often require interactive query response time. Because data warehouses are updated infrequently, it becomes possible to improve query performance by caching sets retrieved by queries in addition to query execution plans. In this paper we report on the design of an intelligent cache manager for sets retrieved by queries called WATCHMAN, which is particularly well suited for data warehousing environment. Our cache manager employs two novel, complementary algorithms for cache replacement and for cache admission. WATCHMAN aims at minimizing query response time and its cache replacement policy swaps out entire retrieved sets of queries instead of individual pages. The cache replacement and admission algorithms make use of a profit metric, which considers for each retrieved set its average rate of reference, its size, and execution cost of the associated query. We report on a performance evaluation based on the TPC-D and Set Query benchmarks. These experiments show that WATCHMAN achieves a substantial performance improvement in a decision support environment when compared to a traditional LRU replacement algorithm.

  15. Storage of hazardous substances in bonded warehouses

    International Nuclear Information System (INIS)

    Villalobos Artavia, Beatriz

    2008-01-01

    A variety of special regulations exist in Costa Rica for registration and transport of hazardous substances; these set the requirements for entry into the country and the security of transport units. However, the regulations mentioned no specific rules for storing hazardous substances. Tax deposits have been the initial place where are stored the substances that enter the country.The creation of basic rules that would be regulating the storage of hazardous substances has taken place through the analysis of regulations and national and international laws governing hazardous substances. The regulatory domain that currently exists will be established with a field research in fiscal deposits in the metropolitan area. The storage and security measures that have been used by the personnel handling the substances will be identified to be putting the reality with that the hazardous substances have been handled in tax deposits. A rule base for the storage of hazardous substances in tax deposits can be made, protecting the safety of the environment in which are manipulated and avoiding a possible accident causing a mess around. The rule will have the characteristics of the storage warehouses hazardous substances, such as safety standards, labeling standards, infrastructure features, common storage and transitional measures that must possess and meet all bonded warehouses to store hazardous substances. (author) [es

  16. Bioinformatics and Cancer

    Science.gov (United States)

    Researchers take on challenges and opportunities to mine "Big Data" for answers to complex biological questions. Learn how bioinformatics uses advanced computing, mathematics, and technological platforms to store, manage, analyze, and understand data.

  17. Renal Gene Expression Database (RGED): a relational database of gene expression profiles in kidney disease

    Science.gov (United States)

    Zhang, Qingzhou; Yang, Bo; Chen, Xujiao; Xu, Jing; Mei, Changlin; Mao, Zhiguo

    2014-01-01

    We present a bioinformatics database named Renal Gene Expression Database (RGED), which contains comprehensive gene expression data sets from renal disease research. The web-based interface of RGED allows users to query the gene expression profiles in various kidney-related samples, including renal cell lines, human kidney tissues and murine model kidneys. Researchers can explore certain gene profiles, the relationships between genes of interests and identify biomarkers or even drug targets in kidney diseases. The aim of this work is to provide a user-friendly utility for the renal disease research community to query expression profiles of genes of their own interest without the requirement of advanced computational skills. Availability and implementation: Website is implemented in PHP, R, MySQL and Nginx and freely available from http://rged.wall-eva.net. Database URL: http://rged.wall-eva.net PMID:25252782

  18. Biology in 'silico': The Bioinformatics Revolution.

    Science.gov (United States)

    Bloom, Mark

    2001-01-01

    Explains the Human Genome Project (HGP) and efforts to sequence the human genome. Describes the role of bioinformatics in the project and considers it the genetics Swiss Army Knife, which has many different uses, for use in forensic science, medicine, agriculture, and environmental sciences. Discusses the use of bioinformatics in the high school…

  19. toxoMine: an integrated omics data warehouse for Toxoplasma gondii systems biology research.

    Science.gov (United States)

    Rhee, David B; Croken, Matthew McKnight; Shieh, Kevin R; Sullivan, Julie; Micklem, Gos; Kim, Kami; Golden, Aaron

    2015-01-01

    Toxoplasma gondii (T. gondii) is an obligate intracellular parasite that must monitor for changes in the host environment and respond accordingly; however, it is still not fully known which genetic or epigenetic factors are involved in regulating virulence traits of T. gondii. There are on-going efforts to elucidate the mechanisms regulating the stage transition process via the application of high-throughput epigenomics, genomics and proteomics techniques. Given the range of experimental conditions and the typical yield from such high-throughput techniques, a new challenge arises: how to effectively collect, organize and disseminate the generated data for subsequent data analysis. Here, we describe toxoMine, which provides a powerful interface to support sophisticated integrative exploration of high-throughput experimental data and metadata, providing researchers with a more tractable means toward understanding how genetic and/or epigenetic factors play a coordinated role in determining pathogenicity of T. gondii. As a data warehouse, toxoMine allows integration of high-throughput data sets with public T. gondii data. toxoMine is also able to execute complex queries involving multiple data sets with straightforward user interaction. Furthermore, toxoMine allows users to define their own parameters during the search process that gives users near-limitless search and query capabilities. The interoperability feature also allows users to query and examine data available in other InterMine systems, which would effectively augment the search scope beyond what is available to toxoMine. toxoMine complements the major community database ToxoDB by providing a data warehouse that enables more extensive integrative studies for T. gondii. Given all these factors, we believe it will become an indispensable resource to the greater infectious disease research community. © The Author(s) 2015. Published by Oxford University Press.

  20. Bioinformatics research in the Asia Pacific: a 2007 update.

    Science.gov (United States)

    Ranganathan, Shoba; Gribskov, Michael; Tan, Tin Wee

    2008-01-01

    We provide a 2007 update on the bioinformatics research in the Asia-Pacific from the Asia Pacific Bioinformatics Network (APBioNet), Asia's oldest bioinformatics organisation set up in 1998. From 2002, APBioNet has organized the first International Conference on Bioinformatics (InCoB) bringing together scientists working in the field of bioinformatics in the region. This year, the InCoB2007 Conference was organized as the 6th annual conference of the Asia-Pacific Bioinformatics Network, on Aug. 27-30, 2007 at Hong Kong, following a series of successful events in Bangkok (Thailand), Penang (Malaysia), Auckland (New Zealand), Busan (South Korea) and New Delhi (India). Besides a scientific meeting at Hong Kong, satellite events organized are a pre-conference training workshop at Hanoi, Vietnam and a post-conference workshop at Nansha, China. This Introduction provides a brief overview of the peer-reviewed manuscripts accepted for publication in this Supplement. We have organized the papers into thematic areas, highlighting the growing contribution of research excellence from this region, to global bioinformatics endeavours.

  1. Host-parasite interactions and ecology of the malaria parasite-a bioinformatics approach.

    Science.gov (United States)

    Izak, Dariusz; Klim, Joanna; Kaczanowski, Szymon

    2018-04-25

    Malaria remains one of the highest mortality infectious diseases. Malaria is caused by parasites from the genus Plasmodium. Most deaths are caused by infections involving Plasmodium falciparum, which has a complex life cycle. Malaria parasites are extremely well adapted for interactions with their host and their host's immune system and are able to suppress the human immune system, erase immunological memory and rapidly alter exposed antigens. Owing to this rapid evolution, parasites develop drug resistance and express novel forms of antigenic proteins that are not recognized by the host immune system. There is an emerging need for novel interventions, including novel drugs and vaccines. Designing novel therapies requires knowledge about host-parasite interactions, which is still limited. However, significant progress has recently been achieved in this field through the application of bioinformatics analysis of parasite genome sequences. In this review, we describe the main achievements in 'malarial' bioinformatics and provide examples of successful applications of protein sequence analysis. These examples include the prediction of protein functions based on homology and the prediction of protein surface localization via domain and motif analysis. Additionally, we describe PlasmoDB, a database that stores accumulated experimental data. This tool allows data mining of the stored information and will play an important role in the development of malaria science. Finally, we illustrate the application of bioinformatics in the development of population genetics research on malaria parasites, an approach referred to as reverse ecology.

  2. BioXSD: the common data-exchange format for everyday bioinformatics web services.

    Science.gov (United States)

    Kalas, Matús; Puntervoll, Pål; Joseph, Alexandre; Bartaseviciūte, Edita; Töpfer, Armin; Venkataraman, Prabakar; Pettifer, Steve; Bryne, Jan Christian; Ison, Jon; Blanchet, Christophe; Rapacki, Kristoffer; Jonassen, Inge

    2010-09-15

    The world-wide community of life scientists has access to a large number of public bioinformatics databases and tools, which are developed and deployed using diverse technologies and designs. More and more of the resources offer programmatic web-service interface. However, efficient use of the resources is hampered by the lack of widely used, standard data-exchange formats for the basic, everyday bioinformatics data types. BioXSD has been developed as a candidate for standard, canonical exchange format for basic bioinformatics data. BioXSD is represented by a dedicated XML Schema and defines syntax for biological sequences, sequence annotations, alignments and references to resources. We have adapted a set of web services to use BioXSD as the input and output format, and implemented a test-case workflow. This demonstrates that the approach is feasible and provides smooth interoperability. Semantics for BioXSD is provided by annotation with the EDAM ontology. We discuss in a separate section how BioXSD relates to other initiatives and approaches, including existing standards and the Semantic Web. The BioXSD 1.0 XML Schema is freely available at http://www.bioxsd.org/BioXSD-1.0.xsd under the Creative Commons BY-ND 3.0 license. The http://bioxsd.org web page offers documentation, examples of data in BioXSD format, example workflows with source codes in common programming languages, an updated list of compatible web services and tools and a repository of feature requests from the community.

  3. Microbial bioinformatics 2020.

    Science.gov (United States)

    Pallen, Mark J

    2016-09-01

    Microbial bioinformatics in 2020 will remain a vibrant, creative discipline, adding value to the ever-growing flood of new sequence data, while embracing novel technologies and fresh approaches. Databases and search strategies will struggle to cope and manual curation will not be sustainable during the scale-up to the million-microbial-genome era. Microbial taxonomy will have to adapt to a situation in which most microorganisms are discovered and characterised through the analysis of sequences. Genome sequencing will become a routine approach in clinical and research laboratories, with fresh demands for interpretable user-friendly outputs. The "internet of things" will penetrate healthcare systems, so that even a piece of hospital plumbing might have its own IP address that can be integrated with pathogen genome sequences. Microbiome mania will continue, but the tide will turn from molecular barcoding towards metagenomics. Crowd-sourced analyses will collide with cloud computing, but eternal vigilance will be the price of preventing the misinterpretation and overselling of microbial sequence data. Output from hand-held sequencers will be analysed on mobile devices. Open-source training materials will address the need for the development of a skilled labour force. As we boldly go into the third decade of the twenty-first century, microbial sequence space will remain the final frontier! © 2016 The Author. Microbial Biotechnology published by John Wiley & Sons Ltd and Society for Applied Microbiology.

  4. Tabu search approaches for the multi-level warehouse layout problem with adjacency constraints

    Science.gov (United States)

    Zhang, G. Q.; Lai, K. K.

    2010-08-01

    A new multi-level warehouse layout problem, the multi-level warehouse layout problem with adjacency constraints (MLWLPAC), is investigated. The same item type is required to be located in adjacent cells, and horizontal and vertical unit travel costs are product dependent. An integer programming model is proposed to formulate the problem, which is NP hard. Along with a cube-per-order index policy based heuristic, the standard tabu search (TS), greedy TS, and dynamic neighbourhood based TS are presented to solve the problem. The computational results show that the proposed approaches can reduce the transportation cost significantly.

  5. Implementación de un piloto del componente comercial del data warehouse de Etapatelecom

    OpenAIRE

    Vélez Iñiguez, Roberto José

    2008-01-01

    El sistema a desarrollar se enmarca en una arquitectura de Data Warehouse, cuyo objetivo es extraer información de los Sistemas Transaccionales disponibles en Etapatelecom para ser usada en una Base de Datos orientada a la toma de decisiones (Data Warehouse) para el Área Comercial de la Empresa. El procedimiento se inicia con un análisis de los requerimientos de los usuarios estratégicos del Área Comercial de Etapatelecom donde se identifican los indicadores de negocio (Medidas) y las Dimensi...

  6. When cloud computing meets bioinformatics: a review.

    Science.gov (United States)

    Zhou, Shuigeng; Liao, Ruiqi; Guan, Jihong

    2013-10-01

    In the past decades, with the rapid development of high-throughput technologies, biology research has generated an unprecedented amount of data. In order to store and process such a great amount of data, cloud computing and MapReduce were applied to many fields of bioinformatics. In this paper, we first introduce the basic concepts of cloud computing and MapReduce, and their applications in bioinformatics. We then highlight some problems challenging the applications of cloud computing and MapReduce to bioinformatics. Finally, we give a brief guideline for using cloud computing in biology research.

  7. Application of machine learning methods in bioinformatics

    Science.gov (United States)

    Yang, Haoyu; An, Zheng; Zhou, Haotian; Hou, Yawen

    2018-05-01

    Faced with the development of bioinformatics, high-throughput genomic technology have enabled biology to enter the era of big data. [1] Bioinformatics is an interdisciplinary, including the acquisition, management, analysis, interpretation and application of biological information, etc. It derives from the Human Genome Project. The field of machine learning, which aims to develop computer algorithms that improve with experience, holds promise to enable computers to assist humans in the analysis of large, complex data sets.[2]. This paper analyzes and compares various algorithms of machine learning and their applications in bioinformatics.

  8. Identifying and Prioritizing Cleaner Production Strategies in Raw Materials’ Warehouse of Yazdbaf Textile Company in 2015

    Directory of Open Access Journals (Sweden)

    Mohammad Taghi Ghaneian

    2017-03-01

    Full Text Available Introduction: Cleaner productions in textile industry is achieved by reducing water and chemicals’ consumption, saving energy, reducing production of air pollution and solid wastes, reducing toxicity and noise pollution through many solutions. The purpose of the present research was to apply Strengths, Weaknesses, Opportunities, Threats (SWOT and Quality Systems Planning Matrix (QSPM techniques in identifying and prioritizing production in raw materials’ warehouse of Yazdbaf Textile Factory. Materials and Methods: In this research, effective internal and external factors in cleaner production were identified by providing the required information through field visit and interview with industry managers and supervisors of raw materials’ warehouse. Finally, To form matrix of internal and external factors 17 important internal factors and 7 important external factors were identified and selected respectively.Then, QSPM matrix was formed to determine the attractiveness and priority of the selected strategies by using results of internal and external factors and SWOT matrixes. Results: According to the results, the total score of raw materials’ warehouse in Internal Factor Evaluation (IFE matrix is equal to 2.90 which shows the good situation of warehouse than the internal factors. However, the total score in External Factor Evaluation (EFE matrix is 2.14 and indicates the relative weak situation of warehouse than the external factors. Conclusion: Based on the obtained results, continuity, monitor, and improvement of the general plan of qualitative control (QC of raw materials and laboratory as well as more emphasis on quality indexes according to its importance in the production processes were selected as the most important strategies. 

  9. Probabilistic Data Modeling and Querying for Location-Based Data Warehouses

    DEFF Research Database (Denmark)

    Timko, Igor; Dyreson, Curtis E.; Pedersen, Torben Bach

    Motivated by the increasing need to handle complex, dynamic, uncertain multidimensional data in location-based warehouses, this paper proposes a novel probabilistic data model that can address the complexities of such data. The model provides a foundation for handling complex hierarchical and unc...

  10. Probabilistic Data Modeling and Querying for Location-Based Data Warehouses

    DEFF Research Database (Denmark)

    Timko, Igor; Dyreson, Curtis E.; Pedersen, Torben Bach

    2005-01-01

    Motivated by the increasing need to handle complex, dynamic, uncertain multidimensional data in location-based warehouses, this paper proposes a novel probabilistic data model that can address the complexities of such data. The model provides a foundation for handling complex hierarchical and unc...

  11. A Foundation for Spatial Data Warehouses on the Semantic Web


    DEFF Research Database (Denmark)

    Gur, Nurefsan; Pedersen, Torben Bach; Zimaányi, Esteban

    2017-01-01

    Large volumes of geospatial data are being published on the Semantic Web (SW), yielding a need for advanced analysis of such data. However, existing SW technologies only support advanced analytical concepts such as multidimensional (MD) data warehouses and Online Analytical Processing (OLAP) over...

  12. Ergatis: a web interface and scalable software system for bioinformatics workflows

    Science.gov (United States)

    Orvis, Joshua; Crabtree, Jonathan; Galens, Kevin; Gussman, Aaron; Inman, Jason M.; Lee, Eduardo; Nampally, Sreenath; Riley, David; Sundaram, Jaideep P.; Felix, Victor; Whitty, Brett; Mahurkar, Anup; Wortman, Jennifer; White, Owen; Angiuoli, Samuel V.

    2010-01-01

    Motivation: The growth of sequence data has been accompanied by an increasing need to analyze data on distributed computer clusters. The use of these systems for routine analysis requires scalable and robust software for data management of large datasets. Software is also needed to simplify data management and make large-scale bioinformatics analysis accessible and reproducible to a wide class of target users. Results: We have developed a workflow management system named Ergatis that enables users to build, execute and monitor pipelines for computational analysis of genomics data. Ergatis contains preconfigured components and template pipelines for a number of common bioinformatics tasks such as prokaryotic genome annotation and genome comparisons. Outputs from many of these components can be loaded into a Chado relational database. Ergatis was designed to be accessible to a broad class of users and provides a user friendly, web-based interface. Ergatis supports high-throughput batch processing on distributed compute clusters and has been used for data management in a number of genome annotation and comparative genomics projects. Availability: Ergatis is an open-source project and is freely available at http://ergatis.sourceforge.net Contact: jorvis@users.sourceforge.net PMID:20413634

  13. A framework for information warehouse development processes

    OpenAIRE

    Holten, Roland

    1999-01-01

    Since the terms Data Warehouse and On-Line Analytical Processing were proposed by Inmon and Codd, Codd, Sally respectively the traditional ideas of creating information systems in support of management¿s decision became interesting again in theory and practice. Today information warehousing is a strategic market for any data base systems vendor. Nevertheless the theoretical discussions of this topic go back to the early years of the 20th century as far as management science and accounting the...

  14. Heuristics for multi-item two-echelon spare parts inventory control problem with batch ordering in the central warehouse

    NARCIS (Netherlands)

    Topan, E.; Bayindir, Z.P.; Tan, T.

    2010-01-01

    We consider a multi-item two-echelon inventory system in which the central warehouse operates under a (Q;R) policy, and each local warehouse implements (S ¡ 1; S) policy. The objective is to find the policy parameters minimizing expected system-wide inventory holding and fixed ordering costs subject

  15. Bioinformatics Training: A Review of Challenges, Actions and Support Requirements

    DEFF Research Database (Denmark)

    Schneider, M.V.; Watson, J.; Attwood, T.

    2010-01-01

    As bioinformatics becomes increasingly central to research in the molecular life sciences, the need to train non-bioinformaticians to make the most of bioinformatics resources is growing. Here, we review the key challenges and pitfalls to providing effective training for users of bioinformatics...... services, and discuss successful training strategies shared by a diverse set of bioinformatics trainers. We also identify steps that trainers in bioinformatics could take together to advance the state of the art in current training practices. The ideas presented in this article derive from the first...

  16. 7 CFR 735.401 - Electronic warehouse receipt and USWA electronic document providers.

    Science.gov (United States)

    2010-01-01

    ... audit level financial statement prepared according to generally accepted accounting standards as defined... warehouse receipt requirements; (3) Liability; (4) Transfer of records protocol; (5) Records; (6) Conflict...

  17. DESAIN ETL DENGAN CONTOH KASUS PERGURUAN TINGGI

    Directory of Open Access Journals (Sweden)

    Spits Warnars

    2009-01-01

    Full Text Available Data Warehouse for higher education as a paradigm for helping high management in order to make an effective and efficient strategic decisions based on reliable and trusted reports which is produced from Data Warehouse itself. Data Warehouse is not a software, hardware or tool but Data Warehouse is an environment where the transactional database is modelled in other view for decision making purposes. ETL (Extraction, Transformation and Loading is a bridge to build Data Warehouse and transform data from transactional database. In every fact and dimension table will be inserted with fields which represent the construction merge loading as an ETL (Extraction, Transformation and Loading extraction. ETL needs an ETL table and ETL process where ETL table as table connectivity between tables in OLTP database and tables in Data Warehouse and ETL process will transform data from table in OLTP database into Data Warehouse table based on ETL table. The extraction process will be run with a table database as differentiate ETL process and an ETL algorithm which will be run automatically in idle transactional process, along with daily transactional database backup when the information system are not used.

  18. MOWServ: a web client for integration of bioinformatic resources

    Science.gov (United States)

    Ramírez, Sergio; Muñoz-Mérida, Antonio; Karlsson, Johan; García, Maximiliano; Pérez-Pulido, Antonio J.; Claros, M. Gonzalo; Trelles, Oswaldo

    2010-01-01

    The productivity of any scientist is affected by cumbersome, tedious and time-consuming tasks that try to make the heterogeneous web services compatible so that they can be useful in their research. MOWServ, the bioinformatic platform offered by the Spanish National Institute of Bioinformatics, was released to provide integrated access to databases and analytical tools. Since its release, the number of available services has grown dramatically, and it has become one of the main contributors of registered services in the EMBRACE Biocatalogue. The ontology that enables most of the web-service compatibility has been curated, improved and extended. The service discovery has been greatly enhanced by Magallanes software and biodataSF. User data are securely stored on the main server by an authentication protocol that enables the monitoring of current or already-finished user’s tasks, as well as the pipelining of successive data processing services. The BioMoby standard has been greatly extended with the new features included in the MOWServ, such as management of additional information (metadata such as extended descriptions, keywords and datafile examples), a qualified registry, error handling, asynchronous services and service replication. All of them have increased the MOWServ service quality, usability and robustness. MOWServ is available at http://www.inab.org/MOWServ/ and has a mirror at http://www.bitlab-es.com/MOWServ/. PMID:20525794

  19. Bioinformatics Training Network (BTN): a community resource for bioinformatics trainers

    DEFF Research Database (Denmark)

    Schneider, Maria V.; Walter, Peter; Blatter, Marie-Claude

    2012-01-01

    and clearly tagged in relation to target audiences, learning objectives, etc. Ideally, they would also be peer reviewed, and easily and efficiently accessible for downloading. Here, we present the Bioinformatics Training Network (BTN), a new enterprise that has been initiated to address these needs and review...

  20. Planning bioinformatics workflows using an expert system

    Science.gov (United States)

    Chen, Xiaoling; Chang, Jeffrey T.

    2017-01-01

    Abstract Motivation: Bioinformatic analyses are becoming formidably more complex due to the increasing number of steps required to process the data, as well as the proliferation of methods that can be used in each step. To alleviate this difficulty, pipelines are commonly employed. However, pipelines are typically implemented to automate a specific analysis, and thus are difficult to use for exploratory analyses requiring systematic changes to the software or parameters used. Results: To automate the development of pipelines, we have investigated expert systems. We created the Bioinformatics ExperT SYstem (BETSY) that includes a knowledge base where the capabilities of bioinformatics software is explicitly and formally encoded. BETSY is a backwards-chaining rule-based expert system comprised of a data model that can capture the richness of biological data, and an inference engine that reasons on the knowledge base to produce workflows. Currently, the knowledge base is populated with rules to analyze microarray and next generation sequencing data. We evaluated BETSY and found that it could generate workflows that reproduce and go beyond previously published bioinformatics results. Finally, a meta-investigation of the workflows generated from the knowledge base produced a quantitative measure of the technical burden imposed by each step of bioinformatics analyses, revealing the large number of steps devoted to the pre-processing of data. In sum, an expert system approach can facilitate exploratory bioinformatic analysis by automating the development of workflows, a task that requires significant domain expertise. Availability and Implementation: https://github.com/jefftc/changlab Contact: jeffrey.t.chang@uth.tmc.edu PMID:28052928

  1. MSeqDR: A Centralized Knowledge Repository and Bioinformatics Web Resource to Facilitate Genomic Investigations in Mitochondrial Disease

    OpenAIRE

    Shen, Lishuang; Diroma, Maria Angela; Gonzalez, Michael; Navarro-Gomez, Daniel; Leipzig, Jeremy; Lott, Marie T.; Oven, Mannis; Wallace, D.C.; Muraresku, Colleen Clarke; Zolkipli-Cunningham, Zarazuela; Chinnery, Patrick; Attimonelli, Marcella; Zuchner, Stephan; Falk, Marni J.; Gai, Xiaowu

    2016-01-01

    textabstractMSeqDR is the Mitochondrial Disease Sequence Data Resource, a centralized and comprehensive genome and phenome bioinformatics resource built by the mitochondrial disease community to facilitate clinical diagnosis and research investigations of individual patient phenotypes, genomes, genes, and variants. A central Web portal (https://mseqdr.org) integrates community knowledge from expert-curated databases with genomic and phenotype data shared by clinicians and researchers. MSeqDR ...

  2. Data warehouse solution for energy management tasks of a power generation corporation in a competitive market; Data-Warehouse-Loesung fuer die Energiemanagementaufgaben eines Stromerzeugungsbereiches in einem wettbewerblichen Umfeld

    Energy Technology Data Exchange (ETDEWEB)

    Brugger, H. [Siemens AG, Vienna (Austria). Bereich Energieuebertragung und -verteilung; Nobach, U. [Siemens AG, Nuernberg (Germany). Bereich Energieuebertragung und -verteilung; Hoenes, R.; Vetter, T. [Neckarwerke Stuttgart AG (Germany)

    1999-04-19

    Liberalization of the energy market gives new requirements to the organization of the departments of utilities and their IT-solutions. The authors explain by using the example of the new energy management system for Neckarwerke Stuttgart AG, how a modern Data Warehouse-platform is created for the operative and economic tasks for electricity supply and district heating in a competitive market. (orig.) [Deutsch] Aufgrund der Liberalisierung des Strommarktes erwachsen bei den Energieversorgungsunternehmen neue Anforderungen an die Organisation der Geschaeftsbereiche und an die EDV-Hilfsmittel. Die Verfasser erlaeutern am Beispiel des neuen Energiemanagementsystems fuer die Neckarwerke Stuttgart AG wie mit einem modernen Data Warehouse die Plattform fuer die Loesung der operativen und der wirtschaftlichen Aufgaben der Strom- und Fernwaermebeschaffung in einem Wettbewerbsumfeld geschaffen werden kann. (orig.)

  3. Continuing Education Workshops in Bioinformatics Positively Impact Research and Careers.

    Science.gov (United States)

    Brazas, Michelle D; Ouellette, B F Francis

    2016-06-01

    Bioinformatics.ca has been hosting continuing education programs in introductory and advanced bioinformatics topics in Canada since 1999 and has trained more than 2,000 participants to date. These workshops have been adapted over the years to keep pace with advances in both science and technology as well as the changing landscape in available learning modalities and the bioinformatics training needs of our audience. Post-workshop surveys have been a mandatory component of each workshop and are used to ensure appropriate adjustments are made to workshops to maximize learning. However, neither bioinformatics.ca nor others offering similar training programs have explored the long-term impact of bioinformatics continuing education training. Bioinformatics.ca recently initiated a look back on the impact its workshops have had on the career trajectories, research outcomes, publications, and collaborations of its participants. Using an anonymous online survey, bioinformatics.ca analyzed responses from those surveyed and discovered its workshops have had a positive impact on collaborations, research, publications, and career progression.

  4. Bioinformatics education dissemination with an evolutionary problem solving perspective.

    Science.gov (United States)

    Jungck, John R; Donovan, Samuel S; Weisstein, Anton E; Khiripet, Noppadon; Everse, Stephen J

    2010-11-01

    Bioinformatics is central to biology education in the 21st century. With the generation of terabytes of data per day, the application of computer-based tools to stored and distributed data is fundamentally changing research and its application to problems in medicine, agriculture, conservation and forensics. In light of this 'information revolution,' undergraduate biology curricula must be redesigned to prepare the next generation of informed citizens as well as those who will pursue careers in the life sciences. The BEDROCK initiative (Bioinformatics Education Dissemination: Reaching Out, Connecting and Knitting together) has fostered an international community of bioinformatics educators. The initiative's goals are to: (i) Identify and support faculty who can take leadership roles in bioinformatics education; (ii) Highlight and distribute innovative approaches to incorporating evolutionary bioinformatics data and techniques throughout undergraduate education; (iii) Establish mechanisms for the broad dissemination of bioinformatics resource materials and teaching models; (iv) Emphasize phylogenetic thinking and problem solving; and (v) Develop and publish new software tools to help students develop and test evolutionary hypotheses. Since 2002, BEDROCK has offered more than 50 faculty workshops around the world, published many resources and supported an environment for developing and sharing bioinformatics education approaches. The BEDROCK initiative builds on the established pedagogical philosophy and academic community of the BioQUEST Curriculum Consortium to assemble the diverse intellectual and human resources required to sustain an international reform effort in undergraduate bioinformatics education.

  5. Using "Arabidopsis" Genetic Sequences to Teach Bioinformatics

    Science.gov (United States)

    Zhang, Xiaorong

    2009-01-01

    This article describes a new approach to teaching bioinformatics using "Arabidopsis" genetic sequences. Several open-ended and inquiry-based laboratory exercises have been designed to help students grasp key concepts and gain practical skills in bioinformatics, using "Arabidopsis" leucine-rich repeat receptor-like kinase (LRR…

  6. Fuzzy Logic in Medicine and Bioinformatics

    Directory of Open Access Journals (Sweden)

    Angela Torres

    2006-01-01

    Full Text Available The purpose of this paper is to present a general view of the current applications of fuzzy logic in medicine and bioinformatics. We particularly review the medical literature using fuzzy logic. We then recall the geometrical interpretation of fuzzy sets as points in a fuzzy hypercube and present two concrete illustrations in medicine (drug addictions and in bioinformatics (comparison of genomes.

  7. Improvements to PATRIC, the all-bacterial Bioinformatics Database and Analysis Resource Center

    Science.gov (United States)

    Wattam, Alice R.; Davis, James J.; Assaf, Rida; Boisvert, Sébastien; Brettin, Thomas; Bun, Christopher; Conrad, Neal; Dietrich, Emily M.; Disz, Terry; Gabbard, Joseph L.; Gerdes, Svetlana; Henry, Christopher S.; Kenyon, Ronald W.; Machi, Dustin; Mao, Chunhong; Nordberg, Eric K.; Olsen, Gary J.; Murphy-Olson, Daniel E.; Olson, Robert; Overbeek, Ross; Parrello, Bruce; Pusch, Gordon D.; Shukla, Maulik; Vonstein, Veronika; Warren, Andrew; Xia, Fangfang; Yoo, Hyunseung; Stevens, Rick L.

    2017-01-01

    The Pathosystems Resource Integration Center (PATRIC) is the bacterial Bioinformatics Resource Center (https://www.patricbrc.org). Recent changes to PATRIC include a redesign of the web interface and some new services that provide users with a platform that takes them from raw reads to an integrated analysis experience. The redesigned interface allows researchers direct access to tools and data, and the emphasis has changed to user-created genome-groups, with detailed summaries and views of the data that researchers have selected. Perhaps the biggest change has been the enhanced capability for researchers to analyze their private data and compare it to the available public data. Researchers can assemble their raw sequence reads and annotate the contigs using RASTtk. PATRIC also provides services for RNA-Seq, variation, model reconstruction and differential expression analysis, all delivered through an updated private workspace. Private data can be compared by ‘virtual integration’ to any of PATRIC's public data. The number of genomes available for comparison in PATRIC has expanded to over 80 000, with a special emphasis on genomes with antimicrobial resistance data. PATRIC uses this data to improve both subsystem annotation and k-mer classification, and tags new genomes as having signatures that indicate susceptibility or resistance to specific antibiotics. PMID:27899627

  8. Online Bioinformatics Tutorials | Office of Cancer Genomics

    Science.gov (United States)

    Bioinformatics is a scientific discipline that applies computer science and information technology to help understand biological processes. The NIH provides a list of free online bioinformatics tutorials, either generated by the NIH Library or other institutes, which includes introductory lectures and "how to" videos on using various tools.

  9. Promoting synergistic research and education in genomics and bioinformatics.

    Science.gov (United States)

    Yang, Jack Y; Yang, Mary Qu; Zhu, Mengxia Michelle; Arabnia, Hamid R; Deng, Youping

    2008-01-01

    Bioinformatics and Genomics are closely related disciplines that hold great promises for the advancement of research and development in complex biomedical systems, as well as public health, drug design, comparative genomics, personalized medicine and so on. Research and development in these two important areas are impacting the science and technology.High throughput sequencing and molecular imaging technologies marked the beginning of a new era for modern translational medicine and personalized healthcare. The impact of having the human sequence and personalized digital images in hand has also created tremendous demands of developing powerful supercomputing, statistical learning and artificial intelligence approaches to handle the massive bioinformatics and personalized healthcare data, which will obviously have a profound effect on how biomedical research will be conducted toward the improvement of human health and prolonging of human life in the future. The International Society of Intelligent Biological Medicine (http://www.isibm.org) and its official journals, the International Journal of Functional Informatics and Personalized Medicine (http://www.inderscience.com/ijfipm) and the International Journal of Computational Biology and Drug Design (http://www.inderscience.com/ijcbdd) in collaboration with International Conference on Bioinformatics and Computational Biology (Biocomp), touch tomorrow's bioinformatics and personalized medicine throughout today's efforts in promoting the research, education and awareness of the upcoming integrated inter/multidisciplinary field. The 2007 international conference on Bioinformatics and Computational Biology (BIOCOMP07) was held in Las Vegas, the United States of American on June 25-28, 2007. The conference attracted over 400 papers, covering broad research areas in the genomics, biomedicine and bioinformatics. The Biocomp 2007 provides a common platform for the cross fertilization of ideas, and to help shape knowledge and

  10. BioXSD: the common data-exchange format for everyday bioinformatics web services

    Science.gov (United States)

    Kalaš, Matúš; Puntervoll, Pæl; Joseph, Alexandre; Bartaševičiūtė, Edita; Töpfer, Armin; Venkataraman, Prabakar; Pettifer, Steve; Bryne, Jan Christian; Ison, Jon; Blanchet, Christophe; Rapacki, Kristoffer; Jonassen, Inge

    2010-01-01

    Motivation: The world-wide community of life scientists has access to a large number of public bioinformatics databases and tools, which are developed and deployed using diverse technologies and designs. More and more of the resources offer programmatic web-service interface. However, efficient use of the resources is hampered by the lack of widely used, standard data-exchange formats for the basic, everyday bioinformatics data types. Results: BioXSD has been developed as a candidate for standard, canonical exchange format for basic bioinformatics data. BioXSD is represented by a dedicated XML Schema and defines syntax for biological sequences, sequence annotations, alignments and references to resources. We have adapted a set of web services to use BioXSD as the input and output format, and implemented a test-case workflow. This demonstrates that the approach is feasible and provides smooth interoperability. Semantics for BioXSD is provided by annotation with the EDAM ontology. We discuss in a separate section how BioXSD relates to other initiatives and approaches, including existing standards and the Semantic Web. Availability: The BioXSD 1.0 XML Schema is freely available at http://www.bioxsd.org/BioXSD-1.0.xsd under the Creative Commons BY-ND 3.0 license. The http://bioxsd.org web page offers documentation, examples of data in BioXSD format, example workflows with source codes in common programming languages, an updated list of compatible web services and tools and a repository of feature requests from the community. Contact: matus.kalas@bccs.uib.no; developers@bioxsd.org; support@bioxsd.org PMID:20823319

  11. Designing ETL Tools to Feed a Data Warehouse Based on Electronic Healthcare Record Infrastructure.

    Science.gov (United States)

    Pecoraro, Fabrizio; Luzi, Daniela; Ricci, Fabrizio L

    2015-01-01

    Aim of this paper is to propose a methodology to design Extract, Transform and Load (ETL) tools in a clinical data warehouse architecture based on the Electronic Healthcare Record (EHR). This approach takes advantages on the use of this infrastructure as one of the main source of information to feed the data warehouse, taking also into account that clinical documents produced by heterogeneous legacy systems are structured using the HL7 CDA standard. This paper describes the main activities to be performed to map the information collected in the different types of document with the dimensional model primitives.

  12. Combining Data Warehouse and Data Mining Techniques for Web Log Analysis

    DEFF Research Database (Denmark)

    Pedersen, Torben Bach; Jespersen, Søren; Thorhauge, Jesper

    2008-01-01

    a number of approaches thatcombine data warehousing and data mining techniques in order to analyze Web logs.After introducing the well-known click and session data warehouse (DW) schemas,the chapter presents the subsession schema, which allows fast queries on sequences...

  13. The development and application of bioinformatics core competencies to improve bioinformatics training and education.

    Science.gov (United States)

    Mulder, Nicola; Schwartz, Russell; Brazas, Michelle D; Brooksbank, Cath; Gaeta, Bruno; Morgan, Sarah L; Pauley, Mark A; Rosenwald, Anne; Rustici, Gabriella; Sierk, Michael; Warnow, Tandy; Welch, Lonnie

    2018-02-01

    Bioinformatics is recognized as part of the essential knowledge base of numerous career paths in biomedical research and healthcare. However, there is little agreement in the field over what that knowledge entails or how best to provide it. These disagreements are compounded by the wide range of populations in need of bioinformatics training, with divergent prior backgrounds and intended application areas. The Curriculum Task Force of the International Society of Computational Biology (ISCB) Education Committee has sought to provide a framework for training needs and curricula in terms of a set of bioinformatics core competencies that cut across many user personas and training programs. The initial competencies developed based on surveys of employers and training programs have since been refined through a multiyear process of community engagement. This report describes the current status of the competencies and presents a series of use cases illustrating how they are being applied in diverse training contexts. These use cases are intended to demonstrate how others can make use of the competencies and engage in the process of their continuing refinement and application. The report concludes with a consideration of remaining challenges and future plans.

  14. The development and application of bioinformatics core competencies to improve bioinformatics training and education

    Science.gov (United States)

    Brooksbank, Cath; Morgan, Sarah L.; Rosenwald, Anne; Warnow, Tandy; Welch, Lonnie

    2018-01-01

    Bioinformatics is recognized as part of the essential knowledge base of numerous career paths in biomedical research and healthcare. However, there is little agreement in the field over what that knowledge entails or how best to provide it. These disagreements are compounded by the wide range of populations in need of bioinformatics training, with divergent prior backgrounds and intended application areas. The Curriculum Task Force of the International Society of Computational Biology (ISCB) Education Committee has sought to provide a framework for training needs and curricula in terms of a set of bioinformatics core competencies that cut across many user personas and training programs. The initial competencies developed based on surveys of employers and training programs have since been refined through a multiyear process of community engagement. This report describes the current status of the competencies and presents a series of use cases illustrating how they are being applied in diverse training contexts. These use cases are intended to demonstrate how others can make use of the competencies and engage in the process of their continuing refinement and application. The report concludes with a consideration of remaining challenges and future plans. PMID:29390004

  15. Development of a Data Warehouse for Riverine and Coastal Flood Risk Management

    Science.gov (United States)

    McGrath, H.; Stefanakis, E.; Nastev, M.

    2014-11-01

    In New Brunswick flooding occurs typically during the spring freshet, though, in recent years, midwinter thaws have led to flooding in January or February. Municipalities are therefore facing a pressing need to perform risk assessments in order to identify communities at risk of flooding. In addition to the identification of communities at risk, quantitative measures of potential structural damage and societal losses are necessary for these identified communities. Furthermore, tools which allow for analysis and processing of possible mitigation plans are needed. Natural Resources Canada is in the process of adapting Hazus-MH to respond to the need for risk management. This requires extensive data from a variety of municipal, provincial, and national agencies in order to provide valid estimates. The aim is to establish a data warehouse to store relevant flood prediction data which may be accessed thru Hazus. Additionally, this data warehouse will contain tools for On-Line Analytical Processing (OLAP) and knowledge discovery to quantitatively determine areas at risk and discover unexpected dependencies between datasets. The third application of the data warehouse is to provide data for online visualization capabilities: web-based thematic maps of Hazus results, historical flood visualizations, and mitigation tools; thus making flood hazard information and tools more accessible to emergency responders, planners, and residents. This paper represents the first step of the process: locating and collecting the appropriate datasets.

  16. CoryneRegNet 4.0 – A reference database for corynebacterial gene regulatory networks

    Directory of Open Access Journals (Sweden)

    Baumbach Jan

    2007-11-01

    Full Text Available Abstract Background Detailed information on DNA-binding transcription factors (the key players in the regulation of gene expression and on transcriptional regulatory interactions of microorganisms deduced from literature-derived knowledge, computer predictions and global DNA microarray hybridization experiments, has opened the way for the genome-wide analysis of transcriptional regulatory networks. The large-scale reconstruction of these networks allows the in silico analysis of cell behavior in response to changing environmental conditions. We previously published CoryneRegNet, an ontology-based data warehouse of corynebacterial transcription factors and regulatory networks. Initially, it was designed to provide methods for the analysis and visualization of the gene regulatory network of Corynebacterium glutamicum. Results Now we introduce CoryneRegNet release 4.0, which integrates data on the gene regulatory networks of 4 corynebacteria, 2 mycobacteria and the model organism Escherichia coli K12. As the previous versions, CoryneRegNet provides a web-based user interface to access the database content, to allow various queries, and to support the reconstruction, analysis and visualization of regulatory networks at different hierarchical levels. In this article, we present the further improved database content of CoryneRegNet along with novel analysis features. The network visualization feature GraphVis now allows the inter-species comparisons of reconstructed gene regulatory networks and the projection of gene expression levels onto that networks. Therefore, we added stimulon data directly into the database, but also provide Web Service access to the DNA microarray analysis platform EMMA. Additionally, CoryneRegNet now provides a SOAP based Web Service server, which can easily be consumed by other bioinformatics software systems. Stimulons (imported from the database, or uploaded by the user can be analyzed in the context of known

  17. Documenting the emergence of bio-ontologies: or, why researching bioinformatics requires HPSSB.

    Science.gov (United States)

    Leonelli, Sabina

    2010-01-01

    This paper reflects on the analytic challenges emerging from the study of bioinformatic tools recently created to store and disseminate biological data, such as databases, repositories, and bio-ontologies. I focus my discussion on the Gene Ontology, a term that defines three entities at once: a classification system facilitating the distribution and use of genomic data as evidence towards new insights; an expert community specialised in the curation of those data; and a scientific institution promoting the use of this tool among experimental biologists. These three dimensions of the Gene Ontology can be clearly distinguished analytically, but are tightly intertwined in practice. I suggest that this is true of all bioinformatic tools: they need to be understood simultaneously as epistemic, social, and institutional entities, since they shape the knowledge extracted from data and at the same time regulate the organisation, development, and communication of research. This viewpoint has one important implication for the methodologies used to study these tools; that is, the need to integrate historical, philosophical, and sociological approaches. I illustrate this claim through examples of misunderstandings that may result from a narrowly disciplinary study of the Gene Ontology, as I experienced them in my own research.

  18. 4273π: bioinformatics education on low cost ARM hardware.

    Science.gov (United States)

    Barker, Daniel; Ferrier, David Ek; Holland, Peter Wh; Mitchell, John Bo; Plaisier, Heleen; Ritchie, Michael G; Smart, Steven D

    2013-08-12

    Teaching bioinformatics at universities is complicated by typical computer classroom settings. As well as running software locally and online, students should gain experience of systems administration. For a future career in biology or bioinformatics, the installation of software is a useful skill. We propose that this may be taught by running the course on GNU/Linux running on inexpensive Raspberry Pi computer hardware, for which students may be granted full administrator access. We release 4273π, an operating system image for Raspberry Pi based on Raspbian Linux. This includes minor customisations for classroom use and includes our Open Access bioinformatics course, 4273π Bioinformatics for Biologists. This is based on the final-year undergraduate module BL4273, run on Raspberry Pi computers at the University of St Andrews, Semester 1, academic year 2012-2013. 4273π is a means to teach bioinformatics, including systems administration tasks, to undergraduates at low cost.

  19. OpenHelix: bioinformatics education outside of a different box.

    Science.gov (United States)

    Williams, Jennifer M; Mangan, Mary E; Perreault-Micale, Cynthia; Lathe, Scott; Sirohi, Neeraj; Lathe, Warren C

    2010-11-01

    The amount of biological data is increasing rapidly, and will continue to increase as new rapid technologies are developed. Professionals in every area of bioscience will have data management needs that require publicly available bioinformatics resources. Not all scientists desire a formal bioinformatics education but would benefit from more informal educational sources of learning. Effective bioinformatics education formats will address a broad range of scientific needs, will be aimed at a variety of user skill levels, and will be delivered in a number of different formats to address different learning styles. Informal sources of bioinformatics education that are effective are available, and will be explored in this review.

  20. Criticality calculation of the nuclear material warehouse of the ININ; Calculo de criticidad del almacen del material nuclear del ININ

    Energy Technology Data Exchange (ETDEWEB)

    Garcia, T.; Angeles, A.; Flores C, J., E-mail: teodoro.garcia@inin.gob.mx [ININ, Carretera Mexico-Toluca s/n, 52750 Ocoyoacac, Estado de Mexico (Mexico)

    2013-10-15

    In this work the conditions of nuclear safety were determined as much in normal conditions as in the accident event of the nuclear fuel warehouse of the reactor TRIGA Mark III of the Instituto Nacional de Investigaciones Nucleares (ININ). The warehouse contains standard fuel elements Leu - 8.5/20, a control rod with follower of standard fuel type Leu - 8.5/20, fuel elements Leu - 30/20, and the reactor fuel Sur-100. To check the subcritical state of the warehouse the effective multiplication factor (keff) was calculated. The keff calculation was carried out with the code MCNPX. (Author)

  1. Development of a cloud-based Bioinformatics Training Platform.

    Science.gov (United States)

    Revote, Jerico; Watson-Haigh, Nathan S; Quenette, Steve; Bethwaite, Blair; McGrath, Annette; Shang, Catherine A

    2017-05-01

    The Bioinformatics Training Platform (BTP) has been developed to provide access to the computational infrastructure required to deliver sophisticated hands-on bioinformatics training courses. The BTP is a cloud-based solution that is in active use for delivering next-generation sequencing training to Australian researchers at geographically dispersed locations. The BTP was built to provide an easy, accessible, consistent and cost-effective approach to delivering workshops at host universities and organizations with a high demand for bioinformatics training but lacking the dedicated bioinformatics training suites required. To support broad uptake of the BTP, the platform has been made compatible with multiple cloud infrastructures. The BTP is an open-source and open-access resource. To date, 20 training workshops have been delivered to over 700 trainees at over 10 venues across Australia using the BTP. © The Author 2016. Published by Oxford University Press.

  2. An overview of bioinformatics methods for modeling biological pathways in yeast.

    Science.gov (United States)

    Hou, Jie; Acharya, Lipi; Zhu, Dongxiao; Cheng, Jianlin

    2016-03-01

    The advent of high-throughput genomics techniques, along with the completion of genome sequencing projects, identification of protein-protein interactions and reconstruction of genome-scale pathways, has accelerated the development of systems biology research in the yeast organism Saccharomyces cerevisiae In particular, discovery of biological pathways in yeast has become an important forefront in systems biology, which aims to understand the interactions among molecules within a cell leading to certain cellular processes in response to a specific environment. While the existing theoretical and experimental approaches enable the investigation of well-known pathways involved in metabolism, gene regulation and signal transduction, bioinformatics methods offer new insights into computational modeling of biological pathways. A wide range of computational approaches has been proposed in the past for reconstructing biological pathways from high-throughput datasets. Here we review selected bioinformatics approaches for modeling biological pathways inS. cerevisiae, including metabolic pathways, gene-regulatory pathways and signaling pathways. We start with reviewing the research on biological pathways followed by discussing key biological databases. In addition, several representative computational approaches for modeling biological pathways in yeast are discussed. © The Author 2015. Published by Oxford University Press. All rights reserved. For permissions, please email: journals.permissions@oup.com.

  3. Update of the androgen receptor gene mutations database.

    Science.gov (United States)

    Gottlieb, B; Beitel, L K; Lumbroso, R; Pinsky, L; Trifiro, M

    1999-01-01

    The current version of the androgen receptor (AR) gene mutations database is described. The total number of reported mutations has risen from 309 to 374 during the past year. We have expanded the database by adding information on AR-interacting proteins; and we have improved the database by identifying those mutation entries that have been updated. Mutations of unknown significance have now been reported in both the 5' and 3' untranslated regions of the AR gene, and in individuals who are somatic mosaics constitutionally. In addition, single nucleotide polymorphisms, including silent mutations, have been discovered in normal individuals and in individuals with male infertility. A mutation hotspot associated with prostatic cancer has been identified in exon 5. The database is available on the internet (http://www.mcgill.ca/androgendb/), from EMBL-European Bioinformatics Institute (ftp.ebi.ac.uk/pub/databases/androgen), or as a Macintosh FilemakerPro or Word file (MC33@musica.mcgill.ca). Copyright 1999 Wiley-Liss, Inc.

  4. Characteristics desired in clinical data warehouse for biomedical research.

    Science.gov (United States)

    Shin, Soo-Yong; Kim, Woo Sung; Lee, Jae-Ho

    2014-04-01

    Due to the unique characteristics of clinical data, clinical data warehouses (CDWs) have not been successful so far. Specifically, the use of CDWs for biomedical research has been relatively unsuccessful thus far. The characteristics necessary for the successful implementation and operation of a CDW for biomedical research have not clearly defined yet. THREE EXAMPLES OF CDWS WERE REVIEWED: a multipurpose CDW in a hospital, a CDW for independent multi-institutional research, and a CDW for research use in an institution. After reviewing the three CDW examples, we propose some key characteristics needed in a CDW for biomedical research. A CDW for research should include an honest broker system and an Institutional Review Board approval interface to comply with governmental regulations. It should also include a simple query interface, an anonymized data review tool, and a data extraction tool. Also, it should be a biomedical research platform for data repository use as well as data analysis. The proposed characteristics desired in a CDW may have limited transfer value to organizations in other countries. However, these analysis results are still valid in Korea, and we have developed clinical research data warehouse based on these desiderata.

  5. Development of hospital data warehouse for cost analysis of DPC based on medical costs.

    Science.gov (United States)

    Muranaga, F; Kumamoto, I; Uto, Y

    2007-01-01

    To develop a data warehouse system for cost analysis, based on the categories of the diagnosis procedure combination (DPC) system, in which medical costs were estimated by DPC category and factors influencing the balance between costs and fees. We developed a data warehouse system for cost analysis using data from the hospital central data warehouse system. The balance data of patients who were discharged from Kagoshima University Hospital from April 2003 to March 2005 were determined in terms of medical procedure, cost per day and patient admission in order to conduct a drill-down analysis. To evaluate this system, we analyzed cash flow by DPC category of patients who were categorized as having malignant tumors and whose DPC category was reevaluated in 2004. The percentages of medical expenses were highest in patients with acute leukemia, non-Hodgkin's lymphoma, and particularly in patients with malignant tumors of the liver and intrahepatic bile duct. Imaging tests degraded the percentages of medical expenses in Kagoshima University Hospital. These results suggested that cost analysis by patient is important for hospital administration in the inclusive evaluation system using a case-mix index such as DPC.

  6. Extending Asia Pacific bioinformatics into new realms in the "-omics" era.

    Science.gov (United States)

    Ranganathan, Shoba; Eisenhaber, Frank; Tong, Joo Chuan; Tan, Tin Wee

    2009-12-03

    The 2009 annual conference of the Asia Pacific Bioinformatics Network (APBioNet), Asia's oldest bioinformatics organisation dating back to 1998, was organized as the 8th International Conference on Bioinformatics (InCoB), Sept. 7-11, 2009 at Biopolis, Singapore. Besides bringing together scientists from the field of bioinformatics in this region, InCoB has actively engaged clinicians and researchers from the area of systems biology, to facilitate greater synergy between these two groups. InCoB2009 followed on from a series of successful annual events in Bangkok (Thailand), Penang (Malaysia), Auckland (New Zealand), Busan (South Korea), New Delhi (India), Hong Kong and Taipei (Taiwan), with InCoB2010 scheduled to be held in Tokyo, Japan, Sept. 26-28, 2010. The Workshop on Education in Bioinformatics and Computational Biology (WEBCB) and symposia on Clinical Bioinformatics (CBAS), the Singapore Symposium on Computational Biology (SYMBIO) and training tutorials were scheduled prior to the scientific meeting, and provided ample opportunity for in-depth learning and special interest meetings for educators, clinicians and students. We provide a brief overview of the peer-reviewed bioinformatics manuscripts accepted for publication in this supplement, grouped into thematic areas. In order to facilitate scientific reproducibility and accountability, we have, for the first time, introduced minimum information criteria for our pubilcations, including compliance to a Minimum Information about a Bioinformatics Investigation (MIABi). As the regional research expertise in bioinformatics matures, we have delineated a minimum set of bioinformatics skills required for addressing the computational challenges of the "-omics" era.

  7. Implementation of a metadata architecture and knowledge collection to support semantic interoperability in an enterprise data warehouse.

    Science.gov (United States)

    Dhaval, Rakesh; Borlawsky, Tara; Ostrander, Michael; Santangelo, Jennifer; Kamal, Jyoti; Payne, Philip R O

    2008-11-06

    In order to enhance interoperability between enterprise systems, and improve data validity and reliability throughout The Ohio State University Medical Center (OSUMC), we have initiated the development of an ontology-anchored metadata architecture and knowledge collection for our enterprise data warehouse. The metadata and corresponding semantic relationships stored in the OSUMC knowledge collection are intended to promote consistency and interoperability across the heterogeneous clinical, research, business and education information managed within the data warehouse.

  8. Microsoft Biology Initiative: .NET Bioinformatics Platform and Tools

    Science.gov (United States)

    Diaz Acosta, B.

    2011-01-01

    The Microsoft Biology Initiative (MBI) is an effort in Microsoft Research to bring new technology and tools to the area of bioinformatics and biology. This initiative is comprised of two primary components, the Microsoft Biology Foundation (MBF) and the Microsoft Biology Tools (MBT). MBF is a language-neutral bioinformatics toolkit built as an extension to the Microsoft .NET Framework—initially aimed at the area of Genomics research. Currently, it implements a range of parsers for common bioinformatics file formats; a range of algorithms for manipulating DNA, RNA, and protein sequences; and a set of connectors to biological web services such as NCBI BLAST. MBF is available under an open source license, and executables, source code, demo applications, documentation and training materials are freely downloadable from http://research.microsoft.com/bio. MBT is a collection of tools that enable biology and bioinformatics researchers to be more productive in making scientific discoveries.

  9. Identification of differentially expressed genes and signaling pathways in ovarian cancer by integrated bioinformatics analysis

    Directory of Open Access Journals (Sweden)

    Yang X

    2018-03-01

    Full Text Available Xiao Yang,1 Shaoming Zhu,2 Li Li,3 Li Zhang,1 Shu Xian,1 Yanqing Wang,1 Yanxiang Cheng1 1Department of Obstetrics and Gynecology, 2Department of Urology, Renmin Hospital of Wuhan University, 3Department of Pharmacology, Wuhan University Health Science Center, Wuhan, Hubei, People’s Republic of China Background: The mortality rate associated with ovarian cancer ranks the highest among gynecological malignancies. However, the cause and underlying molecular events of ovarian cancer are not clear. Here, we applied integrated bioinformatics to identify key pathogenic genes involved in ovarian cancer and reveal potential molecular mechanisms. Results: The expression profiles of GDS3592, GSE54388, and GSE66957 were downloaded from the Gene Expression Omnibus (GEO database, which contained 115 samples, including 85 cases of ovarian cancer samples and 30 cases of normal ovarian samples. The three microarray datasets were integrated to obtain differentially expressed genes (DEGs and were deeply analyzed by bioinformatics methods. The gene ontology (GO and Kyoto Encyclopedia of Genes and Genomes (KEGG pathway enrichments of DEGs were performed by DAVID and KOBAS online analyses, respectively. The protein–protein interaction (PPI networks of the DEGs were constructed from the STRING database. A total of 190 DEGs were identified in the three GEO datasets, of which 99 genes were upregulated and 91 genes were downregulated. GO analysis showed that the biological functions of DEGs focused primarily on regulating cell proliferation, adhesion, and differentiation and intracellular signal cascades. The main cellular components include cell membranes, exosomes, the cytoskeleton, and the extracellular matrix. The molecular functions include growth factor activity, protein kinase regulation, DNA binding, and oxygen transport activity. KEGG pathway analysis showed that these DEGs were mainly involved in the Wnt signaling pathway, amino acid metabolism, and the

  10. Concepts and introduction to RNA bioinformatics

    DEFF Research Database (Denmark)

    Gorodkin, Jan; Hofacker, Ivo L.; Ruzzo, Walter L.

    2014-01-01

    RNA bioinformatics and computational RNA biology have emerged from implementing methods for predicting the secondary structure of single sequences. The field has evolved to exploit multiple sequences to take evolutionary information into account, such as compensating (and structure preserving) base...... for interactions between RNA and proteins.Here, we introduce the basic concepts of predicting RNA secondary structure relevant to the further analyses of RNA sequences. We also provide pointers to methods addressing various aspects of RNA bioinformatics and computational RNA biology....

  11. 76 FR 13972 - United States Warehouse Act; Export Food Aid Commodities Licensing Agreement

    Science.gov (United States)

    2011-03-15

    ..., nuts, cottonseed, and dry beans. Warehouse operators that apply voluntarily agree to be licensed... program for port and transload facility operators storing EFAC. This proposal is in response to the...

  12. Adapting bioinformatics curricula for big data

    Science.gov (United States)

    Greene, Anna C.; Giffin, Kristine A.; Greene, Casey S.

    2016-01-01

    Modern technologies are capable of generating enormous amounts of data that measure complex biological systems. Computational biologists and bioinformatics scientists are increasingly being asked to use these data to reveal key systems-level properties. We review the extent to which curricula are changing in the era of big data. We identify key competencies that scientists dealing with big data are expected to possess across fields, and we use this information to propose courses to meet these growing needs. While bioinformatics programs have traditionally trained students in data-intensive science, we identify areas of particular biological, computational and statistical emphasis important for this era that can be incorporated into existing curricula. For each area, we propose a course structured around these topics, which can be adapted in whole or in parts into existing curricula. In summary, specific challenges associated with big data provide an important opportunity to update existing curricula, but we do not foresee a wholesale redesign of bioinformatics training programs. PMID:25829469

  13. Using Fuzzy Linguistic Representations to Provide Explanatory Semantics for Data Warehouses

    NARCIS (Netherlands)

    Feng, L.; Dillon, Tharam S.

    A data warehouse integrates large amounts of extracted and summarized data from multiple sources for direct querying and analysis. While it provides decision makers with easy access to such historical and aggregate data, the real meaning of the data has been ignored. For example, "whether a total

  14. Managing warehouse efficiency and worker discomfort through enhanced storage assignment decisions

    NARCIS (Netherlands)

    Larco, José Antonio; De Koster, René; Roodbergen, Kees Jan; Dul, Jan

    2017-01-01

    Humans are at the heart of crucial processes in warehouses. Besides the common economic goal of minimising cycle times, we therefore add in this paper the human well-being goal of minimising workers' discomfort in the context of order picking. We propose amethodology for identifying the most

  15. The RHNumtS compilation: Features and bioinformatics approaches to locate and quantify Human NumtS

    Directory of Open Access Journals (Sweden)

    Saccone Cecilia

    2008-06-01

    Full Text Available Abstract Background To a greater or lesser extent, eukaryotic nuclear genomes contain fragments of their mitochondrial genome counterpart, deriving from the random insertion of damaged mtDNA fragments. NumtS (Nuclear mt Sequences are not equally abundant in all species, and are redundant and polymorphic in terms of copy number. In population and clinical genetics, it is important to have a complete overview of NumtS quantity and location. Searching PubMed for NumtS or Mitochondrial pseudo-genes yields hundreds of papers reporting Human NumtS compilations produced by in silico or wet-lab approaches. A comparison of published compilations clearly shows significant discrepancies among data, due both to unwise application of Bioinformatics methods and to a not yet correctly assembled nuclear genome. To optimize quantification and location of NumtS, we produced a consensus compilation of Human NumtS by applying various bioinformatics approaches. Results Location and quantification of NumtS may be achieved by applying database similarity searching methods: we have applied various methods such as Blastn, MegaBlast and BLAT, changing both parameters and database; the results were compared, further analysed and checked against the already published compilations, thus producing the Reference Human Numt Sequences (RHNumtS compilation. The resulting NumtS total 190. Conclusion The RHNumtS compilation represents a highly reliable reference basis, which may allow designing a lab protocol to test the actual existence of each NumtS. Here we report preliminary results based on PCR amplification and sequencing on 41 NumtS selected from RHNumtS among those with lower score. In parallel, we are currently designing the RHNumtS database structure for implementation in the HmtDB resource. In the future, the same database will host NumtS compilations from other organisms, but these will be generated only when the nuclear genome of a specific organism has reached a high

  16. A clinical data warehouse-based process for refining medication orders alerts.

    Science.gov (United States)

    Boussadi, Abdelali; Caruba, Thibaut; Zapletal, Eric; Sabatier, Brigitte; Durieux, Pierre; Degoulet, Patrice

    2012-01-01

    The objective of this case report is to evaluate the use of a clinical data warehouse coupled with a clinical information system to test and refine alerts for medication orders control before they were fully implemented. A clinical decision rule refinement process was used to assess alerts. The criteria assessed were the frequencies of alerts for initial prescriptions of 10 medications whose dosage levels depend on renal function thresholds. In the first iteration of the process, the frequency of the 'exceeds maximum daily dose' alerts was 7.10% (617/8692), while that of the 'under dose' alerts was 3.14% (273/8692). Indicators were presented to the experts. During the different iterations of the process, 45 (16.07%) decision rules were removed, 105 (37.5%) were changed and 136 new rules were introduced. Extensive retrospective analysis of physicians' medication orders stored in a clinical data warehouse facilitates alert optimization toward the goal of maximizing the safety of the patient and minimizing overridden alerts.

  17. ExplorEnz: a MySQL database of the IUBMB enzyme nomenclature.

    Science.gov (United States)

    McDonald, Andrew G; Boyce, Sinéad; Moss, Gerard P; Dixon, Henry B F; Tipton, Keith F

    2007-07-27

    We describe the database ExplorEnz, which is the primary repository for EC numbers and enzyme data that are being curated on behalf of the IUBMB. The enzyme nomenclature is incorporated into many other resources, including the ExPASy-ENZYME, BRENDA and KEGG bioinformatics databases. The data, which are stored in a MySQL database, preserve the formatting of chemical and enzyme names. A simple, easy to use, web-based query interface is provided, along with an advanced search engine for more complex queries. The database is publicly available at http://www.enzyme-database.org. The data are available for download as SQL and XML files via FTP. ExplorEnz has powerful and flexible search capabilities and provides the scientific community with the most up-to-date version of the IUBMB Enzyme List.

  18. Developing library bioinformatics services in context: the Purdue University Libraries bioinformationist program.

    Science.gov (United States)

    Rein, Diane C

    2006-07-01

    Purdue University is a major agricultural, engineering, biomedical, and applied life science research institution with an increasing focus on bioinformatics research that spans multiple disciplines and campus academic units. The Purdue University Libraries (PUL) hired a molecular biosciences specialist to discover, engage, and support bioinformatics needs across the campus. After an extended period of information needs assessment and environmental scanning, the specialist developed a week of focused bioinformatics instruction (Bioinformatics Week) to launch system-wide, library-based bioinformatics services. The specialist employed a two-tiered approach to assess user information requirements and expectations. The first phase involved careful observation and collection of information needs in-context throughout the campus, attending laboratory meetings, interviewing department chairs and individual researchers, and engaging in strategic planning efforts. Based on the information gathered during the integration phase, several survey instruments were developed to facilitate more critical user assessment and the recovery of quantifiable data prior to planning. Given information gathered while working with clients and through formal needs assessments, as well as the success of instructional approaches used in Bioinformatics Week, the specialist is developing bioinformatics support services for the Purdue community. The specialist is also engaged in training PUL faculty librarians in bioinformatics to provide a sustaining culture of library-based bioinformatics support and understanding of Purdue's bioinformatics-related decision and policy making.

  19. C-A1-03: Considerations in the Design and Use of an Oracle-based Virtual Data Warehouse

    Science.gov (United States)

    Bredfeldt, Christine; McFarland, Lela

    2011-01-01

    Background/Aims The amount of clinical data available for research is growing exponentially. As it grows, increasing the efficiency of both data storage and data access becomes critical. Relational database management systems (rDBMS) such as Oracle are ideal solutions for managing longitudinal clinical data because they support large-scale data storage and highly efficient data retrieval. In addition, they can greatly simplify the management of large data warehouses, including security management and regular data refreshes. However, the HMORN Virtual Data Warehouse (VDW) was originally designed based on SAS datasets, and this design choice has a number of implications for both the design and use of an Oracle-based VDW. From a design standpoint, VDW tables are designed as flat SAS datasets, which do not take full advantage of Oracle indexing capabilities. From a data retrieval standpoint, standard VDW SAS scripts do not take advantage of SAS pass-through SQL capabilities to enable Oracle to perform the processing required to narrow datasets to the population of interest. Methods Beginning in 2009, the research department at Kaiser Permanente in the Mid-Atlantic States (KPMA) has developed an Oracle-based VDW according to the HMORN v3 specifications. In order to take advantage of the strengths of relational databases, KPMA introduced an interface layer to the VDW data, using views to provide access to standardized VDW variables. In addition, KPMA has developed SAS programs that provide access to SQL pass-through processing for first-pass data extraction into SAS VDW datasets for processing by standard VDW scripts. Results We discuss both the design and performance considerations specific to the KPMA Oracle-based VDW. We benchmarked performance of the Oracle-based VDW using both standard VDW scripts and an initial pre-processing layer to evaluate speed and accuracy of data return. Conclusions Adapting the VDW for deployment in an Oracle environment required minor

  20. LXtoo: an integrated live Linux distribution for the bioinformatics community.

    Science.gov (United States)

    Yu, Guangchuang; Wang, Li-Gen; Meng, Xiao-Hua; He, Qing-Yu

    2012-07-19

    Recent advances in high-throughput technologies dramatically increase biological data generation. However, many research groups lack computing facilities and specialists. This is an obstacle that remains to be addressed. Here, we present a Linux distribution, LXtoo, to provide a flexible computing platform for bioinformatics analysis. Unlike most of the existing live Linux distributions for bioinformatics limiting their usage to sequence analysis and protein structure prediction, LXtoo incorporates a comprehensive collection of bioinformatics software, including data mining tools for microarray and proteomics, protein-protein interaction analysis, and computationally complex tasks like molecular dynamics. Moreover, most of the programs have been configured and optimized for high performance computing. LXtoo aims to provide well-supported computing environment tailored for bioinformatics research, reducing duplication of efforts in building computing infrastructure. LXtoo is distributed as a Live DVD and freely available at http://bioinformatics.jnu.edu.cn/LXtoo.

  1. 31 CFR 593.412 - Release of any round log or timber product originating in Liberia from a bonded warehouse or...

    Science.gov (United States)

    2010-07-01

    ... product originating in Liberia from a bonded warehouse or foreign trade zone. 593.412 Section 593.412... Interpretations § 593.412 Release of any round log or timber product originating in Liberia from a bonded... from a bonded warehouse or foreign trade zone of any round log or timber product originating in Liberia...

  2. The International Nucleotide Sequence Database Collaboration.

    Science.gov (United States)

    Cochrane, Guy; Karsch-Mizrachi, Ilene; Nakamura, Yasukazu

    2011-01-01

    Under the International Nucleotide Sequence Database Collaboration (INSDC; http://www.insdc.org), globally comprehensive public domain nucleotide sequence is captured, preserved and presented. The partners of this long-standing collaboration work closely together to provide data formats and conventions that enable consistent data submission to their databases and support regular data exchange around the globe. Clearly defined policy and governance in relation to free access to data and relationships with journal publishers have positioned INSDC databases as a key provider of the scientific record and a core foundation for the global bioinformatics data infrastructure. While growth in sequence data volumes comes no longer as a surprise to INSDC partners, the uptake of next-generation sequencing technology by mainstream science that we have witnessed in recent years brings a step-change to growth, necessarily making a clear mark on INSDC strategy. In this article, we introduce the INSDC, outline data growth patterns and comment on the challenges of increased growth.

  3. Concepts Of Bioinformatics And Its Application In Veterinary ...

    African Journals Online (AJOL)

    Bioinformatics has advanced the course of research and future veterinary vaccines development because it has provided new tools for identification of vaccine targets from sequenced biological data of organisms. In Nigeria, there is lack of bioinformatics training in the universities, expect for short training courses in which ...

  4. Galaxy Workflows for Web-based Bioinformatics Analysis of Aptamer High-throughput Sequencing Data

    Directory of Open Access Journals (Sweden)

    William H Thiel

    2016-01-01

    Full Text Available Development of RNA and DNA aptamers for diagnostic and therapeutic applications is a rapidly growing field. Aptamers are identified through iterative rounds of selection in a process termed SELEX (Systematic Evolution of Ligands by EXponential enrichment. High-throughput sequencing (HTS revolutionized the modern SELEX process by identifying millions of aptamer sequences across multiple rounds of aptamer selection. However, these vast aptamer HTS datasets necessitated bioinformatics techniques. Herein, we describe a semiautomated approach to analyze aptamer HTS datasets using the Galaxy Project, a web-based open source collection of bioinformatics tools that were originally developed to analyze genome, exome, and transcriptome HTS data. Using a series of Workflows created in the Galaxy webserver, we demonstrate efficient processing of aptamer HTS data and compilation of a database of unique aptamer sequences. Additional Workflows were created to characterize the abundance and persistence of aptamer sequences within a selection and to filter sequences based on these parameters. A key advantage of this approach is that the online nature of the Galaxy webserver and its graphical interface allow for the analysis of HTS data without the need to compile code or install multiple programs.

  5. Data Warehouse for Professional Skills Required on the IT Labor Market

    Directory of Open Access Journals (Sweden)

    Cristian GEORGESCU

    2012-11-01

    Full Text Available This paper represents a research regarding informatics graduates professional level adjustment to specific requirements of the IT labor market. It uses techniques and models for data warehouse technology to allow a comparative analysis between the supply competencies and the skills demand on the IT labor market.

  6. Determining The Optimal Order Picking Batch Size In Single Aisle Warehouses

    NARCIS (Netherlands)

    T. Le-Duc (Tho); M.B.M. de Koster (René)

    2002-01-01

    textabstractThis work aims at investigating the influence of picking batch size to average time in system of orders in a one-aisle warehouse under the assumption that order arrivals follow a Poisson process and items are uniformly distributed over the aisle's length. We model this problem as an

  7. An exact solution procedure for multi-item two-echelon spare parts inventory control problem with batch ordering in the central warehouse

    NARCIS (Netherlands)

    Topan, E.; Bayindir, Z.P.; Tan, T.

    2009-01-01

    We consider a multi-item two-echelon inventory system in which the central warehouse operates under a (Q; R) policy, and the local warehouses implement basestock policy. An exact solution procedure is proposed to find the inventory control policy parameters that minimize the system-wide inventory

  8. Renal Gene Expression Database (RGED): a relational database of gene expression profiles in kidney disease.

    Science.gov (United States)

    Zhang, Qingzhou; Yang, Bo; Chen, Xujiao; Xu, Jing; Mei, Changlin; Mao, Zhiguo

    2014-01-01

    We present a bioinformatics database named Renal Gene Expression Database (RGED), which contains comprehensive gene expression data sets from renal disease research. The web-based interface of RGED allows users to query the gene expression profiles in various kidney-related samples, including renal cell lines, human kidney tissues and murine model kidneys. Researchers can explore certain gene profiles, the relationships between genes of interests and identify biomarkers or even drug targets in kidney diseases. The aim of this work is to provide a user-friendly utility for the renal disease research community to query expression profiles of genes of their own interest without the requirement of advanced computational skills. Website is implemented in PHP, R, MySQL and Nginx and freely available from http://rged.wall-eva.net. http://rged.wall-eva.net. © The Author(s) 2014. Published by Oxford University Press.

  9. 77 FR 20353 - United States Warehouse Act; Export Food Aid Commodities Licensing Agreement

    Science.gov (United States)

    2012-04-04

    ... licensing agreement include, but are not limited to, corn soy blend, vegetable oil, and pulses such as peas, beans, and lentils. USWA licensing is a voluntary program. Warehouse operators that apply for USWA...

  10. Mi-DISCOVERER: A bioinformatics tool for the detection of mi-RNA in human genome.

    Science.gov (United States)

    Arshad, Saadia; Mumtaz, Asia; Ahmad, Freed; Liaquat, Sadia; Nadeem, Shahid; Mehboob, Shahid; Afzal, Muhammad

    2010-11-27

    MicroRNAs (miRNAs) are 22 nucleotides non-coding RNAs that play pivotal regulatory roles in diverse organisms including the humans and are difficult to be identified due to lack of either sequence features or robust algorithms to efficiently identify. Therefore, we made a tool that is Mi-Discoverer for the detection of miRNAs in human genome. The tools used for the development of software are Microsoft Office Access 2003, the JDK version 1.6.0, BioJava version 1.0, and the NetBeans IDE version 6.0. All already made miRNAs softwares were web based; so the advantage of our project was to make a desktop facility to the user for sequence alignment search with already identified miRNAs of human genome present in the database. The user can also insert and update the newly discovered human miRNA in the database. Mi-Discoverer, a bioinformatics tool successfully identifies human miRNAs based on multiple sequence alignment searches. It's a non redundant database containing a large collection of publicly available human miRNAs.

  11. YAdumper: extracting and translating large information volumes from relational databases to structured flat files.

    Science.gov (United States)

    Fernández, José M; Valencia, Alfonso

    2004-10-12

    Downloading the information stored in relational databases into XML and other flat formats is a common task in bioinformatics. This periodical dumping of information requires considerable CPU time, disk and memory resources. YAdumper has been developed as a purpose-specific tool to deal with the integral structured information download of relational databases. YAdumper is a Java application that organizes database extraction following an XML template based on an external Document Type Declaration. Compared with other non-native alternatives, YAdumper substantially reduces memory requirements and considerably improves writing performance.

  12. Bioinformatics analysis of Brucella vaccines and vaccine targets using VIOLIN.

    Science.gov (United States)

    He, Yongqun; Xiang, Zuoshuang

    2010-09-27

    Brucella spp. are Gram-negative, facultative intracellular bacteria that cause brucellosis, one of the commonest zoonotic diseases found worldwide in humans and a variety of animal species. While several animal vaccines are available, there is no effective and safe vaccine for prevention of brucellosis in humans. VIOLIN (http://www.violinet.org) is a web-based vaccine database and analysis system that curates, stores, and analyzes published data of commercialized vaccines, and vaccines in clinical trials or in research. VIOLIN contains information for 454 vaccines or vaccine candidates for 73 pathogens. VIOLIN also contains many bioinformatics tools for vaccine data analysis, data integration, and vaccine target prediction. To demonstrate the applicability of VIOLIN for vaccine research, VIOLIN was used for bioinformatics analysis of existing Brucella vaccines and prediction of new Brucella vaccine targets. VIOLIN contains many literature mining programs (e.g., Vaxmesh) that provide in-depth analysis of Brucella vaccine literature. As a result of manual literature curation, VIOLIN contains information for 38 Brucella vaccines or vaccine candidates, 14 protective Brucella antigens, and 68 host response studies to Brucella vaccines from 97 peer-reviewed articles. These Brucella vaccines are classified in the Vaccine Ontology (VO) system and used for different ontological applications. The web-based VIOLIN vaccine target prediction program Vaxign was used to predict new Brucella vaccine targets. Vaxign identified 14 outer membrane proteins that are conserved in six virulent strains from B. abortus, B. melitensis, and B. suis that are pathogenic in humans. Of the 14 membrane proteins, two proteins (Omp2b and Omp31-1) are not present in B. ovis, a Brucella species that is not pathogenic in humans. Brucella vaccine data stored in VIOLIN were compared and analyzed using the VIOLIN query system. Bioinformatics curation and ontological representation of Brucella vaccines

  13. Real-time high-level video understanding using data warehouse

    Science.gov (United States)

    Lienard, Bruno; Desurmont, Xavier; Barrie, Bertrand; Delaigle, Jean-Francois

    2006-02-01

    High-level Video content analysis such as video-surveillance is often limited by computational aspects of automatic image understanding, i.e. it requires huge computing resources for reasoning processes like categorization and huge amount of data to represent knowledge of objects, scenarios and other models. This article explains how to design and develop a "near real-time adaptive image datamart", used, as a decisional support system for vision algorithms, and then as a mass storage system. Using RDF specification as storing format of vision algorithms meta-data, we can optimise the data warehouse concepts for video analysis, add some processes able to adapt the current model and pre-process data to speed-up queries. In this way, when new data is sent from a sensor to the data warehouse for long term storage, using remote procedure call embedded in object-oriented interfaces to simplified queries, they are processed and in memory data-model is updated. After some processing, possible interpretations of this data can be returned back to the sensor. To demonstrate this new approach, we will present typical scenarios applied to this architecture such as people tracking and events detection in a multi-camera network. Finally we will show how this system becomes a high-semantic data container for external data-mining.

  14. Tools and data services registry: a community effort to document bioinformatics resources

    Science.gov (United States)

    Ison, Jon; Rapacki, Kristoffer; Ménager, Hervé; Kalaš, Matúš; Rydza, Emil; Chmura, Piotr; Anthon, Christian; Beard, Niall; Berka, Karel; Bolser, Dan; Booth, Tim; Bretaudeau, Anthony; Brezovsky, Jan; Casadio, Rita; Cesareni, Gianni; Coppens, Frederik; Cornell, Michael; Cuccuru, Gianmauro; Davidsen, Kristian; Vedova, Gianluca Della; Dogan, Tunca; Doppelt-Azeroual, Olivia; Emery, Laura; Gasteiger, Elisabeth; Gatter, Thomas; Goldberg, Tatyana; Grosjean, Marie; Grüning, Björn; Helmer-Citterich, Manuela; Ienasescu, Hans; Ioannidis, Vassilios; Jespersen, Martin Closter; Jimenez, Rafael; Juty, Nick; Juvan, Peter; Koch, Maximilian; Laibe, Camille; Li, Jing-Woei; Licata, Luana; Mareuil, Fabien; Mičetić, Ivan; Friborg, Rune Møllegaard; Moretti, Sebastien; Morris, Chris; Möller, Steffen; Nenadic, Aleksandra; Peterson, Hedi; Profiti, Giuseppe; Rice, Peter; Romano, Paolo; Roncaglia, Paola; Saidi, Rabie; Schafferhans, Andrea; Schwämmle, Veit; Smith, Callum; Sperotto, Maria Maddalena; Stockinger, Heinz; Vařeková, Radka Svobodová; Tosatto, Silvio C.E.; de la Torre, Victor; Uva, Paolo; Via, Allegra; Yachdav, Guy; Zambelli, Federico; Vriend, Gert; Rost, Burkhard; Parkinson, Helen; Løngreen, Peter; Brunak, Søren

    2016-01-01

    Life sciences are yielding huge data sets that underpin scientific discoveries fundamental to improvement in human health, agriculture and the environment. In support of these discoveries, a plethora of databases and tools are deployed, in technically complex and diverse implementations, across a spectrum of scientific disciplines. The corpus of documentation of these resources is fragmented across the Web, with much redundancy, and has lacked a common standard of information. The outcome is that scientists must often struggle to find, understand, compare and use the best resources for the task at hand. Here we present a community-driven curation effort, supported by ELIXIR—the European infrastructure for biological information—that aspires to a comprehensive and consistent registry of information about bioinformatics resources. The sustainable upkeep of this Tools and Data Services Registry is assured by a curation effort driven by and tailored to local needs, and shared amongst a network of engaged partners. As of November 2015, the registry includes 1785 resources, with depositions from 126 individual registrations including 52 institutional providers and 74 individuals. With community support, the registry can become a standard for dissemination of information about bioinformatics resources: we welcome everyone to join us in this common endeavour. The registry is freely available at https://bio.tools. PMID:26538599

  15. How Can My State Benefit from an Educational Data Warehouse?

    Science.gov (United States)

    Bergner, Terry; Smith, Nancy J.

    2007-01-01

    Imagine if, at the start of the school year, a teacher could have detailed information about the academic history of every student in her or his classroom. This is possible if the teacher can log on to a Web site that provides access to an educational data warehouse. The teacher would see not only several years of state assessment results, but…

  16. Processing SPARQL queries with regular expressions in RDF databases

    Science.gov (United States)

    2011-01-01

    Background As the Resource Description Framework (RDF) data model is widely used for modeling and sharing a lot of online bioinformatics resources such as Uniprot (dev.isb-sib.ch/projects/uniprot-rdf) or Bio2RDF (bio2rdf.org), SPARQL - a W3C recommendation query for RDF databases - has become an important query language for querying the bioinformatics knowledge bases. Moreover, due to the diversity of users’ requests for extracting information from the RDF data as well as the lack of users’ knowledge about the exact value of each fact in the RDF databases, it is desirable to use the SPARQL query with regular expression patterns for querying the RDF data. To the best of our knowledge, there is currently no work that efficiently supports regular expression processing in SPARQL over RDF databases. Most of the existing techniques for processing regular expressions are designed for querying a text corpus, or only for supporting the matching over the paths in an RDF graph. Results In this paper, we propose a novel framework for supporting regular expression processing in SPARQL query. Our contributions can be summarized as follows. 1) We propose an efficient framework for processing SPARQL queries with regular expression patterns in RDF databases. 2) We propose a cost model in order to adapt the proposed framework in the existing query optimizers. 3) We build a prototype for the proposed framework in C++ and conduct extensive experiments demonstrating the efficiency and effectiveness of our technique. Conclusions Experiments with a full-blown RDF engine show that our framework outperforms the existing ones by up to two orders of magnitude in processing SPARQL queries with regular expression patterns. PMID:21489225

  17. Processing SPARQL queries with regular expressions in RDF databases.

    Science.gov (United States)

    Lee, Jinsoo; Pham, Minh-Duc; Lee, Jihwan; Han, Wook-Shin; Cho, Hune; Yu, Hwanjo; Lee, Jeong-Hoon

    2011-03-29

    As the Resource Description Framework (RDF) data model is widely used for modeling and sharing a lot of online bioinformatics resources such as Uniprot (dev.isb-sib.ch/projects/uniprot-rdf) or Bio2RDF (bio2rdf.org), SPARQL - a W3C recommendation query for RDF databases - has become an important query language for querying the bioinformatics knowledge bases. Moreover, due to the diversity of users' requests for extracting information from the RDF data as well as the lack of users' knowledge about the exact value of each fact in the RDF databases, it is desirable to use the SPARQL query with regular expression patterns for querying the RDF data. To the best of our knowledge, there is currently no work that efficiently supports regular expression processing in SPARQL over RDF databases. Most of the existing techniques for processing regular expressions are designed for querying a text corpus, or only for supporting the matching over the paths in an RDF graph. In this paper, we propose a novel framework for supporting regular expression processing in SPARQL query. Our contributions can be summarized as follows. 1) We propose an efficient framework for processing SPARQL queries with regular expression patterns in RDF databases. 2) We propose a cost model in order to adapt the proposed framework in the existing query optimizers. 3) We build a prototype for the proposed framework in C++ and conduct extensive experiments demonstrating the efficiency and effectiveness of our technique. Experiments with a full-blown RDF engine show that our framework outperforms the existing ones by up to two orders of magnitude in processing SPARQL queries with regular expression patterns.

  18. Vertical and Horizontal Integration of Bioinformatics Education: A Modular, Interdisciplinary Approach

    Science.gov (United States)

    Furge, Laura Lowe; Stevens-Truss, Regina; Moore, D. Blaine; Langeland, James A.

    2009-01-01

    Bioinformatics education for undergraduates has been approached primarily in two ways: introduction of new courses with largely bioinformatics focus or introduction of bioinformatics experiences into existing courses. For small colleges such as Kalamazoo, creation of new courses within an already resource-stretched setting has not been an option.…

  19. The secondary metabolite bioinformatics portal: Computational tools to facilitate synthetic biology of secondary metabolite production

    Directory of Open Access Journals (Sweden)

    Tilmann Weber

    2016-06-01

    Full Text Available Natural products are among the most important sources of lead molecules for drug discovery. With the development of affordable whole-genome sequencing technologies and other ‘omics tools, the field of natural products research is currently undergoing a shift in paradigms. While, for decades, mainly analytical and chemical methods gave access to this group of compounds, nowadays genomics-based methods offer complementary approaches to find, identify and characterize such molecules. This paradigm shift also resulted in a high demand for computational tools to assist researchers in their daily work. In this context, this review gives a summary of tools and databases that currently are available to mine, identify and characterize natural product biosynthesis pathways and their producers based on ‘omics data. A web portal called Secondary Metabolite Bioinformatics Portal (SMBP at http://www.secondarymetabolites.org is introduced to provide a one-stop catalog and links to these bioinformatics resources. In addition, an outlook is presented how the existing tools and those to be developed will influence synthetic biology approaches in the natural products field.

  20. Adapting bioinformatics curricula for big data.

    Science.gov (United States)

    Greene, Anna C; Giffin, Kristine A; Greene, Casey S; Moore, Jason H

    2016-01-01

    Modern technologies are capable of generating enormous amounts of data that measure complex biological systems. Computational biologists and bioinformatics scientists are increasingly being asked to use these data to reveal key systems-level properties. We review the extent to which curricula are changing in the era of big data. We identify key competencies that scientists dealing with big data are expected to possess across fields, and we use this information to propose courses to meet these growing needs. While bioinformatics programs have traditionally trained students in data-intensive science, we identify areas of particular biological, computational and statistical emphasis important for this era that can be incorporated into existing curricula. For each area, we propose a course structured around these topics, which can be adapted in whole or in parts into existing curricula. In summary, specific challenges associated with big data provide an important opportunity to update existing curricula, but we do not foresee a wholesale redesign of bioinformatics training programs. © The Author 2015. Published by Oxford University Press.