Crasto Chiquito J
Full Text Available Abstract Background Gene expression patterns of olfactory receptors (ORs are an important component of the signal encoding mechanism in the olfactory system since they determine the interactions between odorant ligands and sensory neurons. We have developed the Olfactory Receptor Microarray Database (ORMD to house OR gene expression data. ORMD is integrated with the Olfactory Receptor Database (ORDB, which is a key repository of OR gene information. Both databases aim to aid experimental research related to olfaction. Description ORMD is a Web-accessible database that provides a secure data repository for OR microarray experiments. It contains both publicly available and private data; accessing the latter requires authenticated login. The ORMD is designed to allow users to not only deposit gene expression data but also manage their projects/experiments. For example, contributors can choose whether to make their datasets public. For each experiment, users can download the raw data files and view and export the gene expression data. For each OR gene being probed in a microarray experiment, a hyperlink to that gene in ORDB provides access to genomic and proteomic information related to the corresponding olfactory receptor. Individual ORs archived in ORDB are also linked to ORMD, allowing users access to the related microarray gene expression data. Conclusion ORMD serves as a data repository and project management system. It facilitates the study of microarray experiments of gene expression in the olfactory system. In conjunction with ORDB, ORMD integrates gene expression data with the genomic and functional data of ORs, and is thus a useful resource for both olfactory researchers and the public.
Bernstein, P.A.; DeWitt, D.; Heuer, A.
There has been a growing interest in improving the publication processes for database research papers. This panel reports on recent changes in those processes and presents an initial cut at historical data for the VLDB Journal and ACM Transactions on Database Systems.......There has been a growing interest in improving the publication processes for database research papers. This panel reports on recent changes in those processes and presents an initial cut at historical data for the VLDB Journal and ACM Transactions on Database Systems....
Tagmount, Abderrahmane; Wang, Mei; Lindquist, Erika; Tanaka, Yoshihiro; Teranishi, Kristen S.; Sunagawa, Shinichi; Wong, Mike; Stillman, Jonathon H.
Background: With the emergence of a completed genome sequence of the freshwater crustacean Daphnia pulex, construction of genomic-scale sequence databases for additional crustacean sequences are important for comparative genomics and annotation. Porcelain crabs, genus Petrolisthes, have been powerful crustacean models for environmental and evolutionary physiology with respect to thermal adaptation and understanding responses of marine organisms to climate change. Here, we present a large-scale EST sequencing and cDNA microarray database project for the porcelain crab Petrolisthes cinctipes. Methodology/Principal Findings: A set of ~;;30K unique sequences (UniSeqs) representing ~;;19K clusters were generated from ~;;98K high quality ESTs from a set of tissue specific non-normalized and mixed-tissue normalized cDNA libraries from the porcelain crab Petrolisthes cinctipes. Homology for each UniSeq was assessed using BLAST, InterProScan, GO and KEGG database searches. Approximately 66percent of the UniSeqs had homology in at least one of the databases. All EST and UniSeq sequences along with annotation results and coordinated cDNA microarray datasets have been made publicly accessible at the Porcelain Crab Array Database (PCAD), a feature-enriched version of the Stanford and Longhorn Array Databases.Conclusions/Significance: The EST project presented here represents the third largest sequencing effort for any crustacean, and the largest effort for any crab species. Our assembly and clustering results suggest that our porcelain crab EST data set is equally diverse to the much larger EST set generated in the Daphnia pulex genome sequencing project, and thus will be an important resource to the Daphnia research community. Our homology results support the pancrustacea hypothesis and suggest that Malacostraca may be ancestral to Branchiopoda and Hexapoda. Our results also suggest that our cDNA microarrays cover as much of the transcriptome as can reasonably be captured in
Tucker, James Cory
This study examines the extent to which databases support student and faculty research in the area of public administration. A list of journals in public administration, public policy, political science, public budgeting and finance, and other related areas was compared to the journal content list of six business databases. These databases…
Wilson, Concepcion S.; Boell, Sebastian K.; Kennan, Mary Anne; Willard, Patricia
This paper examines aspects of journal articles published from 1967 to 2008, located in eight databases, and authored or co-authored by academics serving for at least two years in Australian LIS programs from 1959 to 2008. These aspects are: inclusion of publications in databases, publications in journals, authorship characteristics of…
Discusses results of a survey of factors influencing database use in public libraries. Highlights the importance of content; ease of use; and importance of instruction. Tabulates importance indications for number and location of workstations, library hours, availability of remote login, usefulness and quality of content, lack of other databases,…
Chockalingam, Sriram; Aluru, Maneesha; Aluru, Srinivas
Pre-processing of microarray data is a well-studied problem. Furthermore, all popular platforms come with their own recommended best practices for differential analysis of genes. However, for genome-scale network inference using microarray data collected from large public repositories, these methods filter out a considerable number of genes. This is primarily due to the effects of aggregating a diverse array of experiments with different technical and biological scenarios. Here we introduce a pre-processing pipeline suitable for inferring genome-scale gene networks from large microarray datasets. We show that partitioning of the available microarray datasets according to biological relevance into tissue- and process-specific categories significantly extends the limits of downstream network construction. We demonstrate the effectiveness of our pre-processing pipeline by inferring genome-scale networks for the model plant Arabidopsis thaliana using two different construction methods and a collection of 11,760 Affymetrix ATH1 microarray chips. Our pre-processing pipeline and the datasets used in this paper are made available at http://alurulab.cc.gatech.edu/microarray-pp.
Tiikkainen, Pekka; Franke, Lutz
Activity data for small molecules are invaluable in chemoinformatics. Various bioactivity databases exist containing detailed information of target proteins and quantitative binding data for small molecules extracted from journals and patents. In the current work, we have merged several public and commercial bioactivity databases into one bioactivity metabase. The molecular presentation, target information, and activity data of the vendor databases were standardized. The main motivation of the work was to create a single relational database which allows fast and simple data retrieval by in-house scientists. Second, we wanted to know the amount of overlap between databases by commercial and public vendors to see whether the former contain data complementing the latter. Third, we quantified the degree of inconsistency between data sources by comparing data points derived from the same scientific article cited by more than one vendor. We found that each data source contains unique data which is due to different scientific articles cited by the vendors. When comparing data derived from the same article we found that inconsistencies between the vendors are common. In conclusion, using databases of different vendors is still useful since the data overlap is not complete. It should be noted that this can be partially explained by the inconsistencies and errors in the source data.
Stokes, Todd H; Torrance, JT; Li, Henry; Wang, May D
Background A survey of microarray databases reveals that most of the repository contents and data models are heterogeneous (i.e., data obtained from different chip manufacturers), and that the repositories provide only basic biological keywords linking to PubMed. As a result, it is difficult to find datasets using research context or analysis parameters information beyond a few keywords. For example, to reduce the "curse-of-dimension" problem in microarray analysis, the number of samples is often increased by merging array data from different datasets. Knowing chip data parameters such as pre-processing steps (e.g., normalization, artefact removal, etc), and knowing any previous biological validation of the dataset is essential due to the heterogeneity of the data. However, most of the microarray repositories do not have meta-data information in the first place, and do not have a a mechanism to add or insert this information. Thus, there is a critical need to create "intelligent" microarray repositories that (1) enable update of meta-data with the raw array data, and (2) provide standardized archiving protocols to minimize bias from the raw data sources. Results To address the problems discussed, we have developed a community maintained system called ArrayWiki that unites disparate meta-data of microarray meta-experiments from multiple primary sources with four key features. First, ArrayWiki provides a user-friendly knowledge management interface in addition to a programmable interface using standards developed by Wikipedia. Second, ArrayWiki includes automated quality control processes (caCORRECT) and novel visualization methods (BioPNG, Gel Plots), which provide extra information about data quality unavailable in other microarray repositories. Third, it provides a user-curation capability through the familiar Wiki interface. Fourth, ArrayWiki provides users with simple text-based searches across all experiment meta-data, and exposes data to search engine crawlers
The LHCb is one of the main detectors of Large Hadron Collider, where physicists and scientists work together on high precision measurements of matter-antimatter asymmetries and searches for rare and forbidden decays, with the aim of discovering new and unexpected forces. The work does not only consist of analyzing data collected from experiments but also in publishing the results of those analyses. The LHCb publications are gathered on LHCb publications page to maximize their availability to both LHCb members and to the high energy community. In this project a new database system was implemented for LHCb publications page. This will help to improve access to research papers for scientists and better integration with current CERN library website and others.
Smith Vincent S
Full Text Available Abstract The fabric of science is changing, driven by a revolution in digital technologies that facilitate the acquisition and communication of massive amounts of data. This is changing the nature of collaboration and expanding opportunities to participate in science. If digital technologies are the engine of this revolution, digital data are its fuel. But for many scientific disciplines, this fuel is in short supply. The publication of primary data is not a universal or mandatory part of science, and despite policies and proclamations to the contrary, calls to make data publicly available have largely gone unheeded. In this short essay I consider why, and explore some of the challenges that lie ahead, as we work toward a database of everything.
... 24 Housing and Urban Development 1 2010-04-01 2010-04-01 false Public-use database and public... Public-use database and public information. (a) General. Except as provided in paragraph (c) of this section, the Secretary shall establish and make available for public use, a public-use database containing...
Grabowski, Marek; Langner, Karol M; Cymborowski, Marcin; Porebski, Przemyslaw J; Sroka, Piotr; Zheng, Heping; Cooper, David R; Zimmerman, Matthew D; Elsliger, Marc André; Burley, Stephen K; Minor, Wladek
The low reproducibility of published experimental results in many scientific disciplines has recently garnered negative attention in scientific journals and the general media. Public transparency, including the availability of `raw' experimental data, will help to address growing concerns regarding scientific integrity. Macromolecular X-ray crystallography has led the way in requiring the public dissemination of atomic coordinates and a wealth of experimental data, making the field one of the most reproducible in the biological sciences. However, there remains no mandate for public disclosure of the original diffraction data. The Integrated Resource for Reproducibility in Macromolecular Crystallography (IRRMC) has been developed to archive raw data from diffraction experiments and, equally importantly, to provide related metadata. Currently, the database of our resource contains data from 2920 macromolecular diffraction experiments (5767 data sets), accounting for around 3% of all depositions in the Protein Data Bank (PDB), with their corresponding partially curated metadata. IRRMC utilizes distributed storage implemented using a federated architecture of many independent storage servers, which provides both scalability and sustainability. The resource, which is accessible via the web portal at http://www.proteindiffraction.org, can be searched using various criteria. All data are available for unrestricted access and download. The resource serves as a proof of concept and demonstrates the feasibility of archiving raw diffraction data and associated metadata from X-ray crystallographic studies of biological macromolecules. The goal is to expand this resource and include data sets that failed to yield X-ray structures in order to facilitate collaborative efforts that will improve protein structure-determination methods and to ensure the availability of `orphan' data left behind for various reasons by individual investigators and/or extinct structural genomics
The study investigated awareness, access and use of electronic database by public library users in Ibadan Oyo State in Nigeria. The purpose of this study was to determine awareness of public library users' electronic databases, find out what these users used electronic databases to do and to identify problems associated ...
Chavez-Alvarez, Rocio; Chavoya, Arturo; Mendez-Vazquez, Andres
DNA microarrays and cell cycle synchronization experiments have made possible the study of the mechanisms of cell cycle regulation of Saccharomyces cerevisiae by simultaneously monitoring the expression levels of thousands of genes at specific time points. On the other hand, pattern recognition techniques can contribute to the analysis of such massive measurements, providing a model of gene expression level evolution through the cell cycle process. In this paper, we propose the use of one of such techniques--an unsupervised artificial neural network called a Self-Organizing Map (SOM)-which has been successfully applied to processes involving very noisy signals, classifying and organizing them, and assisting in the discovery of behavior patterns without requiring prior knowledge about the process under analysis. As a test bed for the use of SOMs in finding possible relationships among genes and their possible contribution in some biological processes, we selected 282 S. cerevisiae genes that have been shown through biological experiments to have an activity during the cell cycle. The expression level of these genes was analyzed in five of the most cited time series DNA microarray databases used in the study of the cell cycle of this organism. With the use of SOM, it was possible to find clusters of genes with similar behavior in the five databases along two cell cycles. This result suggested that some of these genes might be biologically related or might have a regulatory relationship, as was corroborated by comparing some of the clusters obtained with SOMs against a previously reported regulatory network that was generated using biological knowledge, such as protein-protein interactions, gene expression levels, metabolism dynamics, promoter binding, and modification, regulation and transport of proteins. The methodology described in this paper could be applied to the study of gene relationships of other biological processes in different organisms.
Pardo, Belén G; Álvarez-Dios, José Antonio; Cao, Asunción; Ramilo, Andrea; Gómez-Tato, Antonio; Planas, Josep V; Villalba, Antonio; Martínez, Paulino
The flat oyster, Ostrea edulis, is one of the main farmed oysters, not only in Europe but also in the United States and Canada. Bonamiosis due to the parasite Bonamia ostreae has been associated with high mortality episodes in this species. This parasite is an intracellular protozoan that infects haemocytes, the main cells involved in oyster defence. Due to the economical and ecological importance of flat oyster, genomic data are badly needed for genetic improvement of the species, but they are still very scarce. The objective of this study is to develop a sequence database, OedulisDB, with new genomic and transcriptomic resources, providing new data and convenient tools to improve our knowledge of the oyster's immune mechanisms. Transcriptomic and genomic sequences were obtained using 454 pyrosequencing and compiled into an O. edulis database, OedulisDB, consisting of two sets of 10,318 and 7159 unique sequences that represent the oyster's genome (WG) and de novo haemocyte transcriptome (HT), respectively. The flat oyster transcriptome was obtained from two strains (naïve and tolerant) challenged with B. ostreae, and from their corresponding non-challenged controls. Approximately 78.5% of 5619 HT unique sequences were successfully annotated by Blast search using public databases. A total of 984 sequences were identified as being related to immune response and several key immune genes were identified for the first time in flat oyster. Additionally, transcriptome information was used to design and validate the first oligo-microarray in flat oyster enriched with immune sequences from haemocytes. Our transcriptomic and genomic sequencing and subsequent annotation have largely increased the scarce resources available for this economically important species and have enabled us to develop an OedulisDB database and accompanying tools for gene expression analysis. This study represents the first attempt to characterize in depth the O. edulis haemocyte transcriptome in
US Agency for International Development — This dataset brings together information collected since 2001 on PPPs that have been supported by USAID. For the purposes of this dataset a Public-Private...
Full Text Available Abstract Background Microarray analysis has become a widely used technique for the study of gene-expression patterns on a genomic scale. As more and more laboratories are adopting microarray technology, there is a need for powerful and easy to use microarray databases facilitating array fabrication, labeling, hybridization, and data analysis. The wealth of data generated by this high throughput approach renders adequate database and analysis tools crucial for the pursuit of insights into the transcriptomic behavior of cells. Results MARS (Microarray Analysis and Retrieval System provides a comprehensive MIAME supportive suite for storing, retrieving, and analyzing multi color microarray data. The system comprises a laboratory information management system (LIMS, a quality control management, as well as a sophisticated user management system. MARS is fully integrated into an analytical pipeline of microarray image analysis, normalization, gene expression clustering, and mapping of gene expression data onto biological pathways. The incorporation of ontologies and the use of MAGE-ML enables an export of studies stored in MARS to public repositories and other databases accepting these documents. Conclusion We have developed an integrated system tailored to serve the specific needs of microarray based research projects using a unique fusion of Web based and standalone applications connected to the latest J2EE application server technology. The presented system is freely available for academic and non-profit institutions. More information can be found at http://genome.tugraz.at.
Full Text Available Abstract Background Inflammation is a hallmark of many human diseases. Elucidating the mechanisms underlying systemic inflammation has long been an important topic in basic and clinical research. When primary pathogenetic events remains unclear due to its immense complexity, construction and analysis of the gene regulatory network of inflammation at times becomes the best way to understand the detrimental effects of disease. However, it is difficult to recognize and evaluate relevant biological processes from the huge quantities of experimental data. It is hence appealing to find an algorithm which can generate a gene regulatory network of systemic inflammation from high-throughput genomic studies of human diseases. Such network will be essential for us to extract valuable information from the complex and chaotic network under diseased conditions. Results In this study, we construct a gene regulatory network of inflammation using data extracted from the Ensembl and JASPAR databases. We also integrate and apply a number of systematic algorithms like cross correlation threshold, maximum likelihood estimation method and Akaike Information Criterion (AIC on time-lapsed microarray data to refine the genome-wide transcriptional regulatory network in response to bacterial endotoxins in the context of dynamic activated genes, which are regulated by transcription factors (TFs such as NF-κB. This systematic approach is used to investigate the stochastic interaction represented by the dynamic leukocyte gene expression profiles of human subject exposed to an inflammatory stimulus (bacterial endotoxin. Based on the kinetic parameters of the dynamic gene regulatory network, we identify important properties (such as susceptibility to infection of the immune system, which may be useful for translational research. Finally, robustness of the inflammatory gene network is also inferred by analyzing the hubs and "weak ties" structures of the gene network
The Prototype Food and Nutrient Database for Dietary Studies (Prototype FNDDS) Branded Food Products Database for Public Health is a proof of concept database. The database contains a small selection of food products which is being used to exhibit the approach for incorporation of the Branded Food ...
Malbet, Fabien; Mella, Guillaume; Lawson, Peter; Taillifet, Esther; Lafrasse, Sylvain
Optical long baseline interferometry is a technique that has generated almost 850 refereed papers to date. The targets span a large variety of objects from planetary systems to extragalactic studies and all branches of stellar physics. We have created a database hosted by the JMMC and connected to the Optical Long Baseline Interferometry Newsletter (OLBIN) web site using MySQL and a collection of XML or PHP scripts in order to store and classify these publications. Each entry is defined by its ADS bibcode, includes basic ADS informations and metadata. The metadata are specified by tags sorted in categories: interferometric facilities, instrumentation, wavelength of operation, spectral resolution, type of measurement, target type, and paper category, for example. The whole OLBIN publication list has been processed and we present how the database is organized and can be accessed. We use this tool to generate statistical plots of interest for the community in optical long baseline interferometry.
Park, Sungjin; Gildersleeve, Jeffrey C; Blixt, Klas Ola
In the last decade, carbohydrate microarrays have been core technologies for analyzing carbohydrate-mediated recognition events in a high-throughput fashion. A number of methods have been exploited for immobilizing glycans on the solid surface in a microarray format. This microarray...... of substrate specificities of glycosyltransferases. This review covers the construction of carbohydrate microarrays, detection methods of carbohydrate microarrays and their applications in biological and biomedical research....
Gaitan, Santiago; ten Veldhuis, Marie-claire; van de Giesen, Nick
Cities worldwide are challenged by increasing urban flood risks. Precise and realistic measures are required to decide upon investment to reduce their impacts. Obvious flooding factors affecting flood risk include sewer systems performance and urban topography. However, currently implemented sewer and topographic models do not provide realistic predictions of local flooding occurrence during heavy rain events. Assessing other factors such as spatially distributed rainfall and socioeconomic characteristics may help to explain probability and impacts of urban flooding. Several public databases were analyzed: complaints about flooding made by citizens, rainfall depths (15 min and 100 Ha spatio-temporal resolution), grids describing number of inhabitants, income, and housing price (1Ha and 25Ha resolution); and buildings age. Data analysis was done using Python and GIS programming, and included spatial indexing of data, cluster analysis, and multivariate regression on the complaints. Complaints were used as a proxy to characterize flooding impacts. The cluster analysis, run for all the variables except the complaints, grouped part of the grid-cells of central Amsterdam into a highly differentiated group, covering 10% of the analyzed area, and accounting for 25% of registered complaints. The configuration of the analyzed variables in central Amsterdam coincides with a high complaint count. Remaining complaints were evenly dispersed along other groups. An adjusted R2 of 0.38 in the multivariate regression suggests that explaining power can improve if additional variables are considered. While rainfall intensity explained 4% of the incidence of complaints, population density and building age significantly explained around 20% each. Data mining of public databases proved to be a valuable tool to identify factors explaining variability in occurrence of urban pluvial flooding, though additional variables must be considered to fully explain flood risk variability.
The importance of comprehensive food composition databases is more critical than ever in helping to address global food security. The USDA National Nutrient Database for Standard Reference is the “gold standard” for food composition databases. The presentation will include new developments in stren...
Full Text Available Voltage-gated calcium channels (VGCCs are well documented to play roles in cell proliferation, migration, and apoptosis; however, whether VGCCs regulate the onset and progression of cancer is still under investigation. The VGCC family consists of five members, which are L-type, N-type, T-type, R-type and P/Q type. To date, no holistic approach has been used to screen VGCC family genes in different types of cancer. We analyzed the transcript expression of VGCCs in clinical cancer tissue samples by accessing ONCOMINE (www.oncomine.org, a web-based microarray database, to perform a systematic analysis. Every member of the VGCCs was examined across 21 different types of cancer by comparing mRNA expression in cancer to that in normal tissue. A previous study showed that altered expression of mRNA in cancer tissue may play an oncogenic role and promote tumor development; therefore, in the present findings, we focus only on the overexpression of VGCCs in different types of cancer. This bioinformatics analysis revealed that different subtypes of VGCCs (CACNA1C, CACNA1D, CACNA1B, CACNA1G, and CACNA1I are implicated in the development and progression of diverse types of cancer and show dramatic up-regulation in breast cancer. CACNA1F only showed high expression in testis cancer, whereas CACNA1A, CACNA1C, and CACNA1D were highly expressed in most types of cancer. The current analysis revealed that specific VGCCs likely play essential roles in specific types of cancer. Collectively, we identified several VGCC targets and classified them according to different cancer subtypes for prospective studies on the underlying carcinogenic mechanisms. The present findings suggest that VGCCs are possible targets for prospective investigation in cancer treatment.
Full Text Available BACKGROUND: Studies that use electronic health databases as research material are getting popular but the influence of a single electronic health database had not been well investigated yet. The United Kingdom's General Practice Research Database (GPRD is one of the few electronic health databases publicly available to academic researchers. This study analyzed studies that used GPRD to demonstrate the scientific production and academic impact by a single public health database. METHODOLOGY AND FINDINGS: A total of 749 studies published between 1995 and 2009 with 'General Practice Research Database' as their topics, defined as GPRD studies, were extracted from Web of Science. By the end of 2009, the GPRD had attracted 1251 authors from 22 countries and been used extensively in 749 studies published in 193 journals across 58 study fields. Each GPRD study was cited 2.7 times by successive studies. Moreover, the total number of GPRD studies increased rapidly, and it is expected to reach 1500 by 2015, twice the number accumulated till the end of 2009. Since 17 of the most prolific authors (1.4% of all authors contributed nearly half (47.9% of GPRD studies, success in conducting GPRD studies may accumulate. The GPRD was used mainly in, but not limited to, the three study fields of "Pharmacology and Pharmacy", "General and Internal Medicine", and "Public, Environmental and Occupational Health". The UK and United States were the two most active regions of GPRD studies. One-third of GRPD studies were internationally co-authored. CONCLUSIONS: A public electronic health database such as the GPRD will promote scientific production in many ways. Data owners of electronic health databases at a national level should consider how to reduce access barriers and to make data more available for research.
Chen, Yu-Chun; Wu, Jau-Ching; Haschler, Ingo; Majeed, Azeem; Chen, Tzeng-Ji; Wetter, Thomas
Background Studies that use electronic health databases as research material are getting popular but the influence of a single electronic health database had not been well investigated yet. The United Kingdom's General Practice Research Database (GPRD) is one of the few electronic health databases publicly available to academic researchers. This study analyzed studies that used GPRD to demonstrate the scientific production and academic impact by a single public health database. Methodology and Findings A total of 749 studies published between 1995 and 2009 with ‘General Practice Research Database’ as their topics, defined as GPRD studies, were extracted from Web of Science. By the end of 2009, the GPRD had attracted 1251 authors from 22 countries and been used extensively in 749 studies published in 193 journals across 58 study fields. Each GPRD study was cited 2.7 times by successive studies. Moreover, the total number of GPRD studies increased rapidly, and it is expected to reach 1500 by 2015, twice the number accumulated till the end of 2009. Since 17 of the most prolific authors (1.4% of all authors) contributed nearly half (47.9%) of GPRD studies, success in conducting GPRD studies may accumulate. The GPRD was used mainly in, but not limited to, the three study fields of “Pharmacology and Pharmacy”, “General and Internal Medicine”, and “Public, Environmental and Occupational Health”. The UK and United States were the two most active regions of GPRD studies. One-third of GRPD studies were internationally co-authored. Conclusions A public electronic health database such as the GPRD will promote scientific production in many ways. Data owners of electronic health databases at a national level should consider how to reduce access barriers and to make data more available for research. PMID:21731733
Brandhorst, Ted, Ed.
The purpose of this section is to specify the procedure for making changes to the ERIC database after the data involved have been announced in the abstract journals RIE or CIJE. As a matter of general ERIC policy, a document or journal article is not re-announced or re-entered into the database as a new accession for the purpose of accomplishing a…
Xia, Xiao-Qin; McClelland, Michael; Wang, Yipeng
With advances in high-throughput genomics and proteomics, it is challenging for biologists to deal with large data files and to map their data to annotations in public databases. We developed TabSQL, a MySQL-based application tool, for viewing, filtering and querying data files with large numbers of rows. TabSQL provides functions for downloading and installing table files from public databases including the Gene Ontology database (GO), the Ensembl databases, and genome databases from the UCSC genome bioinformatics site. Any other database that provides tab-delimited flat files can also be imported. The downloaded gene annotation tables can be queried together with users' data in TabSQL using either a graphic interface or command line. TabSQL allows queries across the user's data and public databases without programming. It is a convenient tool for biologists to annotate and enrich their data.
Full Text Available Abstract Background High throughput gene expression profiling (GEP is becoming a routine technique in life science laboratories. With experimental designs that repeatedly span thousands of genes and hundreds of samples, relying on a dedicated database infrastructure is no longer an option. GEP technology is a fast moving target, with new approaches constantly broadening the field diversity. This technology heterogeneity, compounded by the informatics complexity of GEP databases, means that software developments have so far focused on mainstream techniques, leaving less typical yet established techniques such as Nylon microarrays at best partially supported. Results MAF (MicroArray Facility is the laboratory database system we have developed for managing the design, production and hybridization of spotted microarrays. Although it can support the widely used glass microarrays and oligo-chips, MAF was designed with the specific idiosyncrasies of Nylon based microarrays in mind. Notably single channel radioactive probes, microarray stripping and reuse, vector control hybridizations and spike-in controls are all natively supported by the software suite. MicroArray Facility is MIAME supportive and dynamically provides feedback on missing annotations to help users estimate effective MIAME compliance. Genomic data such as clone identifiers and gene symbols are also directly annotated by MAF software using standard public resources. The MAGE-ML data format is implemented for full data export. Journalized database operations (audit tracking, data anonymization, material traceability and user/project level confidentiality policies are also managed by MAF. Conclusion MicroArray Facility is a complete data management system for microarray producers and end-users. Particular care has been devoted to adequately model Nylon based microarrays. The MAF system, developed and implemented in both private and academic environments, has proved a robust solution for
.... This responsibility to maintain a public use database (PUDB) for such mortgage data was transferred to... FEDERAL HOUSING FINANCE AGENCY [No. 2010-N-10] Notice of Order: Revisions to Enterprise Public Use Database AGENCY: Federal Housing Finance Agency. ACTION: Notice of order. SUMMARY: Section 1323(a)(1) of...
Information on bibliographic as well as numeric/textual databases relevant to coastal geomorphology has been included in a tabular form. Databases cover a broad spectrum of related subjects like coastal environment and population aspects, coastline...
Divina, Petr; Forejt, Jiří
Roč. 32, - (2004), s. D482-D483 ISSN 0305-1048 R&D Projects: GA MŠk LN00A079; GA ČR GV204/98/K015 Grant - others:HHMI(US) 555000306 Institutional research plan: CEZ:AV0Z5052915 Keywords : mouse SAGE libraries * web -based database Subject RIV: EB - Genetics ; Molecular Biology Impact factor: 7.260, year: 2004
Full Text Available Databases are deeply embedded in archaeology, underpinning and supporting many aspects of the subject. However, as well as providing a means for storing, retrieving and modifying data, databases themselves must be a result of a detailed analysis and design process. This article looks at this process, and shows how the characteristics of data models affect the process of database design and implementation. The impact of the Internet on the development of databases is examined, and the article concludes with a discussion of a range of issues associated with the recording and management of archaeological data.
Vesely, Martin; Le Meur, Jean-Yves
Document ranking for scientific publications involves a variety of specialized resources (e.g. author or citation indexes) that are usually difficult to use within standard general purpose search engines that usually operate on large-scale heterogeneous document collections for which the required specialized resources are not always available for all the documents present in the collections. Integrating such resources into specialized information retrieval engines is therefore important to cope with community-specific user expectations that strongly influence the perception of relevance within the considered community. In this perspective, this paper extends the notion of ranking with various methods exploiting different types of bibliographic knowledge that represent a crucial resource for measuring the relevance of scientific publications. In our work, we experimentally evaluated the adequacy of two such ranking methods (one based on freshness, i.e. the publication date, and the other on a novel index, the ...
Meng, Da; Broschat, Shira L; Call, Douglas R
analysis of subsequent experimental data. Additionally, PLASMID can be used to construct virtual microarrays with genomes from public databases, which can then be used to identify an optimal set of probes.
... information as evidenced by their verification that they have done so. We also note that reports of harm... ``Others'' file reports of harm with us using our online incident reporting form by self-reporting as... requirements for verification of information it intends to make public. Response--Congress provided a clear...
Machado, Helena; Silva, Susana
The ethical aspects of biobanks and forensic DNA databases are often treated as separate issues. As a reflection of this, public participation, or the involvement of citizens in genetic databases, has been approached differently in the fields of forensics and medicine. This paper aims to cross the boundaries between medicine and forensics by exploring the flows between the ethical issues presented in the two domains and the subsequent conceptualisation of public trust and legitimisation. We propose to introduce the concept of 'solidarity', traditionally applied only to medical and research biobanks, into a consideration of public engagement in medicine and forensics. Inclusion of a solidarity-based framework, in both medical biobanks and forensic DNA databases, raises new questions that should be included in the ethical debate, in relation to both health services/medical research and activities associated with the criminal justice system. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://group.bmj.com/group/rights-licensing/permissions.
Medrano Juan F
Full Text Available Abstract Background The increasing use of DNA microarrays for genetical genomics studies generates a need for platforms with complete coverage of the genome. We have compared the effective gene coverage in the mouse genome of different commercial and noncommercial oligonucleotide microarray platforms by performing an in-house gene annotation of probes. We only used information about probes that is available from vendors and followed a process that any researcher may take to find the gene targeted by a given probe. In order to make consistent comparisons between platforms, probes in each microarray were annotated with an Entrez Gene id and the chromosomal position for each gene was obtained from the UCSC Genome Browser Database. Gene coverage was estimated as the percentage of Entrez Genes with a unique position in the UCSC Genome database that is tested by a given microarray platform. Results A MySQL relational database was created to store the mapping information for 25,416 mouse genes and for the probes in five microarray platforms (gene coverage level in parenthesis: Affymetrix430 2.0 (75.6%, ABI Genome Survey (81.24%, Agilent (79.33%, Codelink (78.09%, Sentrix (90.47%; and four array-ready oligosets: Sigma (47.95%, Operon v.3 (69.89%, Operon v.4 (84.03%, and MEEBO (84.03%. The differences in coverage between platforms were highly conserved across chromosomes. Differences in the number of redundant and unspecific probes were also found among arrays. The database can be queried to compare specific genomic regions using a web interface. The software used to create, update and query the database is freely available as a toolbox named ArrayGene. Conclusion The software developed here allows researchers to create updated custom databases by using public or proprietary information on genes for any organisms. ArrayGene allows easy comparisons of gene coverage between microarray platforms for any region of the genome. The comparison presented here
Gresham Cathy R
Full Text Available Abstract Background Modeling results from chicken microarray studies is challenging for researchers due to little functional annotation associated with these arrays. The Affymetrix GenChip chicken genome array, one of the biggest arrays that serve as a key research tool for the study of chicken functional genomics, is among the few arrays that link gene products to Gene Ontology (GO. However the GO annotation data presented by Affymetrix is incomplete, for example, they do not show references linked to manually annotated functions. In addition, there is no tool that facilitates microarray researchers to directly retrieve functional annotations for their datasets from the annotated arrays. This costs researchers amount of time in searching multiple GO databases for functional information. Results We have improved the breadth of functional annotations of the gene products associated with probesets on the Affymetrix chicken genome array by 45% and the quality of annotation by 14%. We have also identified the most significant diseases and disorders, different types of genes, and known drug targets represented on Affymetrix chicken genome array. To facilitate functional annotation of other arrays and microarray experimental datasets we developed an Array GO Mapper (AGOM tool to help researchers to quickly retrieve corresponding functional information for their dataset. Conclusion Results from this study will directly facilitate annotation of other chicken arrays and microarray experimental datasets. Researchers will be able to quickly model their microarray dataset into more reliable biological functional information by using AGOM tool. The disease, disorders, gene types and drug targets revealed in the study will allow researchers to learn more about how genes function in complex biological systems and may lead to new drug discovery and development of therapies. The GO annotation data generated will be available for public use via AgBase website and
Ronald Pamela C
Full Text Available Abstract Background Few microarrays have been quantitatively calibrated to identify optimal hybridization conditions because it is difficult to precisely determine the hybridization characteristics of a microarray using biologically variable cDNA samples. Results Using synthesized samples with known concentrations of specific oligonucleotides, a series of microarray experiments was conducted to evaluate microarrays designed by PICKY, an oligo microarray design software tool, and to test a direct microarray calibration method based on the PICKY-predicted, thermodynamically closest nontarget information. The complete set of microarray experiment results is archived in the GEO database with series accession number GSE14717. Additional data files and Perl programs described in this paper can be obtained from the website http://www.complex.iastate.edu under the PICKY Download area. Conclusion PICKY-designed microarray probes are highly reliable over a wide range of hybridization temperatures and sample concentrations. The microarray calibration method reported here allows researchers to experimentally optimize their hybridization conditions. Because this method is straightforward, uses existing microarrays and relatively inexpensive synthesized samples, it can be used by any lab that uses microarrays designed by PICKY. In addition, other microarrays can be reanalyzed by PICKY to obtain the thermodynamically closest nontarget information for calibration.
Sánchez-Peña, Matilde L; Isaza, Clara E; Pérez-Morales, Jaileene; Rodríguez-Padilla, Cristina; Castro, José M; Cabrera-Ríos, Mauricio
Microarray experiments are capable of determining the relative expression of tens of thousands of genes simultaneously, thus resulting in very large databases. The analysis of these databases and the extraction of biologically relevant knowledge from them are challenging tasks. The identification of potential cancer biomarker genes is one of the most important aims for microarray analysis and, as such, has been widely targeted in the literature. However, identifying a set of these genes consistently across different experiments, researches, microarray platforms, or cancer types is still an elusive endeavor. Besides the inherent difficulty of the large and nonconstant variability in these experiments and the incommensurability between different microarray technologies, there is the issue of the users having to adjust a series of parameters that significantly affect the outcome of the analyses and that do not have a biological or medical meaning. In this study, the identification of potential cancer biomarkers from microarray data is casted as a multiple criteria optimization (MCO) problem. The efficient solutions to this problem, found here through data envelopment analysis (DEA), are associated to genes that are proposed as potential cancer biomarkers. The method does not require any parameter adjustment by the user, and thus fosters repeatability. The approach also allows the analysis of different microarray experiments, microarray platforms, and cancer types simultaneously. The results include the analysis of three publicly available microarray databases related to cervix cancer. This study points to the feasibility of modeling the selection of potential cancer biomarkers from microarray data as an MCO problem and solve it using DEA. Using MCO entails a new optic to the identification of potential cancer biomarkers as it does not require the definition of a threshold value to establish significance for a particular gene and the selection of a normalization
Cleton, N B; van Maanen, K; Bergervoet, S A; Bon, N; Beck, C; Godeke, G-J; Lecollinet, S; Bowen, R; Lelli, D; Nowotny, N; Koopmans, M P G; Reusken, C B E M
The genus Flavivirus in the family Flaviviridae includes some of the most important examples of emerging zoonotic arboviruses that are rapidly spreading across the globe. Japanese encephalitis virus (JEV), West Nile virus (WNV), St. Louis encephalitis virus (SLEV) and Usutu virus (USUV) are mosquito-borne members of the JEV serological group. Although most infections in humans are asymptomatic or present with mild flu-like symptoms, clinical manifestations of JEV, WNV, SLEV, USUV and tick-borne encephalitis virus (TBEV) can include severe neurological disease and death. In horses, infection with WNV and JEV can lead to severe neurological disease and death, while USUV, SLEV and TBEV infections are mainly asymptomatic, however, and induce antibody responses. Horses often serve as sentinels to monitor active virus circulation in serological surveillance programmes specifically for WNV, USUV and JEV. Here, we developed and validated a NS1-antigen protein microarray for the serological differential diagnosis of flavivirus infections in horses using sera of experimentally and naturally infected symptomatic as well as asymptomatic horses. Using samples from experimentally infected horses, an IgG and IgM specificity of 100% and a sensitivity of 95% for WNV and 100% for JEV was achieved with a cut-off titre of 1 : 20 based on ROC calculation. In field settings, the microarray identified 93-100% of IgG-positive horses with recent WNV infections and 87% of TBEV IgG-positive horses. WNV IgM sensitivity was 80%. Differentiation between closely related flaviviruses by the NS1-antigen protein microarray is possible, even though we identified some instances of cross-reactivity among antibodies. However, the assay is not able to differentiate between naturally infected horses and animals vaccinated with an inactivated WNV whole-virus vaccine. We showed that the NS1-microarray can potentially be used for diagnosing and distinguishing flavivirus infections in horses and for public
Jen, C. H.; Manfield, I. W.; Michalopoulos, D. W.
be examined using the novel clique finder tool to determine the sets of genes most likely to be regulated in a similar manner. In combination, these tools offer three levels of analysis: creation of correlation lists of co-expressed genes, refinement of these lists using two-dimensional scatter plots......We present a new WWW-based tool for plant gene analysis, the Arabidopsis Co-Expression Tool (act) , based on a large Arabidopsis thaliana microarray data set obtained from the Nottingham Arabidopsis Stock Centre. The co-expression analysis tool allows users to identify genes whose expression...
Saccone, Scott F; Quan, Jiaxi; Jones, Peter L
Public genomic databases, which are often used to guide genetic studies of human disease, are now being applied to genomic medicine through in silico integrative genomics. These databases, however, often lack tools for systematically determining the experimental origins of the data. We introduce a new data provenance model that we have implemented in a public web application, BioQ, for assessing the reliability of the data by systematically tracing its experimental origins to the original subjects and biologics. BioQ allows investigators to both visualize data provenance as well as explore individual elements of experimental process flow using precise tools for detailed data exploration and documentation. It includes a number of human genetic variation databases such as the HapMap and 1000 Genomes projects. BioQ is freely available to the public at http://bioq.saclab.net.
... single-family matrix in FHFA's Public Use Database (PUDB) to include data fields for the high-cost single... Use Database Incorporating High-Cost Single-Family Securitized Loan Data Fields and Technical Data... amended, it is necessary to revise the single-family matrix of FHFA's Public Use Database (PUDB) by adding...
Alexandra M Schnoes
Full Text Available Due to the rapid release of new data from genome sequencing projects, the majority of protein sequences in public databases have not been experimentally characterized; rather, sequences are annotated using computational analysis. The level of misannotation and the types of misannotation in large public databases are currently unknown and have not been analyzed in depth. We have investigated the misannotation levels for molecular function in four public protein sequence databases (UniProtKB/Swiss-Prot, GenBank NR, UniProtKB/TrEMBL, and KEGG for a model set of 37 enzyme families for which extensive experimental information is available. The manually curated database Swiss-Prot shows the lowest annotation error levels (close to 0% for most families; the two other protein sequence databases (GenBank NR and TrEMBL and the protein sequences in the KEGG pathways database exhibit similar and surprisingly high levels of misannotation that average 5%-63% across the six superfamilies studied. For 10 of the 37 families examined, the level of misannotation in one or more of these databases is >80%. Examination of the NR database over time shows that misannotation has increased from 1993 to 2005. The types of misannotation that were found fall into several categories, most associated with "overprediction" of molecular function. These results suggest that misannotation in enzyme superfamilies containing multiple families that catalyze different reactions is a larger problem than has been recognized. Strategies are suggested for addressing some of the systematic problems contributing to these high levels of misannotation.
Schnoes, Alexandra M; Brown, Shoshana D; Dodevski, Igor; Babbitt, Patricia C
Due to the rapid release of new data from genome sequencing projects, the majority of protein sequences in public databases have not been experimentally characterized; rather, sequences are annotated using computational analysis. The level of misannotation and the types of misannotation in large public databases are currently unknown and have not been analyzed in depth. We have investigated the misannotation levels for molecular function in four public protein sequence databases (UniProtKB/Swiss-Prot, GenBank NR, UniProtKB/TrEMBL, and KEGG) for a model set of 37 enzyme families for which extensive experimental information is available. The manually curated database Swiss-Prot shows the lowest annotation error levels (close to 0% for most families); the two other protein sequence databases (GenBank NR and TrEMBL) and the protein sequences in the KEGG pathways database exhibit similar and surprisingly high levels of misannotation that average 5%-63% across the six superfamilies studied. For 10 of the 37 families examined, the level of misannotation in one or more of these databases is >80%. Examination of the NR database over time shows that misannotation has increased from 1993 to 2005. The types of misannotation that were found fall into several categories, most associated with "overprediction" of molecular function. These results suggest that misannotation in enzyme superfamilies containing multiple families that catalyze different reactions is a larger problem than has been recognized. Strategies are suggested for addressing some of the systematic problems contributing to these high levels of misannotation.
Full Text Available Abstract Microarrays allow researchers to measure the expression of thousands of genes in a single experiment. Before statistical comparisons can be made, the data must be assessed for quality and normalisation procedures must be applied, of which many have been proposed. Methods of comparing the normalised data are also abundant, and no clear consensus has yet been reached. The purpose of this paper was to compare those methods used by the EADGENE network on a very noisy simulated data set. With the a priori knowledge of which genes are differentially expressed, it is possible to compare the success of each approach quantitatively. Use of an intensity-dependent normalisation procedure was common, as was correction for multiple testing. Most variety in performance resulted from differing approaches to data quality and the use of different statistical tests. Very few of the methods used any kind of background correction. A number of approaches achieved a success rate of 95% or above, with relatively small numbers of false positives and negatives. Applying stringent spot selection criteria and elimination of data did not improve the false positive rate and greatly increased the false negative rate. However, most approaches performed well, and it is encouraging that widely available techniques can achieve such good results on a very noisy data set.
Describes a study that assessed the availability and use of microcomputer database management interfaces to online public access catalogs. The software capabilities needed to effect such an interface are identified, and available software packages are evaluated by these criteria. A directory of software vendors is provided. (4 notes with…
Dumont, B.; Fuks, B.; Kraml, S.; Bein, S.; Chalons, G.; Conte, E.; Kulkarni, S.; Sengupta, D.; Wymant, C.
We present the implementation, in the MadAnalysis 5 framework, of several ATLAS and CMS searches for supersymmetry in data recorded during the first run of the LHC. We provide extensive details on the validation of our implementations and propose to create a public analysis database within this framework.
Giselsson, Thomas Mosgaard; Nyholm Jørgensen, Rasmus; Jensen, Peter Kryger
A database of images of approximately 960 unique plants belonging to 12 species at several growth stages is made publicly available. It comprises annotated RGB images with a physical resolution of roughly 10 pixels per mm. To standardise the evaluation of classification results obtained...
Wanke, Dierk; Kilian, Joachim; Bloss, Ulrich; Mangelsen, Elke; Supper, Jochen; Harter, Klaus; Berendzen, Kenneth W.
Biologists and bioinformatic scientists cope with the analysis of transcript abundance and the extraction of meaningful information from microarray expression data. By exploiting biological information accessible in public databases, we try to extend our current knowledge over the plant model organism Arabidopsis thaliana. Here, we give two examples of increasing the quality of information gained from large scale expression experiments by the integration of microarray-unrelated biological information: First, we utilize Arabidopsis microarray data to demonstrate that expression profiles are usually conserved between orthologous genes of different organisms. In an initial step of the analysis, orthology has to be inferred unambiguously, which then allows comparison of expression profiles between orthologs. We make use of the publicly available microarray expression data of Arabidopsis and barley, Hordeum vulgare. We found a generally positive correlation in expression trajectories between true orthologs although both organisms are only distantly related in evolutionary time scale. Second, extracting clusters of co-regulated genes implies similarities in transcriptional regulation via similar cis-regulatory elements (CREs). Vice versa approaches, where co-regulated gene clusters are found by investigating on CREs were not successful in general. Nonetheless, in some cases the presence of CREs in a defined position, orientation or CRE-combinations is positively correlated with co-regulated gene clusters. Here, we make use of genes involved in the phenylpropanoid biosynthetic pathway, to give one positive example for this approach.
Kara A Livingston
Full Text Available Dietary fiber is a broad category of compounds historically defined as partially or completely indigestible plant-based carbohydrates and lignin with, more recently, the additional criteria that fibers incorporated into foods as additives should demonstrate functional human health outcomes to receive a fiber classification. Thousands of research studies have been published examining fibers and health outcomes.(1 Develop a database listing studies testing fiber and physiological health outcomes identified by experts at the Ninth Vahouny Conference; (2 Use evidence mapping methodology to summarize this body of literature. This paper summarizes the rationale, methodology, and resulting database. The database will help both scientists and policy-makers to evaluate evidence linking specific fibers with physiological health outcomes, and identify missing information.To build this database, we conducted a systematic literature search for human intervention studies published in English from 1946 to May 2015. Our search strategy included a broad definition of fiber search terms, as well as search terms for nine physiological health outcomes identified at the Ninth Vahouny Fiber Symposium. Abstracts were screened using a priori defined eligibility criteria and a low threshold for inclusion to minimize the likelihood of rejecting articles of interest. Publications then were reviewed in full text, applying additional a priori defined exclusion criteria. The database was built and published on the Systematic Review Data Repository (SRDR™, a web-based, publicly available application.A fiber database was created. This resource will reduce the unnecessary replication of effort in conducting systematic reviews by serving as both a central database archiving PICO (population, intervention, comparator, outcome data on published studies and as a searchable tool through which this data can be extracted and updated.
Murukarthick, Jayakodi; Sampath, Perumal; Lee, Sang Choon; Choi, Beom-Soon; Senthil, Natesan; Liu, Shengyi; Yang, Tae-Jin
MITE, TRIM and SINEs are miniature form transposable elements (mTEs) that are ubiquitous and dispersed throughout entire plant genomes. Tens of thousands of members cause insertion polymorphism at both the inter- and intra- species level. Therefore, mTEs are valuable targets and resources for development of markers that can be utilized for breeding, genetic diversity and genome evolution studies. Taking advantage of the completely sequenced genomes of Brassica rapa and B. oleracea, characterization of mTEs and building a curated database are prerequisite to extending their utilization for genomics and applied fields in Brassica crops. We have developed BrassicaTED as a unique web portal containing detailed characterization information for mTEs of Brassica species. At present, BrassicaTED has datasets for 41 mTE families, including 5894 and 6026 members from 20 MITE families, 1393 and 1639 members from 5 TRIM families, 1270 and 2364 members from 16 SINE families in B. rapa and B. oleracea, respectively. BrassicaTED offers different sections to browse structural and positional characteristics for every mTE family. In addition, we have added data on 289 MITE insertion polymorphisms from a survey of seven Brassica relatives. Genes with internal mTE insertions are shown with detailed gene annotation and microarray-based comparative gene expression data in comparison with their paralogs in the triplicated B. rapa genome. This database also includes a novel tool, K BLAST (Karyotype BLAST), for clear visualization of the locations for each member in the B. rapa and B. oleracea pseudo-genome sequences. BrassicaTED is a newly developed database of information regarding the characteristics and potential utility of mTEs including MITE, TRIM and SINEs in B. rapa and B. oleracea. The database will promote the development of desirable mTE-based markers, which can be utilized for genomics and breeding in Brassica species. BrassicaTED will be a valuable repository for scientists
Price, Curtis V.; Maupin, Molly A.
The U.S. Geological Survey (USGS) has developed a database containing information about wells, surface-water intakes, and distribution systems that are part of public water systems across the United States, its territories, and possessions. Programs of the USGS such as the National Water Census, the National Water Use Information Program, and the National Water-Quality Assessment Program all require a complete and current inventory of public water systems, the sources of water used by those systems, and the size of populations served by the systems across the Nation. Although the U.S. Environmental Protection Agency’s Safe Drinking Water Information System (SDWIS) database already exists as the primary national Federal database for information on public water systems, the Public-Supply Database (PSDB) was developed to add value to SDWIS data with enhanced location and ancillary information, and to provide links to other databases, including the USGS’s National Water Information System (NWIS) database.
Scott, Daniel J; Lee, Joon; Silva, Ikaro; Park, Shinhyuk; Moody, George B; Celi, Leo A; Mark, Roger G
The Multiparameter Intelligent Monitoring in Intensive Care II (MIMIC-II) database is a free, public resource for intensive care research. The database was officially released in 2006, and has attracted a growing number of researchers in academia and industry. We present the two major software tools that facilitate accessing the relational database: the web-based QueryBuilder and a downloadable virtual machine (VM) image. QueryBuilder and the MIMIC-II VM have been developed successfully and are freely available to MIMIC-II users. Simple example SQL queries and the resulting data are presented. Clinical studies pertaining to acute kidney injury and prediction of fluid requirements in the intensive care unit are shown as typical examples of research performed with MIMIC-II. In addition, MIMIC-II has also provided data for annual PhysioNet/Computing in Cardiology Challenges, including the 2012 Challenge "Predicting mortality of ICU Patients". QueryBuilder is a web-based tool that provides easy access to MIMIC-II. For more computationally intensive queries, one can locally install a complete copy of MIMIC-II in a VM. Both publicly available tools provide the MIMIC-II research community with convenient querying interfaces and complement the value of the MIMIC-II relational database.
..., regarding FHFA's adoption of an Order revising FHFA's Public Use Database matrices to include certain data... FEDERAL HOUSING FINANCE AGENCY [No. 2011-N-13] Notice of Order: Revisions to Enterprise Public Use Database Incorporating High-Cost Single-Family Securitized Loan Data Fields and Technical Data Field...
Fletcher, Alex; Yoo, Terry S.
Public databases today can be constructed with a wide variety of authoring and management structures. The widespread appeal of Internet search engines suggests that public information be made open and available to common search strategies, making accessible information that would otherwise be hidden by the infrastructure and software interfaces of a traditional database management system. We present the construction and organizational details for managing NOVA, the National Online Volumetric Archive. As an archival effort of the Visible Human Project for supporting medical visualization research, archiving 3D multimodal radiological teaching files, and enhancing medical education with volumetric data, our overall database structure is simplified; archives grow by accruing information, but seldom have to modify, delete, or overwrite stored records. NOVA is being constructed and populated so that it is transparent to the Internet; that is, much of its internal structure is mirrored in HTML allowing internet search engines to investigate, catalog, and link directly to the deep relational structure of the collection index. The key organizational concept for NOVA is the Image Content Group (ICG), an indexing strategy for cataloging incoming data as a set structure rather than by keyword management. These groups are managed through a series of XML files and authoring scripts. We cover the motivation for Image Content Groups, their overall construction, authorship, and management in XML, and the pilot results for creating public data repositories using this strategy.
David C. Wheeler
Full Text Available In studies of disease with potential environmental risk factors, residential location is often used as a surrogate for unknown environmental exposures or as a basis for assigning environmental exposures. These studies most typically use the residential location at the time of diagnosis due to ease of collection. However, previous residential locations may be more useful for risk analysis because of population mobility and disease latency. When residential histories have not been collected in a study, it may be possible to generate them through public-record databases. In this study, we evaluated the ability of a public-records database from LexisNexis to provide residential histories for subjects in a geographically diverse cohort study. We calculated 11 performance metrics comparing study-collected addresses and two address retrieval services from LexisNexis. We found 77% and 90% match rates for city and state and 72% and 87% detailed address match rates with the basic and enhanced services, respectively. The enhanced LexisNexis service covered 86% of the time at residential addresses recorded in the study. The mean match rate for detailed address matches varied spatially over states. The results suggest that public record databases can be useful for reconstructing residential histories for subjects in epidemiologic studies.
Zieger, Martin; Utz, Silvia
During the last decade, DNA profiling and the use of DNA databases have become two of the most employed instruments of police investigations. This very rapid establishment of forensic genetics is yet far from being complete. In the last few years novel types of analyses have been presented to describe phenotypically a possible perpetrator. We conducted the present study among German speaking Swiss residents for two main reasons: firstly, we aimed at getting an impression of the public awareness and acceptance of the Swiss DNA database and the perception of a hypothetical DNA database containing all Swiss residents. Secondly, we wanted to get a broader picture of how people that are not working in the field of forensic genetics think about legal permission to establish phenotypic descriptions of alleged criminals by genetic means. Even though a significant number of study participants did not even know about the existence of the Swiss DNA database, its acceptance appears to be very high. Generally our results suggest that the current forensic use of DNA profiling is considered highly trustworthy. However, the acceptance of a hypothetical universal database would be only as low as about 30% among the 284 respondents to our study, mostly because people are concerned about the security of their genetic data, their privacy or a possible risk of abuse of such a database. Concerning the genetic analysis of externally visible characteristics and biogeographical ancestry, we discover a high degree of acceptance. The acceptance decreases slightly when precise characteristics are presented to the participants in detail. About half of the respondents would be in favor of the moderate use of physical traits analyses only for serious crimes threatening life, health or sexual integrity. The possible risk of discrimination and reinforcement of racism, as discussed by scholars from anthropology, bioethics, law, philosophy and sociology, is mentioned less frequently by the study
Kosseim, Patricia; Pullman, Daryl; Perrot-Daley, Astrid; Hodgkinson, Kathy; Street, Catherine; Rahman, Proton
To provide a legal and ethical analysis of some of the implementation challenges faced by the Population Therapeutics Research Group (PTRG) at Memorial University (Canada), in using genealogical information offered by individuals for its genetics research database. This paper describes the unique historical and genetic characteristics of the Newfoundland and Labrador founder population, which gave rise to the opportunity for PTRG to build the Newfoundland Genealogy Database containing digitized records of all pre-confederation (1949) census records of the Newfoundland founder population. In addition to building the database, PTRG has developed the Heritability Analytics Infrastructure, a data management structure that stores genotype, phenotype, and pedigree information in a single database, and custom linkage software (KINNECT) to perform pedigree linkages on the genealogy database. A newly adopted legal regimen in Newfoundland and Labrador is discussed. It incorporates health privacy legislation with a unique research ethics statute governing the composition and activities of research ethics boards and, for the first time in Canada, elevating the status of national research ethics guidelines into law. The discussion looks at this integration of legal and ethical principles which provides a flexible and seamless framework for balancing the privacy rights and welfare interests of individuals, families, and larger societies in the creation and use of research data infrastructures as public goods. The complementary legal and ethical frameworks that now coexist in Newfoundland and Labrador provide the legislative authority, ethical legitimacy, and practical flexibility needed to find a workable balance between privacy interests and public goods. Such an approach may also be instructive for other jurisdictions as they seek to construct and use biobanks and related research platforms for genetic research.
Pullman, Daryl; Perrot-Daley, Astrid; Hodgkinson, Kathy; Street, Catherine; Rahman, Proton
Objective To provide a legal and ethical analysis of some of the implementation challenges faced by the Population Therapeutics Research Group (PTRG) at Memorial University (Canada), in using genealogical information offered by individuals for its genetics research database. Materials and methods This paper describes the unique historical and genetic characteristics of the Newfoundland and Labrador founder population, which gave rise to the opportunity for PTRG to build the Newfoundland Genealogy Database containing digitized records of all pre-confederation (1949) census records of the Newfoundland founder population. In addition to building the database, PTRG has developed the Heritability Analytics Infrastructure, a data management structure that stores genotype, phenotype, and pedigree information in a single database, and custom linkage software (KINNECT) to perform pedigree linkages on the genealogy database. Discussion A newly adopted legal regimen in Newfoundland and Labrador is discussed. It incorporates health privacy legislation with a unique research ethics statute governing the composition and activities of research ethics boards and, for the first time in Canada, elevating the status of national research ethics guidelines into law. The discussion looks at this integration of legal and ethical principles which provides a flexible and seamless framework for balancing the privacy rights and welfare interests of individuals, families, and larger societies in the creation and use of research data infrastructures as public goods. Conclusion The complementary legal and ethical frameworks that now coexist in Newfoundland and Labrador provide the legislative authority, ethical legitimacy, and practical flexibility needed to find a workable balance between privacy interests and public goods. Such an approach may also be instructive for other jurisdictions as they seek to construct and use biobanks and related research platforms for genetic research. PMID
Gething Peter W
Full Text Available Abstract Background Efforts to tackle the enormous burden of ill-health in low-income countries are hampered by weak health information infrastructures that do not support appropriate planning and resource allocation. For health information systems to function well, a reliable inventory of health service providers is critical. The spatial referencing of service providers to allow their representation in a geographic information system is vital if the full planning potential of such data is to be realized. Methods A disparate series of contemporary lists of health service providers were used to update a public health facility database of Kenya last compiled in 2003. These new lists were derived primarily through the national distribution of antimalarial and antiretroviral commodities since 2006. A combination of methods, including global positioning systems, was used to map service providers. These spatially-referenced data were combined with high-resolution population maps to analyze disparity in geographic access to public health care. Findings The updated 2008 database contained 5,334 public health facilities (67% ministry of health; 28% mission and nongovernmental organizations; 2% local authorities; and 3% employers and other ministries. This represented an overall increase of 1,862 facilities compared to 2003. Most of the additional facilities belonged to the ministry of health (79% and the majority were dispensaries (91%. 93% of the health facilities were spatially referenced, 38% using global positioning systems compared to 21% in 2003. 89% of the population was within 5 km Euclidean distance to a public health facility in 2008 compared to 71% in 2003. Over 80% of the population outside 5 km of public health service providers was in the sparsely settled pastoralist areas of the country. Conclusion We have shown that, with concerted effort, a relatively complete inventory of mapped health services is possible with enormous potential for
Noor, Abdisalan M; Alegana, Victor A; Gething, Peter W; Snow, Robert W
Efforts to tackle the enormous burden of ill-health in low-income countries are hampered by weak health information infrastructures that do not support appropriate planning and resource allocation. For health information systems to function well, a reliable inventory of health service providers is critical. The spatial referencing of service providers to allow their representation in a geographic information system is vital if the full planning potential of such data is to be realized. A disparate series of contemporary lists of health service providers were used to update a public health facility database of Kenya last compiled in 2003. These new lists were derived primarily through the national distribution of antimalarial and antiretroviral commodities since 2006. A combination of methods, including global positioning systems, was used to map service providers. These spatially-referenced data were combined with high-resolution population maps to analyze disparity in geographic access to public health care. The updated 2008 database contained 5,334 public health facilities (67% ministry of health; 28% mission and nongovernmental organizations; 2% local authorities; and 3% employers and other ministries). This represented an overall increase of 1,862 facilities compared to 2003. Most of the additional facilities belonged to the ministry of health (79%) and the majority were dispensaries (91%). 93% of the health facilities were spatially referenced, 38% using global positioning systems compared to 21% in 2003. 89% of the population was within 5 km Euclidean distance to a public health facility in 2008 compared to 71% in 2003. Over 80% of the population outside 5 km of public health service providers was in the sparsely settled pastoralist areas of the country. We have shown that, with concerted effort, a relatively complete inventory of mapped health services is possible with enormous potential for improving planning. Expansion in public health care in Kenya has
William S DeWitt
Full Text Available The vast diversity of B-cell receptors (BCR and secreted antibodies enables the recognition of, and response to, a wide range of epitopes, but this diversity has also limited our understanding of humoral immunity. We present a public database of more than 37 million unique BCR sequences from three healthy adult donors that is many fold deeper than any existing resource, together with a set of online tools designed to facilitate the visualization and analysis of the annotated data. We estimate the clonal diversity of the naive and memory B-cell repertoires of healthy individuals, and provide a set of examples that illustrate the utility of the database, including several views of the basic properties of immunoglobulin heavy chain sequences, such as rearrangement length, subunit usage, and somatic hypermutation positions and dynamics.
Full Text Available We describe the main principles of formation of databases (DBs with information about astronomical objects and their physical characteristics derived from observations obtained at the Crimean Astrophysical Observatory (CrAO and published in the “Izvestiya of the CrAO” and elsewhere. Emphasis is placed on the DBs missing from the most complete global library of catalogs and data tables, VizieR (supported by the Center of Astronomical Data, Strasbourg. We specially consider the problem of forming a digital archive of observational data obtained at the CrAO as an interactive DB related to database objects and publications. We present examples of all our DBs as elements integrated into the Crimean Astronomical Virtual Observatory. We illustrate the work with the CrAO DBs using tools of the International Virtual Observatory: Aladin, VOPlot, VOSpec, in conjunction with the VizieR and Simbad DBs.
DeWitt, William S; Lindau, Paul; Snyder, Thomas M; Sherwood, Anna M; Vignali, Marissa; Carlson, Christopher S; Greenberg, Philip D; Duerkopp, Natalie; Emerson, Ryan O; Robins, Harlan S
The vast diversity of B-cell receptors (BCR) and secreted antibodies enables the recognition of, and response to, a wide range of epitopes, but this diversity has also limited our understanding of humoral immunity. We present a public database of more than 37 million unique BCR sequences from three healthy adult donors that is many fold deeper than any existing resource, together with a set of online tools designed to facilitate the visualization and analysis of the annotated data. We estimate the clonal diversity of the naive and memory B-cell repertoires of healthy individuals, and provide a set of examples that illustrate the utility of the database, including several views of the basic properties of immunoglobulin heavy chain sequences, such as rearrangement length, subunit usage, and somatic hypermutation positions and dynamics.
Vergoulis, Thanasis; Kanellos, Ilias; Kostoulas, Nikos; Georgakilas, Georgios; Sellis, Timos; Hatzigeorgiou, Artemis; Dalamagas, Theodore
Identifying, amongst millions of publications available in MEDLINE, those that are relevant to specific microRNAs (miRNAs) of interest based on keyword search faces major obstacles. References to miRNA names in the literature often deviate from standard nomenclature for various reasons, since even the official nomenclature evolves. For instance, a single miRNA name may identify two completely different molecules or two different names may refer to the same molecule. mirPub is a database with a powerful and intuitive interface, which facilitates searching for miRNA literature, addressing the aforementioned issues. To provide effective search services, mirPub applies text mining techniques on MEDLINE, integrates data from several curated databases and exploits data from its user community following a crowdsourcing approach. Other key features include an interactive visualization service that illustrates intuitively the evolution of miRNA data, tag clouds summarizing the relevance of publications to particular diseases, cell types or tissues and access to TarBase 6.0 data to oversee genes related to miRNA publications. mirPub is freely available at http://www.microrna.gr/mirpub/. email@example.com or firstname.lastname@example.org Supplementary data are available at Bioinformatics online. © The Author 2014. Published by Oxford University Press.
Thorson, James T; Cope, Jason M; Patrick, Wesley S
Single-species life history parameters are central to ecological research and management, including the fields of macro-ecology, fisheries science, and ecosystem modeling. However, there has been little independent evaluation of the precision and accuracy of the life history values in global and publicly available databases. We therefore develop a novel method based on a Bayesian errors-in-variables model that compares database entries with estimates from local experts, and we illustrate this process by assessing the accuracy and precision of entries in FishBase, one of the largest and oldest life history databases. This model distinguishes biases among seven life history parameters, two types of information available in FishBase (i.e., published values and those estimated from other parameters), and two taxa (i.e., bony and cartilaginous fishes) relative to values from regional experts in the United States, while accounting for additional variance caused by sex- and region-specific life history traits. For published values in FishBase, the model identifies a small positive bias in natural mortality and negative bias in maximum age, perhaps caused by unacknowledged mortality caused by fishing. For life history values calculated by FishBase, the model identified large and inconsistent biases. The model also demonstrates greatest precision for body size parameters, decreased precision for values derived from geographically distant populations, and greatest between-sex differences in age at maturity. We recommend that our bias and precision estimates be used in future errors-in-variables models as a prior on measurement errors. This approach is broadly applicable to global databases of life history traits and, if used, will encourage further development and improvements in these databases.
Pierson, Kawika; Hand, Michael L; Thompson, Fred
Quantitative public financial management research focused on local governments is limited by the absence of a common database for empirical analysis. While the U.S. Census Bureau distributes government finance data that some scholars have utilized, the arduous process of collecting, interpreting, and organizing the data has led its adoption to be prohibitive and inconsistent. In this article we offer a single, coherent resource that contains all of the government financial data from 1967-2012, uses easy to understand natural-language variable names, and will be extended when new data is available.
Høstgaard, Anna Marie; Pape-Haugaard, Louise
Denmark have unique health informatics databases e.g. "The Children's Database", which since 2009 holds data on all Danish children from birth until 17 years of age. In the current set-up a number of potential sources of errors exist - both technical and human-which means that the data is flawed. This gives rise to erroneous statistics and makes the data unsuitable for research purposes. In order to make the data usable, it is necessary to develop new methods for validating the data generation process at the municipal/regional/national level. In the present ongoing research project, two research areas are combined: Public Health Informatics and Computer Science, and both ethnographic as well as system engineering research methods are used. The project is expected to generate new generic methods and knowledge about electronic data collection and transmission in different social contexts and by different social groups and thus to be of international importance, since this is sparsely documented in the Public Health Informatics perspective. This paper presents the preliminary results, which indicate that health information technology used ought to be subject for redesign, where a thorough insight into the work practices should be point of departure.
Full Text Available ase Description General information of database Database name RED Alternative name Rice Expression Database...enome Research Unit Shoshi Kikuchi E-mail : Database classification Plant databases - Rice Database classifi...cation Microarray, Gene Expression Organism Taxonomy Name: Oryza sativa Taxonomy ID: 4530 Database descripti... Article title: Rice Expression Database: the gateway to rice functional genomics...nt Science (2002) Dec 7 (12):563-564 External Links: Original website information Database maintenance site
Marcella Costa RADAEL
Full Text Available Reproduction is a fundamental part of life being and studies related to fish reproduction have been much accessed. The aim of this study was to perform a bibliometric analysis in intend to identify trends in this kind of publication. During June 2013, were performed searches on Scopus Database, using the term “fish reproduction”, being compiled and presented information related to the number of publications per year, number of publications by country, publications by author, by journal, by institution and most used keywords. Based on the study, it was possible to obtain the following results: Brazil occupies a highlight position in number of papers, being that the Brazilian participation compared to worldwide publishing production is having an exponential increase; in Brazil, there is a high concentration of articles when concerning the top 10 authors and institutions. The present study allows verifying that the term “fish reproduction” has been focused by many scientific papers, being that in Brazil there is a special research effort related to this subject, especially in the last few years. The main contribution concerns to the use of bibliometric methods to describe the growth and concentration of researches in the area of fishfarm and reproduction.
Full Text Available Abstract Background Microarray core facilities are commonplace in biological research organizations, and need systems for accurately tracking various logistical aspects of their operation. Although these different needs could be handled separately, an integrated management system provides benefits in organization, automation and reduction in errors. Results We present SLIMarray (System for Lab Information Management of Microarrays, an open source, modular database web application capable of managing microarray inventories, sample processing and usage charges. The software allows modular configuration and is well suited for further development, providing users the flexibility to adapt it to their needs. SLIMarray Lite, a version of the software that is especially easy to install and run, is also available. Conclusion SLIMarray addresses the previously unmet need for free and open source software for managing the logistics of a microarray core facility.
Broschat Shira L
generated using stepwise discriminant analysis can be stored for analysis of subsequent experimental data. Additionally, PLASMID can be used to construct virtual microarrays with genomes from public databases, which can then be used to identify an optimal set of probes.
Amador Durán Sánchez
Full Text Available The aim of this study was to show the current state of scientific research regarding wine tourism, by comparing the platforms of scientific information WoS and Scopus and applying quantitative methods. For this purpose, a bibliometric study of the publications indexed in WoS and Scopus was conducted, analyzing the correlation between increases, coverage, overlap, dispersion and concentration of documents. During the search process, a set of 238 articles and 122 different journals were obtained. Based on the results of the comparative study, we conclude that WoS and Scopus databases differ in scope, data volume and coverage policies with a high degree of unique sources and articles, resulting both of them complementary and not mutually exclusive. Scopus covers the area of wine tourism better, by including a greater number of journals, papers and signatures.
Eames, Evan; Semelin, Benoît
With current efforts inching closer to detecting the 21-cm signal from the Epoch of Reionization (EoR), proper preparation will require publicly available simulated models of the various forms the signal could take. In this work we present a database of such models, available at 21ssd.obspm.fr. The models are created with a fully-coupled radiative hydrodynamic simulation (LICORICE), and are created at high resolution (10243). We also begin to analyse and explore the possible 21-cm EoR signals (with Power Spectra and Pixel Distribution Functions), and study the effects of thermal noise on our ability to recover the signal out to high redshifts. Finally, we begin to explore the concepts of `distance' between different models, which represents a crucial step towards optimising parameter space sampling, training neural networks, and finally extracting parameter values from observations.
Full Text Available Abstract Background Despite significant improvements in computational annotation of genomes, sequences of abnormal, incomplete or incorrectly predicted genes and proteins remain abundant in public databases. Since the majority of incomplete, abnormal or mispredicted entries are not annotated as such, these errors seriously affect the reliability of these databases. Here we describe the MisPred approach that may provide an efficient means for the quality control of databases. The current version of the MisPred approach uses five distinct routines for identifying abnormal, incomplete or mispredicted entries based on the principle that a sequence is likely to be incorrect if some of its features conflict with our current knowledge about protein-coding genes and proteins: (i conflict between the predicted subcellular localization of proteins and the absence of the corresponding sequence signals; (ii presence of extracellular and cytoplasmic domains and the absence of transmembrane segments; (iii co-occurrence of extracellular and nuclear domains; (iv violation of domain integrity; (v chimeras encoded by two or more genes located on different chromosomes. Results Analyses of predicted EnsEMBL protein sequences of nine deuterostome (Homo sapiens, Mus musculus, Rattus norvegicus, Monodelphis domestica, Gallus gallus, Xenopus tropicalis, Fugu rubripes, Danio rerio and Ciona intestinalis and two protostome species (Caenorhabditis elegans and Drosophila melanogaster have revealed that the absence of expected signal peptides and violation of domain integrity account for the majority of mispredictions. Analyses of sequences predicted by NCBI's GNOMON annotation pipeline show that the rates of mispredictions are comparable to those of EnsEMBL. Interestingly, even the manually curated UniProtKB/Swiss-Prot dataset is contaminated with mispredicted or abnormal proteins, although to a much lesser extent than UniProtKB/TrEMBL or the EnsEMBL or GNOMON
Liu, Hongfang; Li, Xin; Yoon, Victoria; Clarke, Robert
As the most common cancer among women, breast cancer results from the accumulation of mutations in essential genes. Recent advance in high-throughput gene expression microarray technology has inspired researchers to use the technology to assist breast cancer diagnosis, prognosis, and treatment prediction. However, the high dimensionality of microarray experiments and public access of data from many experiments have caused inconsistencies which initiated the development of controlled terminologies and ontologies for annotating microarray experiments, such as the standard microarray Gene Expression Data (MGED) ontology (MO). In this paper, we developed BCM-CO, an ontology tailored specifically for indexing clinical annotations of breast cancer microarray samples from the NCI Thesaurus. Our research showed that the coverage of NCI Thesaurus is very limited with respect to i) terms used by researchers to describe breast cancer histology (covering 22 out of 48 histology terms); ii) breast cancer cell lines (covering one out of 12 cell lines); and iii) classes corresponding to the breast cancer grading and staging. By incorporating a wider range of those terms into BCM-CO, we were able to indexed breast cancer microarray samples from GEO using BCM-CO and MGED ontology and developed a prototype system with web interface that allows the retrieval of microarray data based on the ontology annotations. PMID:18999108
Full Text Available base Description General information of database Database name RMOS Alternative nam...arch Unit Shoshi Kikuchi E-mail : Database classification Plant databases - Rice Microarray Data and other Gene Expression Database...s Organism Taxonomy Name: Oryza sativa Taxonomy ID: 4530 Database description The Ric...19&lang=en Whole data download - Referenced database Rice Expression Database (RED) Rice full-length cDNA Database... (KOME) Rice Genome Integrated Map Database (INE) Rice Mutant Panel Database (Tos17) Rice Genome Annotation Database
Nataša Logar Berginc
Full Text Available The article describes an analysis of automatic term recognition results performed for single- and multi-word terms with the LUIZ term extraction system. The target application of the results is a terminology database of Public Relations and the main resource the KoRP Public Relations Corpus. Our analysis is focused on two segments: (a single-word noun term candidates, which we compare with the frequency list of nouns from KoRP and evaluate termhood on the basis of the judgements of two domain experts, and (b multi-word term candidates with verb and noun as headword. In order to better assess the performance of the system and the soundness of our approach we also performed an analysis of recall. Our results show that the terminological relevance of extracted nouns is indeed higher than that of merely frequent nouns, and that verbal phrases only rarely count as proper terms. The most productive patterns of multi-word terms with noun as a headword have the following structure: [adjective + noun], [adjective + and + adjective + noun] and [adjective + adjective + noun]. The analysis of recall shows low inter-annotator agreement, but nevertheless very satisfactory recall levels.
Horler, R S P; Turner, A S; Fretter, P; Ambrose, M
SeedStor (https://www.seedstor.ac.uk) acts as the publicly available database for the seed collections held by the Germplasm Resources Unit (GRU) based at the John Innes Centre, Norwich, UK. The GRU is a national capability supported by the Biotechnology and Biological Sciences Research Council (BBSRC). The GRU curates germplasm collections of a range of temperate cereal, legume and Brassica crops and their associated wild relatives, as well as precise genetic stocks, near-isogenic lines and mapping populations. With >35,000 accessions, the GRU forms part of the UK's plant conservation contribution to the Multilateral System (MLS) of the International Treaty for Plant Genetic Resources for Food and Agriculture (ITPGRFA) for wheat, barley, oat and pea. SeedStor is a fully searchable system that allows our various collections to be browsed species by species through to complicated multipart phenotype criteria-driven queries. The results from these searches can be downloaded for later analysis or used to order germplasm via our shopping cart. The user community for SeedStor is the plant science research community, plant breeders, specialist growers, hobby farmers and amateur gardeners, and educationalists. Furthermore, SeedStor is much more than a database; it has been developed to act internally as a Germplasm Information Management System that allows team members to track and process germplasm requests, determine regeneration priorities, handle cost recovery and Material Transfer Agreement paperwork, manage the Seed Store holdings and easily report on a wide range of the aforementioned tasks. © The Author(s) 2017. Published by Oxford University Press on behalf of Japanese Society of Plant Physiologists.
Full Text Available A Review of: Hilbert, F., Barth, J., Gremm, J., Gros, D., Haiter, J., Henkel, M., Reinhardt, W., & Stock, W.G. (2015. Coverage of academic citation databases compared with coverage of scientific social media: personal publication lists as calibration parameters. Online Information Review 39(2: 255-264. http://dx.doi.org/10.1108/OIR-07-2014-0159 Objective – The purpose of this study was to explore coverage rates of information science publications in academic citation databases and scientific social media using a new method of personal publication lists as a calibration parameter. The research questions were: How many publications are covered in different databases, which has the best coverage, and what institutions are represented and how does the language of the publication play a role? Design – Bibliometric analysis. Setting – Academic citation databases (Web of Science, Scopus, Google Scholar and scientific social media (Mendeley, CiteULike, Bibsonomy. Subjects – 1,017 library and information science publications produced by 76 information scientists at 5 German-speaking universities in Germany and Austria. Methods – Only documents which were published between 1 January 2003 and 31 December 2012 were included. In that time the 76 information scientists had produced 1,017 documents. The information scientists confirmed that their publication lists were complete and these served as the calibration parameter for the study. The citations from the publication lists were searched in three academic databases: Google Scholar, Web of Science (WoS, and Scopus; as well as three social media citation sites: Mendeley, CiteULike, and BibSonomy and the results were compared. The publications were searched for by author name and words from the title. Main results – None of the databases investigated had 100% coverage. In the academic databases, Google Scholar had the highest amount of coverage with an average of 63%, Scopus an average of 31%, and
Full Text Available This study focuses on databases as they are regulated by Directive no.96/9/EC regarding the protection of databases. There are also several references to Romanian Law no.8/1996 on copyright and neighbouring rights which implements the mentioned European Directive. The study analyses certain effects that the sui-generis protection has on public domain. The study tries to demonstrate that the reglementation specific to databases neglects the interests correlated with the public domain. The effect of such a regulation is the abusive creation of some databases in which the public domain (meaning information not protected by copyright such as news, ideas, procedures, methods, systems, processes, concepts, principles, discoveries ends up being encapsulated and made available only to some private interests, the access to public domain being regulated indirectly. The study begins by explaining the sui- generis right and its origin. The first mention of databases can be found in “Green Paper on Copyright (1998,” a document that clearly shows, the database protection was thought to cover a sphere of information non-protectable from the scientific and industrial fields. Several arguments are made by the author, most of them based on the report of the Public Consultation sustained in 2014 in regards to the necessity of the sui-generis right. There are some references made to a specific case law, namely British Houseracing Board vs William Hill and Fixture Marketing Ldt. The ECJ’s decision în that case is of great importance for the support of public interest to access information corresponding to some restrictive fields that are derived as a result of the maker’s activities, because in the absence of the sui-generis right, all this information can be freely accessed and used.
Walt, David R
This tutorial review describes how fibre optic microarrays can be used to create a variety of sensing and measurement systems. This review covers the basics of optical fibres and arrays, the different microarray architectures, and describes a multitude of applications. Such arrays enable multiplexed sensing for a variety of analytes including nucleic acids, vapours, and biomolecules. Polymer-coated fibre arrays can be used for measuring microscopic chemical phenomena, such as corrosion and localized release of biochemicals from cells. In addition, these microarrays can serve as a substrate for fundamental studies of single molecules and single cells. The review covers topics of interest to chemists, biologists, materials scientists, and engineers.
Full Text Available Abstract Background The criteria for choosing relevant cell lines among a vast panel of available intestinal-derived lines exhibiting a wide range of functional properties are still ill-defined. The objective of this study was, therefore, to establish objective criteria for choosing relevant cell lines to assess their appropriateness as tumor models as well as for drug absorption studies. Results We made use of publicly available expression signatures and cell based functional assays to delineate differences between various intestinal colon carcinoma cell lines and normal intestinal epithelium. We have compared a panel of intestinal cell lines with patient-derived normal and tumor epithelium and classified them according to traits relating to oncogenic pathway activity, epithelial-mesenchymal transition (EMT and stemness, migratory properties, proliferative activity, transporter expression profiles and chemosensitivity. For example, SW480 represent an EMT-high, migratory phenotype and scored highest in terms of signatures associated to worse overall survival and higher risk of recurrence based on patient derived databases. On the other hand, differentiated HT29 and T84 cells showed gene expression patterns closest to tumor bulk derived cells. Regarding drug absorption, we confirmed that differentiated Caco-2 cells are the model of choice for active uptake studies in the small intestine. Regarding chemosensitivity we were unable to confirm a recently proposed association of chemo-resistance with EMT traits. However, a novel signature was identified through mining of NCI60 GI50 values that allowed to rank the panel of intestinal cell lines according to their drug responsiveness to commonly used chemotherapeutics. Conclusions This study presents a straightforward strategy to exploit publicly available gene expression data to guide the choice of cell-based models. While this approach does not overcome the major limitations of such models
Skip to main content DNA Microarray Technology Enter Search Term(s): Español Research Funding An Overview Bioinformatics Current Grants Education and Training Funding Extramural Research News Features Funding Divisions Funding ...
Sumoza-Toledo, Adriana; Espinoza-Gabriel, Mario Iván; Montiel-Condado, Dvorak
Breast cancer is one of the most common malignancies affecting women. Recent investigations have revealed a major role of ion channels in cancer. The transient receptor potential melastatin-2 (TRPM2) is a plasma membrane and lysosomal channel with important roles in cell migration and cell death in immune cells and tumor cells. In this study, we investigated the prognostic value of TRPM2 channel in breast cancer, analyzing public databases compiled in Oncomine™ (Thermo Fisher, Ann Arbor, MI) and online Kaplan-Meier Plotter platforms. The results revealed that TRPM2 mRNA overexpression is significant in situ and invasive breast carcinoma compared to normal breast tissue. Furthermore, multi-gene validation using Oncomine™ showed that this channel is coexpressed with proteins related to cellular migration, transformation, and apoptosis. On the other hand, Kaplan-Meier analysis exhibited that low expression of TRPM2 could be used to predict poor outcome in ER- and HER2+ breast carcinoma patients. TRPM2 is a promising biomarker for aggressiveness of breast cancer, and a potential target for the development of new therapies. Copyright © 2016 Hospital Infantil de México Federico Gómez. Publicado por Masson Doyma México S.A. All rights reserved.
Roth, Andrew; Kyzar, Evan J; Cachat, Jonathan; Stewart, Adam Michael; Green, Jeremy; Gaikwad, Siddharth; O'Leary, Timothy P; Tabakoff, Boris; Brown, Richard E; Kalueff, Allan V
Rodent self-grooming is an important, evolutionarily conserved behavior, highly sensitive to pharmacological and genetic manipulations. Mice with aberrant grooming phenotypes are currently used to model various human disorders. Therefore, it is critical to understand the biology of grooming behavior, and to assess its translational validity to humans. The present in-silico study used publicly available gene expression and behavioral data obtained from several inbred mouse strains in the open-field, light-dark box, elevated plus- and elevated zero-maze tests. As grooming duration differed between strains, our analysis revealed several candidate genes with significant correlations between gene expression in the brain and grooming duration. The Allen Brain Atlas, STRING, GoMiner and Mouse Genome Informatics databases were used to functionally map and analyze these candidate mouse genes against their human orthologs, assessing the strain ranking of their expression and the regional distribution of expression in the mouse brain. This allowed us to identify an interconnected network of candidate genes (which have expression levels that correlate with grooming behavior), display altered patterns of expression in key brain areas related to grooming, and underlie important functions in the brain. Collectively, our results demonstrate the utility of large-scale, high-throughput data-mining and in-silico modeling for linking genomic and behavioral data, as well as their potential to identify novel neural targets for complex neurobehavioral phenotypes, including grooming. Copyright © 2012 Elsevier Inc. All rights reserved.
Thai, Quan Ke; Chung, Dung Anh; Tran, Hoang-Dung
Canine and wolf mitochondrial DNA haplotypes, which can be used for forensic or phylogenetic analyses, have been defined in various schemes depending on the region analyzed. In recent studies, the 582 bp fragment of the HV1 region is most commonly used. 317 different canine HV1 haplotypes have been reported in the rapidly growing public database GenBank. These reported haplotypes contain several inconsistencies in their haplotype information. To overcome this issue, we have developed a Canis mtDNA HV1 database. This database collects data on the HV1 582 bp region in dog mitochondrial DNA from the GenBank to screen and correct the inconsistencies. It also supports users in detection of new novel mutation profiles and assignment of new haplotypes. The Canis mtDNA HV1 database (CHD) contains 5567 nucleotide entries originating from 15 subspecies in the species Canis lupus. Of these entries, 3646 were haplotypes and grouped into 804 distinct sequences. 319 sequences were recognized as previously assigned haplotypes, while the remaining 485 sequences had new mutation profiles and were marked as new haplotype candidates awaiting further analysis for haplotype assignment. Of the 3646 nucleotide entries, only 414 were annotated with correct haplotype information, while 3232 had insufficient or lacked haplotype information and were corrected or modified before storing in the CHD. The CHD can be accessed at http://chd.vnbiology.com . It provides sequences, haplotype information, and a web-based tool for mtDNA HV1 haplotyping. The CHD is updated monthly and supplies all data for download. The Canis mtDNA HV1 database contains information about canine mitochondrial DNA HV1 sequences with reconciled annotation. It serves as a tool for detection of inconsistencies in GenBank and helps identifying new HV1 haplotypes. Thus, it supports the scientific community in naming new HV1 haplotypes and to reconcile existing annotation of HV1 582 bp sequences.
WERNER-WASHBURNE, MARGARET; DAVIDSON, GEORGE S.
Collaboration between Sandia National Laboratories and the University of New Mexico Biology Department resulted in the capability to train students in microarray techniques and the interpretation of data from microarray experiments. These studies provide for a better understanding of the role of stationary phase and the gene regulation involved in exit from stationary phase, which may eventually have important clinical implications. Importantly, this research trained numerous students and is the basis for three new Ph.D. projects
Powell, Kimberly R; Peterson, Shenita R
Web of Science and Scopus are the leading databases of scholarly impact. Recent studies outside the field of nursing report differences in journal coverage and quality. A comparative analysis of nursing publications reported impact. Journal coverage by each database for the field of nursing was compared. Additionally, publications by 2014 nursing faculty were collected in both databases and compared for overall coverage and reported quality, as modeled by Scimajo Journal Rank, peer review status, and MEDLINE inclusion. Individual author impact, modeled by the h-index, was calculated by each database for comparison. Scopus offered significantly higher journal coverage. For 2014 faculty publications, 100% of journals were found in Scopus, Web of Science offered 82%. No significant difference was found in the quality of reported journals. Author h-index was found to be higher in Scopus. When reporting faculty publications and scholarly impact, academic nursing programs may be better represented by Scopus, without compromising journal quality. Programs with strong interdisciplinary work should examine all areas of strength to ensure appropriate coverage. Copyright © 2017 Elsevier Inc. All rights reserved.
van Ginneken, Bram; Stegmann, Mikkel Bille; Loog, Marco
classification method that employs a multi-scale filter bank of Gaussian derivatives and a k-nearest-neighbors classifier. The methods have been tested on a publicly available database of 247 chest radiographs, in which all objects have been manually segmented by two human observers. A parameter optimization...
Discovery of academic literature through Web search engines challenges the traditional role of specialized research databases. Creation of literature outside academic presses and peer-reviewed publications expands the content for scholarly research within a particular field. The resulting body of literature raises the question of whether scholars…
Full Text Available Microarray technologies have been the basis of numerous important findings regarding gene expression in the few last decades. Studies have generated large amounts of data describing various processes, which, due to the existence of public databases, are widely available for further analysis. Given their lower cost and higher maturity compared to newer sequencing technologies, these data continue to be produced, even though data quality has been the subject of some debate. However, given the large volume of data generated, integration can help overcome some issues related, e.g., to noise or reduced time resolution, while providing additional insight on features not directly addressed by sequencing methods. Here, we present an integration test case based on public Drosophila melanogaster datasets (gene expression, binding site affinities, known interactions. Using an evolutionary computation framework, we show how integration can enhance the ability to recover transcriptional gene regulatory networks from these data, as well as indicating which data types are more important for quantitative and qualitative network inference. Our results show a clear improvement in performance when multiple datasets are integrated, indicating that microarray data will remain a valuable and viable resource for some time to come.
Sîrbu, Alina; Crane, Martin; Ruskin, Heather J
Microarray technologies have been the basis of numerous important findings regarding gene expression in the few last decades. Studies have generated large amounts of data describing various processes, which, due to the existence of public databases, are widely available for further analysis. Given their lower cost and higher maturity compared to newer sequencing technologies, these data continue to be produced, even though data quality has been the subject of some debate. However, given the large volume of data generated, integration can help overcome some issues related, e.g., to noise or reduced time resolution, while providing additional insight on features not directly addressed by sequencing methods. Here, we present an integration test case based on public Drosophila melanogaster datasets (gene expression, binding site affinities, known interactions). Using an evolutionary computation framework, we show how integration can enhance the ability to recover transcriptional gene regulatory networks from these data, as well as indicating which data types are more important for quantitative and qualitative network inference. Our results show a clear improvement in performance when multiple datasets are integrated, indicating that microarray data will remain a valuable and viable resource for some time to come.
Hu, Jianjun; Li, Haifeng; Waterman, Michael S; Zhou, Xianghong Jasmine
Missing value estimation is an important preprocessing step in microarray analysis. Although several methods have been developed to solve this problem, their performance is unsatisfactory for datasets with high rates of missing data, high measurement noise, or limited numbers of samples. In fact, more than 80% of the time-series datasets in Stanford Microarray Database contain less than eight samples. We present the integrative Missing Value Estimation method (iMISS) by incorporating information from multiple reference microarray datasets to improve missing value estimation. For each gene with missing data, we derive a consistent neighbor-gene list by taking reference data sets into consideration. To determine whether the given reference data sets are sufficiently informative for integration, we use a submatrix imputation approach. Our experiments showed that iMISS can significantly and consistently improve the accuracy of the state-of-the-art Local Least Square (LLS) imputation algorithm by up to 15% improvement in our benchmark tests. We demonstrated that the order-statistics-based integrative imputation algorithms can achieve significant improvements over the state-of-the-art missing value estimation approaches such as LLS and is especially good for imputing microarray datasets with a limited number of samples, high rates of missing data, or very noisy measurements. With the rapid accumulation of microarray datasets, the performance of our approach can be further improved by incorporating larger and more appropriate reference datasets.
Full Text Available Abstract Background Missing value estimation is an important preprocessing step in microarray analysis. Although several methods have been developed to solve this problem, their performance is unsatisfactory for datasets with high rates of missing data, high measurement noise, or limited numbers of samples. In fact, more than 80% of the time-series datasets in Stanford Microarray Database contain less than eight samples. Results We present the integrative Missing Value Estimation method (iMISS by incorporating information from multiple reference microarray datasets to improve missing value estimation. For each gene with missing data, we derive a consistent neighbor-gene list by taking reference data sets into consideration. To determine whether the given reference data sets are sufficiently informative for integration, we use a submatrix imputation approach. Our experiments showed that iMISS can significantly and consistently improve the accuracy of the state-of-the-art Local Least Square (LLS imputation algorithm by up to 15% improvement in our benchmark tests. Conclusion We demonstrated that the order-statistics-based integrative imputation algorithms can achieve significant improvements over the state-of-the-art missing value estimation approaches such as LLS and is especially good for imputing microarray datasets with a limited number of samples, high rates of missing data, or very noisy measurements. With the rapid accumulation of microarray datasets, the performance of our approach can be further improved by incorporating larger and more appropriate reference datasets.
Full Text Available Abstract Background There are several reports describing thousands of SSR markers in the peanut (Arachis hypogaea L. genome. There is a need to integrate various research reports of peanut DNA polymorphism into a single platform. Further, because of lack of uniformity in the labeling of these markers across the publications, there is some confusion on the identities of many markers. We describe below an effort to develop a central comprehensive database of polymorphic SSR markers in peanut. Findings We compiled 1,343 SSR markers as detecting polymorphism (14.5% within a total of 9,274 markers. Amongst all polymorphic SSRs examined, we found that AG motif (36.5% was the most abundant followed by AAG (12.1%, AAT (10.9%, and AT (10.3%.The mean length of SSR repeats in dinucleotide SSRs was significantly longer than that in trinucleotide SSRs. Dinucleotide SSRs showed higher polymorphism frequency for genomic SSRs when compared to trinucleotide SSRs, while for EST-SSRs, the frequency of polymorphic SSRs was higher in trinucleotide SSRs than in dinucleotide SSRs. The correlation of the length of SSR and the frequency of polymorphism revealed that the frequency of polymorphism was decreased as motif repeat number increased. Conclusions The assembled polymorphic SSRs would enhance the density of the existing genetic maps of peanut, which could also be a useful source of DNA markers suitable for high-throughput QTL mapping and marker-assisted selection in peanut improvement and thus would be of value to breeders.
Gauthier, Nicholas Paul; Larsen, Malene Erup; Wernersson, Rasmus
The past decade has seen the publication of a large number of cell-cycle microarray studies and many more are in the pipeline. However, data from these experiments are not easy to access, combine and evaluate. We have developed a centralized database with an easy-to-use interface, Cyclebase...
Sochacki, Kyle R; Jack, Robert A; Safran, Marc R; Nho, Shane J; Harris, Joshua D
The purpose of this study was to compare (1) major complication, (2) revision, and (3) conversion to arthroplasty rates following hip arthroscopy between database studies and original research peer-reviewed publications. A systematic review was performed using PRISMA guidelines. PubMed, SCOPUS, SportDiscus, and Cochrane Central Register of Controlled Trials were searched for studies that investigated major complication (dislocation, femoral neck fracture, avascular necrosis, fluid extravasation, septic arthritis, death), revision, and hip arthroplasty conversion rates following hip arthroscopy. Major complication, revision, and conversion to hip arthroplasty rates were compared between original research (single- or multicenter therapeutic studies) and database (insurance database using ICD-9/10 and/or current procedural terminology coding terminology) publishing studies. Two hundred seven studies (201 original research publications [15,780 subjects; 54% female] and 6 database studies [20,825 subjects; 60% female]) were analyzed (mean age, 38.2 ± 11.6 years old; mean follow-up, 2.7 ± 2.9 years). The database studies had a significantly higher age (40.6 + 2.8 vs 35.4 ± 11.6), body mass index (27.4 ± 5.6 vs 24.9 ± 3.1), percentage of females (60.1% vs 53.8%), and longer follow-up (3.1 ± 1.6 vs 2.7 ± 3.0) compared with original research (P database studies (P = .029; relative risk [RR], 1.3). There was a significantly higher rate of femoral neck fracture (0.24% vs 0.03%; P database studies. Reoperations occurred at a significantly higher rate in the database studies (11.1% vs 7.3%; P database studies (8.0% vs 3.7%; P Database studies report significantly increased major complication, revision, and conversion to hip arthroplasty rates compared with original research investigations of hip arthroscopy outcomes. Level IV, systematic review of Level I-IV studies. Copyright © 2018 Arthroscopy Association of North America. Published by Elsevier Inc. All rights
White, Amanda M.; Collett, James R.; Seurynck-Servoss, Shannon L.; Daly, Don S.; Zangar, Richard C.
Summary:ELISA-BASE is an open source database for capturing, organizing and analyzing enzyme-linked immunosorbent assay (ELISA) microarray data. ELISA-BASE is an extension of the BioArray Software Environment (BASE) database system.
Wartzek, Tobias; Czaplik, Michael; Antink, Christoph Hoog; Eilebrecht, Benjamin; Walocha, Rafael; Leonhardt, Steffen
While PhysioNet is a large database for standard clinical vital signs measurements, such a database does not exist for unobtrusively measured signals. This inhibits progress in the vital area of signal processing for unobtrusive medical monitoring as not everybody owns the specific measurement systems to acquire signals. Furthermore, if no common database exists, a comparison between different signal processing approaches is not possible. This gap will be closed by our UnoViS database. It contains different recordings in various scenarios ranging from a clinical study to measurements obtained while driving a car. Currently, 145 records with a total of 16.2 h of measurement data is available, which are provided as MATLAB files or in the PhysioNet WFDB file format. In its initial state, only (multichannel) capacitive ECG and unobtrusive PPG signals are, together with a reference ECG, included. All ECG signals contain annotations by a peak detector and by a medical expert. A dataset from a clinical study contains further clinical annotations. Additionally, supplementary functions are provided, which simplify the usage of the database and thus the development and evaluation of new algorithms. The development of urgently needed methods for very robust parameter extraction or robust signal fusion in view of frequent severe motion artifacts in unobtrusive monitoring is now possible with the database.
Rojas-Sola, J. I.; de San-Antonio-Gómez, C.
In this paper the publications from Spanish institutions listed in the journals of the Construction & Building Technology subject of Web of Science database for the period 1997- 2008 are analyzed. The number of journals in whose is published is 35 and the number of articles was 760 (Article or Review). Also a bibliometric assessment has done and we propose two new parameters: Weighted Impact Factor and Relative Impact Factor; also includes the number of citations and the number documents ...
Richard S. Segall
Full Text Available This paper provides continuation and extensions of previous research by Segall and Pierce (2009a that discussed data mining for micro-array databases of Leukemia cells for primarily self-organized maps (SOM. As Segall and Pierce (2009a and Segall and Pierce (2009b the results of applying data mining are shown and discussed for the data categories of microarray databases of HL60, Jurkat, NB4 and U937 Leukemia cells that are also described in this article. First, a background section is provided on the work of others pertaining to the applications of data mining to micro-array databases of Leukemia cells and micro-array databases in general. As noted in predecessor article by Segall and Pierce (2009a, micro-array databases are one of the most popular functional genomics tools in use today. This research in this paper is intended to use advanced data mining technologies for better interpretations and knowledge discovery as generated by the patterns of gene expressions of HL60, Jurkat, NB4 and U937 Leukemia cells. The advanced data mining performed entailed using other data mining tools such as cubic clustering criterion, variable importance rankings, decision trees, and more detailed examinations of data mining statistics and study of other self-organized maps (SOM clustering regions of workspace as generated by SAS Enterprise Miner version 4. Conclusions and future directions of the research are also presented.
Torcellini, P. A.; Crawley, D. B.
To help capture valuable information on''green building'' case studies, the U.S. Department of Energy has created an online database for collecting, standardizing, and disseminating information about high-performance, green projects. Type of information collected includes green features, design processes, energy performance, and comparison to other high-performance, green buildings.
Jacobs, Colin; Prokop, Mathias; Rikxoort, Eva M. van; Ginneken, Bram van; Murphy, Keelin; Schaefer-Prokop, Cornelia M.
To benchmark the performance of state-of-the-art computer-aided detection (CAD) of pulmonary nodules using the largest publicly available annotated CT database (LIDC/IDRI), and to show that CAD finds lesions not identified by the LIDC's four-fold double reading process. The LIDC/IDRI database contains 888 thoracic CT scans with a section thickness of 2.5 mm or lower. We report performance of two commercial and one academic CAD system. The influence of presence of contrast, section thickness, and reconstruction kernel on CAD performance was assessed. Four radiologists independently analyzed the false positive CAD marks of the best CAD system. The updated commercial CAD system showed the best performance with a sensitivity of 82 % at an average of 3.1 false positive detections per scan. Forty-five false positive CAD marks were scored as nodules by all four radiologists in our study. On the largest publicly available reference database for lung nodule detection in chest CT, the updated commercial CAD system locates the vast majority of pulmonary nodules at a low false positive rate. Potential for CAD is substantiated by the fact that it identifies pulmonary nodules that were not marked during the extensive four-fold LIDC annotation process. (orig.)
Shafer, Robert W
Knowledge regarding the drug resistance of human immunodeficiency virus (HIV) is critical for surveillance of drug resistance, development of antiretroviral drugs, and management of infections with drug-resistant viruses. Such knowledge is derived from studies that correlate genetic variation in the targets of therapy with the antiretroviral treatments received by persons from whom the variant was obtained (genotype-treatment), with drug-susceptibility data on genetic variants (genotype-phenotype), and with virological and clinical response to a new treatment regimen (genotype-outcome). An HIV drug-resistance database is required to represent, store, and analyze the diverse forms of data underlying our knowledge of drug resistance and to make these data available to the broad community of researchers studying drug resistance in HIV and clinicians using HIV drug-resistance tests. Such genotype-treatment, genotype-phenotype, and genotype-outcome correlations are contained in the Stanford HIV RT and Protease Sequence Database and have specific usefulness.
librarians during one on one instruction, and the ability of users to browse the database. Correlation of the James A. Haley Veterans Hospital study findings...library to another, librarians must collect and study data about information gathering characteristics of their own users . (Harter and Jackson 1988...based training: improving the quality of end- user searching. The Journal of Academic Librarianship 17, no. 3: 152-56. Ciuffetti, Peter D. 1991a. A plea
Kumar, V.; Kalyane, V.L.; Prakasan, E.R.; Kumar, A.; Sagar, A.; Mohan, L.
Digital databases INIS (1970-2002), INSPEC (1969-2002), Chemical Abstracts (1977-2002), ISMEC (1973-June 2002), Web of Sciences (1974-2002), and Science Citation Index (1982-2002), were used for comprehensive retrieval of bibliographic details of research publications on Pressurized Heavy Water Reactor (PHWR) research. Among the countries contributing to PHWR research, India (having 1737 papers) is the forerunner followed by Canada (1492), Romania (508) and Argentina (334). Collaboration of Canadian researchers with researchers of other countries resulted in 75 publications. Among the most productive researchers in this field, the first 15 are from India. Top three contributors to PHWR publications with their respective authorship credits are: H.S. Kushwaha (106), Anil Kakodkar (100) and V. Venkat Raj (76). Prominent interdomainary interactions in PHWR subfields are: Specific nuclear reactors and associated plants with General studies of nuclear reactors (481), followed by Environmental sciences (185), and Materials science (154). Number of publications dealing with Geosciences aspect of environmental sciences are 141. Romania, Argentina, India and Republic of Korea have used mostly (≥75%) non-conventional media for publications. Out of the 4851 publications, 1228 have been published in 292 distinct journals. Top most journals publishing PHWR papers are: Radiation Protection and Environment (continued from: Bulletin of Radiation Protection since 1997), India (115); Nuclear Engineering International, UK (84); and Transactions of the American Nuclear Society, USA (68). (author)
Meneveau, Charles; Yang, Yunke; Perlman, Eric; Wan, Minpin; Burns, Randal; Szalay, Alex; Chen, Shiyi; Eyink, Gregory
A public database system archiving a direct numerical simulation (DNS) data set of isotropic, forced turbulence is used for studying basic turbulence dynamics. The data set consists of the DNS output on 1024-cubed spatial points and 1024 time-samples spanning about one large-scale turn-over timescale. This complete space-time history of turbulence is accessible to users remotely through an interface that is based on the Web-services model (see http://turbulence.pha.jhu.edu). Users may write and execute analysis programs on their host computers, while the programs make subroutine-like calls that request desired parts of the data over the network. The architecture of the database is briefly explained, as are some of the new functions such as Lagrangian particle tracking and spatial box-filtering. These tools are used to evaluate and compare subgrid stresses and models.
U.S. Environmental Protection Agency — THIS DATA ASSET NO LONGER ACTIVE: This is metadata documentation for the National Priorities List (NPL) Publication Assistance Databsae (PAD), a Lotus Notes...
Full Text Available base Description General information of database Database name DGBY Alternative name Database...EL: +81-29-838-8066 E-mail: Database classification Microarray Data and other Gene Expression Databases Orga...nism Taxonomy Name: Saccharomyces cerevisiae Taxonomy ID: 4932 Database descripti...-called phenomics). We uploaded these data on this website which is designated DGBY(Database for Gene expres...ma J, Ando A, Takagi H. Journal: Yeast. 2008 Mar;25(3):179-90. External Links: Original website information Database
Chang, Hsiao-Ting; Lin, Ming-Hwai; Chen, Chun-Ku; Hwang, Shinn-Jang; Hwang, I-Hsuan; Chen, Yu-Chun
Academic publications are important for developing a medical specialty or discipline and improvements of quality of care. As hospice palliative care medicine is a rapidly growing medical specialty in Taiwan, this study aimed to analyze the hospice palliative care-related publications from 1993 through 2013 both worldwide and in Taiwan, by using the Web of Science database. Academic articles published with topics including "hospice", "palliative care", "end of life care", and "terminal care" were retrieved and analyzed from the Web of Science database, which includes documents published in Science Citation Index-Expanded and Social Science Citation Indexed journals from 1993 to 2013. Compound annual growth rates (CAGRs) were calculated to evaluate the trends of publications. There were a total of 27,788 documents published worldwide during the years 1993 to 2013. The top five most prolific countries/areas with published documents were the United States (11,419 documents, 41.09%), England (3620 documents, 13.03%), Canada (2428 documents, 8.74%), Germany (1598 documents, 5.75%), and Australia (1580 documents, 5.69%). Three hundred and ten documents (1.12%) were published from Taiwan, which ranks second among Asian countries (after Japan, with 594 documents, 2.14%) and 16(th) in the world. During this 21-year period, the number of hospice palliative care-related article publications increased rapidly. The worldwide CAGR for hospice palliative care publications during 1993 through 2013 was 12.9%. As for Taiwan, the CAGR for publications during 1999 through 2013 was 19.4%. The majority of these documents were submitted from universities or hospitals affiliated to universities. The number of hospice palliative care-related publications increased rapidly from 1993 to 2013 in the world and in Taiwan; however, the number of publications from Taiwan is still far below those published in several other countries. Further research is needed to identify and try to reduce the
Kung, Yen-Ying; Hwang, Shinn-Jang; Li, Tsai-Feng; Ko, Seong-Gyu; Huang, Ching-Wen; Chen, Fang-Pey
Acupuncture is a rapidly growing medical specialty worldwide. This study aimed to analyze the acupuncture publications from 1988 to 2015 by using the Web of Science (WoS) database. Familiarity with the trend of acupuncture publications will facilitate a better understanding of existing academic research in acupuncture and its applications. Academic articles published focusing on acupuncture were retrieved and analyzed from the WoS database which included articles published in Science Citation Index-Expanded and Social Science Citation Indexed journals from 1988 to 2015. A total of 7450 articles were published in the field of acupuncture during the period of 1988-2015. Annual article publications increased from 109 in 1988 to 670 in 2015. The People's Republic of China (published 2076 articles, 27.9%), USA (published 1638 articles, 22.0%) and South Korea (published 707 articles, 9.5%) were the most abundantly prolific countries. According to the WoS subject categories, 2591 articles (34.8%) were published in the category of Integrative and Complementary Medicine, followed by Neurosciences (1147 articles, 15.4%), and General Internal Medicine (918 articles, 12.3%). Kyung Hee University (South Korea) is the most prolific organization that is the source of acupuncture publications (365 articles, 4.9%). Fields within acupuncture with the most cited articles included mechanism, clinical trials, epidemiology, and a new research method of acupuncture. Publications associated with acupuncture increased rapidly from 1988 to 2015. The different applications of acupuncture were extensive in multiple fields of medicine. It is important to maintain and even nourish a certain quantity and quality of published acupuncture papers, which can play an important role in developing a medical discipline for acupuncture. Copyright © 2017. Published by Elsevier Taiwan LLC.
Richard S. Segall; Ryan M. Pierce
This paper provides continuation and extensions of previous research by Segall and Pierce (2009a) that discussed data mining for micro-array databases of Leukemia cells for primarily self-organized maps (SOM). As Segall and Pierce (2009a) and Segall and Pierce (2009b) the results of applying data mining are shown and discussed for the data categories of microarray databases of HL60, Jurkat, NB4 and U937 Leukemia cells that are also described in this article. First, a background section is pro...
Full Text Available There is clear demand for a global spatial public domain roads data set with improved geographic and temporal coverage, consistent coding of road types, and clear documentation of sources. The currently best available global public domain product covers only one-quarter to one-third of the existing road networks, and this varies considerably by region. Applications for such a data set span multiple sectors and would be particularly valuable for the international economic development, disaster relief, and biodiversity conservation communities, not to mention national and regional agencies and organizations around the world. The building blocks for such a global product are available for many countries and regions, yet thus far there has been neither strategy nor leadership for developing it. This paper evaluates the best available public domain and commercial data sets, assesses the gaps in global coverage, and proposes a number of strategies for filling them. It also identifies stakeholder organizations with an interest in such a data set that might either provide leadership or funding for its development. It closes with a proposed set of actions to begin the process.
Full Text Available Abstract Background Salmonids are of interest because of their relatively recent genome duplication, and their extensive use in wild fisheries and aquaculture. A comprehensive gene list and a comparison of genes in some of the different species provide valuable genomic information for one of the most widely studied groups of fish. Results 298,304 expressed sequence tags (ESTs from Atlantic salmon (69% of the total, 11,664 chinook, 10,813 sockeye, 10,051 brook trout, 10,975 grayling, 8,630 lake whitefish, and 3,624 northern pike ESTs were obtained in this study and have been deposited into the public databases. Contigs were built and putative full-length Atlantic salmon clones have been identified. A database containing ESTs, assemblies, consensus sequences, open reading frames, gene predictions and putative annotation is available. The overall similarity between Atlantic salmon ESTs and those of rainbow trout, chinook, sockeye, brook trout, grayling, lake whitefish, northern pike and rainbow smelt is 93.4, 94.2, 94.6, 94.4, 92.5, 91.7, 89.6, and 86.2% respectively. An analysis of 78 transcript sets show Salmo as a sister group to Oncorhynchus and Salvelinus within Salmoninae, and Thymallinae as a sister group to Salmoninae and Coregoninae within Salmonidae. Extensive gene duplication is consistent with a genome duplication in the common ancestor of salmonids. Using all of the available EST data, a new expanded salmonid cDNA microarray of 32,000 features was created. Cross-species hybridizations to this cDNA microarray indicate that this resource will be useful for studies of all 68 salmonid species. Conclusion An extensive collection and analysis of salmonid RNA putative transcripts indicate that Pacific salmon, Atlantic salmon and charr are 94–96% similar while the more distant whitefish, grayling, pike and smelt are 93, 92, 89 and 86% similar to salmon. The salmonid transcriptome reveals a complex history of gene duplication that is
Eduardo Luís Hepper
Full Text Available Brazil is going through a time of reflection about the preservation of natural resources, an issue that is increasingly considered in its agenda. The search for balance between environmental, social and economic aspects has been a challenge for business survival over the years and has led companies to adopt initiatives focused on sustainability. The objective of this article is to analyse how the international scientific production addresses sustainable practices and initiatives and their relationship with organizational performance. Considering this scope, a bibliometric study of the publications located on Web of Science - Social Sciences Citation Index (WoS-SSCI was developed. There were 33 articles identified and selected on the subject. Journals that stand out in quantity of articles and number of citations are the Journal of Cleaner Production and Strategic Management Journal, respectively. Analysing the results, a growing concern about this issue and the increase in publications was noticed after the 2000s. The results found, in general, associate sustainable practices to positive organizational performance, such as increased profit on the product sold, quality improvement, improved reputation, and waste reduction, among others gains identified.
Skripcak, Tomas; Belka, Claus; Bosch, Walter; Brink, Carsten; Brunner, Thomas; Budach, Volker; Büttner, Daniel; Debus, Jürgen; Dekker, Andre; Grau, Cai; Gulliford, Sarah; Hurkmans, Coen; Just, Uwe
Disconnected cancer research data management and lack of information exchange about planned and ongoing research are complicating the utilisation of internationally collected medical information for improving cancer patient care. Rapidly collecting/pooling data can accelerate translational research in radiation therapy and oncology. The exchange of study data is one of the fundamental principles behind data aggregation and data mining. The possibilities of reproducing the original study results, performing further analyses on existing research data to generate new hypotheses or developing computational models to support medical decisions (e.g. risk/benefit analysis of treatment options) represent just a fraction of the potential benefits of medical data-pooling. Distributed machine learning and knowledge exchange from federated databases can be considered as one beyond other attractive approaches for knowledge generation within “Big Data”. Data interoperability between research institutions should be the major concern behind a wider collaboration. Information captured in electronic patient records (EPRs) and study case report forms (eCRFs), linked together with medical imaging and treatment planning data, are deemed to be fundamental elements for large multi-centre studies in the field of radiation therapy and oncology. To fully utilise the captured medical information, the study data have to be more than just an electronic version of a traditional (un-modifiable) paper CRF. Challenges that have to be addressed are data interoperability, utilisation of standards, data quality and privacy concerns, data ownership, rights to publish, data pooling architecture and storage. This paper discusses a framework for conceptual packages of ideas focused on a strategic development for international research data exchange in the field of radiation therapy and oncology
Skripcak, Tomas; Belka, Claus; Bosch, Walter; Brink, Carsten; Brunner, Thomas; Budach, Volker; Büttner, Daniel; Debus, Jürgen; Dekker, Andre; Grau, Cai; Gulliford, Sarah; Hurkmans, Coen; Just, Uwe; Krause, Mechthild; Lambin, Philippe; Langendijk, Johannes A; Lewensohn, Rolf; Lühr, Armin; Maingon, Philippe; Masucci, Michele; Niyazi, Maximilian; Poortmans, Philip; Simon, Monique; Schmidberger, Heinz; Spezi, Emiliano; Stuschke, Martin; Valentini, Vincenzo; Verheij, Marcel; Whitfield, Gillian; Zackrisson, Björn; Zips, Daniel; Baumann, Michael
Disconnected cancer research data management and lack of information exchange about planned and ongoing research are complicating the utilisation of internationally collected medical information for improving cancer patient care. Rapidly collecting/pooling data can accelerate translational research in radiation therapy and oncology. The exchange of study data is one of the fundamental principles behind data aggregation and data mining. The possibilities of reproducing the original study results, performing further analyses on existing research data to generate new hypotheses or developing computational models to support medical decisions (e.g. risk/benefit analysis of treatment options) represent just a fraction of the potential benefits of medical data-pooling. Distributed machine learning and knowledge exchange from federated databases can be considered as one beyond other attractive approaches for knowledge generation within "Big Data". Data interoperability between research institutions should be the major concern behind a wider collaboration. Information captured in electronic patient records (EPRs) and study case report forms (eCRFs), linked together with medical imaging and treatment planning data, are deemed to be fundamental elements for large multi-centre studies in the field of radiation therapy and oncology. To fully utilise the captured medical information, the study data have to be more than just an electronic version of a traditional (un-modifiable) paper CRF. Challenges that have to be addressed are data interoperability, utilisation of standards, data quality and privacy concerns, data ownership, rights to publish, data pooling architecture and storage. This paper discusses a framework for conceptual packages of ideas focused on a strategic development for international research data exchange in the field of radiation therapy and oncology. Copyright © 2014 Elsevier Ireland Ltd. All rights reserved.
Full Text Available An important stage in microarray image analysis is gridding. Microarray image gridding is done to locate sub arrays in a microarray image and find co-ordinates of spots within each sub array. For accurate identification of spots, most of the proposed gridding methods require human intervention. In this paper a fully automatic gridding method which enhances spot intensity in the preprocessing step as per a histogram based threshold method is used. The gridding step finds co-ordinates of spots from horizontal and vertical profile of the image. To correct errors due to the grid line placement, a grid line refinement technique is proposed. The algorithm is applied on different image databases and results are compared based on spot detection accuracy and time. An average spot detection accuracy of 95.06% depicts the proposed method’s flexibility and accuracy in finding the spot co-ordinates for different database images.
Full Text Available Abstract Background Phytohormones organize plant development and environmental adaptation through cell-to-cell signal transduction, and their action involves transcriptional activation. Recent international efforts to establish and maintain public databases of Arabidopsis microarray data have enabled the utilization of this data in the analysis of various phytohormone responses, providing genome-wide identification of promoters targeted by phytohormones. Results We utilized such microarray data for prediction of cis-regulatory elements with an octamer-based approach. Our test prediction of a drought-responsive RD29A promoter with the aid of microarray data for response to drought, ABA and overexpression of DREB1A, a key regulator of cold and drought response, provided reasonable results that fit with the experimentally identified regulatory elements. With this succession, we expanded the prediction to various phytohormone responses, including those for abscisic acid, auxin, cytokinin, ethylene, brassinosteroid, jasmonic acid, and salicylic acid, as well as for hydrogen peroxide, drought and DREB1A overexpression. Totally 622 promoters that are activated by phytohormones were subjected to the prediction. In addition, we have assigned putative functions to 53 octamers of the Regulatory Element Group (REG that have been extracted as position-dependent cis-regulatory elements with the aid of their feature of preferential appearance in the promoter region. Conclusions Our prediction of Arabidopsis cis-regulatory elements for phytohormone responses provides guidance for experimental analysis of promoters to reveal the basis of the transcriptional network of phytohormone responses.
Redi, Judith; Liu, Hantao; Alers, Hani; Zunino, Rodolfo; Heynderickx, Ingrid
The Single Stimulus (SS) method is often chosen to collect subjective data testing no-reference objective metrics, as it is straightforward to implement and well standardized. At the same time, it exhibits some drawbacks; spread between different assessors is relatively large, and the measured ratings depend on the quality range spanned by the test samples, hence the results from different experiments cannot easily be merged . The Quality Ruler (QR) method has been proposed to overcome these inconveniences. This paper compares the performance of the SS and QR method for pictures impaired by Gaussian blur. The research goal is, on one hand, to analyze the advantages and disadvantages of both methods for quality assessment and, on the other, to make quality data of blur impaired images publicly available. The obtained results show that the confidence intervals of the QR scores are narrower than those of the SS scores. This indicates that the QR method enhances consistency across assessors. Moreover, QR scores exhibit a higher linear correlation with the distortion applied. In summary, for the purpose of building datasets of subjective quality, the QR approach seems promising from the viewpoint of both consistency and repeatability.
Zhu, K; Lou, Z; Zhou, J; Ballester, N; Kong, N; Parikh, P
more than 10% over the standard classification models, which can be translated to correct labeling of additional 400 - 500 readmissions for heart failure patients in the state of California over a year. Lastly, several key predictor identified from the HCUP data include the disposition location from discharge, the number of chronic conditions, and the number of acute procedures. It would be beneficial to apply simple decision rules obtained from the decision tree in an ad-hoc manner to guide the cohort stratification. It could be potentially beneficial to explore the effect of pairwise interactions between influential predictors when building the logistic regression models for different data strata. Judicious use of the ad-hoc CLR models developed offers insights into future development of prediction models for hospital readmissions, which can lead to better intuition in identifying high-risk patients and developing effective post-discharge care strategies. Lastly, this paper is expected to raise the awareness of collecting data on additional markers and developing necessary database infrastructure for larger-scale exploratory studies on readmission risk prediction.
Zhang, Zhe; Fenstermacher, David
Analyzing microarray data across multiple experiments has been proven advantageous. To support this kind of analysis, we are developing a software system called MAMA (Meta-Analysis of MicroArray data). MAMA utilizes a client-server architecture with a relational database on the server-side for the storage of microarray datasets collected from various resources. The client-side is an application running on the end user's computer that allows the user to manipulate microarray data and analytical results locally. MAMA implementation will integrate several analytical methods, including meta-analysis within an open-source framework offering other developers the flexibility to plug in additional statistical algorithms.
Guz, A. N.; Rushchitsky, J. J.
The paper analyzes the level of coverage and citation of publications by mechanicians of the National Academy of Sciences of Ukraine (NASU) in the Scopus database. Two groups of mechanicians are considered. One group includes 66 doctors of sciences of the S. P. Timoshenko Institute of Mechanics as representatives of the oldest institute of the NASU. The other group includes 34 members (academicians and corresponding members) of the Division of Mechanics of the NASU as representatives of the authoritative community of mechanicians in Ukraine. The results are presented for each scientist in the form of two indices—the total number of publications accessible in the database as the level of coverage of the scientist's publications in this database and the h-index as the citation level of these publications. This paper may be considered to continue the papers [6-12] published in Prikladnaya Mekhanika (International Applied Mechanics) in 2005-2009
Full Text Available Abstract Background The Canadian Institute for Health Information (CIHI collects hospital discharge abstract data (DAD from Canadian provinces and territories. There are many demands for the disclosure of this data for research and analysis to inform policy making. To expedite the disclosure of data for some of these purposes, the construction of a DAD public use microdata file (PUMF was considered. Such purposes include: confirming some published results, providing broader feedback to CIHI to improve data quality, training students and fellows, providing an easily accessible data set for researchers to prepare for analyses on the full DAD data set, and serve as a large health data set for computer scientists and statisticians to evaluate analysis and data mining techniques. The objective of this study was to measure the probability of re-identification for records in a PUMF, and to de-identify a national DAD PUMF consisting of 10% of records. Methods Plausible attacks on a PUMF were evaluated. Based on these attacks, the 2008-2009 national DAD was de-identified. A new algorithm was developed to minimize the amount of suppression while maximizing the precision of the data. The acceptable threshold for the probability of correct re-identification of a record was set at between 0.04 and 0.05. Information loss was measured in terms of the extent of suppression and entropy. Results Two different PUMF files were produced, one with geographic information, and one with no geographic information but more clinical information. At a threshold of 0.05, the maximum proportion of records with the diagnosis code suppressed was 20%, but these suppressions represented only 8-9% of all values in the DAD. Our suppression algorithm has less information loss than a more traditional approach to suppression. Smaller regions, patients with longer stays, and age groups that are infrequently admitted to hospitals tend to be the ones with the highest rates of suppression
Sambrook, Joseph; Bowtell, David
.... DNA Microarrays provides authoritative, detailed instruction on the design, construction, and applications of microarrays, as well as comprehensive descriptions of the software tools and strategies...
SacconePhD, Scott F [Washington University, St. Louis; Chesler, Elissa J [ORNL; Bierut, Laura J [Washington University, St. Louis; Kalivas, Peter J [Medical College of South Carolina, Charleston; Lerman, Caryn [University of Pennsylvania; Saccone, Nancy L [Washington University, St. Louis; Uhl, George R [Johns Hopkins University; Li, Chuan-Yun [Peking University; Philip, Vivek M [ORNL; Edenberg, Howard [Indiana University; Sherry, Steven [National Center for Biotechnology Information; Feolo, Michael [National Center for Biotechnology Information; Moyzis, Robert K [Johns Hopkins University; Rutter, Joni L [National Institute of Drug Abuse
Commercial SNP microarrays now provide comprehensive and affordable coverage of the human genome. However, some diseases have biologically relevant genomic regions that may require additional coverage. Addiction, for example, is thought to be influenced by complex interactions among many relevant genes and pathways. We have assembled a list of 486 biologically relevant genes nominated by a panel of experts on addiction. We then added 424 genes that showed evidence of association with addiction phenotypes through mouse QTL mappings and gene co-expression analysis. We demonstrate that there are a substantial number of SNPs in these genes that are not well represented by commercial SNP platforms. We address this problem by introducing a publicly available SNP database for addiction. The database is annotated using numeric prioritization scores indicating the extent of biological relevance. The scores incorporate a number of factors such as SNP/gene functional properties (including synonymy and promoter regions), data from mouse systems genetics and measures of human/mouse evolutionary conservation. We then used HapMap genotyping data to determine if a SNP is tagged by a commercial microarray through linkage disequilibrium. This combination of biological prioritization scores and LD tagging annotation will enable addiction researchers to supplement commercial SNP microarrays to ensure comprehensive coverage of biologically relevant regions.
Grimm, E.C.; Bradshaw, R.H.W; Brewer, S.; Flantua, S.; Giesecke, T.; Lézine, A.M.; Takahara, H.; Williams, J.W.,Jr; Elias, S.A.; Mock, C.J.
During the past 20 years, several pollen database cooperatives have been established. These databases are now constituent databases of the Neotoma Paleoecology Database, a public domain, multiproxy, relational database designed for Quaternary-Pliocene fossil data and modern surface samples. The
Moo-Young, Tricia A; Panergo, Jessel; Wang, Chih E; Patel, Subhash; Duh, Hong Yan; Winchester, David J; Prinz, Richard A; Fogelfeld, Leon
Clinicopathologic variables influence the treatment and prognosis of patients with thyroid cancer. A retrospective analysis of public hospital thyroid cancer database and the Surveillance, Epidemiology and End Results 17 database was conducted. Demographic, clinical, and pathologic data were compared across ethnic groups. Within the public hospital database, Hispanics versus non-Hispanic whites were younger and had more lymph node involvement (34% vs 17%, P ethnic groups. Similar findings were demonstrated within the Surveillance, Epidemiology and End Results database. African Americans aged ethnic groups. Such disparities persist within an equal-access health care system. These findings suggest that factors beyond socioeconomics may contribute to such differences. Copyright © 2013 Elsevier Inc. All rights reserved.
Full Text Available β-lactam is the most used antibiotic class in the clinical area and it acts on blocking the bacteria cell wall synthesis, causing cell death. However, some bacteria have evolved resistance to these antibiotics mainly due the production of enzymes known as β-lactamases. Hospital sewage is an important source of dispersion of multidrug-resistant bacteria in rivers and oceans. In this work, we used next-generation DNA sequencing to explore the diversity and dissemination of serine β-lactamases in two hospital sewage from Rio de Janeiro, Brazil (South -SZ- and North Zone -NZ, presenting different profiles, and to compare them with public environmental data available. Also, we propose a Hidden-Markov-Model approach to screen potential serine β-lactamases genes (in public environments samples and generated hospital sewage data, exploring its evolutionary relationships. Due to the high variability in β-lactamases, we used a position-specific scoring matrix search method (RPS-BLAST against conserved domain database profiles (CDD, Pfam, and COG followed by visual inspection to detect conserved motifs, to increase the reliability of the results and remove possible false positives. We were able to identify novel β-lactamases from Brazilian hospital sewage and to estimate relative abundance of its types. The highest relative abundance found in SZ was the Class A (50%, while Class D is predominant in NZ (55%. CfxA (65% and ACC (47% types were the most abundant genes detected in SZ, while in NZ the most frequent were OXA-10 (32%, CfxA (28%, ACC (21%, CEPA (20% and FOX (19%. Phylogenetic analysis revealed β-lactamases from Brazilian hospital sewage grouped in the same clade and close to sequences belonging to Firmicutes and Bacteroidetes groups, but distant from potential β-lactamases screened from public environmental data, that grouped closer to β-lactamases of Proteobacteria. Our results demonstrated that HMM-based approach identified homologs of
Fróes, Adriana M; da Mota, Fábio F; Cuadrat, Rafael R C; Dávila, Alberto M R
β-lactam is the most used antibiotic class in the clinical area and it acts on blocking the bacteria cell wall synthesis, causing cell death. However, some bacteria have evolved resistance to these antibiotics mainly due the production of enzymes known as β-lactamases. Hospital sewage is an important source of dispersion of multidrug-resistant bacteria in rivers and oceans. In this work, we used next-generation DNA sequencing to explore the diversity and dissemination of serine β-lactamases in two hospital sewage from Rio de Janeiro, Brazil (South Zone, SZ and North Zone, NZ), presenting different profiles, and to compare them with public environmental data available. Also, we propose a Hidden-Markov-Model approach to screen potential serine β-lactamases genes (in public environments samples and generated hospital sewage data), exploring its evolutionary relationships. Due to the high variability in β-lactamases, we used a position-specific scoring matrix search method (RPS-BLAST) against conserved domain database profiles (CDD, Pfam, and COG) followed by visual inspection to detect conserved motifs, to increase the reliability of the results and remove possible false positives. We were able to identify novel β-lactamases from Brazilian hospital sewage and to estimate relative abundance of its types. The highest relative abundance found in SZ was the Class A (50%), while Class D is predominant in NZ (55%). CfxA (65%) and ACC (47%) types were the most abundant genes detected in SZ, while in NZ the most frequent were OXA-10 (32%), CfxA (28%), ACC (21%), CEPA (20%), and FOX (19%). Phylogenetic analysis revealed β-lactamases from Brazilian hospital sewage grouped in the same clade and close to sequences belonging to Firmicutes and Bacteroidetes groups, but distant from potential β-lactamases screened from public environmental data, that grouped closer to β-lactamases of Proteobacteria. Our results demonstrated that HMM-based approach identified homologs of
Wernersson, Rasmus; Juncker, Agnieszka; Nielsen, Henrik Bjørn
Nucleotide abundance measurements using DNA microarray technology are possible only if appropriate probes complementary to the target nucleotides can be identified. Here we present a protocol for selecting DNA probes for microarrays using the OligoWiz application. OligoWiz is a client-server appl......Nucleotide abundance measurements using DNA microarray technology are possible only if appropriate probes complementary to the target nucleotides can be identified. Here we present a protocol for selecting DNA probes for microarrays using the OligoWiz application. OligoWiz is a client......-server application that offers a detailed graphical interface and real-time user interaction on the client side, and massive computer power and a large collection of species databases (400, summer 2007) on the server side. Probes are selected according to five weighted scores: cross-hybridization, deltaT(m), folding...... computer skills and can be executed from any Internet-connected computer. The probe selection procedure for a standard microarray design targeting all yeast transcripts can be completed in 1 h....
Juri P. Kurhinen
Full Text Available Provides information about the results of the international scienti fic seminar «Сhronicle of Nature – a common database for scientific analysis and joint planning of scientific publications», held at Findland-Russian project «Linking environmental change to biodiversity change: large scale analysis оf Eurasia ecosystem».
Dehghan Khalilabad, Nastaran; Hassanpour, Hamid
Microarray technology is a powerful genomic tool for simultaneously studying and analyzing the behavior of thousands of genes. The analysis of images obtained from this technology plays a critical role in the detection and treatment of diseases. The aim of the current study is to develop an automated system for analyzing data from microarray images in order to detect cancerous cases. The proposed system consists of three main phases, namely image processing, data mining, and the detection of the disease. The image processing phase performs operations such as refining image rotation, gridding (locating genes) and extracting raw data from images the data mining includes normalizing the extracted data and selecting the more effective genes. Finally, via the extracted data, cancerous cell is recognized. To evaluate the performance of the proposed system, microarray database is employed which includes Breast cancer, Myeloid Leukemia and Lymphomas from the Stanford Microarray Database. The results indicate that the proposed system is able to identify the type of cancer from the data set with an accuracy of 95.45%, 94.11%, and 100%, respectively. Copyright © 2017 Elsevier Ltd. All rights reserved.
The amount of microarray gene expression data in public repositories has been increasing exponentially for the last couple of decades. High-throughput microarray data integration and analysis has become a critical step in exploring the large amount of expression data for biological discovery. Howeve...
Beilharz, Traude H; Preiss, Thomas
Nearly all eukaryotic mRNAs terminate in a poly(A) tail that serves important roles in mRNA utilization. In the cytoplasm, the poly(A) tail promotes both mRNA stability and translation, and these functions are frequently regulated through changes in tail length. To identify the scope of poly(A) tail length control in a transcriptome, we developed the polyadenylation state microarray (PASTA) method. It involves the purification of mRNA based on poly(A) tail length using thermal elution from poly(U) sepharose, followed by microarray analysis of the resulting fractions. In this chapter we detail our PASTA approach and describe some methods for bulk and mRNA-specific poly(A) tail length measurements of use to monitor the procedure and independently verify the microarray data.
Friedman, Debra; Hoffman, Phillip
Describes creation of a relational database at the University of Washington supporting ongoing academic planning at several levels and affecting the culture of decision making. Addresses getting started; sharing the database; questions, worries, and issues; improving access to high-demand courses; the advising function; management of instructional…
Sihtmäe, Mariliis; Blinova, Irina; Aruoja, Villem; Dubourguier, Henri-Charles; Legrand, Nicolas; Kahru, Anne
A new open-access online database, E-SovTox, is presented. E-SovTox provides toxicological data for substances relevant to the EU Registration, Evaluation, Authorisation and Restriction of Chemicals (REACH) system, from publicly-available Russian language data sources. The database contains information selected mainly from scientific journals published during the Soviet Union era. The main information source for this database - the journal, Gigiena Truda i Professional'nye Zabolevania [Industrial Hygiene and Occupational Diseases], published between 1957 and 1992 - features acute, but also chronic, toxicity data for numerous industrial chemicals, e.g. for rats, mice, guinea-pigs and rabbits. The main goal of the abovementioned toxicity studies was to derive the maximum allowable concentration limits for industrial chemicals in the occupational health settings of the former Soviet Union. Thus, articles featured in the database include mostly data on LD50 values, skin and eye irritation, skin sensitisation and cumulative properties. Currently, the E-SovTox database contains toxicity data selected from more than 500 papers covering more than 600 chemicals. The user is provided with the main toxicity information, as well as abstracts of these papers in Russian and in English (given as provided in the original publication). The search engine allows cross-searching of the database by the name or CAS number of the compound, and the author of the paper. The E-SovTox database can be used as a decision-support tool by researchers and regulators for the hazard assessment of chemical substances. 2010 FRAME.
Frijters, Raoul; Heupers, Bart; van Beek, Pieter; Bouwhuis, Maurice; van Schaik, René; de Vlieg, Jacob; Polman, Jan; Alkema, Wynand
Medline is a rich information source, from which links between genes and keywords describing biological processes, pathways, drugs, pathologies and diseases can be extracted. We developed a publicly available tool called CoPub that uses the information in the Medline database for the biological interpretation of microarray data. CoPub allows batch input of multiple human, mouse or rat genes and produces lists of keywords from several biomedical thesauri that are significantly correlated with the set of input genes. These lists link to Medline abstracts in which the co-occurring input genes and correlated keywords are highlighted. Furthermore, CoPub can graphically visualize differentially expressed genes and over-represented keywords in a network, providing detailed insight in the relationships between genes and keywords, and revealing the most influential genes as highly connected hubs. CoPub is freely accessible at http://services.nbic.nl/cgi-bin/copub/CoPub.pl.
Full Text Available Abstract Background Microarrays are routinely used to assess mRNA transcript levels on a genome-wide scale. Large amount of microarray datasets are now available in several databases, and new experiments are constantly being performed. In spite of this fact, few and limited tools exist for quickly and easily analyzing the results. Microarray analysis can be challenging for researchers without the necessary training and it can be time-consuming for service providers with many users. Results To address these problems we have developed an automated microarray data analysis (AMDA software, which provides scientists with an easy and integrated system for the analysis of Affymetrix microarray experiments. AMDA is free and it is available as an R package. It is based on the Bioconductor project that provides a number of powerful bioinformatics and microarray analysis tools. This automated pipeline integrates different functions available in the R and Bioconductor projects with newly developed functions. AMDA covers all of the steps, performing a full data analysis, including image analysis, quality controls, normalization, selection of differentially expressed genes, clustering, correspondence analysis and functional evaluation. Finally a LaTEX document is dynamically generated depending on the performed analysis steps. The generated report contains comments and analysis results as well as the references to several files for a deeper investigation. Conclusion AMDA is freely available as an R package under the GPL license. The package as well as an example analysis report can be downloaded in the Services/Bioinformatics section of the Genopolis http://www.genopolis.it/
Full Text Available Abstract Background DNA microarrays are used to produce large sets of expression measurements from which specific biological information is sought. Their analysis requires efficient and reliable algorithms for dimensional reduction, classification and annotation. Results We study networks of co-expressed genes obtained from DNA microarray experiments. The mathematical concept of curvature on graphs is used to group genes or samples into clusters to which relevant gene or sample annotations are automatically assigned. Application to publicly available yeast and human lymphoma data demonstrates the reliability of the method in spite of its simplicity, especially with respect to the small number of parameters involved. Conclusions We provide a method for automatically determining relevant gene clusters among the many genes monitored with microarrays. The automatic annotations and the graphical interface improve the readability of the data. A C++ implementation, called Trixy, is available from http://tagc.univ-mrs.fr/bioinformatics/trixy.html.
Rojas-Sola, J. I.
Full Text Available In this paper the publications from Spanish institutions listed in the journals of the Construction & Building Technology subject of Web of Science database for the period 1997- 2008 are analyzed. The number of journals in whose is published is 35 and the number of articles was 760 (Article or Review. Also a bibliometric assessment has done and we propose two new parameters: Weighted Impact Factor and Relative Impact Factor; also includes the number of citations and the number documents at the institutional level. Among the major production Institutions with greater scientific production, as expected, the Institute of Constructional Science Eduardo Torroja (CSIC, while taking into account the weighted impact factor ranks first University of Vigo. On the other hand, only two journals Cement and Concrete Materials and Materials de Construction agglutinate the 45.26% of the Spanish scientific production published in the Construction & Building Technology subject, with 172 papers each one. Regarding international cooperation, include countries such as England, Mexico, United States, Italy, Argentina and France.
En este trabajo se analizan las publicaciones procedentes de instituciones españolas recogidas en las revistas de la categoría Construction & Building Technology de la base de datos Web of Science para el periodo 1997-2008. El número de revistas incluidas es de 35 y el número de artículos publicados ha sido de 760 (Article o Review. Se ha realizado una evaluación bibliométrica con dos nuevos parámetros: Factor de Impacto Ponderado y Factor de Impacto Relativo; asimismo se incluyen el número de citas y el número de documentos a nivel institucional. Entre los centros con una mayor producción científica destaca, como era de prever, el Instituto de Ciencias de la Construcción Eduardo Torroja (CSIC, mientras que atendiendo al Factor de Impacto Ponderado ocupa el primer lugar la Universidad de Vigo. Por otro lado, sólo dos
Schneeberg, Alexander; Ehricht, Ralf; Slickers, Peter; Baier, Vico; Neubauer, Heinrich; Zimmermann, Stefan; Rabold, Denise; Lübke-Becker, Antina; Seyboldt, Christian
This study presents a DNA microarray-based assay for fast and simple PCR ribotyping of Clostridium difficile strains. Hybridization probes were designed to query the modularly structured intergenic spacer region (ISR), which is also the template for conventional and PCR ribotyping with subsequent capillary gel electrophoresis (seq-PCR) ribotyping. The probes were derived from sequences available in GenBank as well as from theoretical ISR module combinations. A database of reference hybridization patterns was set up from a collection of 142 well-characterized C. difficile isolates representing 48 seq-PCR ribotypes. The reference hybridization patterns calculated by the arithmetic mean were compared using a similarity matrix analysis. The 48 investigated seq-PCR ribotypes revealed 27 array profiles that were clearly distinguishable. The most frequent human-pathogenic ribotypes 001, 014/020, 027, and 078/126 were discriminated by the microarray. C. difficile strains related to 078/126 (033, 045/FLI01, 078, 126, 126/FLI01, 413, 413/FLI01, 598, 620, 652, and 660) and 014/020 (014, 020, and 449) showed similar hybridization patterns, confirming their genetic relatedness, which was previously reported. A panel of 50 C. difficile field isolates was tested by seq-PCR ribotyping and the DNA microarray-based assay in parallel. Taking into account that the current version of the microarray does not discriminate some closely related seq-PCR ribotypes, all isolates were typed correctly. Moreover, seq-PCR ribotypes without reference profiles available in the database (ribotype 009 and 5 new types) were correctly recognized as new ribotypes, confirming the performance and expansion potential of the microarray. Copyright © 2015, American Society for Microbiology. All Rights Reserved.
At the beginning of the IAEA Extrabudgetary Programme on the safety of WWER reactors a great number of findings and recommendations (safety items) were collected as a result of design review and safety review missions of the WWER-440/230 type reactors. On the basis of these findings a technical database containing more than 1300 records was established to support the consolidation of the information obtained and to help in identification of safety issues. After the scope of the WWER extrabudgetary programme was extended similar data sets were prepared for the WWER-440/213, WWER-1000 and RBMK nuclear power plants. This publication describes the structure of the databases on safety issues of WWER and RBMK NPPs, the information sources used in the databases and interrogation capabilities for users to obtain the necessary information. 14 refs, 9 figs, 5 tabs
Full Text Available Abstract Background The generation of large amounts of microarray data presents challenges for data collection, annotation, exchange and analysis. Although there are now widely accepted formats, minimum standards for data content and ontologies for microarray data, only a few groups are using them together to build and populate large-scale databases. Structured environments for data management are crucial for making full use of these data. Description The MiMiR database provides a comprehensive infrastructure for microarray data annotation, storage and exchange and is based on the MAGE format. MiMiR is MIAME-supportive, customised for use with data generated on the Affymetrix platform and includes a tool for data annotation using ontologies. Detailed information on the experiment, methods, reagents and signal intensity data can be captured in a systematic format. Reports screens permit the user to query the database, to view annotation on individual experiments and provide summary statistics. MiMiR has tools for automatic upload of the data from the microarray scanner and export to databases using MAGE-ML. Conclusion MiMiR facilitates microarray data management, annotation and exchange, in line with international guidelines. The database is valuable for underpinning research activities and promotes a systematic approach to data handling. Copies of MiMiR are freely available to academic groups under licence.
Sirisinha, Stitaya; Koontongkaew, Sittichai; Phantumvanit, Prathip; Wittayawuttikul, Ruchareka
This communication analyzed research publications in dentistry in the Institute of Scientific Information Web of Science databases of 10 dental faculties in the Association of South-East Asian Nations (ASEAN) from 2000 to 2009. The term used for the "all-document types" search was "Faculty of Dentistry/College of Dentistry." Abstracts presented at regional meetings were also included in the analysis. The Times Higher Education System QS World University Rankings showed that universities in the region fare poorly in world university rankings. Only the National University of Singapore and Nanyang Technological University appeared in the top 100 in 2009; 19 universities in the region, including Indonesia, Malaysia, the Philippines, Singapore, and Thailand, appeared in the top 500. Data from the databases showed that research publications by dental institutes in the region fall short of their Asian counterparts. Singapore and Thailand are the most active in dental research of the ASEAN countries. © 2011 Blackwell Publishing Asia Pty Ltd.
This paper reviews basics and updates of each microarray technology and serves to .... through protein microarrays. Protein microarrays also known as protein chips are nothing but grids that ... conditioned media, patient sera, plasma and urine. Clontech ... based antibody arrays) is similar to membrane-based antibody ...
Dufva, Hans Martin; Christensen, C.B.V.
DNA microarrays have changed the field of biomedical sciences over the past 10 years. For several reasons, antibody and other protein microarrays have not developed at the same rate. However, protein and antibody arrays have emerged as a powerful tool to complement DNA microarrays during the post...
Nichols, A W
To identify sports medicine-related clinical trial research articles in the PubMed MEDLINE database published between 1996 and 2005 and conduct a review and analysis of topics of research, experimental designs, journals of publication and the internationality of authorships. Sports medicine research is international in scope with improving study methodology and an evolution of topics. Structured review of articles identified in a search of a large electronic medical database. PubMed MEDLINE database. Sports medicine-related clinical research trials published between 1996 and 2005. Review and analysis of articles that meet inclusion criteria. Articles were examined for study topics, research methods, experimental subject characteristics, journal of publication, lead authors and journal countries of origin and language of publication. The search retrieved 414 articles, of which 379 (345 English language and 34 non-English language) met the inclusion criteria. The number of publications increased steadily during the study period. Randomised clinical trials were the most common study type and the "diagnosis, management and treatment of sports-related injuries and conditions" was the most popular study topic. The knee, ankle/foot and shoulder were the most frequent anatomical sites of study. Soccer players and runners were the favourite study subjects. The American Journal of Sports Medicine had the highest number of publications and shared the greatest international diversity of authorships with the British Journal of Sports Medicine. The USA, Australia, Germany and the UK produced a good number of the lead authorships. In all, 91% of articles and 88% of journals were published in English. Sports medicine-related research is internationally diverse, clinical trial publications are increasing and the sophistication of research design may be improving.
Conclusion: The number of hospice palliative care-related publications increased rapidly from 1993 to 2013 in the world and in Taiwan; however, the number of publications from Taiwan is still far below those published in several other countries. Further research is needed to identify and try to reduce the barriers to hospice palliative care research and publication in Taiwan.
Frey Jürg E
Full Text Available Abstract Background Microarrays are powerful tools for DNA-based molecular diagnostics and identification of pathogens. Most target a limited range of organisms and are based on only one or a very few genes for specific identification. Such microarrays are limited to organisms for which specific probes are available, and often have difficulty discriminating closely related taxa. We have developed an alternative broad-spectrum microarray that employs hybridisation fingerprints generated by high-density anonymous markers distributed over the entire genome for identification based on comparison to a reference database. Results A high-density microarray carrying 95,000 unique 13-mer probes was designed. Optimized methods were developed to deliver reproducible hybridisation patterns that enabled confident discrimination of bacteria at the species, subspecies, and strain levels. High correlation coefficients were achieved between replicates. A sub-selection of 12,071 probes, determined by ANOVA and class prediction analysis, enabled the discrimination of all samples in our panel. Mismatch probe hybridisation was observed but was found to have no effect on the discriminatory capacity of our system. Conclusions These results indicate the potential of our genome chip for reliable identification of a wide range of bacterial taxa at the subspecies level without laborious prior sequencing and probe design. With its high resolution capacity, our proof-of-principle chip demonstrates great potential as a tool for molecular diagnostics of broad taxonomic groups.
Matthews Benjamin F
Full Text Available Abstract Background Blueberry is a member of the Ericaceae family, which also includes closely related cranberry and more distantly related rhododendron, azalea, and mountain laurel. Blueberry is a major berry crop in the United States, and one that has great nutritional and economical value. Extreme low temperatures, however, reduce crop yield and cause major losses to US farmers. A better understanding of the genes and biochemical pathways that are up- or down-regulated during cold acclimation is needed to produce blueberry cultivars with enhanced cold hardiness. To that end, the blueberry genomics database (BBDG was developed. Along with the analysis tools and web-based query interfaces, the database serves both the broader Ericaceae research community and the blueberry research community specifically by making available ESTs and gene expression data in searchable formats and in elucidating the underlying mechanisms of cold acclimation and freeze tolerance in blueberry. Description BBGD is the world's first database for blueberry genomics. BBGD is both a sequence and gene expression database. It stores both EST and microarray data and allows scientists to correlate expression profiles with gene function. BBGD is a public online database. Presently, the main focus of the database is the identification of genes in blueberry that are significantly induced or suppressed after low temperature exposure. Conclusion By using the database, researchers have developed EST-based markers for mapping and have identified a number of "candidate" cold tolerance genes that are highly expressed in blueberry flower buds after exposure to low temperatures.
Mihaleva, V.V.; Beek, te T.A.; Zimmeren, van F.; Moco, S.I.A.; Laatikainen, R.; Niemitz, M.; Korhonen, S.P.; Driel, van M.A.; Vervoort, J.
Identification of natural compounds, especially secondary metabolites, has been hampered by the lack of easy to use and accessible reference databases. Nuclear magnetic resonance (NMR) spectroscopy is the most selective technique for identification of unknown metabolites. High quality 1H NMR (proton
Full Text Available Abstract Background The extensive use of DNA microarray technology in the characterization of the cell transcriptome is leading to an ever increasing amount of microarray data from cancer studies. Although similar questions for the same type of cancer are addressed in these different studies, a comparative analysis of their results is hampered by the use of heterogeneous microarray platforms and analysis methods. Results In contrast to a meta-analysis approach where results of different studies are combined on an interpretative level, we investigate here how to directly integrate raw microarray data from different studies for the purpose of supervised classification analysis. We use median rank scores and quantile discretization to derive numerically comparable measures of gene expression from different platforms. These transformed data are then used for training of classifiers based on support vector machines. We apply this approach to six publicly available cancer microarray gene expression data sets, which consist of three pairs of studies, each examining the same type of cancer, i.e. breast cancer, prostate cancer or acute myeloid leukemia. For each pair, one study was performed by means of cDNA microarrays and the other by means of oligonucleotide microarrays. In each pair, high classification accuracies (> 85% were achieved with training and testing on data instances randomly chosen from both data sets in a cross-validation analysis. To exemplify the potential of this cross-platform classification analysis, we use two leukemia microarray data sets to show that important genes with regard to the biology of leukemia are selected in an integrated analysis, which are missed in either single-set analysis. Conclusion Cross-platform classification of multiple cancer microarray data sets yields discriminative gene expression signatures that are found and validated on a large number of microarray samples, generated by different laboratories and
Full Text Available In many cases, crucial genes show relatively slight changes between groups of samples (e.g. normal vs. disease, and many genes selected from microarray differential analysis by measuring the expression level statistically are also poorly annotated and lack of biological significance. In this paper, we present an innovative approach - network expansion and pathway enrichment analysis (NEPEA for integrative microarray analysis. We assume that organized knowledge will help microarray data analysis in significant ways, and the organized knowledge could be represented as molecular interaction networks or biological pathways. Based on this hypothesis, we develop the NEPEA framework based on network expansion from the human annotated and predicted protein interaction (HAPPI database, and pathway enrichment from the human pathway database (HPD. We use a recently-published microarray dataset (GSE24215 related to insulin resistance and type 2 diabetes (T2D as case study, since this study provided a thorough experimental validation for both genes and pathways identified computationally from classical microarray analysis and pathway analysis. We perform our NEPEA analysis for this dataset based on the results from the classical microarray analysis to identify biologically significant genes and pathways. Our findings are not only consistent with the original findings mostly, but also obtained more supports from other literatures.
Jahandeh, Nadia; Ranjbar, Reza; Behzadi, Payam; Behzadi, Elham
The pathotypes of uropathogenic Escherichia coli (UPEC) cause different types of urinary tract infections (UTIs). The presence of a wide range of virulence genes in UPEC enables us to design appropriate DNA microarray probes. These probes, which are used in DNA microarray technology, provide us with an accurate and rapid diagnosis and definitive treatment in association with UTIs caused by UPEC pathotypes. The main goal of this article is to introduce the UPEC virulence genes as invaluable approaches for designing DNA microarray probes. Main search engines such as Google Scholar and databases like NCBI were searched to find and study several original pieces of literature, review articles, and DNA gene sequences. In parallel with in silico studies, the experiences of the authors were helpful for selecting appropriate sources and writing this review article. There is a significant variety of virulence genes among UPEC strains. The DNA sequences of virulence genes are fabulous patterns for designing microarray probes. The location of virulence genes and their sequence lengths influence the quality of probes. The use of selected virulence genes for designing microarray probes gives us a wide range of choices from which the best probe candidates can be chosen. DNA microarray technology provides us with an accurate, rapid, cost-effective, sensitive, and specific molecular diagnostic method which is facilitated by designing microarray probes. Via these tools, we are able to have an accurate diagnosis and a definitive treatment regarding UTIs caused by UPEC pathotypes.
Morello, Samuel A.; Ricks, Wendell R.
The aviation safety issues database was instrumental in the refinement and substantiation of the National Aviation Safety Strategic Plan (NASSP). The issues database is a comprehensive set of issues from an extremely broad base of aviation functions, personnel, and vehicle categories, both nationally and internationally. Several aviation safety stakeholders such as the Commercial Aviation Safety Team (CAST) have already used the database. This broader interest was the genesis to making the database publically accessible and writing this report.
Full Text Available Abstract Background The triple epidemic of silicosis, tuberculosis and HIV infection among migrant miners from South Africa and neighbouring countries who have worked in the South African mining industry is currently the target of regional and international control efforts. These initiatives are hampered by a lack of information on this population. Methods This study analysed the major South African mining recruitment database for the period 1973 to 2012 by calendar intervals and demographic and occupational characteristics. Changes in area of recruitment were mapped using a geographic information system. Results The database contained over 10 million contracts, reducible to 1.64 million individuals. Major trends relevant to health projection were a decline in gold mining employment, the major source of silicosis; increasing recruitment of female miners; and shifts in recruitment from foreign to South African miners, from the Eastern to the Northwestern parts of South Africa, and from company employees to contractors. Conclusions These changes portend further externalisation of the burden of mining lung disease to home communities, as miners, particularly from the gold sector, leave the industry. The implications for health, surveillance and health services of the growing number of miners hired as contractors need further research, as does the health experience of female miners. Overall, the information in this report can be used for projection of disease burden and direction of compensation, screening and treatment services for the ex-miner population throughout Southern Africa.
Ehrlich, Rodney; Montgomery, Alex; Akugizibwe, Paula; Gonsalves, Gregg
The triple epidemic of silicosis, tuberculosis and HIV infection among migrant miners from South Africa and neighbouring countries who have worked in the South African mining industry is currently the target of regional and international control efforts. These initiatives are hampered by a lack of information on this population. This study analysed the major South African mining recruitment database for the period 1973 to 2012 by calendar intervals and demographic and occupational characteristics. Changes in area of recruitment were mapped using a geographic information system. The database contained over 10 million contracts, reducible to 1.64 million individuals. Major trends relevant to health projection were a decline in gold mining employment, the major source of silicosis; increasing recruitment of female miners; and shifts in recruitment from foreign to South African miners, from the Eastern to the Northwestern parts of South Africa, and from company employees to contractors. These changes portend further externalisation of the burden of mining lung disease to home communities, as miners, particularly from the gold sector, leave the industry. The implications for health, surveillance and health services of the growing number of miners hired as contractors need further research, as does the health experience of female miners. Overall, the information in this report can be used for projection of disease burden and direction of compensation, screening and treatment services for the ex-miner population throughout Southern Africa.
Wittkowski Knut M
Full Text Available Abstract Background Microscopists are familiar with many blemishes that fluorescence images can have due to dust and debris, glass flaws, uneven distribution of fluids or surface coatings, etc. Microarray scans show similar artefacts, which affect the analysis, particularly when one tries to detect subtle changes. However, most blemishes are hard to find by the unaided eye, particularly in high-density oligonucleotide arrays (HDONAs. Results We present a method that harnesses the statistical power provided by having several HDONAs available, which are obtained under similar conditions except for the experimental factor. This method "harshlights" blemishes and renders them evident. We find empirically that about 25% of our chips are blemished, and we analyze the impact of masking them on screening for differentially expressed genes. Conclusion Experiments attempting to assess subtle expression changes should be carefully screened for blemishes on the chips. The proposed method provides investigators with a novel robust approach to improve the sensitivity of microarray analyses. By utilizing topological information to identify and mask blemishes prior to model based analyses, the method prevents artefacts from confounding the process of background correction, normalization, and summarization.
Full Text Available Zebrafish (Danio rerio is a well-recognized model for the study of vertebrate developmental genetics, yet at the same time little is known about the transcriptional events that underlie zebrafish embryogenesis. Here we have employed microarray analysis to study the temporal activity of developmentally regulated genes during zebrafish embryogenesis. Transcriptome analysis at 12 different embryonic time points covering five different developmental stages (maternal, blastula, gastrula, segmentation, and pharyngula revealed a highly dynamic transcriptional profile. Hierarchical clustering, stage-specific clustering, and algorithms to detect onset and peak of gene expression revealed clearly demarcated transcript clusters with maximum gene activity at distinct developmental stages as well as co-regulated expression of gene groups involved in dedicated functions such as organogenesis. Our study also revealed a previously unidentified cohort of genes that are transcribed prior to the mid-blastula transition, a time point earlier than when the zygotic genome was traditionally thought to become active. Here we provide, for the first time to our knowledge, a comprehensive list of developmentally regulated zebrafish genes and their expression profiles during embryogenesis, including novel information on the temporal expression of several thousand previously uncharacterized genes. The expression data generated from this study are accessible to all interested scientists from our institute resource database (http://giscompute.gis.a-star.edu.sg/~govind/zebrafish/data_download.html.
Full Text Available Abstract Background Medical spending on psychiatric hospitalization has been reported to impose a tremendous socio-economic burden on many developed countries with public health insurance programmes. However, there has been no in-depth study of the factors affecting psychiatric inpatient medical expenditures and differentiated these factors across different types of public health insurance programmes. In view of this, this study attempted to explore factors affecting medical expenditures for psychiatric inpatients between two public health insurance programmes covering the entire South Korean population: National Health Insurance (NHI and National Medical Care Aid (AID. Methods This retrospective, cross-sectional study used a nationwide, population-based reimbursement claims dataset consisting of 1,131,346 claims of all 160,465 citizens institutionalized due to psychiatric diagnosis between January 2005 and June 2006 in South Korea. To adjust for possible correlation of patients characteristics within the same medical institution and a non-linearity structure, a Box-Cox transformed, multilevel regression analysis was performed. Results Compared with inpatients 19 years old or younger, the medical expenditures of inpatients between 50 and 64 years old were 10% higher among NHI beneficiaries but 40% higher among AID beneficiaries. Males showed higher medical expenditures than did females. Expenditures on inpatients with schizophrenia as compared to expenditures on those with neurotic disorders were 120% higher among NHI beneficiaries but 83% higher among AID beneficiaries. Expenditures on inpatients of psychiatric hospitals were greater on average than expenditures on inpatients of general hospitals. Among AID beneficiaries, institutions owned by private groups treated inpatients with 32% higher costs than did government institutions. Among NHI beneficiaries, inpatients medical expenditures were positively associated with the proportion of
Full Text Available According to the original wording of the Regulation on the register of land and buildings of 2001, in the real estate cadastre there was one attribute associated with the use of a building structure - its intended use, which was applicable until the amendment to the Regulation was introduced in 2013. Then, additional attributes were added, i.e. the type of the building according to the Classification of Fixed Assets (KST, the class of the building according to the Polish Classification of Types of Constructions (PKOB and, at the same time, the main functional use and other functions of the building remained in the Regulation as well. The record data on buildings are captured for the real estate cadastre from other data sets, for example those maintained by architectural and construction authorities. At the same time, the data contained in the cadastre, after they have been entered or changed in the database, are transferred to other registers, such as tax records, or land and mortgage court registers. This study is the result of the analysis of the laws applicable to the specific units and registers. A list of discrepancies in the attributes occurring in the different registers was prepared. The practical part of the study paid particular attention to the legal bases and procedures for entering the function of a building in the real estate cadastre, which is extremely significant, as it is the attribute determining the property tax basis.
According to the original wording of the Regulation on the register of land and buildings of 2001, in the real estate cadastre there was one attribute associated with the use of a building structure - its intended use, which was applicable until the amendment to the Regulation was introduced in 2013. Then, additional attributes were added, i.e. the type of the building according to the Classification of Fixed Assets (KST), the class of the building according to the Polish Classification of Types of Constructions (PKOB) and, at the same time, the main functional use and other functions of the building remained in the Regulation as well. The record data on buildings are captured for the real estate cadastre from other data sets, for example those maintained by architectural and construction authorities. At the same time, the data contained in the cadastre, after they have been entered or changed in the database, are transferred to other registers, such as tax records, or land and mortgage court registers. This study is the result of the analysis of the laws applicable to the specific units and registers. A list of discrepancies in the attributes occurring in the different registers was prepared. The practical part of the study paid particular attention to the legal bases and procedures for entering the function of a building in the real estate cadastre, which is extremely significant, as it is the attribute determining the property tax basis.
DNA microarrays become increasingly important in the field of clinical diagnostics. These microarrays, also called DNA chips, are small solid substrates, typically having a maximum surface area of a few cm2, onto which many spots are arrayed in a pre-determined pattern. Each of these spots contains
Fangel, Jonatan Ulrik; Pedersen, H.L.; Vidal-Melgosa, S.
Almost all plant cells are surrounded by glycan-rich cell walls, which form much of the plant body and collectively are the largest source of biomass on earth. Plants use polysaccharides for support, defense, signaling, cell adhesion, and as energy storage, and many plant glycans are also important...... industrially and nutritionally. Understanding the biological roles of plant glycans and the effective exploitation of their useful properties requires a detailed understanding of their structures, occurrence, and molecular interactions. Microarray technology has revolutionized the massively high...... for plant research and can be used to map glycan populations across large numbers of samples to screen antibodies, carbohydrate binding proteins, and carbohydrate binding modules and to investigate enzyme activities....
de Koning, Dirk-Jan; Jaffrézic, Florence; Lund, Mogens Sandø
Microarray analyses have become an important tool in animal genomics. While their use is becoming widespread, there is still a lot of ongoing research regarding the analysis of microarray data. In the context of a European Network of Excellence, 31 researchers representing 14 research groups from...... 10 countries performed and discussed the statistical analyses of real and simulated 2-colour microarray data that were distributed among participants. The real data consisted of 48 microarrays from a disease challenge experiment in dairy cattle, while the simulated data consisted of 10 microarrays...... statistical weights, to omitting a large number of spots or omitting entire slides. Surprisingly, these very different approaches gave quite similar results when applied to the simulated data, although not all participating groups analysed both real and simulated data. The workshop was very successful...
Full Text Available In the last decade microsatellites have become one of the most useful genetic markers used in a large number of organisms due to their abundance and high level of polymorphism. Microsatellites have been used for individual identification, paternity tests, forensic studies and population genetics. Data on microsatellite abundance comes preferentially from microsatellite enriched libraries and DNA sequence databases. We have conducted a search in GenBank of more than 16,000 Schistosoma mansoni ESTs and 42,000 BAC sequences. In addition, we obtained 300 sequences from CA and AT microsatellite enriched genomic libraries. The sequences were searched for simple repeats using the RepeatMasker software. Of 16,022 ESTs, we detected 481 (3% sequences that contained 622 microsatellites (434 perfect, 164 imperfect and 24 compounds. Of the 481 ESTs, 194 were grouped in 63 clusters containing 2 to 15 ESTs per cluster. Polymorphisms were observed in 16 clusters. The 287 remaining ESTs were orphan sequences. Of the 42,017 BAC end sequences, 1,598 (3.8% contained microsatellites (2,335 perfect, 287 imperfect and 79 compounds. The 1,598 BAC end sequences 80 were grouped into 17 clusters containing 3 to 17 BAC end sequences per cluster. Microsatellites were present in 67 out of 300 sequences from microsatellite enriched libraries (55 perfect, 38 imperfect and 15 compounds. From all of the observed loci 55 were selected for having the longest perfect repeats and flanking regions that allowed the design of primers for PCR amplification. Additionally we describe two new polymorphic microsatellite loci.
Koia, Jonni H; Moyle, Richard L; Botella, Jose R
Pineapple (Ananas comosus) is a tropical fruit crop of significant commercial importance. Although the physiological changes that occur during pineapple fruit development have been well characterized, little is known about the molecular events that occur during the fruit ripening process. Understanding the molecular basis of pineapple fruit ripening will aid the development of new varieties via molecular breeding or genetic modification. In this study we developed a 9277 element pineapple microarray and used it to profile gene expression changes that occur during pineapple fruit ripening. Microarray analyses identified 271 unique cDNAs differentially expressed at least 1.5-fold between the mature green and mature yellow stages of pineapple fruit ripening. Among these 271 sequences, 184 share significant homology with genes encoding proteins of known function, 53 share homology with genes encoding proteins of unknown function and 34 share no significant homology with any database accession. Of the 237 pineapple sequences with homologs, 160 were up-regulated and 77 were down-regulated during pineapple fruit ripening. DAVID Functional Annotation Cluster (FAC) analysis of all 237 sequences with homologs revealed confident enrichment scores for redox activity, organic acid metabolism, metalloenzyme activity, glycolysis, vitamin C biosynthesis, antioxidant activity and cysteine peptidase activity, indicating the functional significance and importance of these processes and pathways during pineapple fruit development. Quantitative real-time PCR analysis validated the microarray expression results for nine out of ten genes tested. This is the first report of a microarray based gene expression study undertaken in pineapple. Our bioinformatic analyses of the transcript profiles have identified a number of genes, processes and pathways with putative involvement in the pineapple fruit ripening process. This study extends our knowledge of the molecular basis of pineapple fruit
Guz, A. N.; Rushchitsky, J. J.
The paper performs a citation analysis of publications of mechanicians of the National Academy of Sciences of Ukraine (NASU) based on information tools developed by the Thomson Reuters Institute for Scientific Information. Two groups of mechanicians are considered: representatives of the S. P. Timoshenko Institute of Mechanics of the NASU (NASU members, heads of departments) and members (academicians) of the NASU Division of Mechanics. Three elements of the Citation Report (Results Found, Citation Index (Sum of the Times Cited), h-index) are presented for each scientist. This paper may be considered as a follow-up on the papers [6-11] published by Prikladnaya Mekhanika ( International Applied Mechanics) in 2005-2009
Bingle, Lynne; Fonseca, Felipe P; Farthing, Paula M
Tissue microarrays were first constructed in the 1980s but were used by only a limited number of researchers for a considerable period of time. In the last 10 years there has been a dramatic increase in the number of publications describing the successful use of tissue microarrays in studies aimed at discovering and validating biomarkers. This, along with the increased availability of both manual and automated microarray builders on the market, has encouraged even greater use of this novel and powerful tool. This chapter describes the basic techniques required to build a tissue microarray using a manual method in order that the theory behind the practical steps can be fully explained. Guidance is given to ensure potential disadvantages of the technique are fully considered.
Jérôme, Marc; Martinsohn, Jann Thorsten; Ortega, Delphine; Carreau, Philippe; Verrez-Bagnis, Véronique; Mouchel, Olivier
Traceability in the fish food sector plays an increasingly important role for consumer protection and confidence building. This is reflected by the introduction of legislation and rules covering traceability on national and international levels. Although traceability through labeling is well established and supported by respective regulations, monitoring and enforcement of these rules are still hampered by the lack of efficient diagnostic tools. We describe protocols using a direct sequencing method based on 212-274-bp diagnostic sequences derived from species-specific mitochondria DNA cytochrome b, 16S rRNA, and cytochrome oxidase subunit I sequences which can efficiently be applied to unambiguously determine even closely related fish species in processed food products labeled "anchovy". Traceability of anchovy-labeled products is supported by the public online database AnchovyID ( http://anchovyid.jrc.ec.europa.eu), which provided data obtained during our study and tools for analytical purposes.
Full Text Available Abstract Background Microarray technologies have become common tools in biological research. As a result, a need for effective computational methods for data analysis has emerged. Numerous different algorithms have been proposed for analyzing the data. However, an objective evaluation of the proposed algorithms is not possible due to the lack of biological ground truth information. To overcome this fundamental problem, the use of simulated microarray data for algorithm validation has been proposed. Results We present a microarray simulation model which can be used to validate different kinds of data analysis algorithms. The proposed model is unique in the sense that it includes all the steps that affect the quality of real microarray data. These steps include the simulation of biological ground truth data, applying biological and measurement technology specific error models, and finally simulating the microarray slide manufacturing and hybridization. After all these steps are taken into account, the simulated data has realistic biological and statistical characteristics. The applicability of the proposed model is demonstrated by several examples. Conclusion The proposed microarray simulation model is modular and can be used in different kinds of applications. It includes several error models that have been proposed earlier and it can be used with different types of input data. The model can be used to simulate both spotted two-channel and oligonucleotide based single-channel microarrays. All this makes the model a valuable tool for example in validation of data analysis algorithms.
The Chemical and Product Categories database (CPCat) catalogs the use of over 40,000 chemicals and their presence in different consumer products. The chemical use information is compiled from multiple sources while product information is gathered from publicly available Material Safety Data Sheets (MSDS). EPA researchers are evaluating the possibility of expanding the database with additional product and use information.
Choe, Jae Gol; Shin, Kyung Ho; Lee, Min Soo; Kim, Meyoung Kon
Microarray technology allows the simultaneous analysis of gene expression patterns of thousands of genes, in a systematic fashion, under a similar set of experimental conditions, thus making the data highly comparable. In some cases arrays are used simply as a primary screen leading to downstream molecular characterization of individual gene candidates. In other cases, the goal of expression profiling is to begin to identify complex regulatory networks underlying developmental processes and disease states. Microarrays were originally used with cell lines or other simple model systems. More recently, microarrays have been used in the analysis of more complex biological tissues including neural systems and the brain. The application of cDNA arrays in neuropsychiatry has lagged behind other fields for a number of reasons. These include a requirement for a large amount of input probe RNA in fluorescent-glass based array systems and the cellular complexity introduced by multicellular brain and neural tissues. An additional factor that impacts the general use of microarrays in neuropsychiatry is the lack of availability of sequenced clone sets from model systems. While human cDNA clones have been widely available, high quality rat, mouse, and drosophilae, among others are just becoming widely available. A final factor in the application of cDNA microarrays in neuropsychiatry is cost of commercial arrays. As academic microarray facilitates become more commonplace custom made arrays will become more widely available at a lower cost allowing more widespread applications. In summary, microarray technology is rapidly having an impact on many areas of biomedical research. Radioisotope-nylon based microarrays offer alternatives that may in some cases be more sensitive, flexible, inexpensive, and universal as compared to other array formats, such as fluorescent-glass arrays. In some situations of limited RNA or exotic species, radioactive membrane microarrays may be the most
Choe, Jae Gol; Shin, Kyung Ho; Lee, Min Soo; Kim, Meyoung Kon [Korea University Medical School, Seoul (Korea, Republic of)
Microarray technology allows the simultaneous analysis of gene expression patterns of thousands of genes, in a systematic fashion, under a similar set of experimental conditions, thus making the data highly comparable. In some cases arrays are used simply as a primary screen leading to downstream molecular characterization of individual gene candidates. In other cases, the goal of expression profiling is to begin to identify complex regulatory networks underlying developmental processes and disease states. Microarrays were originally used with cell lines or other simple model systems. More recently, microarrays have been used in the analysis of more complex biological tissues including neural systems and the brain. The application of cDNA arrays in neuropsychiatry has lagged behind other fields for a number of reasons. These include a requirement for a large amount of input probe RNA in fluorescent-glass based array systems and the cellular complexity introduced by multicellular brain and neural tissues. An additional factor that impacts the general use of microarrays in neuropsychiatry is the lack of availability of sequenced clone sets from model systems. While human cDNA clones have been widely available, high quality rat, mouse, and drosophilae, among others are just becoming widely available. A final factor in the application of cDNA microarrays in neuropsychiatry is cost of commercial arrays. As academic microarray facilitates become more commonplace custom made arrays will become more widely available at a lower cost allowing more widespread applications. In summary, microarray technology is rapidly having an impact on many areas of biomedical research. Radioisotope-nylon based microarrays offer alternatives that may in some cases be more sensitive, flexible, inexpensive, and universal as compared to other array formats, such as fluorescent-glass arrays. In some situations of limited RNA or exotic species, radioactive membrane microarrays may be the most
National Oceanic and Atmospheric Administration, Department of Commerce — NGDC maintains a database of over 1,500 volcano locations obtained from the Smithsonian Institution Global Volcanism Program, Volcanoes of the World publication. The...
González-Alcaide, Gregorio; Castelló-Cogollos, Lourdes; Castellano-Gómez, Miguel; Agullo-Calatayud, Víctor; Aleixandre-Benavent, Rafael; Alvarez, Francisco Javier; Valderrama-Zurián, Juan Carlos
The research of alcohol consumption-related problems is a multidisciplinary field. The aim of this study is to analyze the worldwide scientific production in the area of alcohol-drinking and alcohol-related problems from 2005 to 2009. A MEDLINE and Scopus search on alcohol (alcohol-drinking and alcohol-related problems) published from 2005 to 2009 was carried out. Using bibliometric indicators, the distribution of the publications was determined within the journals that publish said articles, specialty of the journal (broad subject terms), article type, language of the publication, and country where the journal is published. Also, authorship characteristics were assessed (collaboration index and number of authors who have published more than 9 documents). The existing research groups were also determined. About 24,100 documents on alcohol, published in 3,862 journals, and authored by 69,640 authors were retrieved from MEDLINE and Scopus between the years 2005 and 2009. The collaboration index of the articles was 4.83 ± 3.7. The number of consolidated research groups in the field was identified as 383, with 1,933 authors. Documents on alcohol were published mainly in journals covering the field of "Substance-Related Disorders," 23.18%, followed by "Medicine," 8.7%, "Psychiatry," 6.17%, and "Gastroenterology," 5.25%. Research on alcohol is a consolidated field, with an average of 4,820 documents published each year between 2005 and 2009 in MEDLINE and Scopus. Alcohol-related publications have a marked multidisciplinary nature. Collaboration was common among alcohol researchers. There is an underrepresentation of alcohol-related publications in languages other than English and from developing countries, in MEDLINE and Scopus databases. Copyright © 2012 by the Research Society on Alcoholism.
Takeuchi, Ichiro; Nakagawa, Masao; Seto, Masao
In many microarray studies, gene set selection is an important preliminary step for subsequent main task such as tumor classification, cancer subtype identification, etc. In this paper, we investigate the possibility of using metric learning as an alternative to gene set selection. We develop a simple metric learning algorithm aiming to use it for microarray data analysis. Exploiting a property of the algorithm, we introduce a novel approach for extending the metric learning to be adaptive. We apply the algorithm to previously studied microarray data on malignant lymphoma subtype identification.
Wu, Tsung-Jung; Shamsaddini, Amirhossein; Pan, Yang; Smith, Krista; Crichton, Daniel J; Simonyan, Vahan; Mazumder, Raja
Years of sequence feature curation by UniProtKB/Swiss-Prot, PIR-PSD, NCBI-CDD, RefSeq and other database biocurators has led to a rich repository of information on functional sites of genes and proteins. This information along with variation-related annotation can be used to scan human short sequence reads from next-generation sequencing (NGS) pipelines for presence of non-synonymous single-nucleotide variations (nsSNVs) that affect functional sites. This and similar workflows are becoming more important because thousands of NGS data sets are being made available through projects such as The Cancer Genome Atlas (TCGA), and researchers want to evaluate their biomarkers in genomic data. BioMuta, an integrated sequence feature database, provides a framework for automated and manual curation and integration of cancer-related sequence features so that they can be used in NGS analysis pipelines. Sequence feature information in BioMuta is collected from the Catalogue of Somatic Mutations in Cancer (COSMIC), ClinVar, UniProtKB and through biocuration of information available from publications. Additionally, nsSNVs identified through automated analysis of NGS data from TCGA are also included in the database. Because of the petabytes of data and information present in NGS primary repositories, a platform HIVE (High-performance Integrated Virtual Environment) for storing, analyzing, computing and curating NGS data and associated metadata has been developed. Using HIVE, 31 979 nsSNVs were identified in TCGA-derived NGS data from breast cancer patients. All variations identified through this process are stored in a Curated Short Read archive, and the nsSNVs from the tumor samples are included in BioMuta. Currently, BioMuta has 26 cancer types with 13 896 small-scale and 308 986 large-scale study-derived variations. Integration of variation data allows identifications of novel or common nsSNVs that can be prioritized in validation studies. Database URL: BioMuta: http
blood glucose > 16.7 mmol/L were used as the model group and treated with Dendrobium mixture. (DEN ... Keywords: Diabetes, Gene expression, Dendrobium mixture, Microarray testing ..... homeostasis in airway smooth muscle. Am J.
Anguita, Alberto; Martin, Luis; Garcia-Remesal, Miguel; Maojo, Victor
This paper presents RDFBuilder, a tool that enables RDF-based access to MAGE-ML-compliant microarray databases. We have developed a system that automatically transforms the MAGE-OM model and microarray data stored in the ArrayExpress database into RDF format. Additionally, the system automatically enables a SPARQL endpoint. This allows users to execute SPARQL queries for retrieving microarray data, either from specific experiments or from more than one experiment at a time. Our system optimizes response times by caching and reusing information from previous queries. In this paper, we describe our methods for achieving this transformation. We show that our approach is complementary to other existing initiatives, such as Bio2RDF, for accessing and retrieving data from the ArrayExpress database. Copyright © 2013 Elsevier Ireland Ltd. All rights reserved.
Albayrak, Ozgür; Föcker, Manuel; Wibker, Katrin; Hebebrand, Johannes
We aimed to determine the quantitative scientific publication output of child and adolescent psychiatric/psychological affiliations during 2005-2010 by country based on both, "PubMed" and "Scopus" and performed a bibliometric qualitative evaluation for 2009 using "PubMed". We performed our search by affiliation related to child and adolescent psychiatric/psychological institutions using "PubMed". For the quantitative analysis for 2005-2010, we counted the number of abstracts. For the qualitative analysis for 2009 we derived the impact factor of each abstract's journal from "Journal Citation Reports". We related total impact factor scores to the gross domestic product (GDP) and population size of each country. Additionally, we used "Scopus" to determine the number of abstracts for each country that was identified via "PubMed" for 2009 and compared the ranking of countries between the two databases. 61 % of the publications between 2005 and 2010 originated from European countries and 26 % from the USA. After adjustment for GDP and population size, the ranking positions changed in favor of smaller European countries with a population size of less than 20 million inhabitants. The ranking of countries for the count of articles in 2009 as derived from "Scopus" was similar to that identified via the "PubMed" search. The performed search revealed only minor differences between "Scopus" and "PubMed" related to the ranking of countries. Our data indicate a sharp difference between countries with a high versus low GDP with regard to scientific publication output in child and adolescent psychiatry/psychology.
Bell, D A
Relational Databases explores the major advances in relational databases and provides a balanced analysis of the state of the art in relational databases. Topics covered include capture and analysis of data placement requirements; distributed relational database systems; data dependency manipulation in database schemata; and relational database support for computer graphics and computer aided design. This book is divided into three sections and begins with an overview of the theory and practice of distributed systems, using the example of INGRES from Relational Technology as illustration. The
Full Text Available Tissue microarrays are commonly used in modern pathology for cancer tissue evaluation, as it is a very potent technique. Tissue microarray slides are often scanned to perform computer-aided histopathological analysis of the tissue cores. For processing the image, splitting the whole virtual slide into images of individual cores is required. The only way to distinguish cores corresponding to specimens in the tissue microarray is through their arrangement. Unfortunately, distinguishing the correct order of cores is not a trivial task as they are not labelled directly on the slide. The main aim of this study was to create a procedure capable of automatically finding and extracting cores from archival images of the tissue microarrays. This software supports the work of scientists who want to perform further image processing on single cores. The proposed method is an efficient and fast procedure, working in fully automatic or semi-automatic mode. A total of 89% of punches were correctly extracted with automatic selection. With an addition of manual correction, it is possible to fully prepare the whole slide image for extraction in 2 min per tissue microarray. The proposed technique requires minimum skill and time to parse big array of cores from tissue microarray whole slide image into individual core images.
Davies Jonathan J
Full Text Available Abstract Background The prevalence of high resolution profiling of genomes has created a need for the integrative analysis of information generated from multiple methodologies and platforms. Although the majority of data in the public domain are gene expression profiles, and expression analysis software are available, the increase of array CGH studies has enabled integration of high throughput genomic and gene expression datasets. However, tools for direct mining and analysis of array CGH data are limited. Hence, there is a great need for analytical and display software tailored to cross platform integrative analysis of cancer genomes. Results We have created a user-friendly java application to facilitate sophisticated visualization and analysis such as cross-tumor and cross-platform comparisons. To demonstrate the utility of this software, we assembled array CGH data representing Affymetrix SNP chip, Stanford cDNA arrays and whole genome tiling path array platforms for cross comparison. This cancer genome database contains 267 profiles from commonly used cancer cell lines representing 14 different tissue types. Conclusion In this study we have developed an application for the visualization and analysis of data from high resolution array CGH platforms that can be adapted for analysis of multiple types of high throughput genomic datasets. Furthermore, we invite researchers using array CGH technology to deposit both their raw and processed data, as this will be a continually expanding database of cancer genomes. This publicly available resource, the System for Integrative Genomic Microarray Analysis (SIGMA of cancer genomes, can be accessed at http://sigma.bccrc.ca.
Hyunseok P Kang
Full Text Available Background: Tissue microarrays (TMAs are enormously useful tools for translational research, but incompatibilities in database systems between various researchers and institutions prevent the efficient sharing of data that could help realize their full potential. Resource Description Framework (RDF provides a flexible method to represent knowledge in triples, which take the form Subject- Predicate-Object. All data resources are described using Uniform Resource Identifiers (URIs, which are global in scope. We present an OWL (Web Ontology Language schema that expands upon the TMA data exchange specification to address this issue and assist in data sharing and integration. Methods: A minimal OWL schema was designed containing only concepts specific to TMA experiments. More general data elements were incorporated from predefined ontologies such as the NCI thesaurus. URIs were assigned using the Linked Data format. Results: We present examples of files utilizing the schema and conversion of XML data (similar to the TMA DES to OWL. Conclusion: By utilizing predefined ontologies and global unique identifiers, this OWL schema provides a solution to the limitations of XML, which represents concepts defined in a localized setting. This will help increase the utilization of tissue resources, facilitating collaborative translational research efforts.
Benjamin J. Li
Full Text Available Virtual reality (VR has been proposed as a methodological tool to study the basic science of psychology and other fields. One key advantage of VR is that sharing of virtual content can lead to more robust replication and representative sampling. A database of standardized content will help fulfill this vision. There are two objectives to this study. First, we seek to establish and allow public access to a database of immersive VR video clips that can act as a potential resource for studies on emotion induction using virtual reality. Second, given the large sample size of participants needed to get reliable valence and arousal ratings for our video, we were able to explore the possible links between the head movements of the observer and the emotions he or she feels while viewing immersive VR. To accomplish our goals, we sourced for and tested 73 immersive VR clips which participants rated on valence and arousal dimensions using self-assessment manikins. We also tracked participants' rotational head movements as they watched the clips, allowing us to correlate head movements and affect. Based on past research, we predicted relationships between the standard deviation of head yaw and valence and arousal ratings. Results showed that the stimuli varied reasonably well along the dimensions of valence and arousal, with a slight underrepresentation of clips that are of negative valence and highly arousing. The standard deviation of yaw positively correlated with valence, while a significant positive relationship was found between head pitch and arousal. The immersive VR clips tested are available online as supplemental material.
Martinez-Ramirez, Daniel; Jimenez-Shahed, Joohi; Leckman, James Frederick; Porta, Mauro; Servello, Domenico; Meng, Fan-Gang; Kuhn, Jens; Huys, Daniel; Baldermann, Juan Carlos; Foltynie, Thomas; Hariz, Marwan I; Joyce, Eileen M; Zrinzo, Ludvic; Kefalopoulou, Zinovia; Silburn, Peter; Coyne, Terry; Mogilner, Alon Y; Pourfar, Michael H; Khandhar, Suketu M; Auyeung, Man; Ostrem, Jill Louise; Visser-Vandewalle, Veerle; Welter, Marie-Laure; Mallet, Luc; Karachi, Carine; Houeto, Jean Luc; Klassen, Bryan Timothy; Ackermans, Linda; Kaido, Takanobu; Temel, Yasin; Gross, Robert E; Walker, Harrison C; Lozano, Andres M; Walter, Benjamin L; Mari, Zoltan; Anderson, William S; Changizi, Barbara Kelly; Moro, Elena; Zauber, Sarah Elizabeth; Schrock, Lauren E; Zhang, Jian-Guo; Hu, Wei; Rizer, Kyle; Monari, Erin H; Foote, Kelly D; Malaty, Irene A; Deeb, Wissam; Gunduz, Aysegul; Okun, Michael S
Collective evidence has strongly suggested that deep brain stimulation (DBS) is a promising therapy for Tourette syndrome. To assess the efficacy and safety of DBS in a multinational cohort of patients with Tourette syndrome. The prospective International Deep Brain Stimulation Database and Registry included 185 patients with medically refractory Tourette syndrome who underwent DBS implantation from January 1, 2012, to December 31, 2016, at 31 institutions in 10 countries worldwide. Patients with medically refractory symptoms received DBS implantation in the centromedian thalamic region (93 of 163 [57.1%]), the anterior globus pallidus internus (41 of 163 [25.2%]), the posterior globus pallidus internus (25 of 163 [15.3%]), and the anterior limb of the internal capsule (4 of 163 [2.5%]). Scores on the Yale Global Tic Severity Scale and adverse events. The International Deep Brain Stimulation Database and Registry enrolled 185 patients (of 171 with available data, 37 females and 134 males; mean [SD] age at surgery, 29.1 [10.8] years [range, 13-58 years]). Symptoms of obsessive-compulsive disorder were present in 97 of 151 patients (64.2%) and 32 of 148 (21.6%) had a history of self-injurious behavior. The mean (SD) total Yale Global Tic Severity Scale score improved from 75.01 (18.36) at baseline to 41.19 (20.00) at 1 year after DBS implantation (P tic subscore improved from 21.00 (3.72) at baseline to 12.91 (5.78) after 1 year (P tic subscore improved from 16.82 (6.56) at baseline to 9.63 (6.99) at 1 year (P Tourette syndrome but also with important adverse events. A publicly available website on outcomes of DBS in patients with Tourette syndrome has been provided.
Tang, Chang; Cao, Lijuan; Zheng, Xiao; Wang, Minhui
With the rapid development of DNA microarray technology, large amount of genomic data has been generated. Classification of these microarray data is a challenge task since gene expression data are often with thousands of genes but a small number of samples. In this paper, an effective gene selection method is proposed to select the best subset of genes for microarray data with the irrelevant and redundant genes removed. Compared with original data, the selected gene subset can benefit the classification task. We formulate the gene selection task as a manifold regularized subspace learning problem. In detail, a projection matrix is used to project the original high dimensional microarray data into a lower dimensional subspace, with the constraint that the original genes can be well represented by the selected genes. Meanwhile, the local manifold structure of original data is preserved by a Laplacian graph regularization term on the low-dimensional data space. The projection matrix can serve as an importance indicator of different genes. An iterative update algorithm is developed for solving the problem. Experimental results on six publicly available microarray datasets and one clinical dataset demonstrate that the proposed method performs better when compared with other state-of-the-art methods in terms of microarray data classification. Graphical Abstract The graphical abstract of this work.
Publicity for preschool cooperatives is described. Publicity helps produce financial support for preschool cooperatives. It may take the form of posters, brochures, newsletters, open house, newspaper coverage, and radio and television. Word of mouth and general good will in the community are the best avenues of publicity that a cooperative nursery…
Biofuel Database (Web, free access) This database brings together structural, biological, and thermodynamic data for enzymes that are either in current use or are being considered for use in the production of biofuels.
National Oceanic and Atmospheric Administration, Department of Commerce — This excel spreadsheet is the result of merging at the port level of several of the in-house fisheries databases in combination with other demographic databases such...
The EPA DSSTox website (http://www/epa.gov/nheerl/dsstox) publishes standardized, structure-annotated toxicity databases, covering a broad range of toxicity disciplines. Each DSSTox database features documentation written in collaboration with the source authors and toxicity expe...
The Internet and electronic commerce (e-commerce) generate lots of data. Data must be stored, organized, and managed. Database administrators, or DBAs, work with database software to find ways to do this. They identify user needs, set up computer databases, and test systems. They ensure that systems perform as they should and add people to the…
Full Text Available Abstract Background Microarrays permit biologists to simultaneously measure the mRNA abundance of thousands of genes. An important issue facing investigators planning microarray experiments is how to estimate the sample size required for good statistical power. What is the projected sample size or number of replicate chips needed to address the multiple hypotheses with acceptable accuracy? Statistical methods exist for calculating power based upon a single hypothesis, using estimates of the variability in data from pilot studies. There is, however, a need for methods to estimate power and/or required sample sizes in situations where multiple hypotheses are being tested, such as in microarray experiments. In addition, investigators frequently do not have pilot data to estimate the sample sizes required for microarray studies. Results To address this challenge, we have developed a Microrarray PowerAtlas 1. The atlas enables estimation of statistical power by allowing investigators to appropriately plan studies by building upon previous studies that have similar experimental characteristics. Currently, there are sample sizes and power estimates based on 632 experiments from Gene Expression Omnibus (GEO. The PowerAtlas also permits investigators to upload their own pilot data and derive power and sample size estimates from these data. This resource will be updated regularly with new datasets from GEO and other databases such as The Nottingham Arabidopsis Stock Center (NASC. Conclusion This resource provides a valuable tool for investigators who are planning efficient microarray studies and estimating required sample sizes.
Lee, Kyoung-Mu; Kim, Ju-Han; Kang, Daehee
The methods of toxicogenomics might be classified into omics study (e.g., genomics, proteomics, and metabolomics) and population study focusing on risk assessment and gene-environment interaction. In omics study, microarray is the most popular approach. Genes falling into several categories (e.g., xenobiotics metabolism, cell cycle control, DNA repair etc.) can be selected up to 20,000 according to a priori hypothesis. The appropriate type of samples and species should be selected in advance. Multiple doses and varied exposure durations are suggested to identify those genes clearly linked to toxic response. Microarray experiments can be affected by numerous nuisance variables including experimental designs, sample extraction, type of scanners, etc. The number of slides might be determined from the magnitude and variance of expression change, false-positive rate, and desired power. Instead, pooling samples is an alternative. Online databases on chemicals with known exposure-disease outcomes and genetic information can aid the interpretation of the normalized results. Gene function can be inferred from microarray data analyzed by bioinformatics methods such as cluster analysis. The population study often adopts hospital-based or nested case-control design. Biases in subject selection and exposure assessment should be minimized, and confounding bias should also be controlled for in stratified or multiple regression analysis. Optimal sample sizes are dependent on the statistical test for gene-to-environment or gene-to-gene interaction. The design issues addressed in this mini-review are crucial in conducting toxicogenomics study. In addition, integrative approach of exposure assessment, epidemiology, and clinical trial is required
Novak, Jaroslav P; Kim, Seon-Young; Xu, Jun
BACKGROUND: DNA microarrays are a powerful technology that can provide a wealth of gene expression data for disease studies, drug development, and a wide scope of other investigations. Because of the large volume and inherent variability of DNA microarray data, many new statistical methods have...
Full Text Available The present study endeavored to analysis the scientific publications that were indexed in the Web of Science database as the information management records and the visualization of science structure in this field during 1988-2009. The research method was scientometrics. During the study period, 1120 records in the field of information management have been published. These records were extracted in the form of plain text files and stored in a PC. Then they were analyzed by ISI.exe and HistCite softwares. Author's coefficient collaboration (CC was grown from zero in 1988 to 0.33 in 2009. Average coefficient collaboration between the authors was 0.22 which confirmed low authors collaboration in this area. The records have been published in 63 languages. Among these records the English language with 93.8 % possessed the highest proportion. City University London and the University of Sheffield in England had the most common publications in information management field. Based on the number of published records, T.D. Wilson with 13 records and 13 citations ranked as the first. The average number of global citations to 112 documents has been equal to 8.78. Despite the participation of different countries in the production of documents, more than 28.9% of records have been produced in the United States. According to results, 10 countries have published more than 72.4 percent of the records. City University London and the University of Sheffield have had highest frequency in this area. 15 journals have published 564 records (50.4% of the total productions. Finally, by implementation of scientific software HistCite map drawing clustered and authors, articles and four effective specific subjects were introduced..
Braun, Terry A.; Scheetz, Todd; Webster, Gregg L.; Casavant, Thomas L.
The publicly-funded effort to sequence the complete nucleotide sequence of the human genome, the Human Genome Project (HGP), has currently produced more than 93% of the 3 billion nucleotides of the human genome into a preliminary `draft' format. In addition, several valuable sources of information have been developed as direct and indirect results of the HGP. These include the sequencing of model organisms (rat, mouse, fly, and others), gene discovery projects (ESTs and full-length), and new technologies such as expression analysis and resources (micro-arrays or gene chips). These resources are invaluable for the researchers identifying the functional genes of the genome that transcribe and translate into the transcriptome and proteome, both of which potentially contain orders of magnitude more complexity than the genome itself. Preliminary analyses of this data identified approximately 30,000 - 40,000 human `genes.' However, the bulk of the effort still remains -- to identify the functional and structural elements contained within the transcriptome and proteome, and to associate function in the transcriptome and proteome to genes. A fortuitous consequence of the HGP is the existence of hundreds of databases containing biological information that may contain relevant data pertaining to the identification of disease-causing genes. The task of mining these databases for information on candidate genes is a commercial application of enormous potential. We are developing a system to acquire and mine data from specific databases to aid our efforts to identify disease genes. A high speed cluster of Linux of workstations is used to analyze sequence and perform distributed sequence alignments as part of our data mining and processing. This system has been used to mine GeneMap99 sequences within specific genomic intervals to identify potential candidate disease genes associated with Bardet-Biedle Syndrome (BBS).
Strait, Robert S.; Pearson, Peter K.; Sengupta, Sailes K.
A password system comprises a set of codewords spaced apart from one another by a Hamming distance (HD) that exceeds twice the variability that can be projected for a series of biometric measurements for a particular individual and that is less than the HD that can be encountered between two individuals. To enroll an individual, a biometric measurement is taken and exclusive-ORed with a random codeword to produce a "reference value." To verify the individual later, a biometric measurement is taken and exclusive-ORed with the reference value to reproduce the original random codeword or its approximation. If the reproduced value is not a codeword, the nearest codeword to it is found, and the bits that were corrected to produce the codeword to it is found, and the bits that were corrected to produce the codeword are also toggled in the biometric measurement taken and the codeword generated during enrollment. The correction scheme can be implemented by any conventional error correction code such as Reed-Muller code R(m,n). In the implementation using a hand geometry device an R(2,5) code has been used in this invention. Such codeword and biometric measurement can then be used to see if the individual is an authorized user. Conventional Diffie-Hellman public key encryption schemes and hashing procedures can then be used to secure the communications lines carrying the biometric information and to secure the database of authorized users.
Peach, Megan L; Zakharov, Alexey V; Liu, Ruifeng; Pugliese, Angelo; Tawa, Gregory; Wallqvist, Anders; Nicklaus, Marc C
Metabolism has been identified as a defining factor in drug development success or failure because of its impact on many aspects of drug pharmacology, including bioavailability, half-life and toxicity. In this article, we provide an outline and descriptions of the resources for metabolism-related property predictions that are currently either freely or commercially available to the public. These resources include databases with data on, and software for prediction of, several end points: metabolite formation, sites of metabolic transformation, binding to metabolizing enzymes and metabolic stability. We attempt to place each tool in historical context and describe, wherever possible, the data it was based on. For predictions of interactions with metabolizing enzymes, we show a typical set of results for a small test set of compounds. Our aim is to give a clear overview of the areas and aspects of metabolism prediction in which the currently available resources are useful and accurate, and the areas in which they are inadequate or missing entirely.
Strait, R.S.; Pearson, P.K.; Sengupta, S.K.
A password system comprises a set of codewords spaced apart from one another by a Hamming distance (HD) that exceeds twice the variability that can be projected for a series of biometric measurements for a particular individual and that is less than the HD that can be encountered between two individuals. To enroll an individual, a biometric measurement is taken and exclusive-ORed with a random codeword to produce a reference value. To verify the individual later, a biometric measurement is taken and exclusive-ORed with the reference value to reproduce the original random codeword or its approximation. If the reproduced value is not a codeword, the nearest codeword to it is found, and the bits that were corrected to produce the codeword to it is found, and the bits that were corrected to produce the codeword are also toggled in the biometric measurement taken and the codeword generated during enrollment. The correction scheme can be implemented by any conventional error correction code such as Reed-Muller code R(m,n). In the implementation using a hand geometry device an R(2,5) code has been used in this invention. Such codeword and biometric measurement can then be used to see if the individual is an authorized user. Conventional Diffie-Hellman public key encryption schemes and hashing procedures can then be used to secure the communications lines carrying the biometric information and to secure the database of authorized users
Chen, Hua; Li, Jun
Microarrays are important tools for high-throughput analysis of biomolecules. The use of microarrays for parallel screening of nucleic acid and protein profiles has become an industry standard. A few limitations of microarrays are the requirement for relatively large sample volumes and elongated incubation time, as well as the limit of detection. In addition, traditional microarrays make use of bulky instrumentation for the detection, and sample amplification and labeling are quite laborious, which increase analysis cost and delays the time for obtaining results. These problems limit microarray techniques from point-of-care and field applications. One strategy for overcoming these problems is to develop nanoarrays, particularly electronics-based nanoarrays. With further miniaturization, higher sensitivity, and simplified sample preparation, nanoarrays could potentially be employed for biomolecular analysis in personal healthcare and monitoring of trace pathogens. In this chapter, it is intended to introduce the concept and advantage of nanotechnology and then describe current methods and protocols for novel nanoarrays in three aspects: (1) label-free nucleic acids analysis using nanoarrays, (2) nanoarrays for protein detection by conventional optical fluorescence microscopy as well as by novel label-free methods such as atomic force microscopy, and (3) nanoarray for enzymatic-based assay. These nanoarrays will have significant applications in drug discovery, medical diagnosis, genetic testing, environmental monitoring, and food safety inspection.
Zena M Hira
Full Text Available Microarray databases are a large source of genetic data, which, upon proper analysis, could enhance our understanding of biology and medicine. Many microarray experiments have been designed to investigate the genetic mechanisms of cancer, and analytical approaches have been applied in order to classify different types of cancer or distinguish between cancerous and non-cancerous tissue. However, microarrays are high-dimensional datasets with high levels of noise and this causes problems when using machine learning methods. A popular approach to this problem is to search for a set of features that will simplify the structure and to some degree remove the noise from the data. The most widely used approach to feature extraction is principal component analysis (PCA which assumes a multivariate Gaussian model of the data. More recently, non-linear methods have been investigated. Among these, manifold learning algorithms, for example Isomap, aim to project the data from a higher dimensional space onto a lower dimension one. We have proposed a priori manifold learning for finding a manifold in which a representative set of microarray data is fused with relevant data taken from the KEGG pathway database. Once the manifold has been constructed the raw microarray data is projected onto it and clustering and classification can take place. In contrast to earlier fusion based methods, the prior knowledge from the KEGG databases is not used in, and does not bias the classification process--it merely acts as an aid to find the best space in which to search the data. In our experiments we have found that using our new manifold method gives better classification results than using either PCA or conventional Isomap.
Sadi, Al Muktafi; Wang, Dong-Yu; Youngson, Bruce J; Miller, Naomi; Boerner, Scott; Done, Susan J; Leong, Wey L
The ability of gene profiling to predict treatment response and prognosis in breast cancers has been demonstrated in many studies using DNA microarray analyses on RNA from fresh frozen tumor specimens. In certain clinical and research situations, performing such analyses on archival formalin fixed paraffin-embedded (FFPE) surgical specimens would be advantageous as large libraries of such specimens with long-term follow-up data are widely available. However, FFPE tissue processing can cause fragmentation and chemical modifications of the RNA. A number of recent technical advances have been reported to overcome these issues. Our current study evaluates whether or not the technology is ready for clinical applications. A modified RNA extraction method and a recent DNA microarray technique, cDNA-mediated annealing, selection, extension and ligation (DASL, Illumina Inc) were evaluated. The gene profiles generated from FFPE specimens were compared to those obtained from paired fresh fine needle aspiration biopsies (FNAB) of 25 breast cancers of different clinical subtypes (based on ER and Her2/neu status). Selected RNA levels were validated using RT-qPCR, and two public databases were used to demonstrate the prognostic significance of the gene profiles generated from FFPE specimens. Compared to FNAB, RNA isolated from FFPE samples was relatively more degraded, nonetheless, over 80% of the RNA samples were deemed suitable for subsequent DASL assay. Despite a higher noise level, a set of genes from FFPE specimens correlated very well with the gene profiles obtained from FNAB, and could differentiate breast cancer subtypes. Expression levels of these genes were validated using RT-qPCR. Finally, for the first time we correlated gene expression profiles from FFPE samples to survival using two independent microarray databases. Specifically, over-expression of ANLN and KIF2C, and under-expression of MAPT strongly correlated with poor outcomes in breast cancer patients. We
Full Text Available In biological systems that undergo processes such as differentiation, a clear concept of progression exists. We present a novel computational approach, called Sample Progression Discovery (SPD, to discover patterns of biological progression underlying microarray gene expression data. SPD assumes that individual samples of a microarray dataset are related by an unknown biological process (i.e., differentiation, development, cell cycle, disease progression, and that each sample represents one unknown point along the progression of that process. SPD aims to organize the samples in a manner that reveals the underlying progression and to simultaneously identify subsets of genes that are responsible for that progression. We demonstrate the performance of SPD on a variety of microarray datasets that were generated by sampling a biological process at different points along its progression, without providing SPD any information of the underlying process. When applied to a cell cycle time series microarray dataset, SPD was not provided any prior knowledge of samples' time order or of which genes are cell-cycle regulated, yet SPD recovered the correct time order and identified many genes that have been associated with the cell cycle. When applied to B-cell differentiation data, SPD recovered the correct order of stages of normal B-cell differentiation and the linkage between preB-ALL tumor cells with their cell origin preB. When applied to mouse embryonic stem cell differentiation data, SPD uncovered a landscape of ESC differentiation into various lineages and genes that represent both generic and lineage specific processes. When applied to a prostate cancer microarray dataset, SPD identified gene modules that reflect a progression consistent with disease stages. SPD may be best viewed as a novel tool for synthesizing biological hypotheses because it provides a likely biological progression underlying a microarray dataset and, perhaps more importantly, the
Joseph, Jerrine; Rajendran, Vasanthi; Hassan, Sameer; Kumar, Vanaja
Mycobacteriophage genome database (MGDB) is an exclusive repository of the 64 completely sequenced mycobacteriophages with annotated information. It is a comprehensive compilation of the various gene parameters captured from several databases pooled together to empower mycobacteriophage researchers. The MGDB (Version No.1.0) comprises of 6086 genes from 64 mycobacteriophages classified into 72 families based on ACLAME database. Manual curation was aided by information available from public databases which was enriched further by analysis. Its web interface allows browsing as well as querying the classification. The main objective is to collect and organize the complexity inherent to mycobacteriophage protein classification in a rational way. The other objective is to browse the existing and new genomes and describe their functional annotation. The database is available for free at http://mpgdb.ibioinformatics.org/mpgdb.php.
Full Text Available Cohen syndrome (CS is an uncommon autosomal recessive genetic disorder attributed to damage on VPS13B gene, locus 8q22-q23. Characteristic phenotype consists of intellectual disability, microcephaly, facial dysmorphism, ophthalmic abnormalities, truncal obesity and hipotony. Worldwide, around 150 cases have been published, mostly in Finish patients. We report the case of a 3 year-old male, with short height, craniosynostosis, facial dysmorphism, hipotony, and developmental delay. He was diagnosed with Cohen syndrome using Microarray Comparative Genomic Hibridization (aCGH that showed homozygous deletion of 0.153 Mb on 8q22.2 including VPS13B gene, OMIM #216550. With this report we contribute to enlarge epidemiological databases on an uncommon genetic disorder. Besides, we illustrate on the contribution of aCGH to the etiological diagnosis of patients with unexplained intellectual disability, delayed psychomotor development, language difficulties, autism and multiple congenital anomalies.
Andersen, G.L.; He, Z.; DeSantis, T.Z.; Brodie, E.L.; Zhou, J.
Microarrays have proven to be a useful and high-throughput method to provide targeted DNA sequence information for up to many thousands of specific genetic regions in a single test. A microarray consists of multiple DNA oligonucleotide probes that, under high stringency conditions, hybridize only to specific complementary nucleic acid sequences (targets). A fluorescent signal indicates the presence and, in many cases, the abundance of genetic regions of interest. In this chapter we will look at how microarrays are used in microbial ecology, especially with the recent increase in microbial community DNA sequence data. Of particular interest to microbial ecologists, phylogenetic microarrays are used for the analysis of phylotypes in a community and functional gene arrays are used for the analysis of functional genes, and, by inference, phylotypes in environmental samples. A phylogenetic microarray that has been developed by the Andersen laboratory, the PhyloChip, will be discussed as an example of a microarray that targets the known diversity within the 16S rRNA gene to determine microbial community composition. Using multiple, confirmatory probes to increase the confidence of detection and a mismatch probe for every perfect match probe to minimize the effect of cross-hybridization by non-target regions, the PhyloChip is able to simultaneously identify any of thousands of taxa present in an environmental sample. The PhyloChip is shown to reveal greater diversity within a community than rRNA gene sequencing due to the placement of the entire gene product on the microarray compared with the analysis of up to thousands of individual molecules by traditional sequencing methods. A functional gene array that has been developed by the Zhou laboratory, the GeoChip, will be discussed as an example of a microarray that dynamically identifies functional activities of multiple members within a community. The recent version of GeoChip contains more than 24,000 50mer
Gaharwar, Akhilesh K.; Arpanaei, Ayyoob; Andresen, Thomas Lars
Three dimensional (3D) biomaterial microarrays hold enormous promise for regenerative medicine because of their ability to accelerate the design and fabrication of biomimetic materials. Such tissue-like biomaterials can provide an appropriate microenvironment for stimulating and controlling stem...... for tissue engineering and drug screening applications....... cell differentiation into tissue-specifi c lineages. The use of 3D biomaterial microarrays can, if optimized correctly, result in a more than 1000-fold reduction in biomaterials and cells consumption when engineering optimal materials combinations, which makes these miniaturized systems very attractive...
Ekelund, Charlotte Kvist; Kopp, Tine Iskov; Tabor, Ann
trimester ultrasound scan performed at all public hospitals in Denmark are registered in the database. Main variables/descriptive data: Data on maternal characteristics, ultrasonic, and biochemical variables are continuously sent from the fetal medicine units’Astraia databases to the central database via...... analyses are sent to the database. Conclusion: It has been possible to establish a fetal medicine database, which monitors first-trimester screening for chromosomal abnormalities and second-trimester screening for major fetal malformations with the input from already collected data. The database...
Rouse Richard JD
Full Text Available Abstract Background Successful microarray experimentation requires a complex interplay between the slide chemistry, the printing pins, the nucleic acid probes and targets, and the hybridization milieu. Optimization of these parameters and a careful evaluation of emerging slide chemistries are a prerequisite to any large scale array fabrication effort. We have developed a 'microarray meter' tool which assesses the inherent variations associated with microarray measurement prior to embarking on large scale projects. Findings The microarray meter consists of nucleic acid targets (reference and dynamic range control and probe components. Different plate designs containing identical probe material were formulated to accommodate different robotic and pin designs. We examined the variability in probe quality and quantity (as judged by the amount of DNA printed and remaining post-hybridization using three robots equipped with capillary printing pins. Discussion The generation of microarray data with minimal variation requires consistent quality control of the (DNA microarray manufacturing and experimental processes. Spot reproducibility is a measure primarily of the variations associated with printing. The microarray meter assesses array quality by measuring the DNA content for every feature. It provides a post-hybridization analysis of array quality by scoring probe performance using three metrics, a a measure of variability in the signal intensities, b a measure of the signal dynamic range and c a measure of variability of the spot morphologies.
Dai, Yilin; Guo, Ling; Li, Meng; Chen, Yi-Bu
Microarray data analysis presents a significant challenge to researchers who are unable to use the powerful Bioconductor and its numerous tools due to their lack of knowledge of R language. Among the few existing software programs that offer a graphic user interface to Bioconductor packages, none have implemented a comprehensive strategy to address the accuracy and reliability issue of microarray data analysis due to the well known probe design problems associated with many widely used microarray chips. There is also a lack of tools that would expedite the functional analysis of microarray results. We present Microarray Я US, an R-based graphical user interface that implements over a dozen popular Bioconductor packages to offer researchers a streamlined workflow for routine differential microarray expression data analysis without the need to learn R language. In order to enable a more accurate analysis and interpretation of microarray data, we incorporated the latest custom probe re-definition and re-annotation for Affymetrix and Illumina chips. A versatile microarray results output utility tool was also implemented for easy and fast generation of input files for over 20 of the most widely used functional analysis software programs. Coupled with a well-designed user interface, Microarray Я US leverages cutting edge Bioconductor packages for researchers with no knowledge in R language. It also enables a more reliable and accurate microarray data analysis and expedites downstream functional analysis of microarray results.
Welch, M.J.; Welles, B.W.
Accident statistics on all modes of transportation are available as risk assessment analytical tools through several federal agencies. This paper reports on the examination of the accident databases by personal contact with the federal staff responsible for administration of the database programs. This activity, sponsored by the Department of Energy through Sandia National Laboratories, is an overview of the national accident data on highway, rail, air, and marine shipping. For each mode, the definition or reporting requirements of an accident are determined and the method of entering the accident data into the database is established. Availability of the database to others, ease of access, costs, and who to contact were prime questions to each of the database program managers. Additionally, how the agency uses the accident data was of major interest
Full Text Available Abstract Background The retina is a multi-layered sensory tissue that lines the back of the eye and acts at the interface of input light and visual perception. Its main function is to capture photons and convert them into electrical impulses that travel along the optic nerve to the brain where they are turned into images. It consists of neurons, nourishing blood vessels and different cell types, of which neural cells predominate. Defects in any of these cells can lead to a variety of retinal diseases, including age-related macular degeneration, retinitis pigmentosa, Leber congenital amaurosis and glaucoma. Recent progress in genomics and microarray technology provides extensive opportunities to examine alterations in retinal gene expression profiles during development and diseases. However, there is no specific database that deals with retinal gene expression profiling. In this context we have built RETINOBASE, a dedicated microarray database for retina. Description RETINOBASE is a microarray relational database, analysis and visualization system that allows simple yet powerful queries to retrieve information about gene expression in retina. It provides access to gene expression meta-data and offers significant insights into gene networks in retina, resulting in better hypothesis framing for biological problems that can subsequently be tested in the laboratory. Public and proprietary data are automatically analyzed with 3 distinct methods, RMA, dChip and MAS5, then clustered using 2 different K-means and 1 mixture models method. Thus, RETINOBASE provides a framework to compare these methods and to optimize the retinal data analysis. RETINOBASE has three different modules, "Gene Information", "Raw Data System Analysis" and "Fold change system Analysis" that are interconnected in a relational schema, allowing efficient retrieval and cross comparison of data. Currently, RETINOBASE contains datasets from 28 different microarray experiments performed
Ballerstedt, H.; Volkers, R.J.M.; Mars, A.E.; Hallsworth, J.E.; Santos, V.A.M.D.; Puchalka, J.; Duuren, J. van; Eggink, G.; Timmis, K.N.; Bont, J.A.M. de; Wery, J.
Pseudomonas putida KT2440 is the only fully sequenced P. putida strain. Thus, for transcriptomics and proteomics studies with other P. putida strains, the P. putida KT2440 genomic database serves as standard reference. The utility of KT2440 whole-genome, high-density oligonucleotide microarrays for
Mocellin, Simone; Rossi, Carlo Riccardo
The development of several gene expression profiling methods, such as comparative genomic hybridization (CGH), differential display, serial analysis of gene expression (SAGE), and gene microarray, together with the sequencing of the human genome, has provided an opportunity to monitor and investigate the complex cascade of molecular events leading to tumor development and progression. The availability of such large amounts of information has shifted the attention of scientists towards a nonreductionist approach to biological phenomena. High throughput technologies can be used to follow changing patterns of gene expression over time. Among them, gene microarray has become prominent because it is easier to use, does not require large-scale DNA sequencing, and allows for the parallel quantification of thousands of genes from multiple samples. Gene microarray technology is rapidly spreading worldwide and has the potential to drastically change the therapeutic approach to patients affected with tumor. Therefore, it is of paramount importance for both researchers and clinicians to know the principles underlying the analysis of the huge amount of data generated with microarray technology.
The main aim of this master thesis was the simultaneous detection of four selected plant viruses ? Apple mosaic virus, Plum pox virus, Prunus necrotic ringspot virus and Prune harf virus, by microarrays. The intermediate step in the process of the detection was optimizing of multiplex polymerase chain reaction (PCR).
Oct 20, 2014 ... the advent of DNA microarray techniques (Lee et al. 2007). ... atoms of ribose to form a bicyclic ribosyl structure. It is the .... 532 nm and emission at 570 nm. The signal ..... sis and validation using real-time PCR. Nucleic Acids ...
Hybridization of labeled cDNA to microarrays is an intuitively simple and a vastly underestimated process. If it is not performed, optimized, and standardized with the same attention to detail as e.g., RNA amplification, information may be overlooked or even lost. Careful balancing of the amount ...
Barnard, Betsy; Sussman, Michael; BonDurant, Sandra Splinter; Nienhuis, James; Krysan, Patrick
We have developed and optimized the necessary laboratory materials to make DNA microarray technology accessible to all high school students at a fraction of both cost and data size. The primary component is a DNA chip/array that students "print" by hand and then analyze using research tools that have been adapted for classroom use. The…
Thygesen, Helene H.; Zwinderman, Aeilko H.
Background: When DNA microarray data are used for gene clustering, genotype/phenotype correlation studies, or tissue classification the signal intensities are usually transformed and normalized in several steps in order to improve comparability and signal/noise ratio. These steps may include
Full Text Available International fish trade reached an import value of 62.8 billion Euro in 2006, of which 44.6% are covered by the European Union. Species identification is a key problem throughout the life cycle of fishes: from eggs and larvae to adults in fisheries research and control, as well as processed fish products in consumer protection.This study aims to evaluate the applicability of the three mitochondrial genes 16S rRNA (16S, cytochrome b (cyt b, and cytochrome oxidase subunit I (COI for the identification of 50 European marine fish species by combining techniques of "DNA barcoding" and microarrays. In a DNA barcoding approach, neighbour Joining (NJ phylogenetic trees of 369 16S, 212 cyt b, and 447 COI sequences indicated that cyt b and COI are suitable for unambiguous identification, whereas 16S failed to discriminate closely related flatfish and gurnard species. In course of probe design for DNA microarray development, each of the markers yielded a high number of potentially species-specific probes in silico, although many of them were rejected based on microarray hybridisation experiments. None of the markers provided probes to discriminate the sibling flatfish and gurnard species. However, since 16S-probes were less negatively influenced by the "position of label" effect and showed the lowest rejection rate and the highest mean signal intensity, 16S is more suitable for DNA microarray probe design than cty b and COI. The large portion of rejected COI-probes after hybridisation experiments (>90% renders the DNA barcoding marker as rather unsuitable for this high-throughput technology.Based on these data, a DNA microarray containing 64 functional oligonucleotide probes for the identification of 30 out of the 50 fish species investigated was developed. It represents the next step towards an automated and easy-to-handle method to identify fish, ichthyoplankton, and fish products.
Database replication is widely used for fault-tolerance, scalability and performance. The failure of one database replica does not stop the system from working as available replicas can take over the tasks of the failed replica. Scalability can be achieved by distributing the load across all replicas, and adding new replicas should the load increase. Finally, database replication can provide fast local access, even if clients are geographically distributed clients, if data copies are located close to clients. Despite its advantages, replication is not a straightforward technique to apply, and
Ambler, Scott W
Refactoring has proven its value in a wide range of development projects–helping software professionals improve system designs, maintainability, extensibility, and performance. Now, for the first time, leading agile methodologist Scott Ambler and renowned consultant Pramodkumar Sadalage introduce powerful refactoring techniques specifically designed for database systems. Ambler and Sadalage demonstrate how small changes to table structures, data, stored procedures, and triggers can significantly enhance virtually any database design–without changing semantics. You’ll learn how to evolve database schemas in step with source code–and become far more effective in projects relying on iterative, agile methodologies. This comprehensive guide and reference helps you overcome the practical obstacles to refactoring real-world databases by covering every fundamental concept underlying database refactoring. Using start-to-finish examples, the authors walk you through refactoring simple standalone databas...
Full Text Available This study describes the development and validation of an enriched oligonucleotide-microarray platform for Sparus aurata (SAQ to provide a platform for transcriptomic studies in this species. A transcriptome database was constructed by assembly of gilthead sea bream sequences derived from public repositories of mRNA together with reads from a large collection of expressed sequence tags (EST from two extensive targeted cDNA libraries characterizing mRNA transcripts regulated by both bacterial and viral challenge. The developed microarray was further validated by analysing monocyte/macrophage activation profiles after challenge with two Gram-negative bacterial pathogen-associated molecular patterns (PAMPs; lipopolysaccharide (LPS and peptidoglycan (PGN. Of the approximately 10,000 EST sequenced, we obtained a total of 6837 EST longer than 100 nt, with 3778 and 3059 EST obtained from the bacterial-primed and from the viral-primed cDNA libraries, respectively. Functional classification of contigs from the bacterial- and viral-primed cDNA libraries by Gene Ontology (GO showed that the top five represented categories were equally represented in the two libraries: metabolism (approximately 24% of the total number of contigs, carrier proteins/membrane transport (approximately 15%, effectors/modulators and cell communication (approximately 11%, nucleoside, nucleotide and nucleic acid metabolism (approximately 7.5% and intracellular transducers/signal transduction (approximately 5%. Transcriptome analyses using this enriched oligonucleotide platform identified differential shifts in the response to PGN and LPS in macrophage-like cells, highlighting responsive gene-cassettes tightly related to PAMP host recognition. As observed in other fish species, PGN is a powerful activator of the inflammatory response in S. aurata macrophage-like cells. We have developed and validated an oligonucleotide microarray (SAQ that provides a platform enriched for the study
Boltaña, Sebastian; Castellana, Barbara; Goetz, Giles; Tort, Lluis; Teles, Mariana; Mulero, Victor; Novoa, Beatriz; Figueras, Antonio; Goetz, Frederick W; Gallardo-Escarate, Cristian; Planas, Josep V; Mackenzie, Simon
This study describes the development and validation of an enriched oligonucleotide-microarray platform for Sparus aurata (SAQ) to provide a platform for transcriptomic studies in this species. A transcriptome database was constructed by assembly of gilthead sea bream sequences derived from public repositories of mRNA together with reads from a large collection of expressed sequence tags (EST) from two extensive targeted cDNA libraries characterizing mRNA transcripts regulated by both bacterial and viral challenge. The developed microarray was further validated by analysing monocyte/macrophage activation profiles after challenge with two Gram-negative bacterial pathogen-associated molecular patterns (PAMPs; lipopolysaccharide (LPS) and peptidoglycan (PGN)). Of the approximately 10,000 EST sequenced, we obtained a total of 6837 EST longer than 100 nt, with 3778 and 3059 EST obtained from the bacterial-primed and from the viral-primed cDNA libraries, respectively. Functional classification of contigs from the bacterial- and viral-primed cDNA libraries by Gene Ontology (GO) showed that the top five represented categories were equally represented in the two libraries: metabolism (approximately 24% of the total number of contigs), carrier proteins/membrane transport (approximately 15%), effectors/modulators and cell communication (approximately 11%), nucleoside, nucleotide and nucleic acid metabolism (approximately 7.5%) and intracellular transducers/signal transduction (approximately 5%). Transcriptome analyses using this enriched oligonucleotide platform identified differential shifts in the response to PGN and LPS in macrophage-like cells, highlighting responsive gene-cassettes tightly related to PAMP host recognition. As observed in other fish species, PGN is a powerful activator of the inflammatory response in S. aurata macrophage-like cells. We have developed and validated an oligonucleotide microarray (SAQ) that provides a platform enriched for the study of gene
Full Text Available This paper presents microarray BASICA: an integrated image processing tool for background adjustment, segmentation, image compression, and analysis of cDNA microarray images. BASICA uses a fast Mann-Whitney test-based algorithm to segment cDNA microarray images, and performs postprocessing to eliminate the segmentation irregularities. The segmentation results, along with the foreground and background intensities obtained with the background adjustment, are then used for independent compression of the foreground and background. We introduce a new distortion measurement for cDNA microarray image compression and devise a coding scheme by modifying the embedded block coding with optimized truncation (EBCOT algorithm (Taubman, 2000 to achieve optimal rate-distortion performance in lossy coding while still maintaining outstanding lossless compression performance. Experimental results show that the bit rate required to ensure sufficiently accurate gene expression measurement varies and depends on the quality of cDNA microarray images. For homogeneously hybridized cDNA microarray images, BASICA is able to provide from a bit rate as low as 5 bpp the gene expression data that are 99% in agreement with those of the original 32 bpp images.
National Oceanic and Atmospheric Administration, Department of Commerce — This database was established to oversee documents issued in support of fishery research activities including experimental fishing permits (EFP), letters of...
National Oceanic and Atmospheric Administration, Department of Commerce — The Snowstorm Database is a collection of over 500 snowstorms dating back to 1900 and updated operationally. Only storms having large areas of heavy snowfall (10-20...
National Oceanic and Atmospheric Administration, Department of Commerce — The dealer reporting databases contain the primary data reported by federally permitted seafood dealers in the northeast. Electronic reporting was implemented May 1,...
Kristensen, Helen Grundtvig; Stjernø, Henrik
Artikel om national database for sygeplejeforskning oprettet på Dansk Institut for Sundheds- og Sygeplejeforskning. Det er målet med databasen at samle viden om forsknings- og udviklingsaktiviteter inden for sygeplejen.......Artikel om national database for sygeplejeforskning oprettet på Dansk Institut for Sundheds- og Sygeplejeforskning. Det er målet med databasen at samle viden om forsknings- og udviklingsaktiviteter inden for sygeplejen....
Magwene, Paul M; Lizardi, Paul; Kim, Junhyong
Accurate time series for biological processes are difficult to estimate due to problems of synchronization, temporal sampling and rate heterogeneity. Methods are needed that can utilize multi-dimensional data, such as those resulting from DNA microarray experiments, in order to reconstruct time series from unordered or poorly ordered sets of observations. We present a set of algorithms for estimating temporal orderings from unordered sets of sample elements. The techniques we describe are based on modifications of a minimum-spanning tree calculated from a weighted, undirected graph. We demonstrate the efficacy of our approach by applying these techniques to an artificial data set as well as several gene expression data sets derived from DNA microarray experiments. In addition to estimating orderings, the techniques we describe also provide useful heuristics for assessing relevant properties of sample datasets such as noise and sampling intensity, and we show how a data structure called a PQ-tree can be used to represent uncertainty in a reconstructed ordering. Academic implementations of the ordering algorithms are available as source code (in the programming language Python) on our web site, along with documentation on their use. The artificial 'jelly roll' data set upon which the algorithm was tested is also available from this web site. The publicly available gene expression data may be found at http://genome-www.stanford.edu/cellcycle/ and http://caulobacter.stanford.edu/CellCycle/.
Full Text Available Abstract Background Most microarray experiments are carried out with the purpose of identifying genes whose expression varies in relation with specific conditions or in response to environmental stimuli. In such studies, genes showing similar mean expression values between two or more groups are considered as not differentially expressed, even if hidden subclasses with different expression values may exist. In this paper we propose a new method for identifying differentially expressed genes, based on the area between the ROC curve and the rising diagonal (ABCR. ABCR represents a more general approach than the standard area under the ROC curve (AUC, because it can identify both proper (i.e., concave and not proper ROC curves (NPRC. In particular, NPRC may correspond to those genes that tend to escape standard selection methods. Results We assessed the performance of our method using data from a publicly available database of 4026 genes, including 14 normal B cell samples (NBC and 20 heterogeneous lymphomas (namely: 9 follicular lymphomas and 11 chronic lymphocytic leukemias. Moreover, NBC also included two sub-classes, i.e., 6 heavily stimulated and 8 slightly or not stimulated samples. We identified 1607 differentially expressed genes with an estimated False Discovery Rate of 15%. Among them, 16 corresponded to NPRC and all escaped standard selection procedures based on AUC and t statistics. Moreover, a simple inspection to the shape of such plots allowed to identify the two subclasses in either one class in 13 cases (81%. Conclusion NPRC represent a new useful tool for the analysis of microarray data.
Guldberg, Rikke; Brostrøm, Søren; Hansen, Jesper Kjær
in the DugaBase from 1 January 2009 to 31 October 2010, using medical records as a reference. RESULTS: A total of 16,509 urogynaecological procedures were registered in the DugaBase by 31 December 2010. The database completeness has increased by calendar time, from 38.2 % in 2007 to 93.2 % in 2010 for public......INTRODUCTION AND HYPOTHESIS: The Danish Urogynaecological Database (DugaBase) is a nationwide clinical database established in 2006 to monitor, ensure and improve the quality of urogynaecological surgery. We aimed to describe its establishment and completeness and to validate selected variables....... This is the first study based on data from the DugaBase. METHODS: The database completeness was calculated as a comparison between urogynaecological procedures reported to the Danish National Patient Registry and to the DugaBase. Validity was assessed for selected variables from a random sample of 200 women...
INIST is a CNRS (Centre National de la Recherche Scientifique) laboratory devoted to the treatment of scientific and technical informations and to the management of these informations compiled in a database. Reorientation of the database content has been proposed in 1994 to increase the transfer of research towards enterprises and services, to develop more automatized accesses to the informations, and to create a quality assurance plan. The catalog of publications comprises 5800 periodical titles (1300 for fundamental research and 4500 for applied research). A science and technology multi-thematic database will be created in 1995 for the retrieval of applied and technical informations. ''Grey literature'' (reports, thesis, proceedings..) and human and social sciences data will be added to the base by the use of informations selected in the existing GRISELI and Francis databases. Strong modifications are also planned in the thematic cover of Earth sciences and will considerably reduce the geological information content. (J.S.). 1 tab
The essence of bioinformatics is dealing with large quantities of information. Whether it be sequencing data, microarray data files, mass spectrometric data (e.g., fingerprints), the catalog of strains arising from an insertional mutagenesis project, or even large numbers of PDF files, there inevitably comes a time when the information can simply no longer be managed with files and directories. This is where databases come into play. This unit briefly reviews the characteristics of several database management systems, including flat file, indexed file, relational databases, and NoSQL databases. It compares their strengths and weaknesses and offers some general guidelines for selecting an appropriate database management system. Copyright 2013 by JohnWiley & Sons, Inc.
Seok, Junhee; Kaushal, Amit; Davis, Ronald W; Xiao, Wenzhong
The large amount of high-throughput genomic data has facilitated the discovery of the regulatory relationships between transcription factors and their target genes. While early methods for discovery of transcriptional regulation relationships from microarray data often focused on the high-throughput experimental data alone, more recent approaches have explored the integration of external knowledge bases of gene interactions. In this work, we develop an algorithm that provides improved performance in the prediction of transcriptional regulatory relationships by supplementing the analysis of microarray data with a new method of integrating information from an existing knowledge base. Using a well-known dataset of yeast microarrays and the Yeast Proteome Database, a comprehensive collection of known information of yeast genes, we show that knowledge-based predictions demonstrate better sensitivity and specificity in inferring new transcriptional interactions than predictions from microarray data alone. We also show that comprehensive, direct and high-quality knowledge bases provide better prediction performance. Comparison of our results with ChIP-chip data and growth fitness data suggests that our predicted genome-wide regulatory pairs in yeast are reasonable candidates for follow-up biological verification. High quality, comprehensive, and direct knowledge bases, when combined with appropriate bioinformatic algorithms, can significantly improve the discovery of gene regulatory relationships from high throughput gene expression data.
Full Text Available Abstract Background Tissue MicroArray technique is becoming increasingly important in pathology for the validation of experimental data from transcriptomic analysis. This approach produces many images which need to be properly managed, if possible with an infrastructure able to support tissue sharing between institutes. Moreover, the available frameworks oriented to Tissue MicroArray provide good storage for clinical patient, sample treatment and block construction information, but their utility is limited by the lack of data integration with biomolecular information. Results In this work we propose a Tissue MicroArray web oriented system to support researchers in managing bio-samples and, through the use of ontologies, enables tissue sharing aimed at the design of Tissue MicroArray experiments and results evaluation. Indeed, our system provides ontological description both for pre-analysis tissue images and for post-process analysis image results, which is crucial for information exchange. Moreover, working on well-defined terms it is then possible to query web resources for literature articles to integrate both pathology and bioinformatics data. Conclusions Using this system, users associate an ontology-based description to each image uploaded into the database and also integrate results with the ontological description of biosequences identified in every tissue. Moreover, it is possible to integrate the ontological description provided by the user with a full compliant gene ontology definition, enabling statistical studies about correlation between the analyzed pathology and the most commonly related biological processes.
Vanschoren, Joaquin; Blockeel, Hendrik
Next to running machine learning algorithms based on inductive queries, much can be learned by immediately querying the combined results of many prior studies. Indeed, all around the globe, thousands of machine learning experiments are being executed on a daily basis, generating a constant stream of empirical information on machine learning techniques. While the information contained in these experiments might have many uses beyond their original intent, results are typically described very concisely in papers and discarded afterwards. If we properly store and organize these results in central databases, they can be immediately reused for further analysis, thus boosting future research. In this chapter, we propose the use of experiment databases: databases designed to collect all the necessary details of these experiments, and to intelligently organize them in online repositories to enable fast and thorough analysis of a myriad of collected results. They constitute an additional, queriable source of empirical meta-data based on principled descriptions of algorithm executions, without reimplementing the algorithms in an inductive database. As such, they engender a very dynamic, collaborative approach to experimentation, in which experiments can be freely shared, linked together, and immediately reused by researchers all over the world. They can be set up for personal use, to share results within a lab or to create open, community-wide repositories. Here, we provide a high-level overview of their design, and use an existing experiment database to answer various interesting research questions about machine learning algorithms and to verify a number of recent studies.
at identifying the exact breakpoints where DNA has been gained or lost. In this thesis, three popular methods are compared and a realistic simulation model is presented for generating artificial data with known breakpoints and known DNA copy number. By using simulated data, we obtain a realistic evaluation......During the past few years, innovations in the DNA sequencing technology has led to an explosion in available DNA sequence information. This has revolutionized biological research and promoted the development of high throughput analysis methods that can take advantage of the vast amount of sequence...... data. For this, the DNA microarray technology has gained enormous popularity due to its ability to measure the presence or the activity of thousands of genes simultaneously. Microarrays for high throughput data analyses are not limited to a few organisms but may be applied to everything from bacteria...
Satish Balasaheb Nimse
Full Text Available The highly programmable positioning of molecules (biomolecules, nanoparticles, nanobeads, nanocomposites materials on surfaces has potential applications in the fields of biosensors, biomolecular electronics, and nanodevices. However, the conventional techniques including self-assembled monolayers fail to position the molecules on the nanometer scale to produce highly organized monolayers on the surface. The present article elaborates different techniques for the immobilization of the biomolecules on the surface to produce microarrays and their diagnostic applications. The advantages and the drawbacks of various methods are compared. This article also sheds light on the applications of the different technologies for the detection and discrimination of viral/bacterial genotypes and the detection of the biomarkers. A brief survey with 115 references covering the last 10 years on the biological applications of microarrays in various fields is also provided.
Schlecht, Ulrich; Primig, Michael
Gametogenesis is a key developmental process that involves complex transcriptional regulation of numerous genes including many that are conserved between unicellular eukaryotes and mammals. Recent expression-profiling experiments using microarrays have provided insight into the co-ordinated transcription of several hundred genes during mitotic growth and meiotic development in budding and fission yeast. Furthermore, microarray-based studies have identified numerous loci that are regulated during the cell cycle or expressed in a germ-cell specific manner in eukaryotic model systems like Caenorhabditis elegans, Mus musculus as well as Homo sapiens. The unprecedented amount of information produced by post-genome biology has spawned novel approaches to organizing biological knowledge using currently available information technology. This review outlines experiments that contribute to an emerging comprehensive picture of the molecular machinery governing sexual reproduction in eukaryotes.
Kierzek, Elzbieta; Kierzek, Ryszard; Turner, Douglas H; Catrina, Irina E
Determining RNA secondary structure is important for understanding structure-function relationships and identifying potential drug targets. This paper reports the use of microarrays with heptamer 2'-O-methyl oligoribonucleotides to probe the secondary structure of an RNA and thereby improve the prediction of that secondary structure. When experimental constraints from hybridization results are added to a free-energy minimization algorithm, the prediction of the secondary structure of Escherichia coli 5S rRNA improves from 27 to 92% of the known canonical base pairs. Optimization of buffer conditions for hybridization and application of 2'-O-methyl-2-thiouridine to enhance binding and improve discrimination between AU and GU pairs are also described. The results suggest that probing RNA with oligonucleotide microarrays can facilitate determination of secondary structure.
Gogalic, S.; Hageneder, S.; Ctortecka, C.; Bauch, M.; Khan, I.; Preininger, Claudia; Sauer, U.; Dostalek, J.
Plasmonic amplification of fluorescence signal in bioassays with microarray detection format is reported. A crossed relief diffraction grating was designed to couple an excitation laser beam to surface plasmons at the wavelength overlapping with the absorption and emission bands of fluorophore Dy647 that was used as a label. The surface of periodically corrugated sensor chip was coated with surface plasmon-supporting gold layer and a thin SU8 polymer film carrying epoxy groups. These groups were employed for the covalent immobilization of capture antibodies at arrays of spots. The plasmonic amplification of fluorescence signal on the developed microarray chip was tested by using interleukin 8 sandwich immunoassay. The readout was performed ex situ after drying the chip by using a commercial scanner with high numerical aperture collecting lens. Obtained results reveal the enhancement of fluorescence signal by a factor of 5 when compared to a regular glass chip.
Barrios Mello, Rafael; Regis Silva, Maria Regina; Seixas Alves, Maria Teresa; Evison, Martin; Guimarães, Marco Aurélio; Francisco, Rafaella Arrabaça; Dias Astolphi, Rafael; Miazato Iwamura, Edna Sadayo
Taphonomic processes affecting bone post mortem are important in forensic, archaeological and palaeontological investigations. In this study, the application of tissue microarray (TMA) analysis to a sample of femoral bone specimens from 20 exhumed individuals of known period of burial and age at death is described. TMA allows multiplexing of subsamples, permitting standardized comparative analysis of adjacent sections in 3-D and of representative cross-sections of a large number of specimens....
Phelan, Don; Jackson, Carl; Redfern, R. Michael; Morrison, Alan P.; Mathewson, Alan
New Geiger Mode Avalanche Photodiodes (GM-APD) have been designed and characterized specifically for use in microarray systems. Critical parameters such as excess reverse bias voltage, hold-off time and optimum operating temperature have been experimentally determined for these photon-counting devices. The photon detection probability, dark count rate and afterpulsing probability have been measured under different operating conditions. An active- quench circuit (AQC) is presented for operating these GM- APDs. This circuit is relatively simple, robust and has such benefits as reducing average power dissipation and afterpulsing. Arrays of these GM-APDs have already been designed and together with AQCs open up the possibility of having a solid-state microarray detector that enables parallel analysis on a single chip. Another advantage of these GM-APDs over current technology is their low voltage CMOS compatibility which could allow for the fabrication of an AQC on the same device. Small are detectors have already been employed in the time-resolved detection of fluorescence from labeled proteins. It is envisaged that operating these new GM-APDs with this active-quench circuit will have numerous applications for the detection of fluorescence in microarray systems.
Nathan D Grubaugh
Full Text Available BACKGROUND: Arthropod-borne viruses are important emerging pathogens world-wide. Viruses transmitted by mosquitoes, such as dengue, yellow fever, and Japanese encephalitis viruses, infect hundreds of millions of people and animals each year. Global surveillance of these viruses in mosquito vectors using molecular based assays is critical for prevention and control of the associated diseases. Here, we report an oligonucleotide DNA microarray design, termed ArboChip5.1, for multi-gene detection and identification of mosquito-borne RNA viruses from the genera Flavivirus (family Flaviviridae, Alphavirus (Togaviridae, Orthobunyavirus (Bunyaviridae, and Phlebovirus (Bunyaviridae. METHODOLOGY/PRINCIPAL FINDINGS: The assay utilizes targeted PCR amplification of three genes from each virus genus for electrochemical detection on a portable, field-tested microarray platform. Fifty-two viruses propagated in cell-culture were used to evaluate the specificity of the PCR primer sets and the ArboChip5.1 microarray capture probes. The microarray detected all of the tested viruses and differentiated between many closely related viruses such as members of the dengue, Japanese encephalitis, and Semliki Forest virus clades. Laboratory infected mosquitoes were used to simulate field samples and to determine the limits of detection. Additionally, we identified dengue virus type 3, Japanese encephalitis virus, Tembusu virus, Culex flavivirus, and a Quang Binh-like virus from mosquitoes collected in Thailand in 2011 and 2012. CONCLUSIONS/SIGNIFICANCE: We demonstrated that the described assay can be utilized in a comprehensive field surveillance program by the broad-range amplification and specific identification of arboviruses from infected mosquitoes. Furthermore, the microarray platform can be deployed in the field and viral RNA extraction to data analysis can occur in as little as 12 h. The information derived from the ArboChip5.1 microarray can help to establish
Full Text Available Abstract Background DNA microarrays are a powerful tool for monitoring the expression of tens of thousands of genes simultaneously. With the advance of microarray technology, the challenge issue becomes how to analyze a large amount of microarray data and make biological sense of them. Affymetrix GeneChips are widely used microarrays, where a variety of statistical algorithms have been explored and used for detecting significant genes in the experiment. These methods rely solely on the quantitative data, i.e., signal intensity; however, qualitative data are also important parameters in detecting differentially expressed genes. Results AffyMiner is a tool developed for detecting differentially expressed genes in Affymetrix GeneChip microarray data and for associating gene annotation and gene ontology information with the genes detected. AffyMiner consists of the functional modules, GeneFinder for detecting significant genes in a treatment versus control experiment and GOTree for mapping genes of interest onto the Gene Ontology (GO space; and interfaces to run Cluster, a program for clustering analysis, and GenMAPP, a program for pathway analysis. AffyMiner has been used for analyzing the GeneChip data and the results were presented in several publications. Conclusion AffyMiner fills an important gap in finding differentially expressed genes in Affymetrix GeneChip microarray data. AffyMiner effectively deals with multiple replicates in the experiment and takes into account both quantitative and qualitative data in identifying significant genes. AffyMiner reduces the time and effort needed to compare data from multiple arrays and to interpret the possible biological implications associated with significant changes in a gene's expression.
Full Text Available Abstract Background The increasing number of gene expression microarray studies represents an important resource in biomedical research. As a result, gene expression based diagnosis has entered clinical practice for patient stratification in breast cancer. However, the integration and combined analysis of microarray studies remains still a challenge. We assessed the potential benefit of data integration on the classification accuracy and systematically evaluated the generalization performance of selected methods on four breast cancer studies comprising almost 1000 independent samples. To this end, we introduced an evaluation framework which aims to establish good statistical practice and a graphical way to monitor differences. The classification goal was to correctly predict estrogen receptor status (negative/positive and histological grade (low/high of each tumor sample in an independent study which was not used for the training. For the classification we chose support vector machines (SVM, predictive analysis of microarrays (PAM, random forest (RF and k-top scoring pairs (kTSP. Guided by considerations relevant for classification across studies we developed a generalization of kTSP which we evaluated in addition. Our derived version (DV aims to improve the robustness of the intrinsic invariance of kTSP with respect to technologies and preprocessing. Results For each individual study the generalization error was benchmarked via complete cross-validation and was found to be similar for all classification methods. The misclassification rates were substantially higher in classification across studies, when each single study was used as an independent test set while all remaining studies were combined for the training of the classifier. However, with increasing number of independent microarray studies used in the training, the overall classification performance improved. DV performed better than the average and showed slightly less variance. In
The Refrigerant Database is an information system on alternative refrigerants, associated lubricants, and their use in air conditioning and refrigeration. It consolidates and facilitates access to thermophysical properties, compatibility, environmental, safety, application and other information. It provides corresponding information on older refrigerants, to assist manufacturers and those using alternative refrigerants, to make comparisons and determine differences. The underlying purpose is to accelerate phase out of chemical compounds of environmental concern. The database provides bibliographic citations and abstracts for publications that may be useful in research and design of air conditioning and refrigeration equipment. It also references documents addressing compatibility of refrigerants and lubricants with other materials.
Full Text Available Abstract Background Mycotoxins are fungal secondary metabolites commonly present in feed and food, and are widely regarded as hazardous contaminants. Citrinin, one of the very well known mycotoxins that was first isolated from Penicillium citrinum, is produced by more than 10 kinds of fungi, and is possibly spread all over the world. However, the information on the action mechanism of the toxin is limited. Thus, we investigated the citrinin-induced genomic response for evaluating its toxicity. Results Citrinin inhibited growth of yeast cells at a concentration higher than 100 ppm. We monitored the citrinin-induced mRNA expression profiles in yeast using the ORF DNA microarray and Oligo DNA microarray, and the expression profiles were compared with those of the other stress-inducing agents. Results obtained from both microarray experiments clustered together, but were different from those of the mycotoxin patulin. The oxidative stress response genes – AADs, FLR1, OYE3, GRE2, and MET17 – were significantly induced. In the functional category, expression of genes involved in "metabolism", "cell rescue, defense and virulence", and "energy" were significantly activated. In the category of "metabolism", genes involved in the glutathione synthesis pathway were activated, and in the category of "cell rescue, defense and virulence", the ABC transporter genes were induced. To alleviate the induced stress, these cells might pump out the citrinin after modification with glutathione. While, the citrinin treatment did not induce the genes involved in the DNA repair. Conclusion Results from both microarray studies suggest that citrinin treatment induced oxidative stress in yeast cells. The genotoxicity was less severe than the patulin, suggesting that citrinin is less toxic than patulin. The reproducibility of the expression profiles was much better with the Oligo DNA microarray. However, the Oligo DNA microarray did not completely overcome cross
Full Text Available We performed a screening of miRNAs regulated by dietary lipids in a cellular model of enterocytes, Caco-2 cells. Our aim was to describe new lipid-modified miRNAs with an implication in lipid homeostasis and cardiovascular disease [1,2]. For that purpose, we treated differentiated Caco-2 cells with micelles containing the assayed lipids (cholesterol, conjugated linoleic acid and docosahexaenoic acid and the screening of miRNAs was carried out by microarray using the μParaflo®Microfluidic Biochip Technology of LC Sciences (Huston, TX, USA. Experimental design, microarray description and raw data have been made available in the GEO database with the reference number of GSE59153. Here we described in detail the experimental design and methods used to obtain the relative expression data.
Mutagenicity and carcinogenicity databases are crucial resources for toxicologists and regulators involved in chemicals risk assessment. Until recently, existing public toxicity databases have been constructed primarily as
Full Text Available Plants have evolved with intricate mechanisms to cope with multiple environmental stresses. To adapt with biotic and abiotic stresses, plant responses involve changes at the cellular and molecular levels. The current study was designed to investigate the effects of combinations of different environmental stresses on the transcriptome level of Arabidopsis genome using public microarray databases. We investigated the role of cyclopentenones in mediating plant responses to environmental stress through TGA (TGACG motif-binding factor transcription factor, independently from jasmonic acid. Candidate genes were identified by comparing plants inoculated with Botrytis cinerea or treated with heat, salt or osmotic stress with non-inoculated or non-treated tissues. About 2.5% heat-, 19% salinity- and 41% osmotic stress-induced genes were commonly upregulated by B. cinerea-treatment; and 7.6%, 19% and 48% of genes were commonly downregulated by B. cinerea-treatment, respectively. Our results indicate that plant responses to biotic and abiotic stresses are mediated by several common regulatory genes. Comparisons between transcriptome data from Arabidopsis stressed-plants support our hypothesis that some molecular and biological processes involved in biotic and abiotic stress response are conserved. Thirteen of the common regulated genes to abiotic and biotic stresses were studied in detail to determine their role in plant resistance to B. cinerea. Moreover, a T-DNA insertion mutant of the Responsive to Dehydration gene (rd20, encoding for a member of the caleosin (lipid surface protein family, showed an enhanced sensitivity to B. cinerea infection and drought. Overall, the overlapping of plant responses to abiotic and biotic stresses, coupled with the sensitivity of the rd20 mutant, may provide new interesting programs for increased plant resistance to multiple environmental stresses, and ultimately increases its chances to survive. Future research
Full Text Available To evaluate the accuracy of the sub-classification of renal cortical neoplasms using molecular signatures.A search of publicly available databases was performed to identify microarray datasets with multiple histologic sub-types of renal cortical neoplasms. Meta-analytic techniques were utilized to identify differentially expressed genes for each histologic subtype. The lists of genes obtained from the meta-analysis were used to create predictive signatures through the use of a pair-based method. These signatures were organized into an algorithm to sub-classify renal neoplasms. The use of these signatures according to our algorithm was validated on several independent datasets.We identified three Gene Expression Omnibus datasets that fit our criteria to develop a training set. All of the datasets in our study utilized the Affymetrix platform. The final training dataset included 149 samples represented by the four most common histologic subtypes of renal cortical neoplasms: 69 clear cell, 41 papillary, 16 chromophobe, and 23 oncocytomas. When validation of our signatures was performed on external datasets, we were able to correctly classify 68 of the 72 samples (94%. The correct classification by subtype was 19/20 (95% for clear cell, 14/14 (100% for papillary, 17/19 (89% for chromophobe, 18/19 (95% for oncocytomas.Through the use of meta-analytic techniques, we were able to create an algorithm that sub-classified renal neoplasms on a molecular level with 94% accuracy across multiple independent datasets. This algorithm may aid in selecting molecular therapies and may improve the accuracy of subtyping of renal cortical tumors.
Full Text Available Abstract Background Transcriptome sequencing using next-generation sequencing platforms will soon be competing with DNA microarray technologies for global gene expression analysis. As a preliminary evaluation of these promising technologies, we performed deep sequencing of cDNA synthesized from the Microarray Quality Control (MAQC reference RNA samples using Roche's 454 Genome Sequencer FLX. Results We generated more that 3.6 million sequence reads of average length 250 bp for the MAQC A and B samples and introduced a data analysis pipeline for translating cDNA read counts into gene expression levels. Using BLAST, 90% of the reads mapped to the human genome and 64% of the reads mapped to the RefSeq database of well annotated genes with e-values ≤ 10-20. We measured gene expression levels in the A and B samples by counting the numbers of reads that mapped to individual RefSeq genes in multiple sequencing runs to evaluate the MAQC quality metrics for reproducibility, sensitivity, specificity, and accuracy and compared the results with DNA microarrays and Quantitative RT-PCR (QRTPCR from the MAQC studies. In addition, 88% of the reads were successfully aligned directly to the human genome using the AceView alignment programs with an average 90% sequence similarity to identify 137,899 unique exon junctions, including 22,193 new exon junctions not yet contained in the RefSeq database. Conclusion Using the MAQC metrics for evaluating the performance of gene expression platforms, the ExpressSeq results for gene expression levels showed excellent reproducibility, sensitivity, and specificity that improved systematically with increasing shotgun sequencing depth, and quantitative accuracy that was comparable to DNA microarrays and QRTPCR. In addition, a careful mapping of the reads to the genome using the AceView alignment programs shed new light on the complexity of the human transcriptome including the discovery of thousands of new splice variants.
Muller, Jean; Mehlen, André; Vetter, Guillaume; Yatskou, Mikalai; Muller, Arnaud; Chalmel, Frédéric; Poch, Olivier; Friederich, Evelyne; Vallar, Laurent
Background The actin cytoskeleton plays a crucial role in supporting and regulating numerous cellular processes. Mutations or alterations in the expression levels affecting the actin cytoskeleton system or related regulatory mechanisms are often associated with complex diseases such as cancer. Understanding how qualitative or quantitative changes in expression of the set of actin cytoskeleton genes are integrated to control actin dynamics and organisation is currently a challenge and should provide insights in identifying potential targets for drug discovery. Here we report the development of a dedicated microarray, the Actichip, containing 60-mer oligonucleotide probes for 327 genes selected for transcriptome analysis of the human actin cytoskeleton. Results Genomic data and sequence analysis features were retrieved from GenBank and stored in an integrative database called Actinome. From these data, probes were designed using a home-made program (CADO4MI) allowing sequence refinement and improved probe specificity by combining the complementary information recovered from the UniGene and RefSeq databases. Actichip performance was analysed by hybridisation with RNAs extracted from epithelial MCF-7 cells and human skeletal muscle. Using thoroughly standardised procedures, we obtained microarray images with excellent quality resulting in high data reproducibility. Actichip displayed a large dynamic range extending over three logs with a limit of sensitivity between one and ten copies of transcript per cell. The array allowed accurate detection of small changes in gene expression and reliable classification of samples based on the expression profiles of tissue-specific genes. When compared to two other oligonucleotide microarray platforms, Actichip showed similar sensitivity and concordant expression ratios. Moreover, Actichip was able to discriminate the highly similar actin isoforms whereas the two other platforms did not. Conclusion Our data demonstrate that
Chapman, James B.; Kapp, Paul
A database containing previously published geochronologic, geochemical, and isotopic data on Mesozoic to Quaternary igneous rocks in the Himalayan-Tibetan orogenic system are presented. The database is intended to serve as a repository for new and existing igneous rock data and is publicly accessible through a web-based platform that includes an interactive map and data table interface with search, filtering, and download options. To illustrate the utility of the database, the age, location, and ɛHft composition of magmatism from the central Gangdese batholith in the southern Lhasa terrane are compared. The data identify three high-flux events, which peak at 93, 50, and 15 Ma. They are characterized by inboard arc migration and a temporal and spatial shift to more evolved isotopic compositions.
Videbech, Poul Bror Hemming; Deleuran, Anette
AIM OF DATABASE: The purpose of the Danish Depression Database (DDD) is to monitor and facilitate the improvement of the quality of the treatment of depression in Denmark. Furthermore, the DDD has been designed to facilitate research. STUDY POPULATION: Inpatients as well as outpatients...... with depression, aged above 18 years, and treated in the public psychiatric hospital system were enrolled. MAIN VARIABLES: Variables include whether the patient has been thoroughly somatically examined and has been interviewed about the psychopathology by a specialist in psychiatry. The Hamilton score as well...... as an evaluation of the risk of suicide are measured before and after treatment. Whether psychiatric aftercare has been scheduled for inpatients and the rate of rehospitalization are also registered. DESCRIPTIVE DATA: The database was launched in 2011. Every year since then ~5,500 inpatients and 7,500 outpatients...
Full Text Available Abstract Background Most microarray studies are made using labelling with one or two dyes which allows the hybridization of one or two samples on the same slide. In such experiments, the most frequently used dyes are Cy3 and Cy5. Recent improvements in the technology (dye-labelling, scanner and, image analysis allow hybridization up to four samples simultaneously. The two additional dyes are Alexa488 and Alexa494. The triple-target or four-target technology is very promising, since it allows more flexibility in the design of experiments, an increase in the statistical power when comparing gene expressions induced by different conditions and a scaled down number of slides. However, there have been few methods proposed for statistical analysis of such data. Moreover the lowess correction of the global dye effect is available for only two-color experiments, and even if its application can be derived, it does not allow simultaneous correction of the raw data. Results We propose a two-step normalization procedure for triple-target experiments. First the dye bleeding is evaluated and corrected if necessary. Then the signal in each channel is normalized using a generalized lowess procedure to correct a global dye bias. The normalization procedure is validated using triple-self experiments and by comparing the results of triple-target and two-color experiments. Although the focus is on triple-target microarrays, the proposed method can be used to normalize p differently labelled targets co-hybridized on a same array, for any value of p greater than 2. Conclusion The proposed normalization procedure is effective: the technical biases are reduced, the number of false positives is under control in the analysis of differentially expressed genes, and the triple-target experiments are more powerful than the corresponding two-color experiments. There is room for improving the microarray experiments by simultaneously hybridizing more than two samples.
Full Text Available Microarray study enables us to obtain hundreds of thousands of expressions of genes or genotypes at once, and it is an indispensable technology for genome research. The first step is the analysis of scanned microarray images. This is the most important procedure for obtaining biologically reliable data. Currently most microarray image processing systems require burdensome manual block/spot indexing work. Since the amount of experimental data is increasing very quickly, automated microarray image analysis software becomes important. In this paper, we propose two automated methods for analyzing microarray images. First, we propose the extended -regular sequence to index blocks and spots, which enables a novel automatic gridding procedure. Second, we provide a methodology, hierarchical metagrid alignment, to allow reliable and efficient batch processing for a set of microarray images. Experimental results show that the proposed methods are more reliable and convenient than the commercial tools.
Lee, Yun-Shien; Chen, Chun-Houh; Tsai, Chi-Neu; Tsai, Chia-Lung; Chao, Angel; Wang, Tzu-Hao
Interlaboratory comparison of microarray data, even when using the same platform, imposes several challenges to scientists. RNA quality, RNA labeling efficiency, hybridization procedures and data-mining tools can all contribute variations in each laboratory. In Affymetrix GeneChips, about 11–20 different 25-mer oligonucleotides are used to measure the level of each transcript. Here, we report that ‘labeling extension values (LEVs)’, which are correlation coefficients between probe intensities and probe positions, are highly correlated with the gene expression levels (GEVs) on eukayotic Affymetrix microarray data. By analyzing LEVs and GEVs in the publicly available 2414 cel files of 20 Affymetrix microarray types covering 13 species, we found that correlations between LEVs and GEVs only exist in eukaryotic RNAs, but not in prokaryotic ones. Surprisingly, Affymetrix results of the same specimens that were analyzed in different laboratories could be clearly differentiated only by LEVs, leading to the identification of ‘laboratory signatures’. In the examined dataset, GSE10797, filtering out high-LEV genes did not compromise the discovery of biological processes that are constructed by differentially expressed genes. In conclusion, LEVs provide a new filtering parameter for microarray analysis of gene expression and it may improve the inter- and intralaboratory comparability of Affymetrix GeneChips data. PMID:19295132
Full Text Available Abstract Background During the past decade, many software packages have been developed for analysis and visualization of various types of microarrays. We have developed and maintained the widely used dChip as a microarray analysis software package accessible to both biologist and data analysts. However, challenges arise when dChip users want to analyze large number of arrays automatically and share data analysis procedures and parameters. Improvement is also needed when the dChip user support team tries to identify the causes of reported analysis errors or bugs from users. Results We report here implementation and application of the dChip automation module. Through this module, dChip automation files can be created to include menu steps, parameters, and data viewpoints to run automatically. A data-packaging function allows convenient transfer from one user to another of the dChip software, microarray data, and analysis procedures, so that the second user can reproduce the entire analysis session of the first user. An analysis report file can also be generated during an automated run, including analysis logs, user comments, and viewpoint screenshots. Conclusion The dChip automation module is a step toward reproducible research, and it can prompt a more convenient and reproducible mechanism for sharing microarray software, data, and analysis procedures and results. Automation data packages can also be used as publication supplements. Similar automation mechanisms could be valuable to the research community if implemented in other genomics and bioinformatics software packages.
Arigi, Emma; Blixt, Klas Ola; Buschard, Karsten
, the major classes of plant and fungal GSLs. In this work, a prototype "universal" GSL-based covalent microarray has been designed, and preliminary evaluation of its potential utility in assaying protein-GSL binding interactions investigated. An essential step in development involved the enzymatic release...... of the fatty acyl moiety of the ceramide aglycone of selected mammalian GSLs with sphingolipid N-deacylase (SCDase). Derivatization of the free amino group of a typical lyso-GSL, lyso-G(M1), with a prototype linker assembled from succinimidyl-[(N-maleimidopropionamido)-diethyleneglycol] ester and 2...
Li, Shuzhao; Pozhitkov, Alexander; Brouwer, Marius
Understanding the difference in probe properties holds the key to absolute quantification of DNA microarrays. So far, Langmuir-like models have failed to link sequence-specific properties to hybridization signals in the presence of a complex hybridization background. Data from washing experiments indicate that the post-hybridization washing has no major effect on the specifically bound targets, which give the final signals. Thus, the amount of specific targets bound to probes is likely determined before washing, by the competition against nonspecific binding. Our competitive hybridization model is a viable alternative to Langmuir-like models. (comment)
Massoullié, Grégoire; Wintzer-Wehekind, Jérome; Chenaf, Chouki; Mulliez, Aurélien; Pereira, Bruno; Authier, Nicolas; Eschalier, Alain; Clerfond, Guillaume; Souteyrand, Géraud; Tabassome, Simon; Danchin, Nicolas; Citron, Bernard; Lusson, Jean-René; Puymirat, Étienne; Motreff, Pascal; Eschalier, Romain
Multicentre registries of myocardial infarction management show a steady improvement in prognosis and greater access to myocardial revascularization in a more timely manner. While French registries are the standard references, the question arises: are data stemming solely from the activity of French cardiac intensive care units (ICUs) a true reflection of the entire French population with ST-segment elevation myocardial infarction (STEMI)? To compare data on patients hospitalized for STEMI from two French registries: the French registry of acute ST-elevation or non-ST-elevation myocardial infarction (FAST-MI) and the Échantillon généraliste des bénéficiaires (EGB) database. We compared patients treated for STEMI listed in the FAST-MI 2010 registry (n=1716) with those listed in the EGB database, which comprises a sample of 1/97th of the French population, also from 2010 (n=403). Compared with the FAST-MI 2010 registry, the EGB database population were older (67.2±15.3 vs 63.3±14.5 years; P<0.001), had a higher percentage of women (36.0% vs 24.7%; P<0.001), were less likely to undergo emergency coronary angiography (75.2% vs 96.3%; P<0.001) and were less often treated in university hospitals (27.1% vs 37.0%; P=0.001). There were no significant differences between the two registries in terms of cardiovascular risk factors, comorbidities and drug treatment at admission. Thirty-day mortality was higher in the EGB database (10.2% vs 4.4%; P<0.001). Registries such as FAST-MI are indispensable, not only for assessing epidemiological changes over time, but also for evaluating the prognostic effect of modern STEMI management. Meanwhile, exploitation of data from general databases, such as EGB, provides additional relevant information, as they include a broader population not routinely admitted to cardiac ICUs. Copyright © 2016 Elsevier Masson SAS. All rights reserved.
Barik, Anwesha; Banerjee, Satarupa; Dhara, Santanu; Chakravorty, Nishant
Complexities in the full genome expression studies hinder the extraction of tracker genes to analyze the course of biological events. In this study, we demonstrate the applications of supervised machine learning methods to reduce the irrelevance in microarray data series and thereby extract robust molecular markers to track biological processes. The methodology has been illustrated by analyzing whole genome expression studies on bone-implant integration (ossointegration). Being a biological process, osseointegration is known to leave a trail of genetic footprint during the course. In spite of existence of enormous amount of raw data in public repositories, researchers still do not have access to a panel of genes that can definitively track osseointegration. The results from our study revealed panels comprising of matrix metalloproteinases and collagen genes were able to track osseointegration on implant surfaces (MMP9 and COL1A2 on micro-textured; MMP12 and COL6A3 on superimposed nano-textured surfaces) with 100% classification accuracy, specificity and sensitivity. Further, our analysis showed the importance of the progression of the duration in establishment of the mechanical connection at bone-implant surface. The findings from this study are expected to be useful to researchers investigating osseointegration of novel implant materials especially at the early stage. The methodology demonstrated can be easily adapted by scientists in different fields to analyze large databases for other biological processes. Copyright © 2017 Elsevier Inc. All rights reserved.
Cohen, Sara; Zohar, Aviv
Modern blockchain systems are a fresh look at the paradigm of distributed computing, applied under assumptions of large-scale public networks. They can be used to store and share information without a trusted central party. There has been much effort to develop blockchain systems for a myriad of uses, ranging from cryptocurrencies to identity control, supply chain management, etc. None of this work has directly studied the fundamental database issues that arise when using blockchains as the u...
Lukjancenko, Oksana; Ussery, David
-density microarray chip has been designed, using 116 Enterobacteriaceae genome sequences, taking into account the enteric pan-genome. Probes for the microarray were checked in silico and performance of the chip, based on experimental strains from four different genera, demonstrate a relatively high ability...... to distinguish those strains on genus, species, and pathotype/serovar levels. Additionally, the microarray performed well when investigating which genes were found in a given strain of interest. The Enterobacteriaceae pan-genome microarray, based on 116 genomes, provides a valuable tool for determination...
Strope, Pooja K; Chaverri, Priscila; Gazis, Romina; Ciufo, Stacy; Domrachev, Michael; Schoch, Conrad L
Abstract The ITS (nuclear ribosomal internal transcribed spacer) RefSeq database at the National Center for Biotechnology Information (NCBI) is dedicated to the clear association between name, specimen and sequence data. This database is focused on sequences obtained from type material stored in public collections. While the initial ITS sequence curation effort together with numerous fungal taxonomy experts attempted to cover as many orders as possible, we extended our latest focus to the family and genus ranks. We focused on Trichoderma for several reasons, mainly because the asexual and sexual synonyms were well documented, and a list of proposed names and type material were recently proposed and published. In this case study the recent taxonomic information was applied to do a complete taxonomic audit for the genus Trichoderma in the NCBI Taxonomy database. A name status report is available here: https://www.ncbi.nlm.nih.gov/Taxonomy/TaxIdentifier/tax_identifier.cgi. As a result, the ITS RefSeq Targeted Loci database at NCBI has been augmented with more sequences from type and verified material from Trichoderma species. Additionally, to aid in the cross referencing of data from single loci and genomes we have collected a list of quality records of the RPB2 gene obtained from type material in GenBank that could help validate future submissions. During the process of curation misidentified genomes were discovered, and sequence records from type material were found hidden under previous classifications. Source metadata curation, although more cumbersome, proved to be useful as confirmation of the type material designation. Database URL: http://www.ncbi.nlm.nih.gov/bioproject/PRJNA177353 PMID:29220466
Zwinderman Aeilko H
Full Text Available Abstract Background When DNA microarray data are used for gene clustering, genotype/phenotype correlation studies, or tissue classification the signal intensities are usually transformed and normalized in several steps in order to improve comparability and signal/noise ratio. These steps may include subtraction of an estimated background signal, subtracting the reference signal, smoothing (to account for nonlinear measurement effects, and more. Different authors use different approaches, and it is generally not clear to users which method they should prefer. Results We used the ratio between biological variance and measurement variance (which is an F-like statistic as a quality measure for transformation methods, and we demonstrate a method for maximizing that variance ratio on real data. We explore a number of transformations issues, including Box-Cox transformation, baseline shift, partial subtraction of the log-reference signal and smoothing. It appears that the optimal choice of parameters for the transformation methods depends on the data. Further, the behavior of the variance ratio, under the null hypothesis of zero biological variance, appears to depend on the choice of parameters. Conclusions The use of replicates in microarray experiments is important. Adjustment for the null-hypothesis behavior of the variance ratio is critical to the selection of transformation method.
Wong, Darren C J; Sweetman, Crystal; Drew, Damian P; Ford, Christopher M
Gene expression datasets in model plants such as Arabidopsis have contributed to our understanding of gene function and how a single underlying biological process can be governed by a diverse network of genes. The accumulation of publicly available microarray data encompassing a wide range of biological and environmental conditions has enabled the development of additional capabilities including gene co-expression analysis (GCA). GCA is based on the understanding that genes encoding proteins involved in similar and/or related biological processes may exhibit comparable expression patterns over a range of experimental conditions, developmental stages and tissues. We present an open access database for the investigation of gene co-expression networks within the cultivated grapevine, Vitis vinifera. The new gene co-expression database, VTCdb (http://vtcdb.adelaide.edu.au/Home.aspx), offers an online platform for transcriptional regulatory inference in the cultivated grapevine. Using condition-independent and condition-dependent approaches, grapevine co-expression networks were constructed using the latest publicly available microarray datasets from diverse experimental series, utilising the Affymetrix Vitis vinifera GeneChip (16 K) and the NimbleGen Grape Whole-genome microarray chip (29 K), thus making it possible to profile approximately 29,000 genes (95% of the predicted grapevine transcriptome). Applications available with the online platform include the use of gene names, probesets, modules or biological processes to query the co-expression networks, with the option to choose between Affymetrix or Nimblegen datasets and between multiple co-expression measures. Alternatively, the user can browse existing network modules using interactive network visualisation and analysis via CytoscapeWeb. To demonstrate the utility of the database, we present examples from three fundamental biological processes (berry development, photosynthesis and flavonoid biosynthesis
U.S. Environmental Protection Agency — The Comprehensive Environmental Response, Compensation and Liability Information System (CERCLIS) (Superfund) Public Access Database (CPAD) contains a selected set...
Dacheux, Laurent; Berthet, Nicolas; Dissard, Gabriel; Holmes, Edward C; Delmas, Olivier; Larrous, Florence; Guigon, Ghislaine; Dickinson, Philip; Faye, Ousmane; Sall, Amadou A; Old, Iain G; Kong, Katherine; Kennedy, Giulia C; Manuguerra, Jean-Claude; Cole, Stewart T; Caro, Valérie; Gessain, Antoine; Bourhy, Hervé
The rapid and accurate identification of pathogens is critical in the control of infectious disease. To this end, we analyzed the capacity for viral detection and identification of a newly described high-density resequencing microarray (RMA), termed PathogenID, which was designed for multiple pathogen detection using database similarity searching. We focused on one of the largest and most diverse viral families described to date, the family Rhabdoviridae. We demonstrate that this approach has the potential to identify both known and related viruses for which precise sequence information is unavailable. In particular, we demonstrate that a strategy based on consensus sequence determination for analysis of RMA output data enabled successful detection of viruses exhibiting up to 26% nucleotide divergence with the closest sequence tiled on the array. Using clinical specimens obtained from rabid patients and animals, this method also shows a high species level concordance with standard reference assays, indicating that it is amenable for the development of diagnostic assays. Finally, 12 animal rhabdoviruses which were currently unclassified, unassigned, or assigned as tentative species within the family Rhabdoviridae were successfully detected. These new data allowed an unprecedented phylogenetic analysis of 106 rhabdoviruses and further suggest that the principles and methodology developed here may be used for the broad-spectrum surveillance and the broader-scale investigation of biodiversity in the viral world.
Dacheux, Laurent; Berthet, Nicolas; Dissard, Gabriel; Holmes, Edward C.; Delmas, Olivier; Larrous, Florence; Guigon, Ghislaine; Dickinson, Philip; Faye, Ousmane; Sall, Amadou A.; Old, Iain G.; Kong, Katherine; Kennedy, Giulia C.; Manuguerra, Jean-Claude; Cole, Stewart T.; Caro, Valérie; Gessain, Antoine; Bourhy, Hervé
The rapid and accurate identification of pathogens is critical in the control of infectious disease. To this end, we analyzed the capacity for viral detection and identification of a newly described high-density resequencing microarray (RMA), termed PathogenID, which was designed for multiple pathogen detection using database similarity searching. We focused on one of the largest and most diverse viral families described to date, the family Rhabdoviridae. We demonstrate that this approach has the potential to identify both known and related viruses for which precise sequence information is unavailable. In particular, we demonstrate that a strategy based on consensus sequence determination for analysis of RMA output data enabled successful detection of viruses exhibiting up to 26% nucleotide divergence with the closest sequence tiled on the array. Using clinical specimens obtained from rabid patients and animals, this method also shows a high species level concordance with standard reference assays, indicating that it is amenable for the development of diagnostic assays. Finally, 12 animal rhabdoviruses which were currently unclassified, unassigned, or assigned as tentative species within the family Rhabdoviridae were successfully detected. These new data allowed an unprecedented phylogenetic analysis of 106 rhabdoviruses and further suggest that the principles and methodology developed here may be used for the broad-spectrum surveillance and the broader-scale investigation of biodiversity in the viral world. PMID:20610710
About the LCI Database Project The U.S. Life Cycle Inventory (LCI) Database is a publicly available database that allows users to objectively review and compare analysis results that are based on similar source of critically reviewed LCI data through its LCI Database Project. NREL's High-Performance
Dias Rodrigo A
Full Text Available Abstract Background Smallpox is a lethal disease that was endemic in many parts of the world until eradicated by massive immunization. Due to its lethality, there are serious concerns about its use as a bioweapon. Here we analyze publicly available microarray data to further understand survival of smallpox infected macaques, using systems biology approaches. Our goal is to improve the knowledge about the progression of this disease. Results We used KEGG pathways annotations to define groups of genes (or modules, and subsequently compared them to macaque survival times. This technique provided additional insights about the host response to this disease, such as increased expression of the cytokines and ECM receptors in the individuals with higher survival times. These results could indicate that these gene groups could influence an effective response from the host to smallpox. Conclusion Macaques with higher survival times clearly express some specific pathways previously unidentified using regular gene-by-gene approaches. Our work also shows how third party analysis of public datasets can be important to support new hypotheses to relevant biological problems.
deVarvalho, Robert; Desai, Shailen D.; Haines, Bruce J.; Kruizinga, Gerhard L.; Gilmer, Christopher
This software provides storage retrieval and analysis functionality for managing satellite altimetry data. It improves the efficiency and analysis capabilities of existing database software with improved flexibility and documentation. It offers flexibility in the type of data that can be stored. There is efficient retrieval either across the spatial domain or the time domain. Built-in analysis tools are provided for frequently performed altimetry tasks. This software package is used for storing and manipulating satellite measurement data. It was developed with a focus on handling the requirements of repeat-track altimetry missions such as Topex and Jason. It was, however, designed to work with a wide variety of satellite measurement data [e.g., Gravity Recovery And Climate Experiment -- GRACE). The software consists of several command-line tools for importing, retrieving, and analyzing satellite measurement data.
Dong, Yang; Li, Ming; Liu, Puzhao; Song, Haiyan; Zhao, Yuping; Shi, Jianrong
Genes involved in immunity and apoptosis were associated with human presbycusis. CCR3 and GILZ played an important role in the pathogenesis of presbycusis, probably through regulating chemokine receptor, T-cell apoptosis, or T-cell activation pathways. To identify genes associated with human presbycusis and explore the molecular mechanism of presbycusis. Hearing function was tested by pure-tone audiometry. Microarray analysis was performed to identify presbycusis-correlated genes by Illumina Human-6 BeadChip using the peripheral blood samples of subjects. To identify biological process categories and pathways associated with presbycusis-correlated genes, bioinformatics analysis was carried out by Gene Ontology Tree Machine (GOTM) and database for annotation, visualization, and integrated discovery (DAVID). Quantitative RT-PCR (qRT-PCR) was used to validate the microarray data. Microarray analysis identified 469 up-regulated genes and 323 down-regulated genes. Both the dominant biological processes by Gene Ontology (GO) analysis and the enriched pathways by Kyoto encyclopedia of genes and genomes (KEGG) and BIOCARTA showed that genes involved in immunity and apoptosis were associated with presbycusis. In addition, CCR3, GILZ, CXCL10, and CX3CR1 genes showed consistent difference between groups for both the gene chip and qRT-PCR data. The differences of CCR3 and GILZ between presbycusis patients and controls were statistically significant (p < 0.05).
Tárraga, Joaquín; Medina, Ignacio; Carbonell, José; Huerta-Cepas, Jaime; Minguez, Pablo; Alloza, Eva; Al-Shahrour, Fátima; Vegas-Azcárate, Susana; Goetz, Stefan; Escobar, Pablo; Garcia-Garcia, Francisco; Conesa, Ana; Montaner, David; Dopazo, Joaquín
Gene Expression Profile Analysis Suite (GEPAS) is one of the most complete and extensively used web-based packages for microarray data analysis. During its more than 5 years of activity it has continuously been updated to keep pace with the state-of-the-art in the changing microarray data analysis arena. GEPAS offers diverse analysis options that include well established as well as novel algorithms for normalization, gene selection, class prediction, clustering and functional profiling of the experiment. New options for time-course (or dose-response) experiments, microarray-based class prediction, new clustering methods and new tests for differential expression have been included. The new pipeliner module allows automating the execution of sequential analysis steps by means of a simple but powerful graphic interface. An extensive re-engineering of GEPAS has been carried out which includes the use of web services and Web 2.0 technology features, a new user interface with persistent sessions and a new extended database of gene identifiers. GEPAS is nowadays the most quoted web tool in its field and it is extensively used by researchers of many countries and its records indicate an average usage rate of 500 experiments per day. GEPAS, is available at http://www.gepas.org. PMID:18508806
De Loof Arnold
Full Text Available Abstract Background For holometabolous insects there has been an explosion of proteomic and peptidomic information thanks to large genome sequencing projects. Heterometabolous insects, although comprising many important species, have been far less studied. The migratory locust Locusta migratoria, a heterometabolous insect, is one of the most infamous agricultural pests. They undergo a well-known and profound phase transition from the relatively harmless solitary form to a ferocious gregarious form. The underlying regulatory mechanisms of this phase transition are not fully understood, but it is undoubtedly that neuropeptides are involved. However, neuropeptide research in locusts is hampered by the absence of genomic information. Results Recently, EST (Expressed Sequence Tag databases from Locusta migratoria were constructed. Using bioinformatical tools, we searched these EST databases specifically for neuropeptide precursors. Based on known locust neuropeptide sequences, we confirmed the sequence of several previously identified neuropeptide precursors (i.e. pacifastin-related peptides, which consolidated our method. In addition, we found two novel neuroparsin precursors and annotated the hitherto unknown tachykinin precursor. Besides one of the known tachykinin peptides, this EST contained an additional tachykinin-like sequence. Using neuropeptide precursors from Drosophila melanogaster as a query, we succeeded in annotating the Locusta neuropeptide F, allatostatin-C and ecdysis-triggering hormone precursor, which until now had not been identified in locusts or in any other heterometabolous insect. For the tachykinin precursor, the ecdysis-triggering hormone precursor and the allatostatin-C precursor, translation of the predicted neuropeptides in neural tissues was confirmed with mass spectrometric techniques. Conclusion In this study we describe the annotation of 6 novel neuropeptide precursors and the neuropeptides they encode from the
Full Text Available Abstract Background Microorganisms display vast diversity, and each one has its own set of genes, cell components and metabolic reactions. To assess their huge unexploited metabolic potential in different ecosystems, we need high throughput tools, such as functional microarrays, that allow the simultaneous analysis of thousands of genes. However, most classical functional microarrays use specific probes that monitor only known sequences, and so fail to cover the full microbial gene diversity present in complex environments. We have thus developed an algorithm, implemented in the user-friendly program Metabolic Design, to design efficient explorative probes. Results First we have validated our approach by studying eight enzymes involved in the degradation of polycyclic aromatic hydrocarbons from the model strain Sphingomonas paucimobilis sp. EPA505 using a designed microarray of 8,048 probes. As expected, microarray assays identified the targeted set of genes induced during biodegradation kinetics experiments with various pollutants. We have then confirmed the identity of these new genes by sequencing, and corroborated the quantitative discrimination of our microarray by quantitative real-time PCR. Finally, we have assessed metabolic capacities of microbial communities in soil contaminated with aromatic hydrocarbons. Results show that our probe design (sensitivity and explorative quality can be used to study a complex environment efficiently. Conclusions We successfully use our microarray to detect gene expression encoding enzymes involved in polycyclic aromatic hydrocarbon degradation for the model strain. In addition, DNA microarray experiments performed on soil polluted by organic pollutants without prior sequence assumptions demonstrate high specificity and sensitivity for gene detection. Metabolic Design is thus a powerful, efficient tool that can be used to design explorative probes and monitor metabolic pathways in complex environments
Full Text Available Abstract Background Large genomes contain families of highly similar genes that cannot be individually identified by microarray probes. This limitation is due to thermodynamic restrictions and cannot be resolved by any computational method. Since gene annotations are updated more frequently than microarrays, another common issue facing microarray users is that existing microarrays must be routinely reanalyzed to determine probes that are still useful with respect to the updated annotations. Results PICKY 2.0 can design shared probes for sets of genes that cannot be individually identified using unique probes. PICKY 2.0 uses novel algorithms to track sharable regions among genes and to strictly distinguish them from other highly similar but nontarget regions during thermodynamic comparisons. Therefore, PICKY does not sacrifice the quality of shared probes when choosing them. The latest PICKY 2.1 includes the new capability to reanalyze existing microarray probes against updated gene sets to determine probes that are still valid to use. In addition, more precise nonlinear salt effect estimates and other improvements are added, making PICKY 2.1 more versatile to microarray users. Conclusions Shared probes allow expressed gene family members to be detected; this capability is generally more desirable than not knowing anything about these genes. Shared probes also enable the design of cross-genome microarrays, which facilitate multiple species identification in environmental samples. The new nonlinear salt effect calculation significantly increases the precision of probes at a lower buffer salt concentration, and the probe reanalysis function improves existing microarray result interpretations.
Microarrays offer biologists an exciting tool that allows the simultaneous assessment of gene expression levels for thousands of genes at once. At the time of their inception, microarrays were hailed as the new dawn in cancer biology and oncology practice with the hope that within a decade diseases
DNA microarray technology is a powerful functional genomics tool increasingly used for investigating global gene expression in environmental studies. Microarrays can also be used in identifying biological networks, as they give insight on the complex gene-to-gene interactions, ne...
Hal, van N.L.W.; Vorst, O.; Houwelingen, van A.M.M.L.; Kok, E.J.; Peijnenburg, A.A.C.M.; Aharoni, A.; Tunen, van A.J.; Keijer, J.
DNA microarray technology is a new and powerful technology that will substantially increase the speed of molecular biological research. This paper gives a survey of DNA microarray technology and its use in gene expression studies. The technical aspects and their potential improvements are discussed.
Full Text Available Biosensors such as DNA microarrays and microchips are gaining an increasingimportance in medicinal, forensic, and environmental analyses. Such devices are based onthe detection of supramolecular interactions called hybridizations that occur betweencomplementary oligonucleotides, one linked to a solid surface (the probe, and the other oneto be analyzed (the target. This paper focuses on the improvements that hyperbranched andperfectly defined nanomolecules called dendrimers can provide to this methodology. Twomain uses of dendrimers for such purpose have been described up to now; either thedendrimer is used as linker between the solid surface and the probe oligonucleotide, or thedendrimer is used as a multilabeled entity linked to the target oligonucleotide. In the firstcase the dendrimer generally induces a higher loading of probes and an easier hybridization,due to moving away the solid phase. In the second case the high number of localized labels(generally fluorescent induces an increased sensitivity, allowing the detection of smallquantities of biological entities.
Boichard, Jean-Luc; Brissebrat, Guillaume; Cloche, Sophie; Eymard, Laurence; Fleury, Laurence; Mastrorillo, Laurence; Moulaye, Oumarou; Ramage, Karim
The AMMA project includes aircraft, ground-based and ocean measurements, an intensive use of satellite data and diverse modelling studies. Therefore, the AMMA database aims at storing a great amount and a large variety of data, and at providing the data as rapidly and safely as possible to the AMMA research community. In order to stimulate the exchange of information and collaboration between researchers from different disciplines or using different tools, the database provides a detailed description of the products and uses standardized formats. The AMMA database contains: - AMMA field campaigns datasets; - historical data in West Africa from 1850 (operational networks and previous scientific programs); - satellite products from past and future satellites, (re-)mapped on a regular latitude/longitude grid and stored in NetCDF format (CF Convention); - model outputs from atmosphere or ocean operational (re-)analysis and forecasts, and from research simulations. The outputs are processed as the satellite products are. Before accessing the data, any user has to sign the AMMA data and publication policy. This chart only covers the use of data in the framework of scientific objectives and categorically excludes the redistribution of data to third parties and the usage for commercial applications. Some collaboration between data producers and users, and the mention of the AMMA project in any publication is also required. The AMMA database and the associated on-line tools have been fully developed and are managed by two teams in France (IPSL Database Centre, Paris and OMP, Toulouse). Users can access data of both data centres using an unique web portal. This website is composed of different modules : - Registration: forms to register, read and sign the data use chart when an user visits for the first time - Data access interface: friendly tool allowing to build a data extraction request by selecting various criteria like location, time, parameters... The request can
Chaudhry, M. Ahmad [Department of Medical Laboratory and Radiation Sciences, College of Nursing and Health Sciences, University of Vermont, 302 Rowell Building, Burlington, VT 05405 (United States) and DNA Microarray Facility, University of Vermont, Burlington, VT 05405 (United States)]. E-mail: email@example.com
In cell populations exposed to ionizing radiation, the biological effects occur in a much larger proportion of cells than are estimated to be traversed by radiation. It has been suggested that irradiated cells are capable of providing signals to the neighboring unirradiated cells resulting in damage to these cells. This phenomenon is termed the bystander effect. The bystander effect induces persistent, long-term, transmissible changes that result in delayed death and neoplastic transformation. Because the bystander effect is relevant to carcinogenesis, it could have significant implications for risk estimation for radiation exposure. The nature of the bystander effect signal and how it impacts the unirradiated cells remains to be elucidated. Examination of the changes in gene expression could provide clues to understanding the bystander effect and could define the signaling pathways involved in sustaining damage to these cells. The microarray technology serves as a tool to gain insight into the molecular pathways leading to bystander effect. Using medium from irradiated normal human diploid lung fibroblasts as a model system we examined gene expression alterations in bystander cells. The microarray data revealed that the radiation-induced gene expression profile in irradiated cells is different from unirradiated bystander cells suggesting that the pathways leading to biological effects in the bystander cells are different from the directly irradiated cells. The genes known to be responsive to ionizing radiation were observed in irradiated cells. Several genes were upregulated in cells receiving media from irradiated cells. Surprisingly no genes were found to be downregulated in these cells. A number of genes belonging to extracellular signaling, growth factors and several receptors were identified in bystander cells. Interestingly 15 genes involved in the cell communication processes were found to be upregulated. The induction of receptors and the cell
Chaudhry, M. Ahmad
In cell populations exposed to ionizing radiation, the biological effects occur in a much larger proportion of cells than are estimated to be traversed by radiation. It has been suggested that irradiated cells are capable of providing signals to the neighboring unirradiated cells resulting in damage to these cells. This phenomenon is termed the bystander effect. The bystander effect induces persistent, long-term, transmissible changes that result in delayed death and neoplastic transformation. Because the bystander effect is relevant to carcinogenesis, it could have significant implications for risk estimation for radiation exposure. The nature of the bystander effect signal and how it impacts the unirradiated cells remains to be elucidated. Examination of the changes in gene expression could provide clues to understanding the bystander effect and could define the signaling pathways involved in sustaining damage to these cells. The microarray technology serves as a tool to gain insight into the molecular pathways leading to bystander effect. Using medium from irradiated normal human diploid lung fibroblasts as a model system we examined gene expression alterations in bystander cells. The microarray data revealed that the radiation-induced gene expression profile in irradiated cells is different from unirradiated bystander cells suggesting that the pathways leading to biological effects in the bystander cells are different from the directly irradiated cells. The genes known to be responsive to ionizing radiation were observed in irradiated cells. Several genes were upregulated in cells receiving media from irradiated cells. Surprisingly no genes were found to be downregulated in these cells. A number of genes belonging to extracellular signaling, growth factors and several receptors were identified in bystander cells. Interestingly 15 genes involved in the cell communication processes were found to be upregulated. The induction of receptors and the cell
Singh, Anup K.; Throckmorton, Daniel J.; Moran-Mirabal, Jose C.; Edel, Joshua B.; Meyer, Grant D.; Craighead, Harold G.
We present the use of micron-sized lipid domains, patterned onto planar substrates and within microfluidic channels, to assay the binding of bacterial toxins via total internal reflection fluorescence microscopy (TIRFM). The lipid domains were patterned using a polymer lift-off technique and consisted of ganglioside-populated DSPC:cholesterol supported lipid bilayers (SLBs). Lipid patterns were formed on the substrates by vesicle fusion followed by polymer lift-off, which revealed micron-sized SLBs containing either ganglioside GT1b or GM1. The ganglioside-populated SLB arrays were then exposed to either Cholera toxin subunit B (CTB) or Tetanus toxin fragment C (TTC). Binding was assayed on planar substrates by TIRFM down to 1 nM concentration for CTB and 100 nM for TTC. Apparent binding constants extracted from three different models applied to the binding curves suggest that binding of a protein to a lipid-based receptor is strongly affected by the lipid composition of the SLB and by the substrate on which the bilayer is formed. Patterning of SLBs inside microfluidic channels also allowed the preparation of lipid domains with different compositions on a single device. Arrays within microfluidic channels were used to achieve segregation and selective binding from a binary mixture of the toxin fragments in one device. The binding and segregation within the microfluidic channels was assayed with epifluorescence as proof of concept. We propose that the method used for patterning the lipid microarrays on planar substrates and within microfluidic channels can be easily adapted to proteins or nucleic acids and can be used for biosensor applications and cell stimulation assays under different flow conditions. KEYWORDS. Microarray, ganglioside, polymer lift-off, cholera toxin, tetanus toxin, TIRFM, binding constant.4
Wing, Louise; Massoud, Tarik F
Quantitative, qualitative, and innovative application of bibliometric research performance indicators to anatomy and radiology research and education can enhance cross-fertilization between the two disciplines. We aim to use these indicators to identify long-term trends in dissemination of publications in neuroimaging anatomy (including both productivity and citation rates), which has subjectively waned in prestige during recent years. We examined publications over the last 40 years in two neuroradiological journals, AJNR and Neuroradiology, and selected and categorized all neuroimaging anatomy research articles according to theme and type. We studied trends in their citation activity over time, and mathematically analyzed these trends for 1977, 1987, and 1997 publications. We created a novel metric, "citation half-life at 10 years postpublication" (CHL-10), and used this to examine trends in the skew of citation numbers for anatomy articles each year. We identified 367 anatomy articles amongst a total of 18,110 in these journals: 74.2% were original articles, with study of normal anatomy being the commonest theme (46.7%). We recorded a mean of 18.03 citations for each anatomy article, 35% higher than for general neuroradiology articles. Graphs summarizing the rise (upslope) in citation rates after publication revealed similar trends spanning two decades. CHL-10 trends demonstrated that more recently published anatomy articles were likely to take longer to reach peak citation rate. Bibliometric analysis suggests that anatomical research in neuroradiology is not languishing. This novel analytical approach can be applied to other aspects of neuroimaging research, and within other subspecialties in radiology and anatomy, and also to foster anatomical education. © 2014 Wiley Periodicals, Inc.
Roy, Sashwati; Sen, Chandan K.
The cDNA microarray technology and related bioinformatics tools presents a wide range of novel application opportunities. The technology may be productively applied to address food safety. In this mini-review article, we present an update highlighting the late breaking discoveries that demonstrate the vitality of cDNA microarray technology as a tool to analyze food safety with reference to microbial pathogens and genetically modified foods. In order to bring the microarray technology to mainstream food safety, it is important to develop robust user-friendly tools that may be applied in a field setting. In addition, there needs to be a standardized process for regulatory agencies to interpret and act upon microarray-based data. The cDNA microarray approach is an emergent technology in diagnostics. Its values lie in being able to provide complimentary molecular insight when employed in addition to traditional tests for food safety, as part of a more comprehensive battery of tests
Daniel L Roden
Full Text Available Complex human diseases can show significant heterogeneity between patients with the same phenotypic disorder. An outlier detection strategy was developed to identify variants at the level of gene transcription that are of potential biological and phenotypic importance. Here we describe a graphical software package (z-score outlier detection (ZODET that enables identification and visualisation of gross abnormalities in gene expression (outliers in individuals, using whole genome microarray data. Mean and standard deviation of expression in a healthy control cohort is used to detect both over and under-expressed probes in individual test subjects. We compared the potential of ZODET to detect outlier genes in gene expression datasets with a previously described statistical method, gene tissue index (GTI, using a simulated expression dataset and a publicly available monocyte-derived macrophage microarray dataset. Taken together, these results support ZODET as a novel approach to identify outlier genes of potential pathogenic relevance in complex human diseases. The algorithm is implemented using R packages and Java.The software is freely available from http://www.ucl.ac.uk/medicine/molecular-medicine/publications/microarray-outlier-analysis.
Pedersen, Henriette Lodberg; Fangel, Jonatan Ulrik; McCleary, Barry
Microarrays are powerful tools for high throughput analysis, and hundreds or thousands of molecular interactions can be assessed simultaneously using very small amounts of analytes. Nucleotide microarrays are well established in plant research, but carbohydrate microarrays are much less establish...
Jin, S J; Liu, M; Long, W J; Luo, X P
Objective: To explore the clinical phenotypes and the genetic cause for a boy with unexplained growth retardation, nephrocalcinosis, auditory anomalies and multi-organ/system developmental disorders. Method: Routine G-banding and chromosome microarray analysis were applied to a child with unexplained growth retardation, nephrocalcinosis, auditory anomalies and multi-organ/system developmental disorders treated in the Department of Pediatrics of Tongji Hospital Affiliated to Tongji Medical College of Huazhong University of Science and Technology in September 2015 and his parents to conduct the chromosomal karyotype analysis and the whole genome scanning. Deleted genes were searched in the Decipher and NCBI databases, and their relationships with the clinical phenotypes were analyzed. Result: A six-month-old boy was refered to us because of unexplained growth retardation and feeding intolerance.The affected child presented with abnormal manifestation such as special face, umbilical hernia, growth retardation, hypothyroidism, congenital heart disease, right ear sensorineural deafness, hypercalcemia and nephrocalcinosis. The child's karyotype was 46, XY, 16qh + , and his parents' karyotypes were normal. Chromosome microarray analysis revealed a 1 436 kb deletion on the 7q11.23(72701098_74136633) region of the child. This region included 23 protein-coding genes, which were reported to be corresponding to Williams-Beuren syndrome and its certain clinical phenotypes. His parents' results of chromosome microarray analysis were normal. Conclusion: A boy with characteristic manifestation of Williams-Beuren syndrome and rare nephrocalcinosis was diagnosed using chromosome microarray analysis. The deletion on the 7q11.23 might be related to the clinical phenotypes of Williams-Beuren syndrome, yet further studies are needed.
Full Text Available Abstract Background High-throughput RNAi screening is widely applied in biological research, but remains expensive, infrastructure-intensive and conversion of many assays to HTS applications in microplate format is not feasible. Results Here, we describe the optimization of a miniaturized cell spot microarray (CSMA method, which facilitates utilization of the transfection microarray technique for disparate RNAi analyses. To promote rapid adaptation of the method, the concept has been tested with a panel of 92 adherent cell types, including primary human cells. We demonstrate the method in the systematic screening of 492 GPCR coding genes for impact on growth and survival of cultured human prostate cancer cells. Conclusions The CSMA method facilitates reproducible preparation of highly parallel cell microarrays for large-scale gene knockdown analyses. This will be critical towards expanding the cell based functional genetic screens to include more RNAi constructs, allow combinatorial RNAi analyses, multi-parametric phenotypic readouts or comparative analysis of many different cell types.
Severgnini, Marco; Bicciato, Silvio; Mangano, Eleonora; Scarlatti, Francesca; Mezzelani, Alessandra; Mattioli, Michela; Ghidoni, Riccardo; Peano, Clelia; Bonnal, Raoul; Viti, Federica; Milanesi, Luciano; De Bellis, Gianluca; Battaglia, Cristina
Meta-analysis of microarray data is increasingly important, considering both the availability of multiple platforms using disparate technologies and the accumulation in public repositories of data sets from different laboratories. We addressed the issue of comparing gene expression profiles from two microarray platforms by devising a standardized investigative strategy. We tested this procedure by studying MDA-MB-231 cells, which undergo apoptosis on treatment with resveratrol. Gene expression profiles were obtained using high-density, short-oligonucleotide, single-color microarray platforms: GeneChip (Affymetrix) and CodeLink (Amersham). Interplatform analyses were carried out on 8414 common transcripts represented on both platforms, as identified by LocusLink ID, representing 70.8% and 88.6% of annotated GeneChip and CodeLink features, respectively. We identified 105 differentially expressed genes (DEGs) on CodeLink and 42 DEGs on GeneChip. Among them, only 9 DEGs were commonly identified by both platforms. Multiple analyses (BLAST alignment of probes with target sequences, gene ontology, literature mining, and quantitative real-time PCR) permitted us to investigate the factors contributing to the generation of platform-dependent results in single-color microarray experiments. An effective approach to cross-platform comparison involves microarrays of similar technologies, samples prepared by identical methods, and a standardized battery of bioinformatic and statistical analyses.
Formal approaches to the semantics of databases and database languages can have immediate and practical consequences in extending database integration technologies to include a vastly greater range...
Kelley, Rowena Y; Gresham, Cathy; Harper, Jonathan; Bridges, Susan M; Warburton, Marilyn L; Hawkins, Leigh K; Pechanova, Olga; Peethambaran, Bela; Pechan, Tibor; Luthe, Dawn S; Mylroie, J E; Ankala, Arunkanth; Ozkan, Seval; Henry, W B; Williams, W P
Aspergillus flavus Link:Fr, an opportunistic fungus that produces aflatoxin, is pathogenic to maize and other oilseed crops. Aflatoxin is a potent carcinogen, and its presence markedly reduces the value of grain. Understanding and enhancing host resistance to A. flavus infection and/or subsequent aflatoxin accumulation is generally considered an efficient means of reducing grain losses to aflatoxin. Different proteomic, genomic and genetic studies of maize (Zea mays L.) have generated large data sets with the goal of identifying genes responsible for conferring resistance to A. flavus, or aflatoxin. In order to maximize the usage of different data sets in new studies, including association mapping, we have constructed a relational database with web interface integrating the results of gene expression, proteomic (both gel-based and shotgun), Quantitative Trait Loci (QTL) genetic mapping studies, and sequence data from the literature to facilitate selection of candidate genes for continued investigation. The Corn Fungal Resistance Associated Sequences Database (CFRAS-DB) (http://agbase.msstate.edu/) was created with the main goal of identifying genes important to aflatoxin resistance. CFRAS-DB is implemented using MySQL as the relational database management system running on a Linux server, using an Apache web server, and Perl CGI scripts as the web interface. The database and the associated web-based interface allow researchers to examine many lines of evidence (e.g. microarray, proteomics, QTL studies, SNP data) to assess the potential role of a gene or group of genes in the response of different maize lines to A. flavus infection and subsequent production of aflatoxin by the fungus. CFRAS-DB provides the first opportunity to integrate data pertaining to the problem of A. flavus and aflatoxin resistance in maize in one resource and to support queries across different datasets. The web-based interface gives researchers different query options for mining the database
Sigurdsson, J.A.; Getz, L.; Sjonell, G.
.78 to 0.92]. Our calculations are based on the World Health Organization and national databanks on death causes (ICD-10) and the mid-year number of inhabitants in the target group. For Finland, Denmark, Norway and Sweden, we used data for 2009. For Iceland, due to the population's small size, we......, cardiovascular diseases and accidents, with some national variations. Conclusions and implications Establishment of a screening programme for CRC for people aged 55-74 can be expected to affect only a minor proportion of all premature deaths in the Nordic setting. From a public health perspective, prioritizing...
The database provides bibliographic citations and abstracts for publications that may be useful in research and design of air- conditioning and refrigeration equipment. The database identifies sources of specific information on R-32, R-123, R-124, R-125, R-134, R-134a, R-141b, R-142b, R-143a, R-152a, R-245ca, R-290 (propane), R- 717 (ammonia), ethers, and others as well as azeotropic and zeotropic and zeotropic blends of these fluids. It addresses lubricants including alkylbenzene, polyalkylene glycol, ester, and other synthetics as well as mineral oils. It also references documents on compatibility of refrigerants and lubricants with metals, plastics, elastomers, motor insulation, and other materials used in refrigerant circuits. A computerized version is available that includes retrieval software.
Full Text Available Abstract Background Statistical analysis of DNA microarray data provides a valuable diagnostic tool for the investigation of genetic components of diseases. To take advantage of the multitude of available data sets and analysis methods, it is desirable to combine both different algorithms and data from different studies. Applying ensemble learning, consensus clustering and cross-study normalization methods for this purpose in an almost fully automated process and linking different analysis modules together under a single interface would simplify many microarray analysis tasks. Results We present ArrayMining.net, a web-application for microarray analysis that provides easy access to a wide choice of feature selection, clustering, prediction, gene set analysis and cross-study normalization methods. In contrast to other microarray-related web-tools, multiple algorithms and data sets for an analysis task can be combined using ensemble feature selection, ensemble prediction, consensus clustering and cross-platform data integration. By interlinking different analysis tools in a modular fashion, new exploratory routes become available, e.g. ensemble sample classification using features obtained from a gene set analysis and data from multiple studies. The analysis is further simplified by automatic parameter selection mechanisms and linkage to web tools and databases for functional annotation and literature mining. Conclusion ArrayMining.net is a free web-application for microarray analysis combining a broad choice of algorithms based on ensemble and consensus methods, using automatic parameter selection and integration with annotation databases.
Hsia, Chu Chieh; Chizhikov, Vladimir E.; Yang, Amy X.; Selvapandiyan, Angamuthu; Hewlett, Indira; Duncan, Robert; Puri, Raj K.; Nakhasi, Hira L.; Kaplan, Gerardo G.
Hepatitis B virus (HBV), hepatitis C virus (HCV), and human immunodeficiency virus type-1 (HIV-1) are transfusion-transmitted human pathogens that have a major impact on blood safety and public health worldwide. We developed a microarray multiplex assay for the simultaneous detection and discrimination of these three viruses. The microarray consists of 16 oligonucleotide probes, immobilized on a silylated glass slide. Amplicons from multiplex PCR were labeled with Cy-5 and hybridized to the microarray. The assay detected 1 International Unit (IU), 10 IU, 20 IU of HBV, HCV, and HIV-1, respectively, in a single multiplex reaction. The assay also detected and discriminated the presence of two or three of these viruses in a single sample. Our data represent a proof-of-concept for the possible use of highly sensitive multiplex microarray assay to screen and confirm the presence of these viruses in blood donors and patients
Full Text Available Advances in lithographic approaches to fabricating bio-microarrays have been extensively explored over the last two decades. However, the need for pattern flexibility, a high density, a high resolution, affordability and on-demand fabrication is promoting the development of unconventional routes for microarray fabrication. This review highlights the development and uses of a new molecular lithography approach, called “microintaglio printing technology”, for large-scale bio-microarray fabrication using a microreactor array (µRA-based chip consisting of uniformly-arranged, femtoliter-size µRA molds. In this method, a single-molecule-amplified DNA microarray pattern is self-assembled onto a µRA mold and subsequently converted into a messenger RNA or protein microarray pattern by simultaneously producing and transferring (immobilizing a messenger RNA or a protein from a µRA mold to a glass surface. Microintaglio printing allows the self-assembly and patterning of in situ-synthesized biomolecules into high-density (kilo-giga-density, ordered arrays on a chip surface with µm-order precision. This holistic aim, which is difficult to achieve using conventional printing and microarray approaches, is expected to revolutionize and reshape proteomics. This review is not written comprehensively, but rather substantively, highlighting the versatility of microintaglio printing for developing a prerequisite platform for microarray technology for the postgenomic era.
Currently there is an enormous amount of various geoscience databases. Unfortunately the only users of the majority of the databases are their elaborators. There are several reasons for that: incompaitability, specificity of tasks and objects and so on. However the main obstacles for wide usage of geoscience databases are complexity for elaborators and complication for users. The complexity of architecture leads to high costs that block the public access. The complication prevents users from understanding when and how to use the database. Only databases, associated with GoogleMaps don't have these drawbacks, but they could be hardly named "geoscience" Nevertheless, open and simple geoscience database is necessary at least for educational purposes (see our abstract for ESSI20/EOS12). We developed a database and web interface to work with them and now it is accessible at maps.sch192.ru. In this database a result is a value of a parameter (no matter which) in a station with a certain position, associated with metadata: the date when the result was obtained; the type of a station (lake, soil etc); the contributor that sent the result. Each contributor has its own profile, that allows to estimate the reliability of the data. The results can be represented on GoogleMaps space image as a point in a certain position, coloured according to the value of the parameter. There are default colour scales and each registered user can create the own scale. The results can be also extracted in *.csv file. For both types of representation one could select the data by date, object type, parameter type, area and contributor. The data are uploaded in *.csv format: Name of the station; Lattitude(dd.dddddd); Longitude(ddd.dddddd); Station type; Parameter type; Parameter value; Date(yyyy-mm-dd). The contributor is recognised while entering. This is the minimal set of features that is required to connect a value of a parameter with a position and see the results. All the complicated data
James Anthony A
Full Text Available Abstract Background Aedes aegypti is the principal vector of dengue and yellow fever viruses. The availability of the sequenced and annotated genome enables genome-wide analyses of gene expression in this mosquito. The large amount of data resulting from these analyses requires efficient cataloguing before it becomes useful as the basis for new insights into gene expression patterns and studies of the underlying molecular mechanisms for generating these patterns. Findings We provide a publicly-accessible database and data-mining tool, aeGEPUCI, that integrates 1 microarray analyses of sex- and stage-specific gene expression in Ae. aegypti, 2 functional gene annotation, 3 genomic sequence data, and 4 computational sequence analysis tools. The database can be used to identify genes expressed in particular stages and patterns of interest, and to analyze putative cis-regulatory elements (CREs that may play a role in coordinating these patterns. The database is accessible from the address http://www.aegep.bio.uci.edu. Conclusions The combination of gene expression, function and sequence data coupled with integrated sequence analysis tools allows for identification of expression patterns and streamlines the development of CRE predictions and experiments to assess how patterns of expression are coordinated at the molecular level.
Tsou, Ann-Ping; Sun, Yi-Ming; Liu, Chia-Lin; Huang, Hsien-Da; Horng, Jorng-Tzong; Tsai, Meng-Feng; Liu, Baw-Juine
Identification of transcriptional regulatory sites plays an important role in the investigation of gene regulation. For this propose, we designed and implemented a data warehouse to integrate multiple heterogeneous biological data sources with data types such as text-file, XML, image, MySQL database model, and Oracle database model. The utility of the biological data warehouse in predicting transcriptional regulatory sites of coregulated genes was explored using a synexpression group derived from a microarray study. Both of the binding sites of known transcription factors and predicted over-represented (OR) oligonucleotides were demonstrated for the gene group. The potential biological roles of both known nucleotides and one OR nucleotide were demonstrated using bioassays. Therefore, the results from the wet-lab experiments reinforce the power and utility of the data warehouse as an approach to the genome-wide search for important transcription regulatory elements that are the key to many complex biological systems.
Walter Teixeira Lima Junior
Full Text Available The paper aims to reveal the results of researched project research project applied in Conected Social Media Observatory, called Neofluxo. It was approved by the National Council for Scientific and Technological Development (CNPq and its main objective is to identify the behavior of informational flow in social networks during the majority electoral processs in Brazil, in 2010 and demonstrate the possibility to produce Journalism through the intersection and data visualization using APIs. The project stored more than 20,2 million of mentions of candidates, and keywords defined by the researchers. For this, it was elaborated a specific computer program based on an open source that is able to track entries from Twitter users from keywords, collecting and storing them in the database. The Neofluxo also recorded data from official social networks of candidates Jose Serra, Dilma Rousseff and Marina Silva, in order to identify –by these starting points - the informational flows until they have reached Twitter.O presente trabalho visa expor os resultados preliminares do projeto de pesquisa aplicada Observatório de Mídias Sociais Conectadas, batizado de Neofluxo. Aprovado em edital do CNPq, o projeto possui a duração de dois anos, devendo desenvolver-se até junho de 2012. O objetivo principal é identificar o comportamento do fluxo informacional nas redes sociais durante o processo eleitoral majoritário no Brasil, em 2010, e demonstrar a possibilidade de produzir Jornalismo por intermédio do cruzamento e visualização de dados utilizando APIs. O projeto armazenou mais de 20,2 milhões de menções aos candidatos e palavras-chave definidas pelos pesquisadores. Para isso foi elaborado um programa computacional espe¬cífico, baseado em software aberto, capaz de rastrear participações de usuários do Twitter segundo palavras-chave, coletando-as e armazenando-as em banco de dados. Também foram gravados dados das redes sociais oficiais dos
van Hal, N L; Vorst, O; van Houwelingen, A M; Kok, E J; Peijnenburg, A; Aharoni, A; van Tunen, A J; Keijer, J
DNA microarray technology is a new and powerful technology that will substantially increase the speed of molecular biological research. This paper gives a survey of DNA microarray technology and its use in gene expression studies. The technical aspects and their potential improvements are discussed. These comprise array manufacturing and design, array hybridisation, scanning, and data handling. Furthermore, it is discussed how DNA microarrays can be applied in the working fields of: safety, functionality and health of food and gene discovery and pathway engineering in plants.
Shen, Lishuang; Gong, Jian; Caldo, Rico A.; Nettleton, Dan; Cook, Dianne; Wise, Roger P.; Dickerson, Julie A.
BarleyBase (BB) (www.barleybase.org) is an online database for plant microarrays with integrated tools for data visualization and statistical analysis. BB houses raw and normalized expression data from the two publicly available Affymetrix genome arrays, Barley1 and Arabidopsis ATH1 with plans to include the new Affymetrix 61K wheat, maize, soybean and rice arrays, as they become available. BB contains a broad set of query and display options at all data levels, ranging from experiments to individual hybridizations to probe sets down to individual probes. Users can perform cross-experiment queries on probe sets based on observed expression profiles and/or based on known biological information. Probe set queries are integrated with visualization and analysis tools such as the R statistical toolbox, data filters and a large variety of plot types. Controlled vocabularies for gene and plant ontologies, as well as interconnecting links to physical or genetic map and other genomic data in PlantGDB, Gramene and GrainGenes, allow users to perform EST alignments and gene function prediction using Barley1 exemplar sequences, thus, enhancing cross-species comparison. PMID:15608273
Cotton, R G H; Auerbach, A D; Brown, A F; Carrera, P; Christodoulou, J; Claustres, M; Compton, J; Cox, D W; De Baere, E; den Dunnen, J T; Greenblatt, M; Fujiwara, M; Hilbert, P; Jani, A; Lehvaslaiho, H; Nebert, D W; Verma, I; Vihinen, M
Researchers and clinicians ideally need instant access to all the variation in their gene/locus of interest to efficiently conduct their research and genetic healthcare to the highest standards. Currently much key data resides in the laboratory books or patient records around the world, as there are many impediments to submitting this data. It would be ideal therefore if a semiautomated pathway was available, with a minimum of effort, to make the deidentified data publicly available for others to use. The Human Variome Project (HVP) meeting listed 96 recommendations to work toward this situation. This article is planned to initiate a strategy to enhance the collection of phenotype and genotype data from the clinician/diagnostic laboratory nexus. Thus, the aim is to develop universally applicable forms that people can use when investigating patients for each inherited disease, to assist in satisfying many of the recommendations of the HVP Meeting [Cotton et al., 2007]. We call for comment and collaboration in this article. Copyright 2007 Wiley-Liss, Inc.
Ramya S Vokuda
Full Text Available In this era of modern revolutionisation in the field of medical laboratory technology, everyone is aiming at taking the innovations from laboratory to bed side. One such technique that is most relevant to the pathologic community is Tissue Microarray (TMA technology. This is becoming quite popular amongst all the members of this family, right from laboratory scientists to clinicians and residents to technologists. The reason for this technique to gain popularity is attributed to its cost effectiveness and time saving protocols. Though, every technique is accompanied by disadvantages, the benefits out number them. This technique is very versatile as many downstream molecular assays such as immunohistochemistry, cytogenetic studies, Fluorescent In situ-Hybridisation (FISH etc., can be carried out on a single slide with multiple numbers of samples. It is a very practical approach that aids effectively to identify novel biomarkers in cancer diagnostics and therapeutics. It helps in assessing the molecular markers on a large scale very quickly. Also, the quality assurance protocols in pathological laboratory has exploited TMA to a great extent. However, the application of TMA technology is beyond oncology. This review shall focus on the different aspects of this technology such as construction of TMA, instrumentation, types, advantages and disadvantages and utilisation of the technique in various disease conditions.
Mello, Rafael Barrios; Silva, Maria Regina Regis; Alves, Maria Teresa Seixas; Evison, Martin Paul; Guimarães, Marco Aurelio; Francisco, Rafaella Arrabaca; Astolphi, Rafael Dias; Iwamura, Edna Sadayo Miazato
Taphonomic processes affecting bone post mortem are important in forensic, archaeological and palaeontological investigations. In this study, the application of tissue microarray (TMA) analysis to a sample of femoral bone specimens from 20 exhumed individuals of known period of burial and age at death is described. TMA allows multiplexing of subsamples, permitting standardized comparative analysis of adjacent sections in 3-D and of representative cross-sections of a large number of specimens. Standard hematoxylin and eosin, periodic acid-Schiff and silver methenamine, and picrosirius red staining, and CD31 and CD34 immunohistochemistry were applied to TMA sections. Osteocyte and osteocyte lacuna counts, percent bone matrix loss, and fungal spheroid element counts could be measured and collagen fibre bundles observed in all specimens. Decalcification with 7% nitric acid proceeded more rapidly than with 0.5 M EDTA and may offer better preservation of histological and cellular structure. No endothelial cells could be detected using CD31 and CD34 immunohistochemistry. Correlation between osteocytes per lacuna and age at death may reflect reported age-related responses to microdamage. Methodological limitations and caveats, and results of the TMA analysis of post mortem diagenesis in bone are discussed, and implications for DNA survival and recovery considered.
Calm, J.M. [Calm (James M.), Great Falls, VA (United States)
The Refrigerant Database is an information system on alternative refrigerants, associated lubricants, and their use in air conditioning and refrigeration. It consolidates and facilitates access to property, compatibility, environmental, safety, application and other information. It provides corresponding information on older refrigerants, to assist manufacturers and those using alternative refrigerants, to make comparisons and determine differences. The underlying purpose is to accelerate phase out of chemical compounds of environmental concern. The database provides bibliographic citations and abstracts for publications that may be useful in research and design of air-conditioning and refrigeration equipment. The complete documents are not included, though some may be added at a later date. The database identifies sources of specific information on refrigerants. It addresses lubricants including alkylbenzene, polyalkylene glycol, polyolester, and other synthetics as well as mineral oils. It also references documents addressing compatibility of refrigerants and lubricants with metals, plastics, elastomers, motor insulation, and other materials used in refrigerant circuits. Incomplete citations or abstracts are provided for some documents. They are included to accelerate availability of the information and will be completed or replaced in future updates. Citations in this report are divided into the following topics: thermophysical properties; materials compatibility; lubricants and tribology; application data; safety; test and analysis methods; impacts; regulatory actions; substitute refrigerants; identification; absorption and adsorption; research programs; and miscellaneous documents. Information is also presented on ordering instructions for the computerized version.
Cain, J.M. (Calm (James M.), Great Falls, VA (United States))
The Refrigerant Database consolidates and facilitates access to information to assist industry in developing equipment using alternative refrigerants. The underlying purpose is to accelerate phase out of chemical compounds of environmental concern. The database provides bibliographic citations and abstracts for publications that may be useful in research and design of air-conditioning and refrigeration equipment. The complete documents are not included. The database identifies sources of specific information on R-32, R-123, R-124, R-125, R-134, R-134a, R-141b, R-142b, R-143a, R-152a, R-245ca, R-290 (propane), R-717 (ammonia), ethers, and others as well as azeotropic and zeotropic blends of these fluids. It addresses lubricants including alkylbenzene, polyalkylene glycol, ester, and other synthetics as well as mineral oils. It also references documents addressing compatibility of refrigerants and lubricants with metals, plastics, elastomers, motor insulation, and other materials used in refrigerant circuits. Incomplete citations or abstracts are provided for some documents to accelerate availability of the information and will be completed or replaced in future updates.
Calm, J.M. [Calm (James M.), Great Falls, VA (United States)
The Refrigerant Database is an information system on alternative refrigerants, associated lubricants, and their use in air conditioning and refrigeration. It consolidates and facilitates access to property, compatibility, environmental, safety, application and other information. It provides corresponding information on older refrigerants, to assist manufactures and those using alternative refrigerants, to make comparisons and determine differences. The underlying purpose is to accelerate phase out of chemical compounds of environmental concern. The database provides bibliographic citations and abstracts for publications that may be useful in research and design of air-conditioning and refrigeration equipment. The complete documents are not included, though some may be added at a later date. The database identifies sources of specific information on many refrigerants including propane, ammonia, water, carbon dioxide, propylene, ethers, and others as well as azeotropic and zeotropic blends of these fluids. It addresses lubricants including alkylbenzene, polyalkylene glycol, polyolester, and other synthetics as well as mineral oils. It also references documents addressing compatibility of refrigerants and lubricants with metals, plastics, elastomers, motor insulation, and other materials used in refrigerant circuits. Incomplete citations or abstracts are provided for some documents. They are included to accelerate availability of the information and will be completed or replaced in future updates.
The Refrigerant Database is an information system on alternative refrigerants, associated lubricants, and their use in air conditioning and refrigeration. It consolidates and facilitates access to property, compatibility, environmental, safety, application and other information. It provides corresponding information on older refrigerants, to assist manufacturers and those using alterative refrigerants, to make comparisons and determine differences. The underlying purpose is to accelerate phase out of chemical compounds of environmental concern. The database provides bibliographic citations and abstracts for publications that may be useful in research and design of air-conditioning and refrigeration equipment. The complete documents are not included, though some may be added at a later date. The database identifies sources of specific information on various refrigerants. It addresses lubricants including alkylbenzene, polyalkylene glycol, polyolester, and other synthetics as well as mineral oils. It also references documents addressing compatibility of refrigerants and lubricants with metals, plastics, elastomers, motor insulation, and other materials used in refrigerant circuits. Incomplete citations or abstracts are provided for some documents. They are included to accelerate availability of the information and will be completed or replaced in future updates.
Hu, Wenchao; Liu, Yuting; Yan, Jun
Alternative polyadenylation (APA) is a post-transcriptional mechanism to generate diverse mRNA transcripts with different 3′UTRs from the same gene. In this study, we systematically searched for the APA events with differential expression in public mouse microarray data. Hundreds of genes with over-represented differential APA events and the corresponding experiments were identified. We further revealed that global APA differential expression occurred prevalently in tissues such as brain comparing to peripheral tissues, and biological processes such as development, differentiation and immune responses. Interestingly, we also observed widespread differential APA events in RNA-binding protein (RBP) genes such as Rbm3, Eif4e2 and Elavl1. Given the fact that RBPs are considered as the main regulators of differential APA expression, we constructed a co-expression network between APAs and RBPs using the microarray data. Further incorporation of CLIP-seq data of selected RBPs showed that Nova2 represses and Mbnl1 promotes the polyadenylation of closest poly(A) sites respectively. Altogether, our study is the first microarray meta-analysis in a mammal on the regulation of APA by RBPs that integrated massive mRNA expression data under a wide-range of biological conditions. Finally, we present our results as a comprehensive resource in an online website for the research community. PMID:24622240
Full Text Available Abstract Background There are several isolated tools for partial analysis of microarray expression data. To provide an integrative, easy-to-use and automated toolkit for the analysis of Affymetrix microarray expression data we have developed Array2BIO, an application that couples several analytical methods into a single web based utility. Results Array2BIO converts raw intensities into probe expression values, automatically maps those to genes, and subsequently identifies groups of co-expressed genes using two complementary approaches: (1 comparative analysis of signal versus control and (2 clustering analysis of gene expression across different conditions. The identified genes are assigned to functional categories based on Gene Ontology classification and KEGG protein interaction pathways. Array2BIO reliably handles low-expressor genes and provides a set of statistical methods for quantifying expression levels, including Benjamini-Hochberg and Bonferroni multiple testing corrections. An automated interface with the ECR Browser provides evolutionary conservation analysis for the identified gene loci while the interconnection with Crème allows prediction of gene regulatory elements that underlie observed expression patterns. Conclusion We have developed Array2BIO – a web based tool for rapid comprehensive analysis of Affymetrix microarray expression data, which also allows users to link expression data to Dcode.org comparative genomics tools and integrates a system for translating co-expression data into mechanisms of gene co-regulation. Array2BIO is publicly available at http://array2bio.dcode.org.
Introduction to Database Systems Functions of a DatabaseDatabase Management SystemDatabase ComponentsDatabase Development ProcessConceptual Design and Data Modeling Introduction to Database Design Process Understanding Business ProcessEntity-Relationship Data Model Representing Business Process with Entity-RelationshipModelTable Structure and NormalizationIntroduction to TablesTable NormalizationTransforming Data Models to Relational Databases .DBMS Selection Transforming Data Models to Relational DatabasesEnforcing ConstraintsCreating Database for Business ProcessPhysical Design and Database
DNA/RNA and protein microarrays have proven their outstanding bioanalytical performance throughout the past decades, given the unprecedented level of parallelization by which molecular recognition assays can be performed and analyzed. Cell microarrays (CMAs) make use of similar construction principles. They are applied to profile a given cell population with respect to the expression of specific molecular markers and also to measure functional cell responses to drugs and chemicals. This review focuses on the use of cell-based microarrays for assessing the cytotoxicity of drugs, toxins, or chemicals in general. It also summarizes CMA construction principles with respect to the cell types that are used for such microarrays, the readout parameters to assess toxicity, and the various formats that have been established and applied. The review ends with a critical comparison of CMAs and well-established microtiter plate (MTP) approaches.
Tanackovic, Vanja; Rydahl, Maja Gro; Pedersen, Henriette Lodberg
In this study we introduce the starch-recognising carbohydrate binding module family 20 (CBM20) from Aspergillus niger for screening biological variations in starch molecular structure using high throughput carbohydrate microarray technology. Defined linear, branched and phosphorylated...
黄承志; 李原芳; 黄新华; 范美坤
The microarray of DNA probes with 5’ -NH2 and 5’ -Tex/3’ -NH2 modified terminus on 10 um carboxylate functional beads surface in the presence of 1-ethyl-3-(3-dimethylaminopropyl)-carbodiimide (EDC) is characterized in the preseni paper. it was found that the microarray capacity of DNA probes on the beads surface depends on the pH of the aqueous solution, the concentra-tion of DNA probe and the total surface area of the beads. On optimal conditions, the minimum distance of 20 mer single-stranded DNA probe microarrayed on beads surface is about 14 nm, while that of 20 mer double-stranded DNA probes is about 27 nm. If the probe length increases from 20 mer to 35 mer, its microarray density decreases correspondingly. Mechanism study shows that the binding mode of DNA probes on the beads surface is nearly parallel to the beads surface.
The microarray of DNA probes with 5′-NH2 and 5′-Tex/3′-NH2 modified terminus on 10 m m carboxylate functional beads surface in the presence of 1-ethyl-3-(3-dimethylaminopropyl)- carbodiimide (EDC) is characterized in the present paper. It was found that the microarray capacity of DNA probes on the beads surface depends on the pH of the aqueous solution, the concentration of DNA probe and the total surface area of the beads. On optimal conditions, the minimum distance of 20 mer single-stranded DNA probe microarrayed on beads surface is about 14 nm, while that of 20 mer double-stranded DNA probes is about 27 nm. If the probe length increases from 20 mer to 35 mer, its microarray density decreases correspondingly. Mechanism study shows that the binding mode of DNA probes on the beads surface is nearly parallel to the beads surface.
Conclusion: The microarray method provides a more accurate and rapid diagnostic tool for bacterial meningitis compared to traditional culture methods. Clinical application of this new technique may reduce the potential risk of delay in treatment.
Wang, Yuedong; Ma, Yanyuan; Carroll, Raymond J.
Microarrays are one of the most widely used high throughput technologies. One of the main problems in the area is that conventional estimates of the variances that are required in the t-statistic and other statistics are unreliable owing
The authors developed a novel macro and nanoporous silicon surface for protein microarrays to facilitate high-throughput biomarker discovery, and high-density protein-chip array analyses of complex biological samples...
Full Text Available Abstract Background The aim of this paper was to describe and compare the methods used and the results obtained by the participants in a joint EADGENE (European Animal Disease Genomic Network of Excellence and SABRE (Cutting Edge Genomics for Sustainable Animal Breeding workshop focusing on post analysis of microarray data. The participating groups were provided with identical lists of microarray probes, including test statistics for three different contrasts, and the normalised log-ratios for each array, to be used as the starting point for interpreting the affected probes. The data originated from a microarray experiment conducted to study the host reactions in broilers occurring shortly after a secondary challenge with either a homologous or heterologous species of Eimeria. Results Several conceptually different analytical approaches, using both commercial and public available software, were applied by the participating groups. The following tools were used: Ingenuity Pathway Analysis, MAPPFinder, LIMMA, GOstats, GOEAST, GOTM, Globaltest, TopGO, ArrayUnlock, Pathway Studio, GIST and AnnotationDbi. The main focus of the approaches was to utilise the relation between probes/genes and their gene ontology and pathways to interpret the affected probes/genes. The lack of a well-annotated chicken genome did though limit the possibilities to fully explore the tools. The main results from these analyses showed that the biological interpretation is highly dependent on the statistical method used but that some common biological conclusions could be reached. Conclusion It is highly recommended to test different analytical methods on the same data set and compare the results to obtain a reliable biological interpretation of the affected genes in a DNA microarray experiment.
Tsoi, Lam C; Qin, Tingting; Slate, Elizabeth H; Zheng, W Jim
To utilize the large volume of gene expression information generated from different microarray experiments, several meta-analysis techniques have been developed. Despite these efforts, there remain significant challenges to effectively increasing the statistical power and decreasing the Type I error rate while pooling the heterogeneous datasets from public resources. The objective of this study is to develop a novel meta-analysis approach, Consistent Differential Expression Pattern (CDEP), to identify genes with common differential expression patterns across different datasets. We combined False Discovery Rate (FDR) estimation and the non-parametric RankProd approach to estimate the Type I error rate in each microarray dataset of the meta-analysis. These Type I error rates from all datasets were then used to identify genes with common differential expression patterns. Our simulation study showed that CDEP achieved higher statistical power and maintained low Type I error rate when compared with two recently proposed meta-analysis approaches. We applied CDEP to analyze microarray data from different laboratories that compared transcription profiles between metastatic and primary cancer of different types. Many genes identified as differentially expressed consistently across different cancer types are in pathways related to metastatic behavior, such as ECM-receptor interaction, focal adhesion, and blood vessel development. We also identified novel genes such as AMIGO2, Gem, and CXCL11 that have not been shown to associate with, but may play roles in, metastasis. CDEP is a flexible approach that borrows information from each dataset in a meta-analysis in order to identify genes being differentially expressed consistently. We have shown that CDEP can gain higher statistical power than other existing approaches under a variety of settings considered in the simulation study, suggesting its robustness and insensitivity to data variation commonly associated with microarray
Full Text Available Abstract Background Obtaining reliable and reproducible two-color microarray gene expression data is critically important for understanding the biological significance of perturbations made on a cellular system. Microarray design, RNA preparation and labeling, hybridization conditions and data acquisition and analysis are variables difficult to simultaneously control. A useful tool for monitoring and controlling intra- and inter-experimental variation is Universal Reference RNA (URR, developed with the goal of providing hybridization signal at each microarray probe location (spot. Measuring signal at each spot as the ratio of experimental RNA to reference RNA targets, rather than relying on absolute signal intensity, decreases variability by normalizing signal output in any two-color hybridization experiment. Results Human, mouse and rat URR (UHRR, UMRR and URRR, respectively were prepared from pools of RNA derived from individual cell lines representing different tissues. A variety of microarrays were used to determine percentage of spots hybridizing with URR and producing signal above a user defined threshold (microarray coverage. Microarray coverage was consistently greater than 80% for all arrays tested. We confirmed that individual cell lines contribute their own unique set of genes to URR, arguing for a pool of RNA from several cell lines as a better configuration for URR as opposed to a single cell line source for URR. Microarray coverage comparing two separately prepared batches each of UHRR, UMRR and URRR were highly correlated (Pearson's correlation coefficients of 0.97. Conclusion Results of this study demonstrate that large quantities of pooled RNA from individual cell lines are reproducibly prepared and possess diverse gene representation. This type of reference provides a standard for reducing variation in microarray experiments and allows more reliable comparison of gene expression data within and between experiments and
Salehi-Reyhani, Ali; Burgin, Edward; Ces, Oscar; Willison, Keith R; Klug, David R
Addressable droplet microarrays are potentially attractive as a way to achieve miniaturised, reduced volume, high sensitivity analyses without the need to fabricate microfluidic devices or small volume chambers. We report a practical method for producing oil-encapsulated addressable droplet microarrays which can be used for such analyses. To demonstrate their utility, we undertake a series of single cell analyses, to determine the variation in copy number of p53 proteins in cells of a human cancer cell line.
Nicolaisen, Mogens; Nyskjold, Henriette; Bertaccini, Assunta
Detection and identification of phytoplasmas is a laborious process often involving nested PCR followed by restriction enzyme analysis and fine-resolution gel electrophoresis. To improve throughput, other methods are needed. Microarray technology offers a generic assay that can potentially detect...... and differentiate all types of phytoplasmas in one assay. The present protocol describes a microarray-based method for identification of phytoplasmas to 16Sr group level....
Wullschleger, Stan D; Difazio, Stephen P
Microarrays have become an important technology for the global analysis of gene expression in humans, animals, plants, and microbes. Implemented in the context of a well-designed experiment, cDNA and oligonucleotide arrays can provide highthroughput, simultaneous analysis of transcript abundance for hundreds, if not thousands, of genes. However, despite widespread acceptance, the use of microarrays as a tool to better understand processes of interest to the plant physiologist is still being explored. To help illustrate current uses of microarrays in the plant sciences, several case studies that we believe demonstrate the emerging application of gene expression arrays in plant physiology were selected from among the many posters and presentations at the 2003 Plant and Animal Genome XI Conference. Based on this survey, microarrays are being used to assess gene expression in plants exposed to the experimental manipulation of air temperature, soil water content and aluminium concentration in the root zone. Analysis often includes characterizing transcript profiles for multiple post-treatment sampling periods and categorizing genes with common patterns of response using hierarchical clustering techniques. In addition, microarrays are also providing insights into developmental changes in gene expression associated with fibre and root elongation in cotton and maize, respectively. Technical and analytical limitations of microarrays are discussed and projects attempting to advance areas of microarray design and data analysis are highlighted. Finally, although much work remains, we conclude that microarrays are a valuable tool for the plant physiologist interested in the characterization and identification of individual genes and gene families with potential application in the fields of agriculture, horticulture and forestry.
Stephen P. Difazio
Full Text Available Microarrays have become an important technology for the global analysis of gene expression in humans, animals, plants, and microbes. Implemented in the context of a well-designed experiment, cDNA and oligonucleotide arrays can provide highthroughput, simultaneous analysis of transcript abundance for hundreds, if not thousands, of genes. However, despite widespread acceptance, the use of microarrays as a tool to better understand processes of interest to the plant physiologist is still being explored. To help illustrate current uses of microarrays in the plant sciences, several case studies that we believe demonstrate the emerging application of gene expression arrays in plant physiology were selected from among the many posters and presentations at the 2003 Plant and Animal Genome XI Conference. Based on this survey, microarrays are being used to assess gene expression in plants exposed to the experimental manipulation of air temperature, soil water content and aluminium concentration in the root zone. Analysis often includes characterizing transcript profiles for multiple post-treatment sampling periods and categorizing genes with common patterns of response using hierarchical clustering techniques. In addition, microarrays are also providing insights into developmental changes in gene expression associated with fibre and root elongation in cotton and maize, respectively. Technical and analytical limitations of microarrays are discussed and projects attempting to advance areas of microarray design and data analysis are highlighted. Finally, although much work remains, we conclude that microarrays are a valuable tool for the plant physiologist interested in the characterization and identification of individual genes and gene families with potential application in the fields of agriculture, horticulture and forestry.
Lodha, T D; Basak, J
Plant defense responses are mediated by elementary regulatory proteins that affect expression of thousands of genes. Over the last decade, microarray technology has played a key role in deciphering the underlying networks of gene regulation in plants that lead to a wide variety of defence responses. Microarray is an important tool to quantify and profile the expression of thousands of genes simultaneously, with two main aims: (1) gene discovery and (2) global expression profiling. Several microarray technologies are currently in use; most include a glass slide platform with spotted cDNA or oligonucleotides. Till date, microarray technology has been used in the identification of regulatory genes, end-point defence genes, to understand the signal transduction processes underlying disease resistance and its intimate links to other physiological pathways. Microarray technology can be used for in-depth, simultaneous profiling of host/pathogen genes as the disease progresses from infection to resistance/susceptibility at different developmental stages of the host, which can be done in different environments, for clearer understanding of the processes involved. A thorough knowledge of plant disease resistance using successful combination of microarray and other high throughput techniques, as well as biochemical, genetic, and cell biological experiments is needed for practical application to secure and stabilize yield of many crop plants. This review starts with a brief introduction to microarray technology, followed by the basics of plant-pathogen interaction, the use of DNA microarrays over the last decade to unravel the mysteries of plant-pathogen interaction, and ends with the future prospects of this technology.
Full Text Available Abstract Background Pseudomonas aeruginosa is an opportunistic pathogen which has the potential to become extremely harmful in the nosocomial environment, especially for cystic fibrosis (CF patients, who are easily affected by chronic lung infections. For epidemiological purposes, discriminating P.aeruginosa isolates is a critical step, to define distribution of clones among hospital departments, to predict occurring microevolution events and to correlate clones to their source. A collection of 182 P. aeruginosa clinical strains isolated within Italian hospitals from patients with chronic infections, i.e. cystic fibrosis (CF patients, and with acute infections were genotyped. Molecular typing was performed with the ArrayTube (AT multimarker microarray (Alere Technologies GmbH, Jena, Germany, a cost-effective, time-saving and standardized method, which addresses genes from both the core and accessory P.aeruginosa genome. Pulsed-field gel electrophoresis (PFGE and multilocus sequence typing (MLST were employed as reference genotyping techniques to estimate the ArrayTube resolution power. Results 41 AT-genotypes were identified within our collection, among which 14 were novel and 27 had been previously described in publicly available AT-databases. Almost 30% of the genotypes belonged to a main cluster of clones. 4B9A, EC2A, 3C2A were mostly associated to CF-patients whereas F469, 2C1A, 6C22 to non CF. An investigation on co-infections events revealed that almost 40% of CF patients were colonized by more than one genotype, whereas less than 4% were observed in non CF patients. The presence of the exoU gene correlated with non-CF patients within the intensive care unit (ICU whereas the pKLC102-like island appeared to be prevalent in the CF centre. The congruence between the ArrayTube typing and PFGE or MLST was 0.077 and 0.559 (Adjusted Rand coefficient, respectively. AT typing of this Italian collection could be easily integrated with the global P
Full Text Available Abstract Background Veterinary drugs such as clenbuterol (CL and sulfamethazine (SM2 are low molecular weight ( Results The artificial antigens were spotted on microarray slides. Standard concentrations of the compounds were added to compete with the spotted antigens for binding to the antisera to determine the IC50. Our microarray assay showed the IC50 were 39.6 ng/ml for CL and 48.8 ng/ml for SM2, while the traditional competitive indirect-ELISA (ci-ELISA showed the IC50 were 190.7 ng/ml for CL and 156.7 ng/ml for SM2. We further validated the two methods with CL fortified chicken muscle tissues, and the protein microarray assay showed 90% recovery while the ci-ELISA had 76% recovery rate. When tested with CL-fed chicken muscle tissues, the protein microarray assay had higher sensitivity (0.9 ng/g than the ci-ELISA (0.1 ng/g for detection of CL residues. Conclusions The protein microarrays showed 4.5 and 3.5 times lower IC50 than the ci-ELISA detection for CL and SM2, respectively, suggesting that immunodetection of small molecules with protein microarray is a better approach than the traditional ELISA technique.
Smith Andrew M
Full Text Available Abstract Background Microarrays are an invaluable tool in many modern genomic studies. It is generally perceived that decreasing the size of microarray features leads to arrays with higher resolution (due to greater feature density, but this increase in resolution can compromise sensitivity. Results We demonstrate that barcode microarrays with smaller features are equally capable of detecting variation in DNA barcode intensity when compared to larger feature sizes within a specific microarray platform. The barcodes used in this study are the well-characterized set derived from the Yeast KnockOut (YKO collection used for screens of pooled yeast (Saccharomyces cerevisiae deletion mutants. We treated these pools with the glycosylation inhibitor tunicamycin as a test compound. Three generations of barcode microarrays at 30, 8 and 5 μm features sizes independently identified the primary target of tunicamycin to be ALG7. Conclusion We show that the data obtained with 5 μm feature size is of comparable quality to the 30 μm size and propose that further shrinking of features could yield barcode microarrays with equal or greater resolving power and, more importantly, higher density.
Full Text Available Carbohydrates play a crucial role in host-microorganism interactions and many host glycoconjugates are receptors or co-receptors for microbial binding. Host glycosylation varies with species and location in the body, and this contributes to species specificity and tropism of commensal and pathogenic bacteria. Additionally, bacterial glycosylation is often the first bacterial molecular species encountered and responded to by the host system. Accordingly, characterising and identifying the exact structures involved in these critical interactions is an important priority in deciphering microbial pathogenesis. Carbohydrate-based microarray platforms have been an underused tool for screening bacterial interactions with specific carbohydrate structures, but they are growing in popularity in recent years. In this review, we discuss carbohydrate-based microarrays that have been profiled with whole bacteria, recombinantly expressed adhesins or serum antibodies. Three main types of carbohydrate-based microarray platform are considered; (i conventional carbohydrate or glycan microarrays; (ii whole mucin microarrays; and (iii microarrays constructed from bacterial polysaccharides or their components. Determining the nature of the interactions between bacteria and host can help clarify the molecular mechanisms of carbohydrate-mediated interactions in microbial pathogenesis, infectious disease and host immune response and may lead to new strategies to boost therapeutic treatments.
M J Pont
Full Text Available Cellular immunotherapy has proven to be effective in the treatment of hematological cancers by donor lymphocyte infusion after allogeneic hematopoietic stem cell transplantation and more recently by targeted therapy with chimeric antigen or T-cell receptor-engineered T cells. However, dependent on the tissue distribution of the antigens that are targeted, anti-tumor responses can be accompanied by undesired side effects. Therefore, detailed tissue distribution analysis is essential to estimate potential efficacy and toxicity of candidate targets for immunotherapy of hematological malignancies. We performed microarray gene expression analysis of hematological malignancies of different origins, healthy hematopoietic cells and various non-hematopoietic cell types from organs that are often targeted in detrimental immune responses after allogeneic stem cell transplantation leading to graft-versus-host disease. Non-hematopoietic cells were also cultured in the presence of IFN-γ to analyze gene expression under inflammatory circumstances. Gene expression was investigated by Illumina HT12.0 microarrays and quality control analysis was performed to confirm the cell-type origin and exclude contamination of non-hematopoietic cell samples with peripheral blood cells. Microarray data were validated by quantitative RT-PCR showing strong correlations between both platforms. Detailed gene expression profiles were generated for various minor histocompatibility antigens and B-cell surface antigens to illustrate the value of the microarray dataset to estimate efficacy and toxicity of candidate targets for immunotherapy. In conclusion, our microarray database provides a relevant platform to analyze and select candidate antigens with hematopoietic (lineage-restricted expression as potential targets for immunotherapy of hematological cancers.
Castrignanò, Tiziana; De Meo, Paolo D'Onorio; Cozzetto, Domenico; Talamo, Ivano Giuseppe; Tramontano, Anna
The Protein Model Database (PMDB) is a public resource aimed at storing manually built 3D models of proteins. The database is designed to provide access to models published in the scientific literature, together with validating experimental data. It is a relational database and it currently contains >74 000 models for ∼240 proteins. The system is accessible at and allows predictors to submit models along with related supporting evidence and users to download them through a simple and intuitive interface. Users can navigate in the database and retrieve models referring to the same target protein or to different regions of the same protein. Each model is assigned a unique identifier that allows interested users to directly access the data. PMID:16381873
Friis-Andersen, Hans; Bisgaard, Thue
AIM OF DATABASE: To monitor and improve nation-wide surgical outcome after groin hernia repair based on scientific evidence-based surgical strategies for the national and international surgical community. STUDY POPULATION: Patients ≥18 years operated for groin hernia. MAIN VARIABLES: Type and size...... access to their own data stratified on individual surgeons. Registrations are based on a closed, protected Internet system requiring personal codes also identifying the operating institution. A national steering committee consisting of 13 voluntary and dedicated surgeons, 11 of whom are unpaid, handles...... the medical management of the database. RESULTS: The Danish Inguinal Hernia Database comprises intraoperative data from >130,000 repairs (May 2015). A total of 49 peer-reviewed national and international publications have been published from the database (June 2015). CONCLUSION: The Danish Inguinal Hernia...
The traditional publication will be overhauled by the 'Enhanced Publication'. This is a publication that is enhanced with research data, extra materials, post publication data, and database records. It has an object-based structure with explicit l
COMPADRE contains demographic information on hundreds of plant species. The data in COMPADRE are in the form of matrix population models and our goal is to make these publicly available to facilitate their use for research and teaching purposes. COMPADRE is an open-access database. We only request...
ir. Sander van Laar
A formal description of a database consists of the description of the relations (tables) of the database together with the constraints that must hold on the database. Furthermore the contents of a database can be retrieved using queries. These constraints and queries for databases can very well be
Page Home Table of Contents Contents Search Database Search Login Login Databases Advisory Circulars accessed by clicking below: Full-Text WebSearch Databases Database Records Date Advisory Circulars 2092 5 data collection and distribution policies. Document Database Website provided by MicroSearch
Strakova, Eva; Zikova, Alice; Vohradsky, Jiri
A computational model of gene expression was applied to a novel test set of microarray time series measurements to reveal regulatory interactions between transcriptional regulators represented by 45 sigma factors and the genes expressed during germination of a prokaryote Streptomyces coelicolor. Using microarrays, the first 5.5 h of the process was recorded in 13 time points, which provided a database of gene expression time series on genome-wide scale. The computational modeling of the kinetic relations between the sigma factors, individual genes and genes clustered according to the similarity of their expression kinetics identified kinetically plausible sigma factor-controlled networks. Using genome sequence annotations, functional groups of genes that were predominantly controlled by specific sigma factors were identified. Using external binding data complementing the modeling approach, specific genes involved in the control of the studied process were identified and their function suggested.
Raddatz, Barbara B; Spitzbarth, Ingo; Matheis, Katja A; Kalkuhl, Arno; Deschl, Ulrich; Baumgärtner, Wolfgang; Ulrich, Reiner
High-throughput, genome-wide transcriptome analysis is now commonly used in all fields of life science research and is on the cusp of medical and veterinary diagnostic application. Transcriptomic methods such as microarrays and next-generation sequencing generate enormous amounts of data. The pathogenetic expertise acquired from understanding of general pathology provides veterinary pathologists with a profound background, which is essential in translating transcriptomic data into meaningful biological knowledge, thereby leading to a better understanding of underlying disease mechanisms. The scientific literature concerning high-throughput data-mining techniques usually addresses mathematicians or computer scientists as the target audience. In contrast, the present review provides the reader with a clear and systematic basis from a veterinary pathologist's perspective. Therefore, the aims are (1) to introduce the reader to the necessary methodological background; (2) to introduce the sequential steps commonly performed in a microarray analysis including quality control, annotation, normalization, selection of differentially expressed genes, clustering, gene ontology and pathway analysis, analysis of manually selected genes, and biomarker discovery; and (3) to provide references to publically available and user-friendly software suites. In summary, the data analysis methods presented within this review will enable veterinary pathologists to analyze high-throughput transcriptome data obtained from their own experiments, supplemental data that accompany scientific publications, or public repositories in order to obtain a more in-depth insight into underlying disease mechanisms.
Dupľáková, Nikoleta; Reňák, David; Hovanec, P.; Honysová, Barbora; Twell, D.; Honys, David
Roč. 7, - (2007), Article Number: 39 ISSN 1471-2229 R&D Projects: GA MŠk(CZ) LC06004; GA ČR GA522/06/0896 Institutional research plan: CEZ:AV0Z50380511 Source of funding: V - iné verejné zdroje ; V - iné verejné zdroje Keywords : STANFORD MICROARRAY DATABASE * EXPRESSION ANALYSIS * DNA MICROARRAYS Subject RIV: EB - Genetics ; Molecular Biology Impact factor: 3.232, year: 2007
... and US Department of Agriculture Dietary Supplement Ingredient Database Toggle navigation Menu Home About DSID Mission Current ... values can be saved to build a small database or add to an existing database for national, ...
Consumption Database The California Energy Commission has created this on-line database for informal reporting ) classifications. The database also provides easy downloading of energy consumption data into Microsoft Excel (XLSX
Full Text Available Abstract Background Image analysis of microarrays and, in particular, spot quantification and spot quality control, is one of the most important steps in statistical analysis of microarray data. Recent methods of spot quality control are still in early age of development, often leading to underestimation of true positive microarray features and, consequently, to loss of important biological information. Therefore, improving and standardizing the statistical approaches of spot quality control are essential to facilitate the overall analysis of microarray data and subsequent extraction of biological information. Findings We evaluated the performance of two image analysis packages MAIA and GenePix (GP using two complementary experimental approaches with a focus on the statistical analysis of spot quality factors. First, we developed control microarrays with a priori known fluorescence ratios to verify the accuracy and precision of the ratio estimation of signal intensities. Next, we developed advanced semi-automatic protocols of spot quality evaluation in MAIA and GP and compared their performance with available facilities of spot quantitative filtering in GP. We evaluated these algorithms for standardised spot quality analysis in a whole-genome microarray experiment assessing well-characterised transcriptional modifications induced by the transcription regulator SNAI1. Using a set of RT-PCR or qRT-PCR validated microarray data, we found that the semi-automatic protocol of spot quality control we developed with MAIA allowed recovering approximately 13% more spots and 38% more differentially expressed genes (at FDR = 5% than GP with default spot filtering conditions. Conclusion Careful control of spot quality characteristics with advanced spot quality evaluation can significantly increase the amount of confident and accurate data resulting in more meaningful biological conclusions.
Comparison of sequencing the D2 region of the large subunit ribosomal RNA gene (MicroSEQ®) versus the internal transcribed spacer (ITS) regions using two public databases for identification of common and uncommon clinically relevant fungal species.
Arbefeville, S; Harris, A; Ferrieri, P
Fungal infections cause considerable morbidity and mortality in immunocompromised patients. Rapid and accurate identification of fungi is essential to guide accurately targeted antifungal therapy. With the advent of molecular methods, clinical laboratories can use new technologies to supplement traditional phenotypic identification of fungi. The aims of the study were to evaluate the sole commercially available MicroSEQ® D2 LSU rDNA Fungal Identification Kit compared to the in-house developed internal transcribed spacer (ITS) regions assay in identifying moulds, using two well-known online public databases to analyze sequenced data. 85 common and uncommon clinically relevant fungi isolated from clinical specimens were sequenced for the D2 region of the large subunit (LSU) of ribosomal RNA (rRNA) gene with the MicroSEQ® Kit and the ITS regions with the in house developed assay. The generated sequenced data were analyzed with the online GenBank and MycoBank public databases. The D2 region of the LSU rRNA gene identified 89.4% or 92.9% of the 85 isolates to the genus level and the full ITS region (f-ITS) 96.5% or 100%, using GenBank or MycoBank, respectively, when compared to the consensus ID. When comparing species-level designations to the consensus ID, D2 region of the LSU rRNA gene aligned with 44.7% (38/85) or 52.9% (45/85) of these isolates in GenBank or MycoBank, respectively. By comparison, f-ITS possessed greater specificity, followed by ITS1, then ITS2 regions using GenBank or MycoBank. Using GenBank or MycoBank, D2 region of the LSU rRNA gene outperformed phenotypic based ID at the genus level. Comparing rates of ID between D2 region of the LSU rRNA gene and the ITS regions in GenBank or MycoBank at the species level against the consensus ID, f-ITS and ITS2 exceeded performance of the D2 region of the LSU rRNA gene, but ITS1 had similar performance to the D2 region of the LSU rRNA gene using MycoBank. Our results indicated that the MicroSEQ® D2 LSU r
Ariel M Pani
Full Text Available Intellectual disability (ID affects 2-3% of the population and may occur with or without multiple congenital anomalies (MCA or other medical conditions. Established genetic syndromes and visible chromosome abnormalities account for a substantial percentage of ID diagnoses, although for approximately 50% the molecular etiology is unknown. Individuals with features suggestive of various syndromes but lacking their associated genetic anomalies pose a formidable clinical challenge. With the advent of microarray techniques, submicroscopic genome alterations not associated with known syndromes are emerging as a significant cause of ID and MCA.High-density SNP microarrays were used to determine genome wide copy number in 42 individuals: 7 with confirmed alterations in the WS region but atypical clinical phenotypes, 31 with ID and/or MCA, and 4 controls. One individual from the first group had the most telomeric gene in the WS critical region deleted along with 2 Mb of flanking sequence. A second person had the classic WS deletion and a rearrangement on chromosome 5p within the Cri du Chat syndrome (OMIM:123450 region. Six individuals from the ID/MCA group had large rearrangements (3 deletions, 3 duplications, one of whom had a large inversion associated with a deletion that was not detected by the SNP arrays.Combining SNP microarray analyses and qPCR allowed us to clone and sequence 21 deletion breakpoints in individuals with atypical deletions in the WS region and/or ID or MCA. Comparison of these breakpoints to databases of genomic variation revealed that 52% occurred in regions harboring structural variants in the general population. For two probands the genomic alterations were flanked by segmental duplications, which frequently mediate recurrent genome rearrangements; these may represent new genomic disorders. While SNP arrays and related technologies can identify potentially pathogenic deletions and duplications, obtaining sequence information
The metagenomic data obtained from marine environments is significantly useful for understanding marine microbial communities. In comparison with the conventional amplicon-based approach of metagenomics, the recent shotgun sequencing-based approach has become a powerful tool that provides an efficient way of grasping a diversity of the entire microbial community at a sampling point in the sea. However, this approach accelerates accumulation of the metagenome data as well as increase of data complexity. Moreover, when metagenomic approach is used for monitoring a time change of marine environments at multiple locations of the seawater, accumulation of metagenomics data will become tremendous with an enormous speed. Because this kind of situation has started becoming of reality at many marine research institutions and stations all over the world, it looks obvious that the data management and analysis will be confronted by the so-called Big Data issues such as how the database can be constructed in an efficient way and how useful knowledge should be extracted from a vast amount of the data. In this review, we summarize the outline of all the major databases of marine metagenome that are currently publically available, noting that database exclusively on marine metagenome is none but the number of metagenome databases including marine metagenome data are six, unexpectedly still small. We also extend our explanation to the databases, as reference database we call, that will be useful for constructing a marine metagenome database as well as complementing important information with the database. Then, we would point out a number of challenges to be conquered in constructing the marine metagenome database.
US Agency for International Development — The Collecting Taxes Database contains performance and structural indicators about national tax systems. The database contains quantitative revenue performance...
US Agency for International Development — The Anticorruption Projects Database (Database) includes information about USAID projects with anticorruption interventions implemented worldwide between 2007 and...
This thesis deals with database systems referred to as NoSQL databases. In the second chapter, I explain basic terms and the theory of database systems. A short explanation is dedicated to database systems based on the relational data model and the SQL standardized query language. Chapter Three explains the concept and history of the NoSQL databases, and also presents database models, major features and the use of NoSQL databases in comparison with traditional database systems. In the fourth ...
Executive Office of the President — This file contains governmental receipts for 1962 through the current budget year, as well as four years of projections. It can be used to reproduce many of the...
Full Text Available Abstract Background Prostate cancer is a world wide leading cancer and it is characterized by its aggressive metastasis. According to the clinical heterogeneity, prostate cancer displays different stages and grades related to the aggressive metastasis disease. Although numerous studies used microarray analysis and traditional clustering method to identify the individual genes during the disease processes, the important gene regulations remain unclear. We present a computational method for inferring genetic regulatory networks from micorarray data automatically with transcription factor analysis and conditional independence testing to explore the potential significant gene regulatory networks that are correlated with cancer, tumor grade and stage in the prostate cancer. Results To deal with missing values in microarray data, we used a K-nearest-neighbors (KNN algorithm to determine the precise expression values. We applied web services technology to wrap the bioinformatics toolkits and databases to automatically extract the promoter regions of DNA sequences and predicted the transcription factors that regulate the gene expressions. We adopt the microarray datasets consists of 62 primary tumors, 41 normal prostate tissues from Stanford Microarray Database (SMD as a target dataset to evaluate our method. The predicted results showed that the possible biomarker genes related to cancer and denoted the androgen functions and processes may be in the development of the prostate cancer and promote the cell death in cell cycle. Our predicted results showed that sub-networks of genes SREBF1, STAT6 and PBX1 are strongly related to a high extent while ETS transcription factors ELK1, JUN and EGR2 are related to a low extent. Gene SLC22A3 may explain clinically the differentiation associated with the high grade cancer compared with low grade cancer. Enhancer of Zeste Homolg 2 (EZH2 regulated by RUNX1 and STAT3 is correlated to the pathological stage
Yeh, Hsiang-Yuan; Cheng, Shih-Wu; Lin, Yu-Chun; Yeh, Cheng-Yu; Lin, Shih-Fang; Soo, Von-Wun
Prostate cancer is a world wide leading cancer and it is characterized by its aggressive metastasis. According to the clinical heterogeneity, prostate cancer displays different stages and grades related to the aggressive metastasis disease. Although numerous studies used microarray analysis and traditional clustering method to identify the individual genes during the disease processes, the important gene regulations remain unclear. We present a computational method for inferring genetic regulatory networks from micorarray data automatically with transcription factor analysis and conditional independence testing to explore the potential significant gene regulatory networks that are correlated with cancer, tumor grade and stage in the prostate cancer. To deal with missing values in microarray data, we used a K-nearest-neighbors (KNN) algorithm to determine the precise expression values. We applied web services technology to wrap the bioinformatics toolkits and databases to automatically extract the promoter regions of DNA sequences and predicted the transcription factors that regulate the gene expressions. We adopt the microarray datasets consists of 62 primary tumors, 41 normal prostate tissues from Stanford Microarray Database (SMD) as a target dataset to evaluate our method. The predicted results showed that the possible biomarker genes related to cancer and denoted the androgen functions and processes may be in the development of the prostate cancer and promote the cell death in cell cycle. Our predicted results showed that sub-networks of genes SREBF1, STAT6 and PBX1 are strongly related to a high extent while ETS transcription factors ELK1, JUN and EGR2 are related to a low extent. Gene SLC22A3 may explain clinically the differentiation associated with the high grade cancer compared with low grade cancer. Enhancer of Zeste Homolg 2 (EZH2) regulated by RUNX1 and STAT3 is correlated to the pathological stage. We provide a computational framework to reconstruct
Full Text Available Abstract Background Detailed information on DNA-binding transcription factors (the key players in the regulation of gene expression and on transcriptional regulatory interactions of microorganisms deduced from literature-derived knowledge, computer predictions and global DNA microarray hybridization experiments, has opened the way for the genome-wide analysis of transcriptional regulatory networks. The large-scale reconstruction of these networks allows the in silico analysis of cell behavior in response to changing environmental conditions. We previously published CoryneRegNet, an ontology-based data warehouse of corynebacterial transcription factors and regulatory networks. Initially, it was designed to provide methods for the analysis and visualization of the gene regulatory network of Corynebacterium glutamicum. Results Now we introduce CoryneRegNet release 4.0, which integrates data on the gene regulatory networks of 4 corynebacteria, 2 mycobacteria and the model organism Escherichia coli K12. As the previous versions, CoryneRegNet provides a web-based user interface to access the database content, to allow various queries, and to support the reconstruction, analysis and visualization of regulatory networks at different hierarchical levels. In this article, we present the further improved database content of CoryneRegNet along with novel analysis features. The network visualization feature GraphVis now allows the inter-species comparisons of reconstructed gene regulatory networks and the projection of gene expression levels onto that networks. Therefore, we added stimulon data directly into the database, but also provide Web Service access to the DNA microarray analysis platform EMMA. Additionally, CoryneRegNet now provides a SOAP based Web Service server, which can easily be consumed by other bioinformatics software systems. Stimulons (imported from the database, or uploaded by the user can be analyzed in the context of known
Wain, Karen E; Riggs, Erin; Hanson, Karen; Savage, Melissa; Riethmaier, Darlene; Muirhead, Andrea; Mitchell, Elyse; Packard, Bethanny Smith; Faucett, W Andrew
The International Standards for Cytogenomic Arrays (ISCA) Consortium is a worldwide collaborative effort dedicated to optimizing patient care by improving the quality of chromosomal microarray testing. The primary effort of the ISCA Consortium has been the development of a database of copy number variants (CNVs) identified during the course of clinical microarray testing. This database is a powerful resource for clinicians, laboratories, and researchers, and can be utilized for a variety of applications, such as facilitating standardized interpretations of certain CNVs across laboratories or providing phenotypic information for counseling purposes when published data is sparse. A recognized limitation to the clinical utility of this database, however, is the quality of clinical information available for each patient. Clinical genetic counselors are uniquely suited to facilitate the communication of this information to the laboratory by virtue of their existing clinical responsibilities, case management skills, and appreciation of the evolving nature of scientific knowledge. We intend to highlight the critical role that genetic counselors play in ensuring optimal patient care through contributing to the clinical utility of the ISCA Consortium's database, as well as the quality of individual patient microarray reports provided by contributing laboratories. Current tools, paper and electronic forms, created to maximize this collaboration are shared. In addition to making a professional commitment to providing complete clinical information, genetic counselors are invited to become ISCA members and to become involved in the discussions and initiatives within the Consortium.
Full Text Available Abstract Background Genes that are determined to be significantly differentially regulated in microarray analyses often appear to have functional commonalities, such as being components of the same biochemical pathway. This results in certain words being under- or overrepresented in the list of genes. Distinguishing between biologically meaningful trends and artifacts of annotation and analysis procedures is of the utmost importance, as only true biological trends are of interest for further experimentation. A number of sophisticated methods for identification of significant lexical trends are currently available, but these methods are generally too cumbersome for practical use by most microarray users. Results We have developed a tool, LACK, for calculating the statistical significance of apparent lexical bias in microarray datasets. The frequency of a user-specified list of search terms in a list of genes which are differentially regulated is assessed for statistical significance by comparison to randomly generated datasets. The simplicity of the input files and user interface targets the average microarray user who wishes to have a statistical measure of apparent lexical trends in analyzed datasets without the need for bioinformatics skills. The software is available as Perl source or a Windows executable. Conclusion We have used LACK in our laboratory to generate biological hypotheses based on our microarray data. We demonstrate the program's utility using an example in which we confirm significant upregulation of SPI-2 pathogenicity island of Salmonella enterica serovar Typhimurium by the cation chelator dipyridyl.
Wu, Min; Thao, Cheng; Mu, Xiangming; Munson, Ethan V
Microarray has been widely used to measure the relative amounts of every mRNA transcript from the genome in a single scan. Biologists have been accustomed to reading their experimental data directly from tables. However, microarray data are quite large and are stored in a series of files in a machine-readable format, so direct reading of the full data set is not feasible. The challenge is to design a user interface that allows biologists to usefully view large tables of raw microarray-based gene expression data. This paper presents one such interface--an electronic table (E-table) that uses fisheye distortion technology. The Fisheye Viewer for microarray-based gene expression data has been successfully developed to view MIAME data stored in the MAGE-ML format. The viewer can be downloaded from the project web site http://polaris.imt.uwm.edu:7777/fisheye/. The fisheye viewer was implemented in Java so that it could run on multiple platforms. We implemented the E-table by adapting JTable, a default table implementation in the Java Swing user interface library. Fisheye views use variable magnification to balance magnification for easy viewing and compression for maximizing the amount of data on the screen. This Fisheye Viewer is a lightweight but useful tool for biologists to quickly overview the raw microarray-based gene expression data in an E-table.
Munson Ethan V
Full Text Available Abstract Background Microarray has been widely used to measure the relative amounts of every mRNA transcript from the genome in a single scan. Biologists have been accustomed to reading their experimental data directly from tables. However, microarray data are quite large and are stored in a series of files in a machine-readable format, so direct reading of the full data set is not feasible. The challenge is to design a user interface that allows biologists to usefully view large tables of raw microarray-based gene expression data. This paper presents one such interface – an electronic table (E-table that uses fisheye distortion technology. Results The Fisheye Viewer for microarray-based gene expression data has been successfully developed to view MIAME data stored in the MAGE-ML format. The viewer can be downloaded from the project web site http://polaris.imt.uwm.edu:7777/fisheye/. The fisheye viewer was implemented in Java so that it could run on multiple platforms. We implemented the E-table by adapting JTable, a default table implementation in the Java Swing user interface library. Fisheye views use variable magnification to balance magnification for easy viewing and compression for maximizing the amount of data on the screen. Conclusion This Fisheye Viewer is a lightweight but useful tool for biologists to quickly overview the raw microarray-based gene expression data in an E-table.
Full Text Available An automatic cDNA microarray image processing using an improved fuzzy clustering algorithm is presented in this paper. The spot segmentation algorithm proposed uses the gridding technique developed by the authors earlier, for finding the co-ordinates of each spot in an image. Automatic cropping of spots from microarray image is done using these co-ordinates. The present paper proposes an improved fuzzy clustering algorithm Possibility fuzzy local information c means (PFLICM to segment the spot foreground (FG from background (BG. The PFLICM improves fuzzy local information c means (FLICM algorithm by incorporating typicality of a pixel along with gray level information and local spatial information. The performance of the algorithm is validated using a set of simulated cDNA microarray images added with different levels of AWGN noise. The strength of the algorithm is tested by computing the parameters such as the Segmentation matching factor (SMF, Probability of error (pe, Discrepancy distance (D and Normal mean square error (NMSE. SMF value obtained for PFLICM algorithm shows an improvement of 0.9 % and 0.7 % for high noise and low noise microarray images respectively compared to FLICM algorithm. The PFLICM algorithm is also applied on real microarray images and gene expression values are computed.
Qin, Li; Rueda, Luis; Ali, Adnan; Ngom, Alioune
Following the invention of microarrays in 1994, the development and applications of this technology have grown exponentially. The numerous applications of microarray technology include clinical diagnosis and treatment, drug design and discovery, tumour detection, and environmental health research. One of the key issues in the experimental approaches utilising microarrays is to extract quantitative information from the spots, which represent genes in a given experiment. For this process, the initial stages are important and they influence future steps in the analysis. Identifying the spots and separating the background from the foreground is a fundamental problem in DNA microarray data analysis. In this review, we present an overview of state-of-the-art methods for microarray image segmentation. We discuss the foundations of the circle-shaped approach, adaptive shape segmentation, histogram-based methods and the recently introduced clustering-based techniques. We analytically show that clustering-based techniques are equivalent to the one-dimensional, standard k-means clustering algorithm that utilises the Euclidean distance.
Schax, Emilia; Walter, Johanna-Gabriela; Märzhäuser, Helene; Stahl, Frank; Scheper, Thomas; Agard, David A; Eichner, Simone; Kirschning, Andreas; Zeilinger, Carsten
Based on the importance of heat shock proteins (HSPs) in diseases such as cancer, Alzheimer's disease or malaria, inhibitors of these chaperons are needed. Today's state-of-the-art techniques to identify HSP inhibitors are performed in microplate format, requiring large amounts of proteins and potential inhibitors. In contrast, we have developed a miniaturized protein microarray-based assay to identify novel inhibitors, allowing analysis with 300 pmol of protein. The assay is based on competitive binding of fluorescence-labeled ATP and potential inhibitors to the ATP-binding site of HSP. Therefore, the developed microarray enables the parallel analysis of different ATP-binding proteins on a single microarray. We have demonstrated the possibility of multiplexing by immobilizing full-length human HSP90α and HtpG of Helicobacter pylori on microarrays. Fluorescence-labeled ATP was competed by novel geldanamycin/reblastatin derivatives with IC50 values in the range of 0.5 nM to 4 μM and Z(*)-factors between 0.60 and 0.96. Our results demonstrate the potential of a target-oriented multiplexed protein microarray to identify novel inhibitors for different members of the HSP90 family. Copyright © 2014 Elsevier B.V. All rights reserved.
Basic, I.; Vrbanic, I.; Zabric, I.; Savli, S.
The aspects of plant ageing management (AM) gained increasing attention over the last ten years. Numerous technical studies have been performed to study the impact of ageing mechanisms on the safe and reliable operation of nuclear power plants. National research activities have been initiated or are in progress to provide the technical basis for decision making processes. The long-term operation of nuclear power plants is influenced by economic considerations, the socio-economic environment including public acceptance, developments in research and the regulatory framework, the availability of technical infrastructure to maintain and service the systems, structures and components as well as qualified personnel. Besides national activities there are a number of international activities in particular under the umbrella of the IAEA, the OECD and the EU. The paper discusses the process, procedure and database developed for Slovenian Nuclear Safety Administration (SNSA) surveillance of ageing process of Nuclear power Plant Krsko.(author)
Full Text Available Abstract Background The availability of the human genome sequence as well as the large number of physically accessible oligonucleotides, cDNA, and BAC clones across the entire genome has triggered and accelerated the use of several platforms for analysis of DNA copy number changes, amongst others microarray comparative genomic hybridization (arrayCGH. One of the challenges inherent to this new technology is the management and analysis of large numbers of data points generated in each individual experiment. Results We have developed arrayCGHbase, a comprehensive analysis platform for arrayCGH experiments consisting of a MIAME (Minimal Information About a Microarray Experiment supportive database using MySQL underlying a data mining web tool, to store, analyze, interpret, compare, and visualize arrayCGH results in a uniform and user-friendly format. Following its flexible design, arrayCGHbase is compatible with all existing and forthcoming arrayCGH platforms. Data can be exported in a multitude of formats, including BED files to map copy number information on the genome using the Ensembl or UCSC genome browser. Conclusion ArrayCGHbase is a web based and platform independent arrayCGH data analysis tool, that allows users to access the analysis suite through the internet or a local intranet after installation on a private server. ArrayCGHbase is available at http://medgen.ugent.be/arrayCGHbase/.
Noriyuki, Nakatsu; Igarashi, Yoshinobu; Ono, Atsushi; Yamada, Hiroshi; Ohno, Yasuo; Urushidani, Tetsuro
An important technology used in toxicogenomic drug discovery research is the microarray, which enables researchers to simultaneously analyze the expression of a large number of genes. To build a database and data analysis system for use in assessing the safety of drugs and drug candidates, in 2002 we conducted a 5-year collaborative study in the Toxicogenomics Project (TGP1) in Japan. Experimental data generated by such studies must be validated by different laboratories for robust and accurate analysis. For this purpose, we conducted intra- and inter-laboratory validation studies with participating companies in the second collaborative study in the Toxicogenomics Project (TGP2). Gene expression in the liver of rats treated with acetaminophen (APAP) was independently examined by the participating companies using Affymetrix GeneChip microarrays. The intra- and inter-laboratory reproducibility of the data was evaluated using hierarchical clustering analysis. The toxicogenomics results were highly reproducible, indicating that the gene expression data generated in our TGP1 project is reliable and compatible with the data generated by the participating laboratories.
Full Text Available Abstract Motivation Identification of differentially expressed genes from microarray datasets is one of the most important analyses for microarray data mining. Popular algorithms such as statistical t-test rank genes based on a single statistics. The false positive rate of these methods can be improved by considering other features of differentially expressed genes. Results We proposed a pattern recognition strategy for identifying differentially expressed genes. Genes are mapped to a two dimension feature space composed of average difference of gene expression and average expression levels. A density based pruning algorithm (DB Pruning is developed to screen out potential differentially expressed genes usually located in the sparse boundary region. Biases of popular algorithms for identifying differentially expressed genes are visually characterized. Experiments on 17 datasets from Gene Omnibus Database (GEO with experimentally verified differentially expressed genes showed that DB pruning can significantly improve the prediction accuracy of popular identification algorithms such as t-test, rank product, and fold change. Conclusions Density based pruning of non-differentially expressed genes is an effective method for enhancing statistical testing based algorithms for identifying differentially expressed genes. It improves t-test, rank product, and fold change by 11% to 50% in the numbers of identified true differentially expressed genes. The source code of DB pruning is freely available on our website http://mleg.cse.sc.edu/degprune
Full Text Available Abstract Background The induction of genomic deletions by physical- or chemical- agents is an easy and inexpensive means to generate a genome-saturating collection of mutations. Different mutagens can be selected to ensure a mutant collection with a range of deletion sizes. This would allow identification of mutations in single genes or, alternatively, a deleted group of genes that might collectively govern a trait (e.g., quantitative trait loci, QTL. However, deletion mutants have not been widely used in functional genomics, because the mutated genes are not tagged and therefore, difficult to identify. Here, we present a microarray-based approach to identify deleted genomic regions in rice mutants selected from a large collection generated by gamma ray or fast neutron treatment. Our study focuses not only on the utility of this method for forward genetics, but also its potential as a reverse genetics tool through accumulation of hybridization data for a collection of deletion mutants harboring multiple genetic lesions. Results We demonstrate that hybridization of labeled genomic DNA directly onto the Affymetrix Rice GeneChip® allows rapid localization of deleted regions in rice mutants. Deletions ranged in size from one gene model to ~500 kb and were predicted on all 12 rice chromosomes. The utility of the technique as a tool in forward genetics was demonstrated in combination with an allelic series of mutants to rapidly narrow the genomic region, and eventually identify a candidate gene responsible for a lesion mimic phenotype. Finally, the positions of mutations in 14 mutants were aligned onto the rice pseudomolecules in a user-friendly genome browser to allow for rapid identification of untagged mutations http://irfgc.irri.org/cgi-bin/gbrowse/IR64_deletion_mutants/. Conclusion We demonstrate the utility of oligonucleotide arrays to discover deleted genes in rice. The density and distribution of deletions suggests the feasibility of a
Full Text Available Hans Friis-Andersen1,2, Thue Bisgaard2,3 1Surgical Department, Horsens Regional Hospital, Horsens, Denmark; 2Steering Committee, Danish Hernia Database, 3Surgical Gastroenterological Department 235, Copenhagen University Hospital, Hvidovre, Denmark Aim of database: To monitor and improve nation-wide surgical outcome after groin hernia repair based on scientific evidence-based surgical strategies for the national and international surgical community. Study population: Patients ≥18 years operated for groin hernia. Main variables: Type and size of hernia, primary or recurrent, type of surgical repair procedure, mesh and mesh fixation methods. Descriptive data: According to the Danish National Health Act, surgeons are obliged to register all hernia repairs immediately after surgery (3 minute registration time. All institutions have continuous access to their own data stratified on individual surgeons. Registrations are based on a closed, protected Internet system requiring personal codes also identifying the operating institution. A national steering committee consisting of 13 voluntary and dedicated surgeons, 11 of whom are unpaid, handles the medical management of the database. Results: The Danish Inguinal Hernia Database comprises intraoperative data from >130,000 repairs (May 2015. A total of 49 peer-reviewed national and international publications have been published from the database (June 2015. Conclusion: The Danish Inguinal Hernia Database is fully active monitoring surgical quality and contributes to the national and international surgical society to improve outcome after groin hernia repair. Keywords: nation-wide, recurrence, chronic pain, femoral hernia, surgery, quality improvement
CD-ROM has rapidly evolved as a new information medium with large capacity, In the U.S. it is predicted that it will become two hundred billion yen market in three years, and thus CD-ROM is strategic target of database industry. Here in Japan the movement toward its commercialization has been active since this year. Shall CD-ROM bussiness ever conquer information market as an on-disk database or electronic publication? Referring to some cases of the applications in the U.S. the author views marketability and the future trend of this new optical disk medium.
Hirakawa, Hideki; Mun, Terry; Sato, Shusei
Since the genome sequence of Lotus japonicus, a model plant of family Fabaceae, was determined in 2008 (Sato et al. 2008), the genomes of other members of the Fabaceae family, soybean (Glycine max) (Schmutz et al. 2010) and Medicago truncatula (Young et al. 2011), have been sequenced. In this sec....... In this section, we introduce representative, publicly accessible online resources related to plant materials, integrated databases containing legume genome information, and databases for genome sequence and derived marker information of legume species including L. japonicus...
Gottfried, John C.
This study examines potential correlates of business research database access through academic libraries serving top business programs in the United States. Results indicate that greater access to research databases is related to enrollment in graduate business programs, but not to overall enrollment or status as a public or private institution.…
National Oceanic and Atmospheric Administration, Department of Commerce — The World Ocean Database (WOD) is the world's largest publicly available uniform format quality controlled ocean profile dataset. Ocean profile data are sets of...
Earth Data Analysis Center, University of New Mexico — The Protected Areas Database of the United States (PAD-US) is a geodatabase, managed by USGS GAP, that illustrates and describes public land ownership, management...
Thomsen, Steven R.
Finds that corporate public relations practitioners felt they were able, using online database and information services, to intercept issues earlier in the "issue cycle" and thus enable their organizations to develop more "proactionary" or "catalytic" issues management repose strategies. (SR)
National Oceanic and Atmospheric Administration, Department of Commerce — In the Pacific Northwest Salmon Habitat Project Database Across the Pacific Northwest, both public and private agents are working to improve riverine habitat for a...
Brinkmann, Falko; Hirtz, Michael; Haller, Anna; Gorges, Tobias M.; Vellekoop, Michael J.; Riethdorf, Sabine; Müller, Volkmar; Pantel, Klaus; Fuchs, Harald
Analyses of rare events occurring at extremely low frequencies in body fluids are still challenging. We established a versatile microarray-based platform able to capture single target cells from large background populations. As use case we chose the challenging application of detecting circulating tumor cells (CTCs) - about one cell in a billion normal blood cells. After incubation with an antibody cocktail, targeted cells are extracted on a microarray in a microfluidic chip. The accessibility of our platform allows for subsequent recovery of targets for further analysis. The microarray facilitates exclusion of false positive capture events by co-localization allowing for detection without fluorescent labelling. Analyzing blood samples from cancer patients with our platform reached and partly outreached gold standard performance, demonstrating feasibility for clinical application. Clinical researchers free choice of antibody cocktail without need for altered chip manufacturing or incubation protocol, allows virtual arbitrary targeting of capture species and therefore wide spread applications in biomedical sciences.
Primate Info Net Related Databases NCRR PrimateLit: A bibliographic database for primatology Top of any problems with this service. We welcome your feedback. The PrimateLit database is no longer being Resources, National Institutes of Health. The database is a collaborative project of the Wisconsin Primate
NaKnowBase is a relational database populated with data from peer-reviewed ORD nanomaterials research publications. The database focuses on papers describing the actions of nanomaterials in environmental or biological media including their interactions, transformations and poten...
NaKnowBase is an internal relational database populated with data from peer-reviewed ORD nanomaterials research publications. The database focuses on papers describing the actions of nanomaterials in environmental or biological media including their interactions, transformations...
Jeong, Kwan Seong; Lee, Yong Bum; Jeong, Hae Yong; Ha, Kwi Seok
KALIMER database is an advanced database to utilize the integration management for liquid metal reactor design technology development using Web applications. KALIMER design database is composed of results database, Inter-Office Communication (IOC), 3D CAD database, and reserved documents database. Results database is a research results database during all phase for liquid metal reactor design technology development of mid-term and long-term nuclear R and D. IOC is a linkage control system inter sub project to share and integrate the research results for KALIMER. 3D CAD database is a schematic overview for KALIMER design structure. And reserved documents database is developed to manage several documents and reports since project accomplishment.
Jeong, Kwan Seong; Lee, Yong Bum; Jeong, Hae Yong; Ha, Kwi Seok
KALIMER database is an advanced database to utilize the integration management for liquid metal reactor design technology development using Web applications. KALIMER design database is composed of results database, Inter-Office Communication (IOC), 3D CAD database, and reserved documents database. Results database is a research results database during all phase for liquid metal reactor design technology development of mid-term and long-term nuclear R and D. IOC is a linkage control system inter sub project to share and integrate the research results for KALIMER. 3D CAD database is a schematic overview for KALIMER design structure. And reserved documents database is developed to manage several documents and reports since project accomplishment
Full Text Available Protein microarray technology has gone through numerous innovative developments in recent decades. In this review, we focus on the development of protein detection methods embedded in the technology. Early microarrays utilized useful chromophores and versatile biochemical techniques dominated by high-throughput illumination. Recently, the realization of label-free techniques has been greatly advanced by the combination of knowledge in material sciences, computational design and nanofabrication. These rapidly advancing techniques aim to provide data without the intervention of label molecules. Here, we present a brief overview of this remarkable innovation from the perspectives of label and label-free techniques in transducing nano‑biological events.
Garmany, John; Clark, Terry
INTRODUCTION TO LOGICAL DATABASE DESIGNUnderstanding a Database Database Architectures Relational Databases Creating the Database System Development Life Cycle (SDLC)Systems Planning: Assessment and Feasibility System Analysis: RequirementsSystem Analysis: Requirements Checklist Models Tracking and Schedules Design Modeling Functional Decomposition DiagramData Flow Diagrams Data Dictionary Logical Structures and Decision Trees System Design: LogicalSYSTEM DESIGN AND IMPLEMENTATION The ER ApproachEntities and Entity Types Attribute Domains AttributesSet-Valued AttributesWeak Entities Constraint
Slobodanka Ključanin; Zdravko Galić
The concept of producing a prototype of interoperable cartographic database is explored in this paper, including the possibilities of integration of different geospatial data into the database management system and their visualization on the Internet. The implementation includes vectorization of the concept of a single map page, creation of the cartographic database in an object-relation database, spatial analysis, definition and visualization of the database content in the form of a map on t...
Initially launched in 1983, the CHEMTOX Database was among the first microcomputer databases containing hazardous chemical information. The database is used in many industries and government agencies in more than 17 countries. Updated quarterly, the CHEMTOX Database provides detailed environmental and safety information on 7500-plus hazardous substances covered by dozens of regulatory and advisory sources. This brief listing describes the method of accessing data and provides ordering information for those wishing to obtain the CHEMTOX Database
Foncy, Julie; Estève, Aurore; Degache, Amélie; Colin, Camille; Cau, Jean Christophe; Malaquin, Laurent; Vieu, Christophe; Trévisiol, Emmanuelle
Biomolecule microarrays are generally produced by conventional microarrayer, i.e., by contact or inkjet printing. Microcontact printing represents an alternative way of deposition of biomolecules on solid supports but even if various biomolecules have been successfully microcontact printed, the production of biomolecule microarrays in routine by microcontact printing remains a challenging task and needs an effective, fast, robust, and low-cost automation process. Here, we describe the production of biomolecule microarrays composed of extracellular matrix protein for the fabrication of cell microarrays by using an automated microcontact printing device. Large scale cell microarrays can be reproducibly obtained by this method.
Mutagenicity and carcinogenicity databases are crucial resources for toxicologists and regulators involved in chemicals risk assessment. Until recently, existing public toxicity databases have been constructed primarily as "look-up-tables" of existing data, and most often did no...
Chen, Jie; Fu, Ziyi; Ji, Chenbo; Gu, Pingqing; Xu, Pengfei; Yu, Ningzhu; Kan, Yansheng; Wu, Xiaowei; Shen, Rong; Shen, Yan
The human uterine cervix carcinoma is one of the most well-known malignancy reproductive system cancers, which threatens women health globally. However, the mechanisms of the oncogenesis and development process of cervix carcinoma are not yet fully understood. Long non-coding RNAs (lncRNAs) have been proved to play key roles in various biological processes, especially development of cancer. The function and mechanism of lncRNAs on cervix carcinoma is still rarely reported. We selected 3 cervix cancer and normal cervix tissues separately, then performed lncRNA microarray to detect the differentially expressed lncRNAs. Subsequently, we explored the potential function of these dysregulated lncRNAs through online bioinformatics databases. Finally, quantity real-time PCR was carried out to confirm the expression levels of these dysregulated lncRNAs in cervix cancer and normal tissues. We uncovered the profiles of differentially expressed lncRNAs between normal and cervix carcinoma tissues by using the microarray techniques, and found 1622 upregulated and 3026 downregulated lncRNAs (fold-change>2.0) in cervix carcinoma compared to the normal cervical tissue. Furthermore, we found HOXA11-AS might participate in cervix carcinogenesis by regulating HOXA11, which is involved in regulating biological processes of cervix cancer. This study afforded expression profiles of lncRNAs between cervix carcinoma tissue and normal cervical tissue, which could provide database for further research about the function and mechanism of key-lncRNAs in cervix carcinoma, and might be helpful to explore potential diagnosis factors and therapeutic targets for cervix carcinoma. Copyright © 2015 Elsevier Masson SAS. All rights reserved.
Helgstrand, Frederik; Jorgensen, Lars Nannestad
Aim: The Danish Ventral Hernia Database (DVHD) provides national surveillance of current surgical practice and clinical postoperative outcomes. The intention is to reduce postoperative morbidity and hernia recurrence, evaluate new treatment strategies, and facilitate nationwide implementation of ...... of operations and is an excellent tool for observing changes over time, including adjustment of several confounders. This national database registry has impacted on clinical practice in Denmark and led to a high number of scientific publications in recent years.......Aim: The Danish Ventral Hernia Database (DVHD) provides national surveillance of current surgical practice and clinical postoperative outcomes. The intention is to reduce postoperative morbidity and hernia recurrence, evaluate new treatment strategies, and facilitate nationwide implementation...... to the surgical repair are recorded. Data registration is mandatory. Data may be merged with other Danish health registries and information from patient questionnaires or clinical examinations. Descriptive data: More than 37,000 operations have been registered. Data have demonstrated high agreement with patient...
Daniel Victor Guebel
Full Text Available Motivation: In the brain of elderly-healthy individuals, the effects of sexual dimorphism and those due to normal ageing appear overlapped. Discrimination of these two dimensions would powerfully contribute to a better understanding of the aetiology of some neurodegenerative diseases, such as sporadic Alzheimer. Methods: Following a system biology approach, top-down and bottom-up strategies were combined. First, public transcriptome data corresponding to the transition from adulthood to the ageing stage in normal, human hippocampus were analysed through an optimized microarray post-processing (Q-GDEMAR method together with a proper experimental design (full factorial analysis. Second, the identified genes were placed in context by building compatible networks. The subsequent ontology analyses carried out on these networks clarify the main functionalities involved. Results: Noticeably we could identify large sets of genes according to three groups: those that exclusively depend on the sex, those that exclusively depend on the age, and those that depend on the particular combinations of sex and age (interaction. The genes identified were validated against three independent sources (a proteomic study of ageing, a senescence database, and a mitochondrial genetic database. We arrived to several new inferences about the biological functions compromised during ageing in two ways: by taking into account the sex-independent effects of ageing, and considering the interaction between age and sex where pertinent. In particular, we discuss the impact of our findings on the functions of mitochondria, autophagy, mitophagia, and microRNAs.Conclusions: The evidence obtained herein supports the occurrence of significant neurobiological differences in the hippocampus, not only between adult and elderly individuals, but between old-healthy women and old-healthy men. Hence, to obtain realistic results in further analysis of the transition from the normal ageing to
Johana A. Luna Coronell
Full Text Available Characterization of the colon cancer immunome and its autoantibody signature from differentially-reactive antigens (DIRAGs could provide insights into aberrant cellular mechanisms or enriched networks associated with diseases. The purpose of this study was to characterize the antibody profile of plasma samples from 32 colorectal cancer (CRC patients and 32 controls using proteins isolated from 15,417 human cDNA expression clones on microarrays. 671 unique DIRAGs were identified and 632 were more highly reactive in CRC samples. Bioinformatics analyses reveal that compared to control samples, the immunoproteomic IgG profiling of CRC samples is mainly associated with cell death, survival, and proliferation pathways, especially proteins involved in EIF2 and mTOR signaling. Ribosomal proteins (e.g., RPL7, RPL22, and RPL27A and CRC-related genes such as APC, AXIN1, E2F4, MSH2, PMS2, and TP53 were highly enriched. In addition, differential pathways were observed between the CRC and control samples. Furthermore, 103 DIRAGs were reported in the SEREX antigen database, demonstrating our ability to identify known and new reactive antigens. We also found an overlap of 7 antigens with 48 “CRC genes.” These data indicate that immunomics profiling on protein microarrays is able to reveal the complexity of immune responses in cancerous diseases and faithfully reflects the underlying pathology. Keywords: Autoantibody tumor biomarker, Cancer immunology, Colorectal cancer, Immunomics, Protein microarray
Full Text Available Intermittent hypoxia (IH during sleep is one of the major abnormalities occurring in patients suffering from obstructive sleep apnea (OSA, a highly prevalent disorder affecting 6–15% of the general population, particularly among obese people. IH has been proposed as a major determinant of oncogenetically-related processes such as tumor growth, invasion and metastasis. During the growth and expansion of tumors, fragmented DNA is released into the bloodstream and enters the circulation. Circulating tumor DNA (cirDNA conserves the genetic and epigenetic profiles from the tumor of origin and can be isolated from the plasma fraction. Here we report a microarray-based epigenetic profiling of cirDNA isolated from blood samples of mice engrafted with TC1 epithelial lung cancer cells and controls, which were exposed to IH during sleep (XenoIH group, n = 3 or control conditions, (i.e., room air (RA; XenoRA group, n = 3 conditions. To prepare the targets for microarray hybridization, we applied a previously developed method that enriches the modified fraction of the cirDNA without amplification of genomic DNA. Regions of differential cirDNA modification between the two groups were identified by hybridizing the enriched fractions for each sample to Affymetrix GeneChip Human Promoter Arrays 1.0R. Microarray raw and processed data were deposited in NCBI's Gene Expression Omnibus (GEO database (accession number: GSE61070.
Feng, L.; Lu, H.J.
In this paper, we propose a data-mining-based approach to public buffer management for a multiuser database system, where database buffers are organized into two areas – public and private. While the private buffer areas contain pages to be updated by particular users, the public
Ile Kristina E
Full Text Available Abstract Background The ADGE technique is a method designed to magnify the ratios of gene expression before detection. It improves the detection sensitivity to small change of gene expression and requires small amount of starting material. However, the throughput of ADGE is low. We integrated ADGE with DNA microarray (ADGE microarray and compared it with regular microarray. Results When ADGE was integrated with DNA microarray, a quantitative relationship of a power function between detected and input ratios was found. Because of ratio magnification, ADGE microarray was better able to detect small changes in gene expression in a drug resistant model cell line system. The PCR amplification of templates and efficient labeling reduced the requirement of starting material to as little as 125 ng of total RNA for one slide hybridization and enhanced the signal intensity. Integration of ratio magnification, template amplification and efficient labeling in ADGE microarray reduced artifacts in microarray data and improved detection fidelity. The results of ADGE microarray were less variable and more reproducible than those of regular microarray. A gene expression profile generated with ADGE microarray characterized the drug resistant phenotype, particularly with reference to glutathione, proliferation and kinase pathways. Conclusion ADGE microarray magnified the ratios of differential gene expression in a power function, improved the detection sensitivity and fidelity and reduced the requirement for starting material while maintaining high throughput. ADGE microarray generated a more informative expression pattern than regular microarray.
Caivano, Jose L.
The paper describes the methodology and results of a project under development, aimed at the elaboration of an interactive bibliographical database on color in all fields of application: philosophy, psychology, semiotics, education, anthropology, physical and natural sciences, biology, medicine, technology, industry, architecture and design, arts, linguistics, geography, history. The project is initially based upon an already developed bibliography, published in different journals, updated in various opportunities, and now available at the Internet, with more than 2,000 entries. The interactive database will amplify that bibliography, incorporating hyperlinks and contents (indexes, abstracts, keywords, introductions, or eventually the complete document), and devising mechanisms for information retrieval. The sources to be included are: books, doctoral dissertations, multimedia publications, reference works. The main arrangement will be chronological, but the design of the database will allow rearrangements or selections by different fields: subject, Decimal Classification System, author, language, country, publisher, etc. A further project is to develop another database, including color-specialized journals or newsletters, and articles on color published in international journals, arranged in this case by journal name and date of publication, but allowing also rearrangements or selections by author, subject and keywords.
Turnbull Arran K
Full Text Available Abstract Background Affymetrix GeneChips and Illumina BeadArrays are the most widely used commercial single channel gene expression microarrays. Public data repositories are an extremely valuable resource, providing array-derived gene expression measurements from many thousands of experiments. Unfortunately many of these studies are underpowered and it is desirable to improve power by combining data from more than one study; we sought to determine whether platform-specific bias precludes direct integration of probe intensity signals for combined reanalysis. Results Using Affymetrix and Illumina data from the microarray quality control project, from our own clinical samples, and from additional publicly available datasets we evaluated several approaches to directly integrate intensity level expression data from the two platforms. After mapping probe sequences to Ensembl genes we demonstrate that, ComBat and cross platform normalisation (XPN, significantly outperform mean-centering and distance-weighted discrimination (DWD in terms of minimising inter-platform variance. In particular we observed that DWD, a popular method used in a number of previous studies, removed systematic bias at the expense of genuine biological variability, potentially reducing legitimate biological differences from integrated datasets. Conclusion Normalised and batch-corrected intensity-level data from Affymetrix and Illumina microarrays can be directly combined to generate biologically meaningful results with improved statistical power for robust, integrated reanalysis.
Full Text Available Genomic microarrays are powerful research tools in bioinformatics and modern medicinal research because they enable massively-parallel assays and simultaneous monitoring of thousands of gene expression of biological samples. However, a simple microarray experiment often leads to very high-dimensional data and a huge amount of information, the vast amount of data challenges researchers into extracting the important features and reducing the high dimensionality. In this paper, a nonlinear dimensionality reduction kernel method based locally linear embedding(LLE is proposed, and fuzzy K-nearest neighbors algorithm which denoises datasets will be introduced as a replacement to the classical LLEÃ¢Â€Â™s KNN algorithm. In addition, kernel method based support vector machine (SVM will be used to classify genomic microarray data sets in this paper. We demonstrate the application of the techniques to two published DNA microarray data sets. The experimental results confirm the superiority and high success rates of the presented method.
Lenz, Ondřej; Petrzik, Karel; Špak, Josef
Roč. 148, July (2009), s. 27 ISSN 1866-590X. [International Conference on Virus and other Graft Transmissible Diseases of Fruit Crops /21./. 05.07.2009-10.07.2009, Neustadt] R&D Projects: GA MŠk OC 853.001 Institutional research plan: CEZ:AV0Z50510513 Keywords : microarray * detection * virus Subject RIV: EE - Microbiology, Virology
It is estimated that more than 160, 000 miles of rivers and streams in the United States are impaired due to the presence of waterborne pathogens. These pathogens typically originate from human and other animal fecal pollution sources; therefore, a rapid microbial source tracking (MST) method is needed to facilitate water quality assessment and impaired water remediation. We report a novel qualitative DNA microarray technology consisting of 453 probes for the detection of general fecal and host-associated bacteria, viruses, antibiotic resistance, and other environmentally relevant genetic indicators. A novel data normalization and reduction approach is also presented to help alleviate false positives often associated with high-density microarray applications. To evaluate the performance of the approach, DNA and cDNA was isolated from swine, cattle, duck, goose and gull fecal reference samples, as well as soiled poultry liter and raw municipal sewage. Based on nonmetric multidimensional scaling analysis of results, findings suggest that the novel microarray approach may be useful for pathogen detection and identification of fecal contamination in recreational waters. The ability to simultaneously detect a large collection of environmentally important genetic indicators in a single test has the potential to provide water quality managers with a wide range of information in a short period of time. Future research is warranted to measure microarray performance i
Full Text Available Dimension reduction has become inevitable for pre-processing of high dimensional data. “Gene expression microarray data” is an instance of such high dimensional data. Gene expression microarray data displays the maximum number of genes (features simultaneously at a molecular level with a very small number of samples. The copious numbers of genes are usually provided to a learning algorithm for producing a complete characterization of the classification task. However, most of the times the majority of the genes are irrelevant or redundant to the learning task. It will deteriorate the learning accuracy and training speed as well as lead to the problem of overfitting. Thus, dimension reduction of microarray data is a crucial preprocessing step for prediction and classification of disease. Various feature selection and feature extraction techniques have been proposed in the literature to identify the genes, that have direct impact on the various machine learning algorithms for classification and eliminate the remaining ones. This paper describes the taxonomy of dimension reduction methods with their characteristics, evaluation criteria, advantages and disadvantages. It also presents a review of numerous dimension reduction approaches for microarray data, mainly those methods that have been proposed over the past few years.
Knudsen, Steen; Workman, Christopher; Sicheritz-Ponten, T.
GenePublisher, a system for automatic analysis of data from DNA microarray experiments, has been implemented with a web interface at http://www.cbs.dtu.dk/services/GenePublisher. Raw data are uploaded to the server together with aspecification of the data. The server performs normalization...
Smistrup, Kristian; Bruus, Henrik; Hansen, Mikkel Fougt
to use larger currents and obtain forces of longer range than from thin current lines at a given power limit. Guiding of magnetic beads in the hybrid magnetic separator and the construction of a programmable microarray of magnetic beads in the microfluidic channel by hydrodynamic focusing is presented....
In the 2007 Association of Biomolecular Resource Facilities (ABRF) Microarray Research Group (MARG) project, we analyzed HL-60 DNA with five platforms: Agilent, Affymetrix 500K, Affymetrix U133 Plus 2.0, Illumina, and RPCI 19K BAC arrays. Copy number variation (CNV) was analyzed ...
The generation of corroborative data has become a commonly used approach for ensuring the veracity of microarray data. Indeed, the need to conduct corroborative studies has now become official editorial policy for at least two journals, and several more are considering introducin...
Lucas, J M
Progress in nanotechnology and DNA recombination techniques have produced tools for the diagnosis and investigation of allergy at molecular level. The most advanced examples of such progress are the microarray techniques, which have been expanded not only in research in the field of proteomics but also in application to the clinical setting. Microarrays of allergic components offer results relating to hundreds of allergenic components in a single test, and using a small amount of serum which can be obtained from capillary blood. The availability of new molecules will allow the development of panels including new allergenic components and sources, which will require evaluation for clinical use. Their application opens the door to component-based diagnosis, to the holistic perception of sensitisation as represented by molecular allergy, and to patient-centred medical practice by allowing great diagnostic accuracy and the definition of individualised immunotherapy for each patient. The present article reviews the application of allergenic component microarrays to allergology for diagnosis, management in the form of specific immunotherapy, and epidemiological studies. A review is also made of the use of protein and gene microarray techniques in basic research and in allergological diseases. Lastly, an evaluation is made of the challenges we face in introducing such techniques to clinical practice, and of the future perspectives of this new technology. Copyright 2010 SEICAP. Published by Elsevier Espana. All rights reserved.
Herbáth, Melinda; Balogh, Andrea; Matkó, János; Papp, Krisztián; Prechl, József
Protein microarray technology is becoming the method of choice for identifying protein interaction partners, detecting specific proteins, carbohydrates and lipids, or for characterizing protein interactions and serum antibodies in a massively parallel manner. Availability of the well-established instrumentation of DNA arrays and development of new fluorescent detection instruments promoted the spread of this technique. Fluorescent detection has the advantage of high sensitivity, specificity, simplicity and wide dynamic range required by most measurements. Fluorescence through specifically designed probes and an increasing variety of detection modes offers an excellent tool for such microarray platforms. Measuring for example the level of antibodies, their isotypes and/or antigen specificity simultaneously can offer more complex and comprehensive information about the investigated biological phenomenon, especially if we take into consideration that hundreds of samples can be measured in a single assay. Not only body fluids, but also cell lysates, extracted cellular components, and intact living cells can be analyzed on protein arrays for monitoring functional responses to printed samples on the surface. As a rapidly evolving area, protein microarray technology offers a great bulk of information and new depth of knowledge. These are the features that endow protein arrays with wide applicability and robust sample analyzing capability. On the whole, protein arrays are emerging new tools not just in proteomics, but glycomics, lipidomics, and are also important for immunological research. In this review we attempt to summarize the technical aspects of planar fluorescent microarray technology along with the description of its main immunological applications. (topical review)
Transcriptional profiling experiments utilizing DNA microarrays to study the intracellular accumulation of PHB in Synechocystis has proved difficult in large part because strains that show significant differences in PHB which would justify global analysis of gene expression have not been isolated.
Børsting, Claus; Sanchez Sanchez, Juan Jose; Morling, Niels
We describe a single nucleotide polymorphism (SNP) typing protocol developed for the NanoChip electronic microarray. The NanoChip array consists of 100 electrodes covered by a thin hydrogel layer containing streptavidin. An electric currency can be applied to one, several, or all electrodes...
Helweg-Larsen, Rehannah Borup
The overall purpose of this thesis is to evaluate the use of microarray analysis to investigate the transcriptome of human cancers and human follicular cells and define the correlation between expression of human genes and specific cancer types as well as the developmental competence of the oocyte...
Al-Khaldi, Sufian F; Mossoba, Magdi M; Allard, Marc M; Lienau, E Kurt; Brown, Eric D
The era of fast and accurate discovery of biological sequence motifs in prokaryotic and eukaryotic cells is here. The co-evolution of direct genome sequencing and DNA microarray strategies not only will identify, isotype, and serotype pathogenic bacteria, but also it will aid in the discovery of new gene functions by detecting gene expressions in different diseases and environmental conditions. Microarray bacterial identification has made great advances in working with pure and mixed bacterial samples. The technological advances have moved beyond bacterial gene expression to include bacterial identification and isotyping. Application of new tools such as mid-infrared chemical imaging improves detection of hybridization in DNA microarrays. The research in this field is promising and future work will reveal the potential of infrared technology in bacterial identification. On the other hand, DNA sequencing by using 454 pyrosequencing is so cost effective that the promise of $1,000 per bacterial genome sequence is becoming a reality. Pyrosequencing technology is a simple to use technique that can produce accurate and quantitative analysis of DNA sequences with a great speed. The deposition of massive amounts of bacterial genomic information in databanks is creating fingerprint phylogenetic analysis that will ultimately replace several technologies such as Pulsed Field Gel Electrophoresis. In this chapter, we will review (1) the use of DNA microarray using fluorescence and infrared imaging detection for identification of pathogenic bacteria, and (2) use of pyrosequencing in DNA cluster analysis to fingerprint bacterial phylogenetic trees.
Molenaar, D.; Bringel, F.; Schuren, F.H.; Vos, de W.M.; Siezen, R.J.; Kleerebezem, M.
Lactobacillus plantarum is a versatile and flexible species that is encountered in a variety of niches and can utilize a broad range of fermentable carbon sources. To assess if this versatility is linked to a variable gene pool, microarrays containing a subset of small genomic fragments of L.
von Götz, Franz
Despite the controversy of whether genetically modified organisms (GMOs) are beneficial or harmful for humans, animals, and/or ecosystems, the number of cultivated GMOs is increasing every year. Many countries and federations have implemented safety and surveillance systems for GMOs. Potent testing technologies need to be developed and implemented to monitor the increasing number of GMOs. First, these GMO tests need to be comprehensive, i.e., should detect all, or at least the most important, GMOs on the market. This type of GMO screening requires a high degree of parallel tests or multiplexing. To date, DNA microarrays have the highest number of multiplexing capabilities when nucleic acids are analyzed. This trend article focuses on the evolution of DNA microarrays for GMO testing. Over the last 7 years, combinations of multiplex PCR detection and microarray detection have been developed to qualitatively assess the presence of GMOs. One example is the commercially available DualChip GMO (Eppendorf, Germany; http://www.eppendorf-biochip.com), which is the only GMO screening system successfully validated in a multicenter study. With use of innovative amplification techniques, promising steps have recently been taken to make GMO detection with microarrays quantitative.
Gorte, M.; Horstman, A.; Page, R.B.; Heidstra, R.; Stromberg, A.; Boutilier, K.A.
Microarray analysis is widely used to identify transcriptional changes associated with genetic perturbation or signaling events. Here we describe its application in the identification of plant transcription factor target genes with emphasis on the design of suitable DNA constructs for controlling TF
Larsen, Martin J; Thomassen, Mads; Tan, Qihua
analyzed the same 234 breast cancers on two different microarray platforms. One dataset contained known batch-effects associated with the fabrication procedure used. The aim was to assess the significance of correcting for systematic batch-effects when integrating data from different platforms. We here...
Tete, Stefano; Mastrangelo, Filiberto; Scioletti, Anna Paola; Tranasi, Michelangelo; Raicu, Florina; Paolantonio, Michele; Stuppia, Liborio; Vinci, Raffaele; Gherlone, Enrico; Ciampoli, Cristian; Sberna, Maria Teresa; Conti, Pio
Microarray is a recently developed simultaneous analysis of expression patterns of thousand of genes. The aim of this research was to evaluate the expression profile of human healthy dental pulp in order to find the presence of genes activated and encoding for proteins involved in the physiological process of human dental pulp. We report data obtained by analyzing expression profiles of human tooth pulp from single subjects, using an approach based on the amplification of the total RNA. Experiments were performed on a high-density array able to analyse about 21,000 oligonucleotide sequences of about 70 bases in duplicate, using an approach based on the amplification of the total RNA from the pulp of a single tooth. Obtained data were analyzed using the S.A.M. system (Significance Analysis of Microarray) and genes were merged according to their molecular functions and biological process by the Onto-Express software. The microarray analysis revealed 362 genes with specific pulp expression. Genes showing significant high expression were classified in genes involved in tooth development, protoncogenes, genes of collagen, DNAse, Metallopeptidases and Growth factors. We report a microarray analysis, carried out by extraction of total RNA from specimens of healthy human dental pulp tissue. This approach represents a powerful tool in the study of human normal and pathological pulp, allowing minimization of the genetic variability due to the pooling of samples from different individuals.
Microarray analysis of the gene expression profile in triethylene glycol dimethacrylate-treated human dental pulp cells. ... Conclusions: Our results suggest that TEGDMA can change the many functions of hDPCs through large changes in gene expression levels and complex interactions with different signaling pathways.
This fourth edition of the Directory of IAEA Databases has been prepared within the Division of NESI. ITs main objective is to describe the computerized information sources available to the public. This directory contains all publicly available databases which are produced at the IAEA. This includes databases stored on the mainframe, LAN servers and user PCs. All IAEA Division Directors have been requested to register the existence of their databases with NESI. At the data of printing, some of the information in the directory will be already obsolete. For the most up-to-date information please see the IAEA's World Wide Web site at URL: http:/www.iaea.or.at/databases/dbdir/. Refs, figs, tabs
Full Text Available Charlotte Kvist Ekelund,1 Tine Iskov Kopp,2 Ann Tabor,1 Olav Bjørn Petersen3 1Department of Obstetrics, Center of Fetal Medicine, Rigshospitalet, University of Copenhagen, Copenhagen, Denmark; 2Registry Support Centre (East – Epidemiology and Biostatistics, Research Centre for Prevention and Health, Glostrup, Denmark; 3Fetal Medicine Unit, Aarhus University Hospital, Aarhus Nord, Denmark Aim: The aim of this study is to set up a database in order to monitor the detection rates and false-positive rates of first-trimester screening for chromosomal abnormalities and prenatal detection rates of fetal malformations in Denmark. Study population: Pregnant women with a first or second trimester ultrasound scan performed at all public hospitals in Denmark are registered in the database. Main variables/descriptive data: Data on maternal characteristics, ultrasonic, and biochemical variables are continuously sent from the fetal medicine units' Astraia databases to the central database via web service. Information about outcome of pregnancy (miscarriage, termination, live birth, or stillbirth is received from the National Patient Register and National Birth Register and linked via the Danish unique personal registration number. Furthermore, results of all pre- and postnatal chromosome analyses are sent to the database. Conclusion: It has been possible to establish a fetal medicine database, which monitors first-trimester screening for chromosomal abnormalities and second-trimester screening for major fetal malformations with the input from already collected data. The database is valuable to assess the performance at a regional level and to compare Danish performance with international results at a national level. Keywords: prenatal screening, nuchal translucency, fetal malformations, chromosomal abnormalities
Full Text Available Abstract Background Microarray data analysis has been the subject of extensive and ongoing pipeline development due to its complexity, the availability of several options at each analysis step, and the development of new analysis demands, including integration with new data sources. Bioinformatics pipelines are usually custom built for different applications, making them typically difficult to modify, extend and repurpose. Scientific workflow systems are intended to address these issues by providing general-purpose frameworks in which to develop and execute such pipelines. The Kepler workflow environment is a well-established system under continual development that is employed in several areas of scientific research. Kepler provides a flexible graphical interface, featuring clear display of parameter values, for design and modification of workflows. It has capabilities for developing novel computational components in the R, Python, and Java programming languages, all of which are widely used for bioinformatics algorithm development, along with capabilities for invoking external applications and using web services. Results We developed a series of fully functional bioinformatics pipelines addressing common tasks in microarray processing in the Kepler workflow environment. These pipelines consist of a set of tools for GFF file processing of NimbleGen chromatin immunoprecipitation on microarray (ChIP-chip datasets and more comprehensive workflows for Affymetrix gene expression microarray bioinformatics and basic primer design for PCR experiments, which are often used to validate microarray results. Although functional in themselves, these workflows can be easily customized, extended, or repurposed to match the needs of specific projects and are designed to be a toolkit and starting point for specific applications. These workflows illustrate a workflow programming paradigm focusing on local resources (programs and data and therefore are close to
Stropp, Thomas; McPhillips, Timothy; Ludäscher, Bertram; Bieda, Mark
Microarray data analysis has been the subject of extensive and ongoing pipeline development due to its complexity, the availability of several options at each analysis step, and the development of new analysis demands, including integration with new data sources. Bioinformatics pipelines are usually custom built for different applications, making them typically difficult to modify, extend and repurpose. Scientific workflow systems are intended to address these issues by providing general-purpose frameworks in which to develop and execute such pipelines. The Kepler workflow environment is a well-established system under continual development that is employed in several areas of scientific research. Kepler provides a flexible graphical interface, featuring clear display of parameter values, for design and modification of workflows. It has capabilities for developing novel computational components in the R, Python, and Java programming languages, all of which are widely used for bioinformatics algorithm development, along with capabilities for invoking external applications and using web services. We developed a series of fully functional bioinformatics pipelines addressing common tasks in microarray processing in the Kepler workflow environment. These pipelines consist of a set of tools for GFF file processing of NimbleGen chromatin immunoprecipitation on microarray (ChIP-chip) datasets and more comprehensive workflows for Affymetrix gene expression microarray bioinformatics and basic primer design for PCR experiments, which are often used to validate microarray results. Although functional in themselves, these workflows can be easily customized, extended, or repurposed to match the needs of specific projects and are designed to be a toolkit and starting point for specific applications. These workflows illustrate a workflow programming paradigm focusing on local resources (programs and data) and therefore are close to traditional shell scripting or R
Fleischmann Robert D
Full Text Available Abstract Background In the postgenomic era, high throughput protein expression and protein microarray technologies have progressed markedly permitting screening of therapeutic reagents and discovery of novel protein functions. Hexa-histidine is one of the most commonly used fusion tags for protein expression due to its small size and convenient purification via immobilized metal ion affinity chromatography (IMAC. This purification process has been adapted to the protein microarray format, but the quality of in situ His-tagged protein purification on slides has not been systematically evaluated. We established methods to determine the level of purification of such proteins on metal chelate-modified slide surfaces. Optimized in situ purification of His-tagged recombinant proteins has the potential to become the new gold standard for cost-effective generation of high-quality and high-density protein microarrays. Results Two slide surfaces were examined, chelated Cu2+ slides suspended on a polyethylene glycol (PEG coating and chelated Ni2+ slides immobilized on a support without PEG coating. Using PEG-coated chelated Cu2+ slides, consistently higher purities of recombinant proteins were measured. An optimized wash buffer (PBST composed of 10 mM phosphate buffer, 2.7 mM KCl, 140 mM NaCl and 0.05% Tween 20, pH 7.4, further improved protein purity levels. Using Escherichia coli cell lysates expressing 90 recombinant Streptococcus pneumoniae proteins, 73 proteins were successfully immobilized, and 66 proteins were in situ purified with greater than 90% purity. We identified several antigens among the in situ-purified proteins via assays with anti-S. pneumoniae rabbit antibodies and a human patient antiserum, as a demonstration project of large scale microarray-based immunoproteomics profiling. The methodology is compatible with higher throughput formats of in vivo protein expression, eliminates the need for resin-based purification and circumvents
Background Microarray data analysis has been the subject of extensive and ongoing pipeline development due to its complexity, the availability of several options at each analysis step, and the development of new analysis demands, including integration with new data sources. Bioinformatics pipelines are usually custom built for different applications, making them typically difficult to modify, extend and repurpose. Scientific workflow systems are intended to address these issues by providing general-purpose frameworks in which to develop and execute such pipelines. The Kepler workflow environment is a well-established system under continual development that is employed in several areas of scientific research. Kepler provides a flexible graphical interface, featuring clear display of parameter values, for design and modification of workflows. It has capabilities for developing novel computational components in the R, Python, and Java programming languages, all of which are widely used for bioinformatics algorithm development, along with capabilities for invoking external applications and using web services. Results We developed a series of fully functional bioinformatics pipelines addressing common tasks in microarray processing in the Kepler workflow environment. These pipelines consist of a set of tools for GFF file processing of NimbleGen chromatin immunoprecipitation on microarray (ChIP-chip) datasets and more comprehensive workflows for Affymetrix gene expression microarray bioinformatics and basic primer design for PCR experiments, which are often used to validate microarray results. Although functional in themselves, these workflows can be easily customized, extended, or repurposed to match the needs of specific projects and are designed to be a toolkit and starting point for specific applications. These workflows illustrate a workflow programming paradigm focusing on local resources (programs and data) and therefore are close to traditional shell scripting or
Full Text Available Kristian Antonsen,1 Charlotte Vallentin Rosenstock,2 Lars Hyldborg Lundstrøm2 1Board of Directors, Copenhagen University Hospital, Bispebjerg and Frederiksberg Hospital, Capital Region of Denmark, Denmark; 2Department of Anesthesiology, Copenhagen University Hospital, Nordsjællands Hospital-Hillerød, Capital Region of Denmark, Denmark Aim of database: The aim of the Danish Anaesthesia Database (DAD is the nationwide collection of data on all patients undergoing anesthesia. Collected data are used for quality assurance, quality development, and serve as a basis for research projects. Study population: The DAD was founded in 2004 as a part of Danish Clinical Registries (Regionernes Kliniske Kvalitetsudviklings Program [RKKP]. Patients undergoing general anesthesia, regional anesthesia with or without combined general anesthesia as well as patients under sedation are registered. Data are retrieved from public and private anesthesia clinics, single-centers as well as multihospital corporations across Denmark. In 2014 a total of 278,679 unique entries representing a national coverage of ~70% were recorded, data completeness is steadily increasing. Main variable: Records are aggregated for determining 13 defined quality indicators and eleven defined complications all covering the anesthetic process from the preoperative assessment through anesthesia and surgery until the end of the postoperative recovery period. Descriptive data: Registered variables include patients' individual social security number (assigned to all Danes and both direct patient-related lifestyle factors enabling a quantification of patients' comorbidity as well as variables that are strictly related to the type, duration, and safety of the anesthesia. Data and specific data combinations can be extracted within each department in order to monitor patient treatment. In addition, an annual DAD report is a benchmark for departments nationwide. Conclusion: The DAD is covering the
Microarray technology is being used widely in various biomedical research areas; the corresponding microarray data analysis is an essential step toward the best utilizing of array technologies. Here we review two components of the microarray data analysis: a low level of microarray data analysis that emphasizes the designing, the quality control, and the preprocessing of microarray experiments, then a high level of microarray data analysis that focuses on the domain-specific microarray applications such as tumor classification, biomarker prediction, analyzing array CGH experiments, and reverse engineering of gene expression networks. Additionally, we will review the recent development of building a predictive model in genome expression and regulation studies. This review may help biologists grasp a basic knowledge of microarray bioinformatics as well as its potential impact on the future evolvement of biomedical research fields.
Microarrays represent a core technology in pharmacogenomics and toxicogenomics; however, before this technology can successfully and reliably be applied in clinical practice and regulatory decision-making, standards and quality measures need to be developed. The Microarray Qualit...
Hoffmann, Katrin; Firth, Martin J; Beesley, Alex H; Klerk, Nicholas H de; Kees, Ursula R
Recent findings from microarray studies have raised the prospect of a standardized diagnostic gene expression platform to enhance accurate diagnosis and risk stratification in paediatric acute lymphoblastic leukaemia (ALL). However, the robustness as well as the format for such a diagnostic test remains to be determined. As a step towards clinical application of these findings, we have systematically analyzed a published ALL microarray data set using Robust Multi-array Analysis (RMA) and Random Forest (RF). We examined published microarray data from 104 ALL patients specimens, that represent six different subgroups defined by cytogenetic features and immunophenotypes. Using the decision-tree based supervised learning algorithm Random Forest (RF), we determined a small set of genes for optimal subgroup distinction and subsequently validated their predictive power in an independent patient cohort. We achieved very high overall ALL subgroup prediction accuracies of about 98%, and were able to verify the robustness of these genes in an independent panel of 68 specimens obtained from a different institution and processed in a different laboratory. Our study established that the selection of discriminating genes is strongly dependent on the analysis method. This may have profound implications for clinical use, particularly when the classifier is reduced to a small set of genes. We have demonstrated that as few as 26 genes yield accurate class prediction and importantly, almost 70% of these genes have not been previously identified as essential for class distinction of the six ALL subgroups. Our finding supports the feasibility of qRT-PCR technology for standardized diagnostic testing in paediatric ALL and should, in conjunction with conventional cytogenetics lead to a more accurate classification of the disease. In addition, we have demonstrated that microarray findings from one study can be confirmed in an independent study, using an entirely independent patient cohort
Full Text Available BACKGROUND: Phototrophy of the extremely halophilic archaeon Halobacterium salinarum was explored for decades. The research was mainly focused on the expression of bacteriorhodopsin and its functional properties. In contrast, less is known about genome wide transcriptional changes and their impact on the physiological adaptation to phototrophy. The tool of choice to record transcriptional profiles is the DNA microarray technique. However, the technique is still rarely used for transcriptome analysis in archaea. METHODOLOGY/PRINCIPAL FINDINGS: We developed a whole-genome DNA microarray based on our sequence data of the Hbt. salinarum strain R1 genome. The potential of our tool is exemplified by the comparison of cells growing under aerobic and phototrophic conditions, respectively. We processed the raw fluorescence data by several stringent filtering steps and a subsequent MAANOVA analysis. The study revealed a lot of transcriptional differences between the two cell states. We found that the transcriptional changes were relatively weak, though significant. Finally, the DNA microarray data were independently verified by a real-time PCR analysis. CONCLUSION/SIGNIFICANCE: This is the first DNA microarray analysis of Hbt. salinarum cells that were actually grown under phototrophic conditions. By comparing the transcriptomics data with current knowledge we could show that our DNA microarray tool is well applicable for transcriptome analysis in the extremely halophilic archaeon Hbt. salinarum. The reliability of our tool is based on both the high-quality array of DNA probes and the stringent data handling including MAANOVA analysis. Among the regulated genes more than 50% had unknown functions. This underlines the fact that haloarchaeal phototrophy is still far away from being completely understood. Hence, the data recorded in this study will be subject to future systems biology analysis.
Full Text Available Abstract Background Post-hybridization washing is an essential part of microarray experiments. Both the quality of the experimental washing protocol and adequate consideration of washing in intensity calibration ultimately affect the quality of the expression estimates extracted from the microarray intensities. Results We conducted experiments on GeneChip microarrays with altered protocols for washing, scanning and staining to study the probe-level intensity changes as a function of the number of washing cycles. For calibration and analysis of the intensity data we make use of the 'hook' method which allows intensity contributions due to non-specific and specific hybridization of perfect match (PM and mismatch (MM probes to be disentangled in a sequence specific manner. On average, washing according to the standard protocol removes about 90% of the non-specific background and about 30-50% and less than 10% of the specific targets from the MM and PM, respectively. Analysis of the washing kinetics shows that the signal-to-noise ratio doubles roughly every ten stringent washing cycles. Washing can be characterized by time-dependent rate constants which reflect the heterogeneous character of target binding to microarray probes. We propose an empirical washing function which estimates the survival of probe bound targets. It depends on the intensity contribution due to specific and non-specific hybridization per probe which can be estimated for each probe using existing methods. The washing function allows probe intensities to be calibrated for the effect of washing. On a relative scale, proper calibration for washing markedly increases expression measures, especially in the limit of small and large values. Conclusions Washing is among the factors which potentially distort expression measures. The proposed first-order correction method allows direct implementation in existing calibration algorithms for microarray data. We provide an experimental
Suela, Javier; López-Expósito, Isabel; Querejeta, María Eugenia; Martorell, Rosa; Cuatrecasas, Esther; Armengol, Lluis; Antolín, Eugenia; Domínguez Garrido, Elena; Trujillo-Tiebas, María José; Rosell, Jordi; García Planells, Javier; Cigudosa, Juan Cruz
Microarray technology, recently implemented in international prenatal diagnosis systems, has become one of the main techniques in this field in terms of detection rate and objectivity of the results. This guideline attempts to provide background information on this technology, including technical and diagnostic aspects to be considered. Specifically, this guideline defines: the different prenatal sample types to be used, as well as their characteristics (chorionic villi samples, amniotic fluid, fetal cord blood or miscarriage tissue material); variant reporting policies (including variants of uncertain significance) to be considered in informed consents and prenatal microarray reports; microarray limitations inherent to the technique and which must be taken into account when recommending microarray testing for diagnosis; a detailed clinical algorithm recommending the use of microarray testing and its introduction into routine clinical practice within the context of other genetic tests, including pregnancies in families with a genetic history or specific syndrome suspicion, first trimester increased nuchal translucency or second trimester heart malformation and ultrasound findings not related to a known or specific syndrome. This guideline has been coordinated by the Spanish Association for Prenatal Diagnosis (AEDP, «Asociación Española de Diagnóstico Prenatal»), the Spanish Human Genetics Association (AEGH, «Asociación Española de Genética Humana») and the Spanish Society of Clinical Genetics and Dysmorphology (SEGCyD, «Sociedad Española de Genética Clínica y Dismorfología»). Copyright © 2017 Elsevier España, S.L.U. All rights reserved.
Full Text Available Abstract Background DNA microarrays and other genomics-inspired technologies provide large datasets that often include hidden patterns of correlation between genes reflecting the complex processes that underlie cellular metabolism and physiology. The challenge in analyzing large-scale expression data has been to extract biologically meaningful inferences regarding these processes – often represented as networks – in an environment where the datasets are often imperfect and biological noise can obscure the actual signal. Although many techniques have been developed in an attempt to address these issues, to date their ability to extract meaningful and predictive network relationships has been limited. Here we describe a method that draws on prior information about gene-gene interactions to infer biologically relevant pathways from microarray data. Our approach consists of using preliminary networks derived from the literature and/or protein-protein interaction data as seeds for a Bayesian network analysis of microarray results. Results Through a bootstrap analysis of gene expression data derived from a number of leukemia studies, we demonstrate that seeded Bayesian Networks have the ability to identify high-confidence gene-gene interactions which can then be validated by comparison to other sources of pathway data. Conclusion The use of network seeds greatly improves the ability of Bayesian Network analysis to learn gene interaction networks from gene expression data. We demonstrate that the use of seeds derived from the biomedical literature or high-throughput protein-protein interaction data, or the combination, provides improvement over a standard Bayesian Network analysis, allowing networks involving dynamic processes to be deduced from the static snapshots of biological systems that represent the most common source of microarray data. Software implementing these methods has been included in the widely used TM4 microarray analysis package.
Full Text Available abase Description General information of database Database name PSCDB Alternative n...rial Science and Technology (AIST) Takayuki Amemiya E-mail: Database classification Structure Databases - Protein structure Database...554-D558. External Links: Original website information Database maintenance site Graduate School of Informat...available URL of Web services - Need for user registration Not available About This Database Database Descri...ption Download License Update History of This Database Site Policy | Contact Us Database Description - PSCDB | LSDB Archive ...
The first edition of the Directory of IAEA Databases is intended to describe the computerized information sources available to IAEA staff members. It contains a listing of all databases produced at the IAEA, together with information on their availability
... Indian Health Board) Welcome to the Native Health Database. Please enter your search terms. Basic Search Advanced ... To learn more about searching the Native Health Database, click here. Tutorial Video The NHD has made ...
U.S. Department of Health & Human Services — The Cell Centered Database (CCDB) is a web accessible database for high resolution 2D, 3D and 4D data from light and electron microscopy, including correlated imaging.
US Agency for International Development — E3 Staff database is maintained by E3 PDMS (Professional Development & Management Services) office. The database is Mysql. It is manually updated by E3 staff as...
Recently, library staffs arranged and compiled the original research papers that have been written by researchers for 33 years since National Institute of Radiological Sciences (NIRS) established. This papers describes how the internal database of original research papers has been created. This is a small sample of hand-made database. This has been cumulating by staffs who have any knowledge about computer machine or computer programming. (author)
Burnham, Judy F
The Scopus database provides access to STM journal articles and the references included in those articles, allowing the searcher to search both forward and backward in time. The database can be used for collection development as well as for research. This review provides information on the key points of the database and compares it to Web of Science. Neither database is inclusive, but complements each other. If a library can only afford one, choice must be based in institutional needs.
Ensuring database stability and steady performance in the modern world of agile computing is a major challenge. Various changes happening at any level of the computing infrastructure: OS parameters & packages, kernel versions, database parameters & patches, or even schema changes, all can potentially harm production services. This presentation shows how an automatic and regular testing of Oracle databases can be achieved in such agile environment.
Pels, H.J.; Lans, van der R.F.; Pels, H.J.; Meersman, R.A.
Dit artikel introduceert de voornaamste begrippen die een rol spelen rond databases en het geeft een overzicht van de doelstellingen, de functies en de componenten van database-systemen. Hoewel de functie van een database intuitief vrij duidelijk is, is het toch een in technologisch opzicht complex
Full Text Available Abstract Background To interpret microarray experiments, several ontological analysis tools have been developed. However, current tools are limited to specific organisms. Results We developed a bioinformatics system to assign the probe set sequences of any organism to a hierarchical functional classification modelled on KEGG ontology. The GeneBins database currently supports the functional classification of expression data from four Affymetrix arrays; Arabidopsis thaliana, Oryza sativa, Glycine max and Medicago truncatula. An online analysis tool to identify relevant functions is also provided. Conclusion GeneBins provides resources to interpret gene expression results from microarray experiments. It is available at http://bioinfoserver.rsbs.anu.edu.au/utils/GeneBins/
Full Text Available Abstract Background High-density oligonucleotide microarray technology enables the discovery of genes that are transcriptionally modulated in different biological samples due to physiology, disease or intervention. Methods for the identification of these so-called "differentially expressed genes" (DEG would largely benefit from a deeper knowledge of the intrinsic measurement variability. Though it is clear that variance of repeated measures is highly dependent on the average expression level of a given gene, there is still a lack of consensus on how signal reproducibility is linked to signal intensity. The aim of this study was to empirically model the variance versus mean dependence in microarray data to improve the performance of existing methods for identifying DEG. Results In the present work we used data generated by our lab as well as publicly available data sets to show that dispersion of repeated measures depends on location of the measures themselves following a power law. This enables us to construct a power law global error model (PLGEM that is applicable to various Affymetrix GeneChip data sets. A new DEG identification method is therefore proposed, consisting of a statistic designed to make explicit use of model-derived measurement spread estimates and a resampling-based hypothesis testing algorithm. Conclusions The new method provides a control of the false positive rate, a good sensitivity vs. specificity trade-off and consistent results with varying number of replicates and even using single samples.
Wang, Zheng; Vora, Gary J; Stenger, David A
Entamoeba histolytica, Giardia lamblia, and Cryptosporidium parvum are the most frequently identified protozoan parasites causing waterborne disease outbreaks. The morbidity and mortality associated with these intestinal parasitic infections warrant the development of rapid and accurate detection and genotyping methods to aid public health efforts aimed at preventing and controlling outbreaks. In this study, we describe the development of an oligonucleotide microarray capable of detecting and discriminating between E. histolytica, Entamoeba dispar, G. lamblia assemblages A and B, and C. parvum types 1 and 2 in a single assay. Unique hybridization patterns for each selected protozoan were generated by amplifying six to eight diagnostic sequences/organism by multiplex PCR; fluorescent labeling of the amplicons via primer extension; and subsequent hybridization to a set of genus-, species-, and subtype-specific covalently immobilized oligonucleotide probes. The profile-based specificity of this methodology not only permitted for the unequivocal identification of the six targeted species and subtypes, but also demonstrated its potential in identifying related species such as Cryptosporidium meleagridis and Cryptosporidium muris. In addition, sensitivity assays demonstrated lower detection limits of five trophozoites of G. lamblia. Taken together, the specificity and sensitivity of the microarray-based approach suggest that this methodology may provide a promising tool to detect and genotype protozoa from clinical and environmental samples.
De Hertogh, Benoît; De Meulder, Bertrand; Berger, Fabrice; Pierre, Michael; Bareke, Eric; Gaigneaux, Anthoula; Depiereux, Eric
Recent reanalysis of spike-in datasets underscored the need for new and more accurate benchmark datasets for statistical microarray analysis. We present here a fresh method using biologically-relevant data to evaluate the performance of statistical methods. Our novel method ranks the probesets from a dataset composed of publicly-available biological microarray data and extracts subset matrices with precise information/noise ratios. Our method can be used to determine the capability of different methods to better estimate variance for a given number of replicates. The mean-variance and mean-fold change relationships of the matrices revealed a closer approximation of biological reality. Performance analysis refined the results from benchmarks published previously.We show that the Shrinkage t test (close to Limma) was the best of the methods tested, except when two replicates were examined, where the Regularized t test and the Window t test performed slightly better. The R scripts used for the analysis are available at http://urbm-cluster.urbm.fundp.ac.be/~bdemeulder/.
Full Text Available The concept of producing a prototype of interoperable cartographic database is explored in this paper, including the possibilities of integration of different geospatial data into the database management system and their visualization on the Internet. The implementation includes vectorization of the concept of a single map page, creation of the cartographic database in an object-relation database, spatial analysis, definition and visualization of the database content in the form of a map on the Internet.
Yu, Jeffrey Xu; Chang, Lijun
It has become highly desirable to provide users with flexible ways to query/search information over databases as simple as keyword search like Google search. This book surveys the recent developments on keyword search over databases, and focuses on finding structural information among objects in a database using a set of keywords. Such structural information to be returned can be either trees or subgraphs representing how the objects, that contain the required keywords, are interconnected in a relational database or in an XML database. The structural keyword search is completely different from
Ding Xiaoming; Li Lin; Zhao Shiping
Nuclear power economic database (NPEDB), based on ORACLE V6.0, consists of three parts, i.e., economic data base of nuclear power station, economic data base of nuclear fuel cycle and economic database of nuclear power planning and nuclear environment. Economic database of nuclear power station includes data of general economics, technique, capital cost and benefit, etc. Economic database of nuclear fuel cycle includes data of technique and nuclear fuel price. Economic database of nuclear power planning and nuclear environment includes data of energy history, forecast, energy balance, electric power and energy facilities
Ovacik, Meric A.; Sen, Banalata; Euling, Susan Y.; Gaido, Kevin W.; Ierapetritou, Marianthi G.; Androulakis, Ioannis P.
Pathway activity level analysis, the approach pursued in this study, focuses on all genes that are known to be members of metabolic and signaling pathways as defined by the KEGG database. The pathway activity level analysis entails singular value decomposition (SVD) of the expression data of the genes constituting a given pathway. We explore an extension of the pathway activity methodology for application to time-course microarray data. We show that pathway analysis enhances our ability to detect biologically relevant changes in pathway activity using synthetic data. As a case study, we apply the pathway activity level formulation coupled with significance analysis to microarray data from two different rat testes exposed in utero to Dibutyl Phthalate (DBP). In utero DBP exposure in the rat results in developmental toxicity of a number of male reproductive organs, including the testes. One well-characterized mode of action for DBP and the male reproductive developmental effects is the repression of expression of genes involved in cholesterol transport, steroid biosynthesis and testosterone synthesis that lead to a decreased fetal testicular testosterone. Previous analyses of DBP testes microarray data focused on either individual gene expression changes or changes in the expression of specific genes that are hypothesized, or known, to be important in testicular development and testosterone synthesis. However, a pathway analysis may inform whether there are additional affected pathways that could inform additional modes of action linked to DBP developmental toxicity. We show that Pathway activity analysis may be considered for a more comprehensive analysis of microarray data
Ovacik, Meric A. [Chemical and Biochemical Engineering Department, Rutgers University, Piscataway, NJ 08854 (United States); Sen, Banalata [National Center for Environmental Assessment, U.S. Environmental Protection Agency, Research Triangle Park, NC 27709 (United States); Euling, Susan Y. [National Center for Environmental Assessment, Office of Research and Development, U.S. Environmental Protection Agency, Washington, DC 20460 (United States); Gaido, Kevin W. [U.S. Food and Drug Administration, Center for Veterinary Medicine, Office of New Animal Drug Evaluation, Division of Human Food Safety, Rockville, MD 20855 (United States); Ierapetritou, Marianthi G. [Chemical and Biochemical Engineering Department, Rutgers University, Piscataway, NJ 08854 (United States); Androulakis, Ioannis P., E-mail: firstname.lastname@example.org [Chemical and Biochemical Engineering Department, Rutgers University, Piscataway, NJ 08854 (United States); Biomedical Engineering Department, Rutgers University, NJ 08854 (United States)
Pathway activity level analysis, the approach pursued in this study, focuses on all genes that are known to be members of metabolic and signaling pathways as defined by the KEGG database. The pathway activity level analysis entails singular value decomposition (SVD) of the expression data of the genes constituting a given pathway. We explore an extension of the pathway activity methodology for application to time-course microarray data. We show that pathway analysis enhances our ability to detect biologically relevant changes in pathway activity using synthetic data. As a case study, we apply the pathway activity level formulation coupled with significance analysis to microarray data from two different rat testes exposed in utero to Dibutyl Phthalate (DBP). In utero DBP exposure in the rat results in developmental toxicity of a number of male reproductive organs, including the testes. One well-characterized mode of action for DBP and the male reproductive developmental effects is the repression of expression of genes involved in cholesterol transport, steroid biosynthesis and testosterone synthesis that lead to a decreased fetal testicular testosterone. Previous analyses of DBP testes microarray data focused on either individual gene expression changes or changes in the expression of specific genes that are hypothesized, or known, to be important in testicular development and testosterone synthesis. However, a pathway analysis may inform whether there are additional affected pathways that could inform additional modes of action linked to DBP developmental toxicity. We show that Pathway activity analysis may be considered for a more comprehensive analysis of microarray data.
Jewison, Timothy; Knox, Craig; Neveu, Vanessa; Djoumbou, Yannick; Guo, An Chi; Lee, Jacqueline; Liu, Philip; Mandal, Rupasri; Krishnamurthy, Ram; Sinelnikov, Igor; Wilson, Michael; Wishart, David S.
The Yeast Metabolome Database (YMDB, http://www.ymdb.ca) is a richly annotated ‘metabolomic’ database containing detailed information about the metabolome of Saccharomyces cerevisiae. Modeled closely after the Human Metabolome Database, the YMDB contains >2000 metabolites with links to 995 different genes/proteins, including enzymes and transporters. The information in YMDB has been gathered from hundreds of books, journal articles and electronic databases. In addition to its comprehensive literature-derived data, the YMDB also contains an extensive collection of experimental intracellular and extracellular metabolite concentration data compiled from detailed Mass Spectrometry (MS) and Nuclear Magnetic Resonance (NMR) metabolomic analyses performed in our lab. This is further supplemented with thousands of NMR and MS spectra collected on pure, reference yeast metabolites. Each metabolite entry in the YMDB contains an average of 80 separate data fields including comprehensive compound description, names and synonyms, structural information, physico-chemical data, reference NMR and MS spectra, intracellular/extracellular concentrations, growth conditions and substrates, pathway information, enzyme data, gene/protein sequence data, as well as numerous hyperlinks to images, references and other public databases. Extensive searching, relational querying and data browsing tools are also provided that support text, chemical structure, spectral, molecular weight and gene/protein sequence queries. Because of S. cervesiae's importance as a model organism for biologists and as a biofactory for industry, we believe this kind of database could have considerable appeal not only to metabolomics researchers, but also to yeast biologists, systems biologists, the industrial fermentation industry, as well as the beer, wine and spirit industry. PMID:22064855
Ingeholm, Peter; Gögenur, Ismail; Iversen, Lene H
The aim of the database, which has existed for registration of all patients with colorectal cancer in Denmark since 2001, is to improve the prognosis for this patient group. All Danish patients with newly diagnosed colorectal cancer who are either diagnosed or treated in a surgical department of a public Danish hospital. The database comprises an array of surgical, radiological, oncological, and pathological variables. The surgeons record data such as diagnostics performed, including type and results of radiological examinations, lifestyle factors, comorbidity and performance, treatment including the surgical procedure, urgency of surgery, and intra- and postoperative complications within 30 days after surgery. The pathologists record data such as tumor type, number of lymph nodes and metastatic lymph nodes, surgical margin status, and other pathological risk factors. The database has had >95% completeness in including patients with colorectal adenocarcinoma with >54,000 patients registered so far with approximately one-third rectal cancers and two-third colon cancers and an overrepresentation of men among rectal cancer patients. The stage distribution has been more or less constant until 2014 with a tendency toward a lower rate of stage IV and higher rate of stage I after introduction of the national screening program in 2014. The 30-day mortality rate after elective surgery has been reduced from >7% in 2001-2003 to database is a national population-based clinical database with high patient and data completeness for the perioperative period. The resolution of data is high for description of the patient at the time of diagnosis, including comorbidities, and for characterizing diagnosis, surgical interventions, and short-term outcomes. The database does not have high-resolution oncological data and does not register recurrences after primary surgery. The Danish Colorectal Cancer Group provides high-quality data and has been documenting an increase in short- and long
Full Text Available ase Description General information of database Database name RPD Alternative name Rice Proteome Database...titute of Crop Science, National Agriculture and Food Research Organization Setsuko Komatsu E-mail: Database... classification Proteomics Resources Plant databases - Rice Organism Taxonomy Name: Oryza sativa Taxonomy ID: 4530 Database... description Rice Proteome Database contains information on protei...and entered in the Rice Proteome Database. The database is searchable by keyword,
Full Text Available base Description General information of database Database name JSNP Alternative nam...n Science and Technology Agency Creator Affiliation: Contact address E-mail : Database...sapiens Taxonomy ID: 9606 Database description A database of about 197,000 polymorphisms in Japanese populat...1):605-610 External Links: Original website information Database maintenance site Institute of Medical Scien...er registration Not available About This Database Database Description Download License Update History of This Database
Full Text Available abase Description General information of database Database name ASTRA Alternative n...tics Journal Search: Contact address Database classification Nucleotide Sequence Databases - Gene structure,...3702 Taxonomy Name: Oryza sativa Taxonomy ID: 4530 Database description The database represents classified p...(10):1211-6. External Links: Original website information Database maintenance site National Institute of Ad... for user registration Not available About This Database Database Description Dow
Full Text Available abase Description General information of database Database name PLACE Alternative name A Database...Kannondai, Tsukuba, Ibaraki 305-8602, Japan National Institute of Agrobiological Sciences E-mail : Databas...e classification Plant databases Organism Taxonomy Name: Tracheophyta Taxonomy ID: 58023 Database...99, Vol.27, No.1 :297-300 External Links: Original website information Database maintenance site National In...- Need for user registration Not available About This Database Database Descripti
Full Text Available List Contact us Arabidopsis Phenome Database Database Description General information of database Database n... BioResource Center Hiroshi Masuya Database classification Plant databases - Arabidopsis thaliana Organism T...axonomy Name: Arabidopsis thaliana Taxonomy ID: 3702 Database description The Arabidopsis thaliana phenome i...heir effective application. We developed the new Arabidopsis Phenome Database integrating two novel database...seful materials for their experimental research. The other, the “Database of Curated Plant Phenome” focusing
Friis-Andersen, Hans; Bisgaard, Thue
To monitor and improve nation-wide surgical outcome after groin hernia repair based on scientific evidence-based surgical strategies for the national and international surgical community. Patients ≥18 years operated for groin hernia. Type and size of hernia, primary or recurrent, type of surgical repair procedure, mesh and mesh fixation methods. According to the Danish National Health Act, surgeons are obliged to register all hernia repairs immediately after surgery (3 minute registration time). All institutions have continuous access to their own data stratified on individual surgeons. Registrations are based on a closed, protected Internet system requiring personal codes also identifying the operating institution. A national steering committee consisting of 13 voluntary and dedicated surgeons, 11 of whom are unpaid, handles the medical management of the database. The Danish Inguinal Hernia Database comprises intraoperative data from >130,000 repairs (May 2015). A total of 49 peer-reviewed national and international publications have been published from the database (June 2015). The Danish Inguinal Hernia Database is fully active monitoring surgical quality and contributes to the national and international surgical society to improve outcome after groin hernia repair.
MacBeath, Gavin; Schreiber, Stuart L.
Systematic efforts are currently under way to construct defined sets of cloned genes for high-throughput expression and purification of recombinant proteins. To facilitate subsequent studies of protein function, we have developed miniaturized assays that accommodate extremely low sample volumes and enable the rapid, simultaneous processing of thousands of proteins. A high-precision robot designed to manufacture complementary DNA microarrays was used to spot proteins onto chemically derivatized glass slides at extremely high spatial densities. The proteins attached covalently to the slide surface yet retained their ability to interact specifically with other proteins, or with small molecules, in solution. Three applications for protein microarrays were demonstrated: screening for protein-protein interactions, identifying the substrates of protein kinases, and identifying the protein targets of small molecules.
Calabrese, Barbara; Cannataro, Mario
High-throughput platforms such as microarray, mass spectrometry, and next-generation sequencing are producing an increasing volume of omics data that needs large data storage and computing power. Cloud computing offers massive scalable computing and storage, data sharing, on-demand anytime and anywhere access to resources and applications, and thus, it may represent the key technology for facing those issues. In fact, in the recent years it has been adopted for the deployment of different bioinformatics solutions and services both in academia and in the industry. Although this, cloud computing presents several issues regarding the security and privacy of data, that are particularly important when analyzing patients data, such as in personalized medicine. This chapter reviews main academic and industrial cloud-based bioinformatics solutions; with a special focus on microarray data analysis solutions and underlines main issues and problems related to the use of such platforms for the storage and analysis of patients data.
Isager Ahl, Louise; Grace, Olwen M; Pedersen, Henriette Lodberg
As the popularity of Aloe vera extracts continues to rise, a desire to fully understand the individual polymer components of the leaf mesophyll, their relation to one another and the effects they have on the human body are increasing. Polysaccharides present in the leaf mesophyll have been...... identified as the components responsible for the biological activities of Aloe vera, and they have been widely studied in the past decades. However, the commonly used methods do not provide the desired platform to conduct large comparative studies of polysaccharide compositions as most of them require...... a complete or near-complete fractionation of the polymers. The objective for this study was to assess whether carbohydrate microarrays could be used for the high-throughput analysis of cell wall polysaccharides in Aloe leaf mesophyll. The method we chose is known as Comprehensive Microarray Polymer Profiling...
Liu-Stratton, Yiwen; Roy, Sashwati; Sen, Chandan K
The quality and quantity of diet is a key determinant of health and disease. Molecular diagnostics may play a key role in food safety related to genetically modified foods, food-borne pathogens and novel nutraceuticals. Functional outcomes in biology are determined, for the most part, by net balance between sets of genes related to the specific outcome in question. The DNA microarray technology offers a new dimension of strength in molecular diagnostics by permitting the simultaneous analysis of large sets of genes. Automation of assay and novel bioinformatics tools make DNA microarrays a robust technology for diagnostics. Since its development a few years ago, this technology has been used for the applications of toxicogenomics, pharmacogenomics, cell biology, and clinical investigations addressing the prevention and intervention of diseases. Optimization of this technology to specifically address food safety is a vast resource that remains to be mined. Efforts to develop diagnostic custom arrays and simplified bioinformatics tools for field use are warranted.
Bae, Jin-Woo; Park, Yong-Ha
Microbial ecological microarrays have been developed for investigating the composition and functions of microorganism communities in environmental niches. These arrays include microbial identification microarrays, which use oligonucleotides, gene fragments or microbial genomes as probes. In this article, the advantages and disadvantages of each type of probe are reviewed. Oligonucleotide probes are currently useful for probing uncultivated bacteria that are not amenable to gene fragment probing, whereas the functional gene fragments amplified randomly from microbial genomes require phylogenetic and hierarchical categorization before use as microbial identification probes, despite their high resolution for both specificity and sensitivity. Until more bacteria are sequenced and gene fragment probes are thoroughly validated, heterogeneous bacterial genome probes will provide a simple, sensitive and quantitative tool for exploring the ecosystem structure.
Full Text Available Growing interest in the future medical applications of nanotechnology is leading to the emergence of a new scientific field that called as “nanomedicine”. Nanomedicine may be defined as the investigating, treating, reconstructing and controlling human biology and health at the molecular level, using engineered nanodevices and nanostructures. Microarray technology is a revolutionary tool for elucidating roles of genes in infectious diseases, shifting from traditional methods of research to integrated approaches. This technology has great potential to provide medical diagnosis, monitor treatment and help in the development of new tools for infectious disease prevention and/or management. The aim of this paper is to provide an overview of the current application of microarray platforms and nanomedicine in the study of experimental microbiology and the impact of this technology in clinical settings.
... 40 Protection of Environment 32 2010-07-01 2010-07-01 false Read-only database. 1400.13 Section... INFORMATION Other Provisions § 1400.13 Read-only database. The Administrator is authorized to establish... public off-site consequence analysis information by means of a central database under the control of the...
Legislative Districts, In order for others to use the information in the Census TIGER database in a geographic information system or for other geographic applications, the Census Bureau releases to the public extracts of the database in the form of TIGER/Line files., Published in 2006, 1:24000 (1in=2000ft) scale, Louisiana State University (LSU).
NSGIC Education | GIS Inventory — Legislative Districts dataset current as of 2006. In order for others to use the information in the Census TIGER database in a geographic information system or for...
Ladayya, Faroh; Purnami, Santi Wulan; Irhamah
DNA microarrays are data containing gene expression with small sample sizes and high number of features. Furthermore, imbalanced classes is a common problem in microarray data. This occurs when a dataset is dominated by a class which have significantly more instances than the other minority classes. Therefore, it is needed a classification method that solve the problem of high dimensional and imbalanced data. Support Vector Machine (SVM) is one of the classification methods that is capable of handling large or small samples, nonlinear, high dimensional, over learning and local minimum issues. SVM has been widely applied to DNA microarray data classification and it has been shown that SVM provides the best performance among other machine learning methods. However, imbalanced data will be a problem because SVM treats all samples in the same importance thus the results is bias for minority class. To overcome the imbalanced data, Fuzzy SVM (FSVM) is proposed. This method apply a fuzzy membership to each input point and reformulate the SVM such that different input points provide different contributions to the classifier. The minority classes have large fuzzy membership so FSVM can pay more attention to the samples with larger fuzzy membership. Given DNA microarray data is a high dimensional data with a very large number of features, it is necessary to do feature selection first using Fast Correlation based Filter (FCBF). In this study will be analyzed by SVM, FSVM and both methods by applying FCBF and get the classification performance of them. Based on the overall results, FSVM on selected features has the best classification performance compared to SVM.
Jaakson, K; Zernant, J; Külm, M; Hutchinson, A; Tonisson, N; Glavac, D; Ravnik-Glavac, M; Hawlina, M; Meltzer, M R; Caruso, R C; Testa, F; Maugeri, A; Hoyng, C B; Gouras, P; Simonelli, F; Lewis, R A; Lupski, J R; Cremers, F P M; Allikmets, R
Genetic variation in the ABCR (ABCA4) gene has been associated with five distinct retinal phenotypes, including Stargardt disease/fundus flavimaculatus (STGD/FFM), cone-rod dystrophy (CRD), and age-related macular degeneration (AMD). Comparative genetic analyses of ABCR variation and diagnostics have been complicated by substantial allelic heterogeneity and by differences in screening methods. To overcome these limitations, we designed a genotyping microarray (gene chip) for ABCR that includes all approximately 400 disease-associated and other variants currently described, enabling simultaneous detection of all known ABCR variants. The ABCR genotyping microarray (the ABCR400 chip) was constructed by the arrayed primer extension (APEX) technology. Each sequence change in ABCR was included on the chip by synthesis and application of sequence-specific oligonucleotides. We validated the chip by screening 136 confirmed STGD patients and 96 healthy controls, each of whom we had analyzed previously by single strand conformation polymorphism (SSCP) technology and/or heteroduplex analysis. The microarray was >98% effective in determining the existing genetic variation and was comparable to direct sequencing in that it yielded many sequence changes undetected by SSCP. In STGD patient cohorts, the efficiency of the array to detect disease-associated alleles was between 54% and 78%, depending on the ethnic composition and degree of clinical and molecular characterization of a cohort. In addition, chip analysis suggested a high carrier frequency (up to 1:10) of ABCR variants in the general population. The ABCR genotyping microarray is a robust, cost-effective, and comprehensive screening tool for variation in one gene in which mutations are responsible for a substantial fraction of retinal disease. The ABCR chip is a prototype for the next generation of screening and diagnostic tools in ophthalmic genetics, bridging clinical and scientific research. Copyright 2003 Wiley
Travensolo,Regiane F.; Carareto-Alves,Lucia M.; Costa,Maria V.C.G.; Lopes,Tiago J.S.; Carrilho,Emanuel; Lemos,Eliana G.M.
Xylella fastidiosa genome sequencing has generated valuable data by identifying genes acting either on metabolic pathways or in associated pathogenicity and virulence. Based on available information on these genes, new strategies for studying their expression patterns, such as microarray technology, were employed. A total of 2,600 primer pairs were synthesized and then used to generate fragments using the PCR technique. The arrays were hybridized against cDNAs labeled during reverse transcrip...
Tang, C S; Dusseiller, M; Makohliso, S; Heuschkel, M; Sharma, S; Keller, B; Vörös, J
Microarray technology is a powerful tool that provides a high throughput of bioanalytical information within a single experiment. These miniaturized and parallelized binding assays are highly sensitive and have found widespread popularity especially during the genomic era. However, as drug diagnostics studies are often targeted at membrane proteins, the current arraying technologies are ill-equipped to handle the fragile nature of the protein molecules. In addition, to understand the complex structure and functions of proteins, different strategies to immobilize the probe molecules selectively onto a platform for protein microarray are required. We propose a novel approach to create a (membrane) protein microarray by using an indium tin oxide (ITO) microelectrode array with an electronic multiplexing capability. A polycationic, protein- and vesicle-resistant copolymer, poly(l-lysine)-grafted-poly(ethylene glycol) (PLL-g-PEG), is exposed to and adsorbed uniformly onto the microelectrode array, as a passivating adlayer. An electronic stimulation is then applied onto the individual ITO microelectrodes resulting in the localized release of the polymer thus revealing a bare ITO surface. Different polymer and biological moieties are specifically immobilized onto the activated ITO microelectrodes while the other regions remain protein-resistant as they are unaffected by the induced electrical potential. The desorption process of the PLL-g-PEG is observed to be highly selective, rapid, and reversible without compromising on the integrity and performance of the conductive ITO microelectrodes. As such, we have successfully created a stable and heterogeneous microarray of biomolecules by using selective electronic addressing on ITO microelectrodes. Both pharmaceutical diagnostics and biomedical technology are expected to benefit directly from this unique method.
Full Text Available Abstract Background Composting is one of the methods utilised in recycling organic communal waste. The composting process is dependent on aerobic microbial activity and proceeds through a succession of different phases each dominated by certain microorganisms. In this study, a ligation-detection-reaction (LDR based microarray method was adapted for species-level detection of compost microbes characteristic of each stage of the composting process. LDR utilises the specificity of the ligase enzyme to covalently join two adjacently hybridised probes. A zip-oligo is attached to the 3'-end of one probe and fluorescent label to the 5'-end of the other probe. Upon ligation, the probes are combined in the same molecule and can be detected in a specific location on a universal microarray with complementary zip-oligos enabling equivalent hybridisation conditions for all probes. The method was applied to samples from Nordic composting facilities after testing and optimisation with fungal pure cultures and environmental clones. Results Probes targeted for fungi were able to detect 0.1 fmol of target ribosomal PCR product in an artificial reaction mixture containing 100 ng competing fungal ribosomal internal transcribed spacer (ITS area or herring sperm DNA. The detection level was therefore approximately 0.04% of total DNA. Clone libraries were constructed from eight compost samples. The LDR microarray results were in concordance with the clone library sequencing results. In addition a control probe was used to monitor the per-spot hybridisation efficiency on the array. Conclusion This study demonstrates that the LDR microarray method is capable of sensitive and accurate species-level detection from a complex microbial community. The method can detect key species from compost samples, making it a basis for a tool for compost process monitoring in industrial facilities.
Full Text Available Objective To study the application of DNA microarray technique for screening and identifying multiple food-borne pathogens. Methods The oligonucleotide probes were designed by Clustal X and Oligo 6.0 at the conserved regions of specific genes of multiple food-borne pathogens, and then were validated by bioinformatic analyses. The 5' end of each probe was modified by amino-group and 10 Poly-T, and the optimized probes were synthesized and spotted on aldehyde-coated slides. The bacteria DNA template incubated with Klenow enzyme was amplified by arbitrarily primed PCR, and PCR products incorporated into Aminoallyl-dUTP were coupled with fluorescent dye. After hybridization of the purified PCR products with DNA microarray, the hybridization image and fluorescence intensity analysis was acquired by ScanArray and GenePix Pro 5.1 software. A series of detection conditions such as arbitrarily primed PCR and microarray hybridization were optimized. The specificity of this approach was evaluated by 16 different bacteria DNA, and the sensitivity and reproducibility were verified by 4 food-borne pathogens DNA. The samples of multiple bacteria DNA and simulated water samples of Shigella dysenteriae were detected. Results Nine different food-borne bacteria were successfully discriminated under the same condition. The sensitivity of genomic DNA was 102 －103pg/ μl, and the coefficient of variation (CV of the reproducibility of assay was less than 15%. The corresponding specific hybridization maps of the multiple bacteria DNA samples were obtained, and the detection limit of simulated water sample of Shigella dysenteriae was 3.54×105cfu/ml. Conclusions The DNA microarray detection system based on arbitrarily primed PCR can be employed for effective detection of multiple food-borne pathogens, and this assay may offer a new method for high-throughput platform for detecting bacteria.
Full Text Available Abstract Background In experimental data analysis, bioinformatics researchers increasingly rely on tools that enable the composition and reuse of scientific workflows. The utility of current bioinformatics workflow environments can be significantly increased by offering advanced data mining services as workflow components. Such services can support, for instance, knowledge discovery from diverse distributed data and knowledge sources (such as GO, KEGG, PubMed, and experimental databases. Specifically, cutting-edge data analysis approaches, such as semantic data mining, link discovery, and visualization, have not yet been made available to researchers investigating complex biological datasets. Results We present a new methodology, SegMine, for semantic analysis of microarray data by exploiting general biological knowledge, and a new workflow environment, Orange4WS, with integrated support for web services in which the SegMine methodology is implemented. The SegMine methodology consists of two main steps. First, the semantic subgroup discovery algorithm is used to construct elaborate rules that identify enriched gene sets. Then, a link discovery service is used for the creation and visualization of new biological hypotheses. The utility of SegMine, implemented as a set of workflows in Orange4WS, is demonstrated in two microarray data analysis applications. In the analysis of senescence in human stem cells, the use of SegMine resulted in three novel research hypotheses that could improve understanding of the underlying mechanisms of senescence and identification of candidate marker genes. Conclusions Compared to the available data analysis systems, SegMine offers improved hypothesis generation and data interpretation for bioinformatics in an easy-to-use integrated workflow environment.
Kodaira, M.; Sasaki, K.; Tagawa, H.; Omine, H.; Kushiro, J.; Takahashi, N.; Katayama, H.
We are trying to evaluate genetic effects of radiation on human using mutation frequency as an indicator. For the efficient detection of mutations, it is important to understand the mechanism and the characteristics of radiation-induced mutations. We have started the analysis of hypoxanthine-guanine phosphoribosyl transferase (HPRT) mutants induced by X-ray in order to clarify the deletion size and the mutation-distribution. We analyzed 39 human X-ray induced HPRT-deletion mutants by using the microarray-CGH. The array for this analysis contains 57 BAC clones covering as much as possible of the 4Mb of the 5' side and 10Mb of the 3' side of the HPRT gene based on the NCBI genome database. DNA from parent strain and each HPRT-mutant strain are labeled with Cy5 and Cy3 respectively, and were mixed and hybridized on the array. Fluorescent intensity ratio of the obtained spots was analyzed using software we developed to identify clones corresponding to the deletion region. The deletion in these strains ranged up to 3.5 Mb on the 5' side and 6 Mb on the 3' side of the HPRT gene. Deletions in 13 strains ended around BAC clones located at about 3 Mb on the 5' side. On the 3' side, deletions extended up to the specific clones located at 1.5 Mb in 11 strains. The mutations seem to be complex on the 3' end of deletion; some accompanied duplications with deletions and others could not be explained by one mutation event. We need to confirm these results, taking into account the experimental reproducibility and the accuracy of the published genetic map. The results of the research using the microarray-CGH help us to search the regions where deletions are easily induced and to identify the factors affecting the range of deletions
Bryant Susan V
Full Text Available Abstract Background Microarray analysis and 454 cDNA sequencing were used to investigate a centuries-old problem in regenerative biology: the basis of nerve-dependent limb regeneration in salamanders. Innervated (NR and denervated (DL forelimbs of Mexican axolotls were amputated and transcripts were sampled after 0, 5, and 14 days of regeneration. Results Considerable similarity was observed between NR and DL transcriptional programs at 5 and 14 days post amputation (dpa. Genes with extracellular functions that are critical to wound healing were upregulated while muscle-specific genes were downregulated. Thus, many processes that are regulated during early limb regeneration do not depend upon nerve-derived factors. The majority of the transcriptional differences between NR and DL limbs were correlated with blastema formation; cell numbers increased in NR limbs after 5 dpa and this yielded distinct transcriptional signatures of cell proliferation in NR limbs at 14 dpa. These transcriptional signatures were not observed in DL limbs. Instead, gene expression changes within DL limbs suggest more diverse and protracted wound-healing responses. 454 cDNA sequencing complemented the microarray analysis by providing deeper sampling of transcriptional programs and associated biological processes. Assembly of new 454 cDNA sequences with existing expressed sequence tag (EST contigs from the Ambystoma EST database more than doubled (3935 to 9411 the number of non-redundant human-A. mexicanum orthologous sequences. Conclusion Many new candidate gene sequences were discovered for the first time and these will greatly enable future studies of wound healing, epigenetics, genome stability, and nerve-dependent blastema formation and outgrowth using the axolotl model.
Full Text Available Desert locusts (Schistocerca gregaria show an extreme form of phenotypic plasticity and can transform between a cryptic solitarious phase and a swarming gregarious phase. The two phases differ extensively in behavior, morphology and physiology but very little is known about the molecular basis of these differences. We used our recently generated Expressed Sequence Tag (EST database derived from S. gregaria central nervous system (CNS to design oligonucleotide microarrays and compare the expression of thousands of genes in the CNS of long-term gregarious and solitarious adult desert locusts. This identified 214 differentially expressed genes, of which 40% have been annotated to date. These include genes encoding proteins that are associated with CNS development and modeling, sensory perception, stress response and resistance, and fundamental cellular processes. Our microarray analysis has identified genes whose altered expression may enable locusts of either phase to deal with the different challenges they face. Genes for heat shock proteins and proteins which confer protection from infection were upregulated in gregarious locusts, which may allow them to respond to acute physiological challenges. By contrast the longer-lived solitarious locusts appear to be more strongly protected from the slowly accumulating effects of ageing by an upregulation of genes related to anti-oxidant systems, detoxification and anabolic renewal. Gregarious locusts also had a greater abundance of transcripts for proteins involved in sensory processing and in nervous system development and plasticity. Gregarious locusts live in a more complex sensory environment than solitarious locusts and may require a greater turnover of proteins involved in sensory transduction, and possibly greater neuronal plasticity.
Hinman, R.; Thrall, B.; Wong, K,
A cDNA microarray allows biologists to examine the expression of thousands of genes simultaneously. Researchers may analyze the complete transcriptional program of an organism in response to specific physiological or developmental conditions. By design, a cDNA microarray is an experiment with many variables and few controls. One question that inevitably arises when working with a cDNA microarray is data reproducibility. How easy is it to confirm mRNA expression patterns? In this paper, a case study involving the treatment of a murine macrophage RAW 264.7 cell line with tumor necrosis factor alpha (TNF) was used to obtain a rough estimate of data reproducibility. Two trials were examined and a list of genes displaying either a > 2-fold or > 4-fold increase in gene expression was compiled. Variations in signal mean ratios between the two slides were observed. We can assume that erring in reproducibility may be compensated by greater inductive levels of similar genes. Steps taken to obtain results included serum starvation of cells before treatment, tests of mRNA for quality/consistency, and data normalization.
Rehrauer, Hubert; Zoller, Stefan; Schlapbach, Ralph
The web application MAGMA provides a simple and intuitive interface to identify differentially expressed genes from two-channel microarray data. While the underlying algorithms are not superior to those of similar web applications, MAGMA is particularly user friendly and can be used without prior training. The user interface guides the novice user through the most typical microarray analysis workflow consisting of data upload, annotation, normalization and statistical analysis. It automatically generates R-scripts that document MAGMA's entire data processing steps, thereby allowing the user to regenerate all results in his local R installation. The implementation of MAGMA follows the model-view-controller design pattern that strictly separates the R-based statistical data processing, the web-representation and the application logic. This modular design makes the application flexible and easily extendible by experts in one of the fields: statistical microarray analysis, web design or software development. State-of-the-art Java Server Faces technology was used to generate the web interface and to perform user input processing. MAGMA's object-oriented modular framework makes it easily extendible and applicable to other fields and demonstrates that modern Java technology is also suitable for rather small and concise academic projects. MAGMA is freely available at www.magma-fgcz.uzh.ch.
Xu, Jiucheng; Mu, Huiyu; Wang, Yun; Huang, Fangzhou
The selection of feature genes with high recognition ability from the gene expression profiles has gained great significance in biology. However, most of the existing methods have a high time complexity and poor classification performance. Motivated by this, an effective feature selection method, called supervised locally linear embedding and Spearman's rank correlation coefficient (SLLE-SC 2 ), is proposed which is based on the concept of locally linear embedding and correlation coefficient algorithms. Supervised locally linear embedding takes into account class label information and improves the classification performance. Furthermore, Spearman's rank correlation coefficient is used to remove the coexpression genes. The experiment results obtained on four public tumor microarray datasets illustrate that our method is valid and feasible.
Grams, W H
The Hazard Analysis Database was developed in conjunction with the hazard analysis activities conducted in accordance with DOE-STD-3009-94, Preparation Guide for U S . Department of Energy Nonreactor Nuclear Facility Safety Analysis Reports, for HNF-SD-WM-SAR-067, Tank Farms Final Safety Analysis Report (FSAR). The FSAR is part of the approved Authorization Basis (AB) for the River Protection Project (RPP). This document describes, identifies, and defines the contents and structure of the Tank Farms FSAR Hazard Analysis Database and documents the configuration control changes made to the database. The Hazard Analysis Database contains the collection of information generated during the initial hazard evaluations and the subsequent hazard and accident analysis activities. The Hazard Analysis Database supports the preparation of Chapters 3 ,4 , and 5 of the Tank Farms FSAR and the Unreviewed Safety Question (USQ) process and consists of two major, interrelated data sets: (1) Hazard Analysis Database: Data from t...
Full Text Available Almost every organization has at its centre a database. The database provides support for conducting different activities, whether it is production, sales and marketing or internal operations. Every day, a database is accessed for help in strategic decisions. The satisfaction therefore of such needs is entailed with a high quality security and availability. Those needs can be realised using a DBMS (Database Management System which is, in fact, software for a database. Technically speaking, it is software which uses a standard method of cataloguing, recovery, and running different data queries. DBMS manages the input data, organizes it, and provides ways of modifying or extracting the data by its users or other programs. Managing the database is an operation that requires periodical updates, optimizing and monitoring.
Kannegaard, Pia Nimann; Vinding, Kirsten L; Hare-Bruun, Helle
AIM OF DATABASE: The aim of the National Database of Geriatrics is to monitor the quality of interdisciplinary diagnostics and treatment of patients admitted to a geriatric hospital unit. STUDY POPULATION: The database population consists of patients who were admitted to a geriatric hospital unit....... Geriatric patients cannot be defined by specific diagnoses. A geriatric patient is typically a frail multimorbid elderly patient with decreasing functional ability and social challenges. The database includes 14-15,000 admissions per year, and the database completeness has been stable at 90% during the past......, percentage of discharges with a rehabilitation plan, and the part of cases where an interdisciplinary conference has taken place. Data are recorded by doctors, nurses, and therapists in a database and linked to the Danish National Patient Register. DESCRIPTIVE DATA: Descriptive patient-related data include...
Homer, Collin G.; Fry, Joyce A.; Barnes, Christopher A.
The National Land Cover Database (NLCD) serves as the definitive Landsat-based, 30-meter resolution, land cover database for the Nation. NLCD provides spatial reference and descriptive data for characteristics of the land surface such as thematic class (for example, urban, agriculture, and forest), percent impervious surface, and percent tree canopy cover. NLCD supports a wide variety of Federal, State, local, and nongovernmental applications that seek to assess ecosystem status and health, understand the spatial patterns of biodiversity, predict effects of climate change, and develop land management policy. NLCD products are created by the Multi-Resolution Land Characteristics (MRLC) Consortium, a partnership of Federal agencies led by the U.S. Geological Survey. All NLCD data products are available for download at no charge to the public from the MRLC Web site: http://www.mrlc.gov.
Full Text Available Global gene expression analysis using microarrays and, more recently, RNA-seq, has allowed investigators to understand biological processes at a system level. However, the identification of differentially expressed genes in experiments with small sample size, high dimensionality, and high variance remains challenging, limiting the usability of these tens of thousands of publicly available, and possibly many more unpublished, gene expression datasets. We propose a novel variable selection algorithm for ultra-low-n microarray studies using generalized linear model-based variable selection with a penalized binomial regression algorithm called penalized Euclidean distance (PED. Our method uses PED to build a classifier on the experimental data to rank genes by importance. In place of cross-validation, which is required by most similar methods but not reliable for experiments with small sample size, we use a simulation-based approach to additively build a list of differentially expressed genes from the rank-ordered list. Our simulation-based approach maintains a low false discovery rate while maximizing the number of differentially expressed genes identified, a feature critical for downstream pathway analysis. We apply our method to microarray data from an experiment perturbing the Notch signaling pathway in Xenopus laevis embryos. This dataset was chosen because it showed very little differential expression according to limma, a powerful and widely-used method for microarray analysis. Our method was able to detect a significant number of differentially expressed genes in this dataset and suggest future directions for investigation. Our method is easily adaptable for analysis of data from RNA-seq and other global expression experiments with low sample size and high dimensionality.
Juntunen, R. (Risto)
Abstract In a distributed database data is spread throughout the network into separated nodes with different DBMS systems (Date, 2000). According to CAP-theorem three database properties — consistency, availability and partition tolerance cannot be achieved simultaneously in distributed database systems. Two of these properties can be achieved but not all three at the same time (Brewer, 2000). Since this theorem there has b...
Gasparyan, Armen Yuri; Yessirkepov, Marlen; Voronov, Alexander A.; Trukhachev, Vladimir I.; Kostyukova, Elena I.; Gerasimov, Alexey N.; Kitas, George D.
Specialist bibliographic databases offer essential online tools for researchers and authors who work on specific subjects and perform comprehensive and systematic syntheses of evidence. This article presents examples of the established specialist databases, which may be of interest to those engaged in multidisciplinary science communication. Access to most specialist databases is through subscription schemes and membership in professional associations. Several aggregators of information and d...
The Supply Chain Initiatives Database (SCID) presents innovative approaches to engaging industrial suppliers in efforts to save energy, increase productivity and improve environmental performance. This comprehensive and freely-accessible database was developed by the Institute for Industrial Productivity (IIP). IIP acknowledges Ecofys for their valuable contributions. The database contains case studies searchable according to the types of activities buyers are undertaking to motivate suppliers, target sector, organization leading the initiative, and program or partnership linkages.
Full Text Available base Description General information of database Database name SAHG Alternative nam...h: Contact address Chie Motono Tel : +81-3-3599-8067 E-mail : Database classification Structure Databases - ...e databases - Protein properties Organism Taxonomy Name: Homo sapiens Taxonomy ID: 9606 Database description... Links: Original website information Database maintenance site The Molecular Profiling Research Center for D...stration Not available About This Database Database Description Download License Update History of This Database Site Policy | Contact Us Database Description - SAHG | LSDB Archive ...