Matsuda, Fumio; Shinbo, Yoko; Oikawa, Akira; Hirai, Masami Yokota; Fiehn, Oliver; Kanaya, Shigehiko; Saito, Kazuki
Background In metabolomics researches using mass spectrometry (MS), systematic searching of high-resolution mass data against compound databases is often the first step of metabolite annotation to determine elemental compositions possessing similar theoretical mass numbers. However, incorrect hits derived from errors in mass analyses will be included in the results of elemental composition searches. To assess the quality of peak annotation information, a novel methodology for false discovery rates (FDR) evaluation is presented in this study. Based on the FDR analyses, several aspects of an elemental composition search, including setting a threshold, estimating FDR, and the types of elemental composition databases most reliable for searching are discussed. Methodology/Principal Findings The FDR can be determined from one measured value (i.e., the hit rate for search queries) and four parameters determined by Monte Carlo simulation. The results indicate that relatively high FDR values (30–50%) were obtained when searching time-of-flight (TOF)/MS data using the KNApSAcK and KEGG databases. In addition, searches against large all-in-one databases (e.g., PubChem) always produced unacceptable results (FDR >70%). The estimated FDRs suggest that the quality of search results can be improved not only by performing more accurate mass analysis but also by modifying the properties of the compound database. A theoretical analysis indicates that FDR could be improved by using compound database with smaller but higher completeness entries. Conclusions/Significance High accuracy mass analysis, such as Fourier transform (FT)-MS, is needed for reliable annotation (FDR metabolome data. PMID:19847304
Bouhifd, Mounir; Beger, Richard; Flynn, Thomas; Guo, Lining; Harris, Georgina; Hogberg, Helena; Kaddurah-Daouk, Rima; Kamp, Hennicke; Kleensang, Andre; Maertens, Alexandra; Odwin-DaCosta, Shelly; Pamies, David; Robertson, Donald; Smirnova, Lena; Sun, Jinchun; Zhao, Liang; Hartung, Thomas
Metabolomics promises a holistic phenotypic characterization of biological responses to toxicants. This technology is based on advanced chemical analytical tools with reasonable throughput, including mass-spectroscopy and NMR. Quality assurance, however - from experimental design, sample preparation, metabolite identification, to bioinformatics data-mining - is urgently needed to assure both quality of metabolomics data and reproducibility of biological models. In contrast to microarray-based transcriptomics, where consensus on quality assurance and reporting standards has been fostered over the last two decades, quality assurance of metabolomics is only now emerging. Regulatory use in safety sciences, and even proper scientific use of these technologies, demand quality assurance. In an effort to promote this discussion, an expert workshop discussed the quality assurance needs of metabolomics. The goals for this workshop were 1) to consider the challenges associated with metabolomics as an emerging science, with an emphasis on its application in toxicology and 2) to identify the key issues to be addressed in order to establish and implement quality assurance procedures in metabolomics-based toxicology. Consensus has still to be achieved regarding best practices to make sure sound, useful, and relevant information is derived from these new tools.
By the term 'Metabolomics' means the discipline which allows you to determine the set of small molecules (metabolites) produced by an organism in a given time. The metabolomic analysis requires complex technological platforms that allow, in the first place, the separation (chromatography liquid or gaseous) of the different molecules and, subsequently, the identification of the same on the basis of characteristic ratio between their mass and charge (m / z). This study arises by estimates that, between climate change planned for the coming decades, there will also be quick increasing the concentration of Co2 in the atmosphere. In this context, it is essential to predict how these changes weather will impact on product quality plant at the base of our diet. [it
Mahieu, Nathaniel G; Patti, Gary J
When using liquid chromatography/mass spectrometry (LC/MS) to perform untargeted metabolomics, it is now routine to detect tens of thousands of features from biological samples. Poor understanding of the data, however, has complicated interpretation and masked the number of unique metabolites actually being measured in an experiment. Here we place an upper bound on the number of unique metabolites detected in Escherichia coli samples analyzed with one untargeted metabolomics method. We first group multiple features arising from the same analyte, which we call "degenerate features", using a context-driven annotation approach. Surprisingly, this analysis revealed thousands of previously unreported degeneracies that reduced the number of unique analytes to ∼2961. We then applied an orthogonal approach to remove nonbiological features from the data using the 13 C-based credentialing technology. This further reduced the number of unique analytes to less than 1000. Our 90% reduction in data is 5-fold greater than previously published studies. On the basis of the results, we propose an alternative approach to untargeted metabolomics that relies on thoroughly annotated reference data sets. To this end, we introduce the creDBle database ( http://creDBle.wustl.edu ), which contains accurate mass, retention time, and MS/MS fragmentation data as well as annotations of all credentialed features.
Dudzik, Danuta; Barbas-Bernardos, Cecilia; García, Antonia; Barbas, Coral
Untargeted metabolomics, as a global approach, has already proven its great potential and capabilities for the investigation of health and disease, as well as the wide applicability for other research areas. Although great progress has been made on the feasibility of metabolomics experiments, there are still some challenges that should be faced and that includes all sources of fluctuations and bias affecting every step involved in multiplatform untargeted metabolomics studies. The identification and reduction of the main sources of unwanted variation regarding the pre-analytical, analytical and post-analytical phase of metabolomics experiments is essential to ensure high data quality. Nowadays, there is still a lack of information regarding harmonized guidelines for quality assurance as those available for targeted analysis. In this review, sources of variations to be considered and minimized along with methodologies and strategies for monitoring and improvement the quality of the results are discussed. The given information is based on evidences from different groups among our own experiences and recommendations for each stage of the metabolomics workflow. The comprehensive overview with tools presented here might serve other researchers interested in monitoring, controlling and improving the reliability of their findings by implementation of good experimental quality practices in the untargeted metabolomics study. Copyright © 2017 Elsevier B.V. All rights reserved.
Kamstrup-Nielsen, Maja Hermann
Metabolomics is the analysis of the whole metabolome and the focus in metabolomics studies is to measure as many metabolites as possible. The use of chemometrics in metabolomics studies is widespread, but there is a clear lack of validation in the developed models. The focus in this thesis has been...... how to properly handle complex metabolomics data, in order to achieve reliable and valid multivariate models. This has been illustrated by three case studies with examples of forecasting breast cancer and early detection of colorectal cancer based on data from nuclear magnetic resonance (NMR...... is a presentation of a core consistency diagnostic aiding in determining the number of components in a PARAFAC2 model. It is of great importance to validate especially PLS-DA models and if not done properly, the developed models might reveal spurious groupings. Furthermore, data from metabolomics studies contain...
Boudah, Samia; Olivier, Marie-Françoise; Aros-Calt, Sandrine; Oliveira, Lydie; Fenaille, François; Tabet, Jean-Claude; Junot, Christophe
This work aims at evaluating the relevance and versatility of liquid chromatography coupled to high resolution mass spectrometry (LC/HRMS) for performing a qualitative and comprehensive study of the human serum metabolome. To this end, three different chromatographic systems based on a reversed phase (RP), hydrophilic interaction chromatography (HILIC) and a pentafluorophenylpropyl (PFPP) stationary phase were used, with detection in both positive and negative electrospray modes. LC/HRMS platforms were first assessed for their ability to detect, retain and separate 657 metabolite standards representative of the chemical families occurring in biological fluids. More than 75% were efficiently retained in either one LC-condition and less than 5% were exclusively retained by the RP column. These three LC/HRMS systems were then evaluated for their coverage of serum metabolome. The combination of RP, HILIC and PFPP based LC/HRMS methods resulted in the annotation of about 1328 features in the negative ionization mode, and 1358 in the positive ionization mode on the basis of their accurate mass and precise retention time in at least one chromatographic condition. Less than 12% of these annotations were shared by the three LC systems, which highlights their complementarity. HILIC column ensured the greatest metabolome coverage in the negative ionization mode, whereas PFPP column was the most effective in the positive ionization mode. Altogether, 192 annotations were confirmed using our spectral database and 74 others by performing MS/MS experiments. This resulted in the formal or putative identification of 266 metabolites, among which 59 are reported for the first time in human serum. Copyright © 2014 Elsevier B.V. All rights reserved.
Full Text Available The identification of translation initiation sites (TISs constitutes an important aspect of sequence-based genome analysis. An erroneous TIS annotation can impair the identification of regulatory elements and N-terminal signal peptides, and also may flaw the determination of descent, for any particular gene. We have formulated a reference-free method to score the TIS annotation quality. The method is based on a comparison of the observed and expected distribution of all TISs in a particular genome given prior gene-calling. We have assessed the TIS annotations for all available NCBI RefSeq microbial genomes and found that approximately 87% is of appropriate quality, whereas 13% needs substantial improvement. We have analyzed a number of factors that could affect TIS annotation quality such as GC-content, taxonomy, the fraction of genes with a Shine-Dalgarno sequence and the year of publication. The analysis showed that only the first factor has a clear effect. We have then formulated a straightforward Principle Component Analysis-based TIS identification strategy to self-organize and score potential TISs. The strategy is independent of reference data and a priori calculations. A representative set of 277 genomes was subjected to the analysis and we found a clear increase in TIS annotation quality for the genomes with a low quality score. The PCA-based annotation was also compared with annotation with the current tool of reference, Prodigal. The comparison for the model genome of Escherichia coli K12 showed that both methods supplement each other and that prediction agreement can be used as an indicator of a correct TIS annotation. Importantly, the data suggest that the addition of a PCA-based strategy to a Prodigal prediction can be used to 'flag' TIS annotations for re-evaluation and in addition can be used to evaluate a given annotation in case a Prodigal annotation is lacking.
Welzenbach, Julia; Neuhoff, Christiane; Looft, Christian; Schellander, Karl; Tholen, Ernst; Große-Brinkhaus, Christine
The aim of this study was to elucidate the underlying biochemical processes to identify potential key molecules of meat quality traits drip loss, pH of meat 1 h post-mortem (pH1), pH in meat 24 h post-mortem (pH24) and meat color. An untargeted metabolomics approach detected the profiles of 393 annotated and 1,600 unknown metabolites in 97 Duroc × Pietrain pigs. Despite obvious differences regarding the statistical approaches, the four applied methods, namely correlation analysis, principal component analysis, weighted network analysis (WNA) and random forest regression (RFR), revealed mainly concordant results. Our findings lead to the conclusion that meat quality traits pH1, pH24 and color are strongly influenced by processes of post-mortem energy metabolism like glycolysis and pentose phosphate pathway, whereas drip loss is significantly associated with metabolites of lipid metabolism. In case of drip loss, RFR was the most suitable method to identify reliable biomarkers and to predict the phenotype based on metabolites. On the other hand, WNA provides the best parameters to investigate the metabolite interactions and to clarify the complex molecular background of meat quality traits. In summary, it was possible to attain findings on the interaction of meat quality traits and their underlying biochemical processes. The detected key metabolites might be better indicators of meat quality especially of drip loss than the measured phenotype itself and potentially might be used as bio indicators. PMID:26919205
Full Text Available Abstract Background Metabolomics experiments using Mass Spectrometry (MS technology measure the mass to charge ratio (m/z and intensity of ionised molecules in crude extracts of complex biological samples to generate high dimensional metabolite 'fingerprint' or metabolite 'profile' data. High resolution MS instruments perform routinely with a mass accuracy of Results Metabolite 'structures' harvested from publicly accessible databases were converted into a common format to generate a comprehensive archive in MZedDB. 'Rules' were derived from chemical information that allowed MZedDB to generate a list of adducts and neutral loss fragments putatively able to form for each structure and calculate, on the fly, the exact molecular weight of every potential ionisation product to provide targets for annotation searches based on accurate mass. We demonstrate that data matrices representing populations of ionisation products generated from different biological matrices contain a large proportion (sometimes > 50% of molecular isotopes, salt adducts and neutral loss fragments. Correlation analysis of ESI-MS data features confirmed the predicted relationships of m/z signals. An integrated isotope enumerator in MZedDB allowed verification of exact isotopic pattern distributions to corroborate experimental data. Conclusion We conclude that although ultra-high accurate mass instruments provide major insight into the chemical diversity of biological extracts, the facile annotation of a large proportion of signals is not possible by simple, automated query of current databases using computed molecular formulae. Parameterising MZedDB to take into account predicted ionisation behaviour and the biological source of any sample improves greatly both the frequency and accuracy of potential annotation 'hits' in ESI-MS data.
Habchi, Baninia; Alves, Sandra; Jouan-Rimbaud Bouveresse, Delphine; Appenzeller, Brice; Paris, Alain; Rutledge, Douglas N; Rathahao-Paris, Estelle
Due to the presence of pollutants in the environment and food, the assessment of human exposure is required. This necessitates high-throughput approaches enabling large-scale analysis and, as a consequence, the use of high-performance analytical instruments to obtain highly informative metabolomic profiles. In this study, direct introduction mass spectrometry (DIMS) was performed using a Fourier transform ion cyclotron resonance (FT-ICR) instrument equipped with a dynamically harmonized cell. Data quality was evaluated based on mass resolving power (RP), mass measurement accuracy, and ion intensity drifts from the repeated injections of quality control sample (QC) along the analytical process. The large DIMS data size entails the use of bioinformatic tools for the automatic selection of common ions found in all QC injections and for robustness assessment and correction of eventual technical drifts. RP values greater than 10 6 and mass measurement accuracy of lower than 1 ppm were obtained using broadband mode resulting in the detection of isotopic fine structure. Hence, a very accurate relative isotopic mass defect (RΔm) value was calculated. This reduces significantly the number of elemental composition (EC) candidates and greatly improves compound annotation. A very satisfactory estimate of repeatability of both peak intensity and mass measurement was demonstrated. Although, a non negligible ion intensity drift was observed for negative ion mode data, a normalization procedure was easily applied to correct this phenomenon. This study illustrates the performance and robustness of the dynamically harmonized FT-ICR cell to perform large-scale high-throughput metabolomic analyses in routine conditions. Graphical abstract Analytical performance of FT-ICR instrument equipped with a dynamically harmonized cell.
Rezvan, Mohammadreza; Shekarpour, Saeedeh; Balasuriya, Lakshika; Thirunarayan, Krishnaprasad; Shalin, Valerie; Sheth, Amit
Having a quality annotated corpus is essential especially for applied research. Despite the recent focus of Web science community on researching about cyberbullying, the community dose not still have standard benchmarks. In this paper, we publish first, a quality annotated corpus and second, an offensive words lexicon capturing different types type of harassment as (i) sexual harassment, (ii) racial harassment, (iii) appearance-related harassment, (iv) intellectual harassment, and (v) politic...
Shu, Yisong; Liu, Zhenli; Zhao, Siyu; Song, Zhiqian; He, Dan; Wang, Menglei; Zeng, Honglian; Lu, Cheng; Lu, Aiping; Liu, Yuanyan
Traditional Chinese medicine (TCM) exerts its therapeutic effect in a holistic fashion with the synergistic function of multiple characteristic constituents. The holism philosophy of TCM is coincident with global and systematic theories of metabolomics. The proposed pseudotargeted metabolomics methodologies were employed for the establishment of reliable quality control markers for use in the screening strategy of TCMs. Pseudotargeted metabolomics integrates the advantages of both targeted and untargeted methods. In the present study, targeted metabolomics equipped with the gold standard RRLC-QqQ-MS method was employed for in vivo quantitative plasma pharmacochemistry study of characteristic prototypic constituents. Meanwhile, untargeted metabolomics using UHPLC-QE Orbitrap HRMS with better specificity and selectivity was employed for identification of untargeted metabolites in the complex plasma matrix. In all, 32 prototypic metabolites were quantitatively determined, and 66 biotransformed metabolites were convincingly identified after being orally administered with standard extracts of four labeled Citrus TCMs. The global absorption and metabolism process of complex TCMs was depicted in a systematic manner.
Kirwan, Jennifer A; Weber, Ralf J M; Broadhurst, David I; Viant, Mark R
Direct-infusion mass spectrometry (DIMS) metabolomics is an important approach for characterising molecular responses of organisms to disease, drugs and the environment. Increasingly large-scale metabolomics studies are being conducted, necessitating improvements in both bioanalytical and computational workflows to maintain data quality. This dataset represents a systematic evaluation of the reproducibility of a multi-batch DIMS metabolomics study of cardiac tissue extracts. It comprises of twenty biological samples (cow vs. sheep) that were analysed repeatedly, in 8 batches across 7 days, together with a concurrent set of quality control (QC) samples. Data are presented from each step of the workflow and are available in MetaboLights. The strength of the dataset is that intra- and inter-batch variation can be corrected using QC spectra and the quality of this correction assessed independently using the repeatedly-measured biological samples. Originally designed to test the efficacy of a batch-correction algorithm, it will enable others to evaluate novel data processing algorithms. Furthermore, this dataset serves as a benchmark for DIMS metabolomics, derived using best-practice workflows and rigorous quality assessment. PMID:25977770
Full Text Available Abstract Background Metabolomic studies are targeted at identifying and quantifying all metabolites in a given biological context. Among the tools used for metabolomic research, mass spectrometry is one of the most powerful tools. However, metabolomics by mass spectrometry always reveals a high number of unknown compounds which complicate in depth mechanistic or biochemical understanding. In principle, mass spectrometry can be utilized within strategies of de novo structure elucidation of small molecules, starting with the computation of the elemental composition of an unknown metabolite using accurate masses with errors Results High mass accuracy (95% of false candidates. This orthogonal filter can condense several thousand candidates down to only a small number of molecular formulas. Example calculations for 10, 5, 3, 1 and 0.1 ppm mass accuracy are given. Corresponding software scripts can be downloaded from http://fiehnlab.ucdavis.edu. A comparison of eight chemical databases revealed that PubChem and the Dictionary of Natural Products can be recommended for automatic queries using molecular formulae. Conclusion More than 1.6 million molecular formulae in the range 0–500 Da were generated in an exhaustive manner under strict observation of mathematical and chemical rules. Assuming that ion species are fully resolved (either by chromatography or by high resolution mass spectrometry, we conclude that a mass spectrometer capable of 3 ppm mass accuracy and 2% error for isotopic abundance patterns outperforms mass spectrometers with less than 1 ppm mass accuracy or even hypothetical mass spectrometers with 0.1 ppm mass accuracy that do not include isotope information in the calculation of molecular formulae.
Lee, Kyung-Min; Jeon, Jun-Yeong; Lee, Byeong-Ju; Lee, Hwanhui; Choi, Hyung-Kyoon
Metabolomics has been used as a powerful tool for the analysis and quality assessment of the natural product (NP)-derived medicines. It is increasingly being used in the quality control and standardization of NP-derived medicines because they are composed of hundreds of natural compounds. The most common techniques that are used in metabolomics consist of NMR, GC-MS, and LC-MS in combination with multivariate statistical analyses including principal components analysis (PCA) and partial least squares-discriminant analysis (PLS-DA). Currently, the quality control of the NP-derived medicines is usually conducted using HPLC and is specified by one or two indicators. To create a superior quality control framework and avoid adulterated drugs, it is necessary to be able to determine and establish standards based on multiple ingredients using metabolic profiling and fingerprinting. Therefore, the application of various analytical tools in the quality control of NP-derived medicines forms the major part of this review. Veregen ® (Medigene AG, Planegg/Martinsried, Germany), which is the first botanical prescription drug approved by US Food and Drug Administration, is reviewed as an example that will hopefully provide future directions and perspectives on metabolomics technologies available for the quality control of NP-derived medicines.
In recent years, omic sciences have been increasingly employed in a multitude of research fields thanks to their high-throughput capabilities and holistic approach. Among the omic sciences, metabolomics and foodomics have recently emerged in the investigation of food and nutrition and their relat......In recent years, omic sciences have been increasingly employed in a multitude of research fields thanks to their high-throughput capabilities and holistic approach. Among the omic sciences, metabolomics and foodomics have recently emerged in the investigation of food and nutrition...... and their relation to the individual health and wellness status (Chapter 1). The analytical platforms used are ideal for non-targeted analysis, due to their capability of detecting and identifying a large set of variables (or metabolites) in complex biological samples. The most employed metabolomics techniques...... carried out both in Italy and in Denmark, outlines the analytical pipeline of the foodomic approach and highlights the current challenges in the field (Chapter 2.3). The thesis traces the path of modern foodomics and metabolomics from the definition and description of food quality (Chapters 3 to 6...
Full Text Available This study characterized the changes in quality and quantity of saliva, and changes in the salivary metabolomic profile, to understand the effects of masticatory stimulation.Stimulated and unstimulated saliva samples were collected from 55 subjects and salivary hydrophilic metabolites were comprehensively quantified using capillary electrophoresis-time-of-flight mass spectrometry.In total, 137 metabolites were identified and quantified. The concentrations of 44 metabolites in stimulated saliva were significantly higher than those in unstimulated saliva. Pathway analysis identified the upregulation of the urea cycle and synthesis and degradation pathways of glycine, serine, cysteine and threonine in stimulated saliva. A principal component analysis revealed that the effect of masticatory stimulation on salivary metabolomic profiles was less dependent on sample population sex, age, and smoking. The concentrations of only 1 metabolite in unstimulated saliva, and of 3 metabolites stimulated saliva, showed significant correlation with salivary secretion volume, indicating that the salivary metabolomic profile and salivary secretion volume were independent factors.Masticatory stimulation affected not only salivary secretion volume, but also metabolite concentration patterns. A low correlation between the secretion volume and these patterns supports the conclusion that the salivary metabolomic profile may be a new indicator to characterize masticatory stimulation.
Liu, Shao; Liang, Yi-Zeng; Liu, Hai-Tao
Traditional Chinese medicines (TCMs) bring a great challenge in quality control and evaluating the efficacy because of their complexity of chemical composition. Chemometric techniques provide a good opportunity for mining more useful chemical information from TCMs. Then, the application of chemometrics in the field of TCMs is spontaneous and necessary. This review focuses on the recent various important chemometrics tools for chromatographic fingerprinting, including peak alignment information features, baseline correction and applications of chemometrics in metabolomics and modernization of TCMs, including authentication and evaluation of the quality of TCMs, evaluating the efficacy of TCMs and essence of TCM syndrome. In the conclusions, the general trends and some recommendations for improving chromatographic metabolomics data analysis are provided. Copyright © 2016 Elsevier B.V. All rights reserved.
Simader, Alexandra Maria; Kluger, Bernhard; Neumann, Nora Katharina Nicole; Bueschl, Christoph; Lemmens, Marc; Lirk, Gerald; Krska, Rudolf; Schuhmacher, Rainer
Metabolomics experiments often comprise large numbers of biological samples resulting in huge amounts of data. This data needs to be inspected for plausibility before data evaluation to detect putative sources of error e.g. retention time or mass accuracy shifts. Especially in liquid chromatography-high resolution mass spectrometry (LC-HRMS) based metabolomics research, proper quality control checks (e.g. for precision, signal drifts or offsets) are crucial prerequisites to achieve reliable and comparable results within and across experimental measurement sequences. Software tools can support this process. The software tool QCScreen was developed to offer a quick and easy data quality check of LC-HRMS derived data. It allows a flexible investigation and comparison of basic quality-related parameters within user-defined target features and the possibility to automatically evaluate multiple sample types within or across different measurement sequences in a short time. It offers a user-friendly interface that allows an easy selection of processing steps and parameter settings. The generated results include a coloured overview plot of data quality across all analysed samples and targets and, in addition, detailed illustrations of the stability and precision of the chromatographic separation, the mass accuracy and the detector sensitivity. The use of QCScreen is demonstrated with experimental data from metabolomics experiments using selected standard compounds in pure solvent. The application of the software identified problematic features, samples and analytical parameters and suggested which data files or compounds required closer manual inspection. QCScreen is an open source software tool which provides a useful basis for assessing the suitability of LC-HRMS data prior to time consuming, detailed data processing and subsequent statistical analysis. It accepts the generic mzXML format and thus can be used with many different LC-HRMS platforms to process both multiple
The FDA mandates that digital electrocardiograms (ECGs) from 'thorough' QTc trials be submitted into the ECG Warehouse in Health Level 7 extended markup language format with annotated onset and offset points of waveforms. The FDA did not disclose the exact Warehouse metrics and minimal acceptable quality standards. The author describes the Warehouse scoring algorithms and metrics used by FDA, points out ways to improve FDA review and suggests Warehouse benefits for pharmaceutical sponsors. The Warehouse ranks individual ECGs according to their score for each quality metric and produces histogram distributions with Warehouse-specific thresholds that identify ECGs of questionable quality. Automatic Warehouse algorithms assess the quality of QT annotation and duration of manual QT measurement by the central ECG laboratory.
Full Text Available Abstract The assessment of oocyte quality in human in vitro fertilization (IVF is getting increasing attention from embryologists. Oocyte selection and the identification of the best oocytes, in fact, would help to limit embryo overproduction and to improve the results of oocyte cryostorage programs. Follicular fluid (FF is easily available during oocyte pick-up and theorically represents an optimal source on non-invasive biochemical predictors of oocyte quality. Unfortunately, however, the studies aiming to find a good molecular predictor of oocyte quality in FF were not able to identify substances that could be used as reliable markers of oocyte competence to fertilization, embryo development and pregnancy. In the last years, a well definite trend toward passing from the research of single molecular markers to more complex techniques that study all metabolites of FF has been observed. The metabolomic approach is a powerful tool to study biochemical predictors of oocyte quality in FF, but its application in this area is still at the beginning. This review provides an overview of the current knowledge about the biochemical predictors of oocyte quality in FF, describing both the results coming from studies on single biochemical markers and those deriving from the most recent studies of metabolomics
Johannesen, Lars; Galeotti, Loriano
An algorithm to determine the quality of electrocardiograms (ECGs) can enable inexperienced nurses and paramedics to record ECGs of sufficient diagnostic quality. Previously, we proposed an algorithm for determining if ECG recordings are of acceptable quality, which was entered in the PhysioNet Challenge 2011. In the present work, we propose an improved two-step algorithm, which first rejects ECGs with macroscopic errors (signal absent, large voltage shifts or saturation) and subsequently quantifies the noise (baseline, powerline or muscular noise) on a continuous scale. The performance of the improved algorithm was evaluated using the PhysioNet Challenge database (1500 ECGs rated by humans for signal quality). We achieved a classification accuracy of 92.3% on the training set and 90.0% on the test set. The improved algorithm is capable of detecting ECGs with macroscopic errors and giving the user a score of the overall quality. This allows the user to assess the degree of noise and decide if it is acceptable depending on the purpose of the recording. (paper)
Guo, An Chi; Jewison, Timothy; Wilson, Michael; Liu, Yifeng; Knox, Craig; Djoumbou, Yannick; Lo, Patrick; Mandal, Rupasri; Krishnamurthy, Ram; Wishart, David S.
The Escherichia coli Metabolome Database (ECMDB, http://www.ecmdb.ca) is a comprehensively annotated metabolomic database containing detailed information about the metabolome of E. coli (K-12). Modelled closely on the Human and Yeast Metabolome Databases, the ECMDB contains >2600 metabolites with links to ?1500 different genes and proteins, including enzymes and transporters. The information in the ECMDB has been collected from dozens of textbooks, journal articles and electronic databases. E...
strategy influences the patterns identified as important for the nutritional question under study. Therefore, in depth understanding of the study design and the specific effects of the analytical technology on the produced data is extremely important to achieve high quality data handling. Besides data......Metabolomics provides a holistic approach to investigate the perturbations in human metabolism with respect to a specific exposure. In nutritional metabolomics, the research question is generally related to the effect of a specific food intake on metabolic profiles commonly of plasma or urine....... Application of multiple analytical strategies may provide comprehensive information to reach a valid answer to these research questions. In this thesis, I investigated several analytical technologies and data handling strategies in order to evaluate their effects on the biological answer. In metabolomics, one...
Diet, dietary patterns, and other environmental factors such as exposure to toxins are playing an important role in the prevention/development of many diseases, like obesity, type 2 diabetes, and consequently on the health status of individuals. A major challenge nowadays is to identify novel biomarkers to detect as early as possible metabolic dysfunction and to predict evolution of health status in order to refine nutritional advices to specific population groups. Omics technologies such as genomics, transcriptomics, proteomics, and metabolomics coupled with statistical and bioinformatics tools have already shown great potential in this research field even if so far only few biomarkers have been validated. For the past two decades, important analytical techniques have been developed to detect as many metabolites as possible in human biofluids such as urine, blood, and saliva. In the field of food science and nutrition, many studies have been carried out for food authenticity, quality, and safety, as well as for food processing. Furthermore, metabolomic investigations have been carried out to discover new early biomarkers of metabolic dysfunction and predictive biomarkers of developing pathologies (obesity, metabolic syndrome, type-2 diabetes, etc.). Great emphasis is also placed in the development of methodologies to identify and validate biomarkers of nutrients exposure. © 2017 Elsevier Inc. All rights reserved.
Full Text Available Goji (fruits of Lycium barbarum L. and L. chinense Mill. has been used in China as food and medicine for millennia, and globally has been consumed increasingly as a healthy food. Ningxia, with a semi-arid climate, always had the reputation of producing best goji quality (daodi area. Recently, the increasing market demand pushed the cultivation into new regions with different climates. We therefore ask: How does goji quality differ among production areas of various climatic regions? Historical records are used to trace the spread of goji production in China over time. Quality measurements of 51 samples were correlated with the four main production areas in China: monsoon (Hebei, semi-arid (Ningxia, Gansu, and Inner Mongolia, plateau (Qinghai and arid regions (Xinjiang. We include morphological characteristics, sugar and polysaccharide content, antioxidant activity, and metabolomic profiling to compare goji among climatic regions. Goji cultivation probably began in the East (Hebei of China around 100 CE and later shifted westward to the semi-arid regions. Goji from monsoon, plateau and arid regions differ according to its fruit morphology, whereas semi-arid goji cannot be separated from the other regions. L. chinense fruits, which are exclusively cultivated in Hebei (monsoon, are significantly lighter, smaller and brighter in color, while the heaviest and largest fruits (L. barbarum stem from the plateau. The metabolomic profiling separates the two species but not the regions of cultivation. Lycium chinense and samples from the semi-arid regions have significantly (p < 0.01 lower sugar contents and L. chinense shows the highest antioxidant activity. Our results do not justify superiority of a specific production area over other areas. Instead it will be essential to distinguish goji from different regions based on the specific morphological and chemical traits with the aim to understand what its intended uses are.
Farag, Mohamed A
The number of botanical dietary supplements in the market has recently increased primarily due to increased health awareness. Standardization and quality control of the constituents of these plant extracts is an important topic, particularly when such ingredients are used long term as dietary supplements, or in cases where higher doses are marketed as drugs. The development of fast, comprehensive, and effective untargeted analytical methods for plant extracts is of high interest. Nuclear magnetic resonance spectroscopy and mass spectrometry are the most informative tools, each of which enables high-throughput and global analysis of hundreds of metabolites in a single step. Although only one of the two techniques is utilized in the majority of plant metabolomics applications, there is a growing interest in combining the data from both platforms to effectively unravel the complexity of plant samples. The application of combined MS and NMR in the quality control of nutraceuticals forms the major part of this review. Finally I will look at the future developments and perspectives of these two technologies for the quality control of herbal materials.
Tedersoo, Leho; Abarenkov, Kessy; Nilsson, R Henrik; Schüssler, Arthur; Grelet, Gwen-Aëlle; Kohout, Petr; Oja, Jane; Bonito, Gregory M; Veldre, Vilmar; Jairus, Teele; Ryberg, Martin; Larsson, Karl-Henrik; Kõljalg, Urmas
Sequence analysis of the ribosomal RNA operon, particularly the internal transcribed spacer (ITS) region, provides a powerful tool for identification of mycorrhizal fungi. The sequence data deposited in the International Nucleotide Sequence Databases (INSD) are, however, unfiltered for quality and are often poorly annotated with metadata. To detect chimeric and low-quality sequences and assign the ectomycorrhizal fungi to phylogenetic lineages, fungal ITS sequences were downloaded from INSD, aligned within family-level groups, and examined through phylogenetic analyses and BLAST searches. By combining the fungal sequence database UNITE and the annotation and search tool PlutoF, we also added metadata from the literature to these accessions. Altogether 35,632 sequences belonged to mycorrhizal fungi or originated from ericoid and orchid mycorrhizal roots. Of these sequences, 677 were considered chimeric and 2,174 of low read quality. Information detailing country of collection, geographical coordinates, interacting taxon and isolation source were supplemented to cover 78.0%, 33.0%, 41.7% and 96.4% of the sequences, respectively. These annotated sequences are publicly available via UNITE (http://unite.ut.ee/) for downstream biogeographic, ecological and taxonomic analyses. In European Nucleotide Archive (ENA; http://www.ebi.ac.uk/ena/), the annotated sequences have a special link-out to UNITE. We intend to expand the data annotation to additional genes and all taxonomic groups and functional guilds of fungi.
Saito, Kazuki; Matsuda, Fumio
Metabolomics now plays a significant role in fundamental plant biology and applied biotechnology. Plants collectively produce a huge array of chemicals, far more than are produced by most other organisms; hence, metabolomics is of great importance in plant biology. Although substantial improvements have been made in the field of metabolomics, the uniform annotation of metabolite signals in databases and informatics through international standardization efforts remains a challenge, as does the development of new fields such as fluxome analysis and single cell analysis. The principle of transcript and metabolite cooccurrence, particularly transcriptome coexpression network analysis, is a powerful tool for decoding the function of genes in Arabidopsis thaliana. This strategy can now be used for the identification of genes involved in specific pathways in crops and medicinal plants. Metabolomics has gained importance in biotechnology applications, as exemplified by quantitative loci analysis, prediction of food quality, and evaluation of genetically modified crops. Systems biology driven by metabolome data will aid in deciphering the secrets of plant cell systems and their application to biotechnology.
Sarapa, Nenad; Mortara, Justin L; Brown, Barry D; Isola, Lamberto; Badilini, Fabio
The US Food and Drug Administration recommends submission of digital electrocardiograms in the standard HL7 XML format into the electrocardiogram warehouse to support preapproval review of new drug applications. The Food and Drug Administration scrutinizes electrocardiogram quality by viewing the annotated waveforms and scoring electrocardiogram quality by the warehouse algorithms. Part of the Food and Drug Administration warehouse is commercially available to sponsors as the E-Scribe Warehouse. The authors tested the performance of E-Scribe Warehouse algorithms by quantifying electrocardiogram acquisition quality, adherence to QT annotation protocol, and T-wave signal strength in 2 data sets: "reference" (104 digital electrocardiograms from a phase I study with sotalol in 26 healthy subjects with QT annotations by computer-assisted manual adjustment) and "test" (the same electrocardiograms with an intentionally introduced predefined number of quality issues). The E-Scribe Warehouse correctly detected differences between the 2 sets expected from the number and pattern of errors in the "test" set (except for 1 subject with QT misannotated in different leads of serial electrocardiograms) and confirmed the absence of differences where none was expected. E-Scribe Warehouse scores below the threshold value identified individual electrocardiograms with questionable T-wave signal strength. The E-Scribe Warehouse showed satisfactory performance in detecting electrocardiogram quality issues that may impair reliability of QTc assessment in clinical trials in healthy subjects.
Ma , Yue; Lévy , François; Ghimire , Sudeep
International audience; Linguistic and semantic annotations are important features for text-based applications. However, achieving and maintaining a good quality of a set of annotations is known to be a complex task. Many ad hoc approaches have been developed to produce various types of annotations, while comparing those annotations to improve their quality is still rare. In this paper, we propose a framework in which both linguistic and domain information can cooperate to reason with annotat...
Gas chromatography-mass spectrometry (GC-MS)-based metabolomics is ideal for identifying and quantitating small molecular metabolites (metabolomics easily allows integrating targeted assays for absolute quantification of specific metabolites with untargeted metabolomics to discover novel compounds. Complemented by database annotations using large spectral libraries and validated, standardized standard operating procedures, GC-MS can identify and semi-quantify over 200 compounds per study in human body fluids (e.g., plasma, urine or stool) samples. Deconvolution software enables detection of more than 300 additional unidentified signals that can be annotated through accurate mass instruments with appropriate data processing workflows, similar to liquid chromatography-MS untargeted profiling (LC-MS). Hence, GC-MS is a mature technology that not only uses classic detectors (‘quadrupole’) but also target mass spectrometers (‘triple quadrupole’) and accurate mass instruments (‘quadrupole-time of flight’). This unit covers the following aspects of GC-MS-based metabolomics: (i) sample preparation from mammalian samples, (ii) acquisition of data, (iii) quality control, and (iv) data processing. PMID:27038389
Overmars, L.; Siezen, R.J.; Francke, C.
The identification of translation initiation sites (TISs) constitutes an important aspect of sequence-based genome analysis. An erroneous TIS annotation can impair the identification of regulatory elements and N-terminal signal peptides, and also may flaw the determination of descent, for any
Chagoyen, Mónica; López-Ibáñez, Javier; Pazos, Florencio
Metabolomics aims at characterizing the repertory of small chemical compounds in a biological sample. As it becomes more massive and larger sets of compounds are detected, a functional analysis is required to convert these raw lists of compounds into biological knowledge. The most common way of performing such analysis is "annotation enrichment analysis," also used in transcriptomics and proteomics. This approach extracts the annotations overrepresented in the set of chemical compounds arisen in a given experiment. Here, we describe the protocols for performing such analysis as well as for visualizing a set of compounds in different representations of the metabolic networks, in both cases using free accessible web tools.
Sakurai, Nozomu; Ara, Takeshi; Enomoto, Mitsuo; Motegi, Takeshi; Morishita, Yoshihiko; Kurabayashi, Atsushi; Iijima, Yoko; Ogata, Yoshiyuki; Nakajima, Daisuke; Suzuki, Hideyuki; Shibata, Daisuke
A metabolome--the collection of comprehensive quantitative data on metabolites in an organism--has been increasingly utilized for applications such as data-intensive systems biology, disease diagnostics, biomarker discovery, and assessment of food quality. A considerable number of tools and databases have been developed to date for the analysis of data generated by various combinations of chromatography and mass spectrometry. We report here a web portal named KOMICS (The Kazusa Metabolomics Portal), where the tools and databases that we developed are available for free to academic users. KOMICS includes the tools and databases for preprocessing, mining, visualization, and publication of metabolomics data. Improvements in the annotation of unknown metabolites and dissemination of comprehensive metabolomic data are the primary aims behind the development of this portal. For this purpose, PowerGet and FragmentAlign include a manual curation function for the results of metabolite feature alignments. A metadata-specific wiki-based database, Metabolonote, functions as a hub of web resources related to the submitters' work. This feature is expected to increase citation of the submitters' work, thereby promoting data publication. As an example of the practical use of KOMICS, a workflow for a study on Jatropha curcas is presented. The tools and databases available at KOMICS should contribute to enhanced production, interpretation, and utilization of metabolomic Big Data.
Picone, Gianfranco; Engelsen, Søren Balling; Savorani, Francesco; Testi, Silvia; Badiani, Anna; Capozzi, Francesco
The molecular profiles of perchloric acid solutions extracted from the flesh of Sparus aurata fish specimens, produced according to different aquaculture systems, have been investigated. The 1H-NMR spectra of aqueous extracts are indicative of differences in the metabolite content of fish reared under different conditions that are already distinguishable at their capture, and substantially maintain the same differences in their molecular profiles after sixteen days of storage under ice. The fish metabolic profiles are studied by top-down chemometric analysis. The results of this exploratory investigation show that the fish metabolome accurately reflects the rearing conditions. The level of many metabolites co-vary with the rearing conditions and a few metabolites are quantified including glycogen (stress indicator), histidine, alanine and glycine which all display significant changes dependent on the aquaculture system and on the storage times. PMID:22254093
Vanzo, Andreja; Jenko, Mojca; Vrhovsek, Urska; Stopar, Matej
Apple quality was investigated in the scab-resistant 'Liberty', 'Santana', and 'Topaz' cultivars and the scab-susceptible 'Golden Delicious' cultivar. Trees subjected to the same crop load were cultivated using either an organic (ORG) or an integrated production (IP) system. Physicochemical properties, phenolic content, and sensorial quality of fruit from both systems were compared. There were no significant differences in fruit mass, starch, and total soluble solid content (the latter was higher in ORG 'Liberty') between ORG and IP fruit, whereas significantly higher flesh firmness was found in ORG fruit (except no difference in 'Golden Delicious'). Significantly higher total phenolic content in ORG fruit was found in 'Golden Delicious', whereas differences in other cultivars were not significant. Targeted metabolomic profiling of multiple classes of phenolics confirmed the impact of the production system on the 'Golden Delicious' phenolic profile as higher levels of 4-hydroxybenzoic acid, neo- and chlorogenic acids, phloridzin, procyanidin B2+B4, -3-O-glucoside and -3-O-galactoside of quercetin, kaempferol-3-O-rutinoside, and rutin being found in ORG fruit. The results obtained suggested that scab resistance influenced the phenolic biosynthesis in relation to the agricultural system. Sensorial evaluation indicated significantly better flavor (except for 'Topaz') and better appearance of IP fruit.
Wang, Ya-Qin; Hu, Li-Ping; Liu, Guang-Min; Zhang, De-Shuang; He, Hong-Ju
Chinese kale ( Brassica alboglabra Bailey) is a widely consumed vegetable which is rich in antioxidants and anticarcinogenic compounds. Herein, we used an untargeted ultra-high-performance liquid chromatography (UHPLC)-Quadrupole-Orbitrap MS/MS-based metabolomics strategy to study the nutrient profiles of Chinese kale. Seven Chinese kale cultivars and three different edible parts were evaluated, and amino acids, sugars, organic acids, glucosinolates and phenolic compounds were analysed simultaneously. We found that two cultivars, a purple-stem cultivar W1 and a yellow-flower cultivar Y1, had more health-promoting compounds than others. The multivariate statistical analysis results showed that gluconapin was the most important contributor for discriminating both cultivars and edible parts. The purple-stem cultivar W1 had higher levels of some phenolic acids and flavonoids than the green stem cultivars. Compared to stems and leaves, the inflorescences contained more amino acids, glucosinolates and most of the phenolic acids. Meanwhile, the stems had the least amounts of phenolic compounds among the organs tested. Metabolomics is a powerful approach for the comprehensive understanding of vegetable nutritional quality. The results provide the basis for future metabolomics-guided breeding and nutritional quality improvement.
Bouatra, Souhaila; Aziat, Farid; Mandal, Rupasri; Guo, An Chi; Wilson, Michael R.; Knox, Craig; Bjorndahl, Trent C.; Krishnamurthy, Ramanarayan; Saleem, Fozia; Liu, Philip; Dame, Zerihun T.; Poelzer, Jenna; Huynh, Jessica; Yallou, Faizath S.; Psychogios, Nick; Dong, Edison; Bogumil, Ralf; Roehring, Cornelia; Wishart, David S.
Urine has long been a “favored” biofluid among metabolomics researchers. It is sterile, easy-to-obtain in large volumes, largely free from interfering proteins or lipids and chemically complex. However, this chemical complexity has also made urine a particularly difficult substrate to fully understand. As a biological waste material, urine typically contains metabolic breakdown products from a wide range of foods, drinks, drugs, environmental contaminants, endogenous waste metabolites and bacterial by-products. Many of these compounds are poorly characterized and poorly understood. In an effort to improve our understanding of this biofluid we have undertaken a comprehensive, quantitative, metabolome-wide characterization of human urine. This involved both computer-aided literature mining and comprehensive, quantitative experimental assessment/validation. The experimental portion employed NMR spectroscopy, gas chromatography mass spectrometry (GC-MS), direct flow injection mass spectrometry (DFI/LC-MS/MS), inductively coupled plasma mass spectrometry (ICP-MS) and high performance liquid chromatography (HPLC) experiments performed on multiple human urine samples. This multi-platform metabolomic analysis allowed us to identify 445 and quantify 378 unique urine metabolites or metabolite species. The different analytical platforms were able to identify (quantify) a total of: 209 (209) by NMR, 179 (85) by GC-MS, 127 (127) by DFI/LC-MS/MS, 40 (40) by ICP-MS and 10 (10) by HPLC. Our use of multiple metabolomics platforms and technologies allowed us to identify several previously unknown urine metabolites and to substantially enhance the level of metabolome coverage. It also allowed us to critically assess the relative strengths and weaknesses of different platforms or technologies. The literature review led to the identification and annotation of another 2206 urinary compounds and was used to help guide the subsequent experimental studies. An online database containing
Anna A Vanyushkina
Full Text Available We present a systematic study of three bacterial species that belong to the class Mollicutes, the smallest and simplest bacteria, Spiroplasma melliferum, Mycoplasma gallisepticum, and Acholeplasma laidlawii. To understand the difference in the basic principles of metabolism regulation and adaptation to environmental conditions in the three species, we analyzed the metabolome of these bacteria. Metabolic pathways were reconstructed using the proteogenomic annotation data provided by our lab. The results of metabolome, proteome and genome profiling suggest a fundamental difference in the adaptation of the three closely related Mollicute species to stress conditions. As the transaldolase is not annotated in Mollicutes, we propose variants of the pentose phosphate pathway catalyzed by annotated enzymes for three species. For metabolite detection we employed high performance liquid chromatography coupled with mass spectrometry. We used liquid chromatography method - hydrophilic interaction chromatography with silica column - as it effectively separates highly polar cellular metabolites prior to their detection by mass spectrometer.
The large size and relative complexity of many plant genomes make creation, quality control, and dissemination of high-quality gene structure annotations challenging. In response, we have developed MAKER-P, a fast and easy-to-use genome annotation engine for plants. Here, we report the use of MAKER-...
Full Text Available A metabolome—the collection of comprehensive quantitative data on metabolites in an organism—has been increasingly utilized for applications such as data-intensive systems biology, disease diagnostics, biomarker discovery, and assessment of food quality. A considerable number of tools and databases have been developed to date for the analysis of data generated by various combinations of chromatography and mass spectrometry. We report here a web portal named KOMICS (The Kazusa Metabolomics Portal, where the tools and databases that we developed are available for free to academic users. KOMICS includes the tools and databases for preprocessing, mining, visualization, and publication of metabolomics data. Improvements in the annotation of unknown metabolites and dissemination of comprehensive metabolomic data are the primary aims behind the development of this portal. For this purpose, PowerGet and FragmentAlign include a manual curation function for the results of metabolite feature alignments. A metadata-specific wiki-based database, Metabolonote, functions as a hub of web resources related to the submitters' work. This feature is expected to increase citation of the submitters' work, thereby promoting data publication. As an example of the practical use of KOMICS, a workflow for a study on Jatropha curcas is presented. The tools and databases available at KOMICS should contribute to enhanced production, interpretation, and utilization of metabolomic Big Data.
syndrome are complex disorders and are not caused by a high-calorie diet and low exercise level alone. The specific nature of the nutrients, independent of their caloric value, also play a role. The question is which. In the quest to answer this question the qualitative intake of protein is of special...... and prevention of the metabolic syndrome related to obesity and diabetes. In this thesis the effects of whey intake on the human metabolome was investigated using a metabolomics approach. We demonstrated that intake of whey causes a decreased rate of gastric emptying compared to other protein sources....... Therefore this thesis will also present and discuss state-of-the-art tools for computer-assisted compound identification, including: annotation of adducts and fragments, determination of the molecular ion, in silico fragmentation, retention time mapping between analytical systems and de novo retention time...
Bernillon, Stéphane; Biais, Benoit; Deborde, Catherine
Melon (Cucumis melo L.) is a global crop in terms of economic importance and nutritional quality. The aim of this study was to explore the variability in metabolite and elemental composition of several commercial varieties of melon in various environmental conditions. Volatile and non...
Bernillon, S.; Biais, B.; Deborde, C.; Maucort, M.; Cabasson, C.; Gibon, Y.; Hansen, T.; Husted, S.; Vos, de R.C.H.; Mumm, R.; Jonker, H.; Ward, J.L.; Miller, S.J.; Baker, J.M.; Burger, J.; Tadmor, Y.; Beale, M.H.; Schjoerring, J.K.; Schaffer, A.; Rolin, D.; Hall, R.D.; Moing, A.
Melon (Cucumis melo L.) is a global crop in terms of economic importance and nutritional quality. The aim of this study was to explore the variability in metabolite and elemental composition of several commercial varieties of melon in various environmental conditions. Volatile and non-volatile
Thissen, U.; Coulier, L.; Overkamp, K.M.; Jetten, J.; Werff, B.J.C. van de; Ven, T. van de; Werf, M.J. van der
In agricultural and food products, typical quality parameters are sensory properties, shelf-life, safety, health, nutritional value, crop yield per area and disease resistance. It is known that these parameters are importantly determined by the metabolites in the crops and food products.
Keshavan, Anisha; Madan, Christopher; Datta, Esha; McDonough, Ian
Jewison, Timothy; Knox, Craig; Neveu, Vanessa; Djoumbou, Yannick; Guo, An Chi; Lee, Jacqueline; Liu, Philip; Mandal, Rupasri; Krishnamurthy, Ram; Sinelnikov, Igor; Wilson, Michael; Wishart, David S.
The Yeast Metabolome Database (YMDB, http://www.ymdb.ca) is a richly annotated ‘metabolomic’ database containing detailed information about the metabolome of Saccharomyces cerevisiae. Modeled closely after the Human Metabolome Database, the YMDB contains >2000 metabolites with links to 995 different genes/proteins, including enzymes and transporters. The information in YMDB has been gathered from hundreds of books, journal articles and electronic databases. In addition to its comprehensive literature-derived data, the YMDB also contains an extensive collection of experimental intracellular and extracellular metabolite concentration data compiled from detailed Mass Spectrometry (MS) and Nuclear Magnetic Resonance (NMR) metabolomic analyses performed in our lab. This is further supplemented with thousands of NMR and MS spectra collected on pure, reference yeast metabolites. Each metabolite entry in the YMDB contains an average of 80 separate data fields including comprehensive compound description, names and synonyms, structural information, physico-chemical data, reference NMR and MS spectra, intracellular/extracellular concentrations, growth conditions and substrates, pathway information, enzyme data, gene/protein sequence data, as well as numerous hyperlinks to images, references and other public databases. Extensive searching, relational querying and data browsing tools are also provided that support text, chemical structure, spectral, molecular weight and gene/protein sequence queries. Because of S. cervesiae's importance as a model organism for biologists and as a biofactory for industry, we believe this kind of database could have considerable appeal not only to metabolomics researchers, but also to yeast biologists, systems biologists, the industrial fermentation industry, as well as the beer, wine and spirit industry. PMID:22064855
Nemkov, Travis; Hansen, Kirk C; Dumont, Larry J; D'Alessandro, Angelo
Biochemical investigations on the regulatory mechanisms of red blood cell (RBC) and platelet (PLT) metabolism have fostered a century of advances in the field of transfusion medicine. Owing to these advances, storage of RBCs and PLT concentrates has become a lifesaving practice in clinical and military settings. There, however, remains room for improvement, especially with regard to the introduction of novel storage and/or rejuvenation solutions, alternative cell processing strategies (e.g., pathogen inactivation technologies), and quality testing (e.g., evaluation of novel containers with alternative plasticizers). Recent advancements in mass spectrometry-based metabolomics and systems biology, the bioinformatics integration of omics data, promise to speed up the design and testing of innovative storage strategies developed to improve the quality, safety, and effectiveness of blood products. Here we review the currently available metabolomics technologies and briefly describe the routine workflow for transfusion medicine-relevant studies. The goal is to provide transfusion medicine experts with adequate tools to navigate through the otherwise overwhelming amount of metabolomics data burgeoning in the field during the past few years. Descriptive metabolomics data have represented the first step omics researchers have taken into the field of transfusion medicine. However, to up the ante, clinical and omics experts will need to merge their expertise to investigate correlative and mechanistic relationships among metabolic variables and transfusion-relevant variables, such as 24-hour in vivo recovery for transfused RBCs. Integration with systems biology models will potentially allow for in silico prediction of metabolic phenotypes, thus streamlining the design and testing of alternative storage strategies and/or solutions. © 2015 AABB.
Chagoyen, Monica; Pazos, Florencio
The so-called 'omics' approaches used in modern biology aim at massively characterizing the molecular repertories of living systems at different levels. Metabolomics is one of the last additions to the 'omics' family and it deals with the characterization of the set of metabolites in a given biological system. As metabolomic techniques become more massive and allow characterizing larger sets of metabolites, automatic methods for analyzing these sets in order to obtain meaningful biological information are required. Only recently the first tools specifically designed for this task in metabolomics appeared. They are based on approaches previously used in transcriptomics and other 'omics', such as annotation enrichment analysis. These, together with generic tools for metabolic analysis and visualization not specifically designed for metabolomics will for sure be in the toolbox of the researches doing metabolomic experiments in the near future.
Gil de la Fuente, Alberto; Grace Armitage, Emily; Otero, Abraham; Barbas, Coral; Godzien, Joanna
Metabolite identification is one of the most challenging steps in metabolomics studies and reflects one of the greatest bottlenecks in the entire workflow. The success of this step determines the success of the entire research, therefore the quality at which annotations are given requires special attention. A variety of tools and resources are available to aid metabolite identification or annotation, offering different and often complementary functionalities. In preparation for this article, almost 50 databases were reviewed, from which 17 were selected for discussion, chosen for their online ESI-MS functionality. The general characteristics and functions of each database is discussed in turn, considering the advantages and limitations of each along with recommendations for optimal use of each tool, as derived from experiences encountered at the Centre for Metabolomics and Bioanalysis (CEMBIO) in Madrid. These databases were evaluated considering their utility in non-targeted metabolomics, including aspects such as identifier assignment, structural assignment and interpretation of results. © 2017 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Metabolomic analysis of plants broadens understanding of how plants may benefit humans, animals and the environment, provide sustainable food and energy, and improve current agricultural, pharmacological and medicinal practices in order to bring about healthier and longer life. The quality...... and amount of the extractible biological information is largely determined by data acquisition, data processing and analysis methodologies of the plant metabolomics studies. This PhD study focused mainly on the development and implementation of new metabolomics methodologies for improved data acquisition...... and data processing. The study mainly concerned the three most commonly applied analytical techniques in plant metabolomics, GC-MS, LC-MS and NMR. In addition, advanced chemometrics methods e.g. PARAFAC2 and ASCA have been extensively used for development of complex metabolomics data processing...
Nima Ranjbar Sistani
Full Text Available In field peas, ascochyta blight is one of the most common fungal diseases caused by Didymella pinodes. Despite the high diversity of pea cultivars, only little resistance has been developed until to date, still leading to significant losses in grain yield. Rhizobia as plant growth promoting endosymbionts are the main partners for establishment of symbiosis with pea plants. The key role of Rhizobium as an effective nitrogen source for legumes seed quality and quantity improvement is in line with sustainable agriculture and food security programs. Besides these growth promoting effects, Rhizobium symbiosis has been shown to have a priming impact on the plants immune system that enhances resistance against environmental perturbations. This is the first integrative study that investigates the effect of Rhizobium leguminosarum bv. viceae (Rlv on phenotypic seed quality, quantity and fungal disease in pot grown pea (Pisum sativum cultivars with two different resistance levels against D. pinodes through metabolomics and proteomics analyses. In addition, the pathogen effects on seed quantity components and quality are assessed at morphological and molecular level. Rhizobium inoculation decreased disease severity by significant reduction of seed infection level. Rhizobium symbiont enhanced yield through increased seed fresh and dry weights based on better seed filling. Rhizobium inoculation also induced changes in seed proteome and metabolome involved in enhanced P. sativum resistance level against D. pinodes. Besides increased redox and cell wall adjustments light is shed on the role of late embryogenesis abundant proteins and metabolites such as the seed triterpenoid Soyasapogenol. The results of this study open new insights into the significance of symbiotic Rhizobium interactions for crop yield, health and seed quality enhancement and reveal new metabolite candidates involved in pathogen resistance.
Full Text Available The annotation of small molecules remains a major challenge in untargeted mass spectrometry-based metabolomics. We here critically discuss structured elucidation approaches and software that are designed to help during the annotation of unknown compounds. Only by elucidating unknown metabolites first is it possible to biologically interpret complex systems, to map compounds to pathways and to create reliable predictive metabolic models for translational and clinical research. These strategies include the construction and quality of tandem mass spectral databases such as the coalition of MassBank repositories and investigations of MS/MS matching confidence. We present in silico fragmentation tools such as MS-FINDER, CFM-ID, MetFrag, ChemDistiller and CSI:FingerID that can annotate compounds from existing structure databases and that have been used in the CASMI (critical assessment of small molecule identification contests. Furthermore, the use of retention time models from liquid chromatography and the utility of collision cross-section modelling from ion mobility experiments are covered. Workflows and published examples of successfully annotated unknown compounds are included.
Full Text Available Progress in improving crop growth is an absolute goal despite the influence multifactorial components have on crop yield and quality. An Avalon × Cadenza doubled-haploid wheat mapping population was used to study the leaf metabolome of field grown wheat at weekly intervals during the time in which the canopy contributes to grain filling, i.e., from anthesis to 5 weeks post-anthesis. Wheat was grown under four different nitrogen supplies reaching from residual soil N to a luxury over-fertilization (0, 100, 200, and 350 kg N ha−1. Four lines from a segregating doubled haploid population derived of a cross of the wheat elite cvs. Avalon and Cadenza were chosen as they showed pairwise differences in either N utilization efficiency (NUtE or senescence timing. 108 annotated metabolites of primary metabolism and ions were determined. The analysis did not provide genotype specific markers because of a remarkable stability of the metabolome between lines. We speculate that the reason for failing to identify genotypic markers might be due to insufficient genetic diversity of the wheat parents and/or the known tendency of plants to keep metabolome homeostasis even under adverse conditions through multiple adaptations and rescue mechanism. The data, however, provided a consistent catalogue of metabolites and their respective responses to environmental and developmental factors and may bode well for future systems biology approaches, and support plant breeding and crop improvement.
Under a cooperative agreement with the U.S. Department of Energy's Office of Science and Technology, Waste Policy Institute (WPI) is conducting a five-year research project to develop a research-based approach for integrating communication products in stakeholder involvement related to innovative technology. As part of the research, WPI developed this annotated bibliography which contains almost 100 citations of articles/books/resources involving topics related to communication and public involvement aspects of deploying innovative cleanup technology. To compile the bibliography, WPI performed on-line literature searches (e.g., Dialog, International Association of Business Communicators Public Relations Society of America, Chemical Manufacturers Association, etc.), consulted past years proceedings of major environmental waste cleanup conferences (e.g., Waste Management), networked with professional colleagues and DOE sites to gather reports or case studies, and received input during the August 1996 Research Design Team meeting held to discuss the project's research methodology. Articles were selected for annotation based upon their perceived usefulness to the broad range of public involvement and communication practitioners
Mass spectrometry (MS)-based metabolomics is the popular platform for metabolome analyses. Computational techniques for the processing of MS raw data, for example, feature detection, peak alignment, and the exclusion of false-positive peaks, have been established. The next stage of untargeted metabolomics would be to decipher the mass fragmentation of small molecules for the global identification of human-, animal-, plant-, and microbiota metabolomes, resulting in a deeper understanding of metabolisms. This review is an update on the latest computational metabolomics including known/expected structure databases, chemical ontology classifications, and mass spectrometry cheminformatics for the interpretation of mass fragmentations and for the elucidation of unknown metabolites. The importance of metabolome 'databases' and 'repositories' is also discussed because novel biological discoveries are often attributable to the accumulation of data, to relational databases, and to their statistics. Lastly, a practical guide for metabolite annotations is presented as the summary of this review. Copyright © 2018 Elsevier Ltd. All rights reserved.
Kusonmano, Kanthida; Vongsangnak, Wanwipa; Chumnanpuen, Pramote
Metabolome profiling of biological systems has the powerful ability to provide the biological understanding of their metabolic functional states responding to the environmental factors or other perturbations. Tons of accumulative metabolomics data have thus been established since pre-metabolomics era. This is directly influenced by the high-throughput analytical techniques, especially mass spectrometry (MS)- and nuclear magnetic resonance (NMR)-based techniques. Continuously, the significant numbers of informatics techniques for data processing, statistical analysis, and data mining have been developed. The following tools and databases are advanced for the metabolomics society which provide the useful metabolomics information, e.g., the chemical structures, mass spectrum patterns for peak identification, metabolite profiles, biological functions, dynamic metabolite changes, and biochemical transformations of thousands of small molecules. In this chapter, we aim to introduce overall metabolomics studies from pre- to post-metabolomics era and their impact on society. Directing on post-metabolomics era, we provide a conceptual framework of informatics techniques for metabolomics and show useful examples of techniques, tools, and databases for metabolomics data analysis starting from preprocessing toward functional interpretation. Throughout the framework of informatics techniques for metabolomics provided, it can be further used as a scaffold for translational biomedical research which can thus lead to reveal new metabolite biomarkers, potential metabolic targets, or key metabolic pathways for future disease therapy.
Bujak, Renata; Struck-Lewicka, Wiktoria; Markuszewski, Michał J; Kaliszan, Roman
Metabolomics is an emerging approach in a systems biology field. Due to continuous development in advanced analytical techniques and in bioinformatics, metabolomics has been extensively applied as a novel, holistic diagnostic tool in clinical and biomedical studies. Metabolome's measurement, as a chemical reflection of a current phenotype of a particular biological system, is nowadays frequently implemented to understand pathophysiological processes involved in disease progression as well as to search for new diagnostic or prognostic biomarkers of various organism's disorders. In this review, we discussed the research strategies and analytical platforms commonly applied in the metabolomics studies. The applications of the metabolomics in laboratory diagnostics in the last 5 years were also reviewed according to the type of biological sample used in the metabolome's analysis. We also discussed some limitations and further improvements which should be considered taking in mind potential applications of metabolomic research and practice. Copyright © 2014 Elsevier B.V. All rights reserved.
Kessler, Nikolas; Walter, Frederik; Persicke, Marcus; Albaum, Stefan P; Kalinowski, Jörn; Goesmann, Alexander; Niehaus, Karsten; Nattkemper, Tim W
Adduct formation, fragmentation events and matrix effects impose special challenges to the identification and quantitation of metabolites in LC-ESI-MS datasets. An important step in compound identification is the deconvolution of mass signals. During this processing step, peaks representing adducts, fragments, and isotopologues of the same analyte are allocated to a distinct group, in order to separate peaks from coeluting compounds. From these peak groups, neutral masses and pseudo spectra are derived and used for metabolite identification via mass decomposition and database matching. Quantitation of metabolites is hampered by matrix effects and nonlinear responses in LC-ESI-MS measurements. A common approach to correct for these effects is the addition of a U-13C-labeled internal standard and the calculation of mass isotopomer ratios for each metabolite. Here we present a new web-platform for the analysis of LC-ESI-MS experiments. ALLocator covers the workflow from raw data processing to metabolite identification and mass isotopomer ratio analysis. The integrated processing pipeline for spectra deconvolution "ALLocatorSD" generates pseudo spectra and automatically identifies peaks emerging from the U-13C-labeled internal standard. Information from the latter improves mass decomposition and annotation of neutral losses. ALLocator provides an interactive and dynamic interface to explore and enhance the results in depth. Pseudo spectra of identified metabolites can be stored in user- and method-specific reference lists that can be applied on succeeding datasets. The potential of the software is exemplified in an experiment, in which abundance fold-changes of metabolites of the l-arginine biosynthesis in C. glutamicum type strain ATCC 13032 and l-arginine producing strain ATCC 21831 are compared. Furthermore, the capability for detection and annotation of uncommon large neutral losses is shown by the identification of (γ-)glutamyl dipeptides in the same strains
Full Text Available Adduct formation, fragmentation events and matrix effects impose special challenges to the identification and quantitation of metabolites in LC-ESI-MS datasets. An important step in compound identification is the deconvolution of mass signals. During this processing step, peaks representing adducts, fragments, and isotopologues of the same analyte are allocated to a distinct group, in order to separate peaks from coeluting compounds. From these peak groups, neutral masses and pseudo spectra are derived and used for metabolite identification via mass decomposition and database matching. Quantitation of metabolites is hampered by matrix effects and nonlinear responses in LC-ESI-MS measurements. A common approach to correct for these effects is the addition of a U-13C-labeled internal standard and the calculation of mass isotopomer ratios for each metabolite. Here we present a new web-platform for the analysis of LC-ESI-MS experiments. ALLocator covers the workflow from raw data processing to metabolite identification and mass isotopomer ratio analysis. The integrated processing pipeline for spectra deconvolution "ALLocatorSD" generates pseudo spectra and automatically identifies peaks emerging from the U-13C-labeled internal standard. Information from the latter improves mass decomposition and annotation of neutral losses. ALLocator provides an interactive and dynamic interface to explore and enhance the results in depth. Pseudo spectra of identified metabolites can be stored in user- and method-specific reference lists that can be applied on succeeding datasets. The potential of the software is exemplified in an experiment, in which abundance fold-changes of metabolites of the l-arginine biosynthesis in C. glutamicum type strain ATCC 13032 and l-arginine producing strain ATCC 21831 are compared. Furthermore, the capability for detection and annotation of uncommon large neutral losses is shown by the identification of (γ-glutamyl dipeptides in
Misra, Biswapriya B; van der Hooft, Justin J J
Data processing and interpretation represent the most challenging and time-consuming steps in high-throughput metabolomic experiments, regardless of the analytical platforms (MS or NMR spectroscopy based) used for data acquisition. Improved machinery in metabolomics generates increasingly complex datasets that create the need for more and better processing and analysis software and in silico approaches to understand the resulting data. However, a comprehensive source of information describing the utility of the most recently developed and released metabolomics resources--in the form of tools, software, and databases--is currently lacking. Thus, here we provide an overview of freely-available, and open-source, tools, algorithms, and frameworks to make both upcoming and established metabolomics researchers aware of the recent developments in an attempt to advance and facilitate data processing workflows in their metabolomics research. The major topics include tools and researches for data processing, data annotation, and data visualization in MS and NMR-based metabolomics. Most in this review described tools are dedicated to untargeted metabolomics workflows; however, some more specialist tools are described as well. All tools and resources described including their analytical and computational platform dependencies are summarized in an overview Table. © 2015 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Evolution of potent odorants within the volatile metabolome of high-quality hazelnuts (Corylus avellana L.): evaluation by comprehensive two-dimensional gas chromatography coupled with mass spectrometry.
Rosso, Marta Cialiè; Liberto, Erica; Spigolon, Nicola; Fontana, Mauro; Somenzi, Marco; Bicchi, Carlo; Cordero, Chiara
Within the pattern of volatiles released by food products (volatilome), potent odorants are bio-active compounds that trigger aroma perception by activating a complex array of odor receptors (ORs) in the regio olfactoria. Their informative role is fundamental to select optimal post-harvest and storage conditions and preserve food sensory quality. This study addresses the volatile metabolome from high-quality hazelnuts (Corylus avellana L.) from the Ordu region (Turkey) and Tonda Romana from Italy, and investigates its evolution throughout the production chain (post-harvest, industrial storage, roasting) to find functional correlations between technological strategies and product quality. The volatile metabolome is analyzed by headspace solid-phase microextration combined with comprehensive two-dimensional gas chromatography and mass spectrometry. Dedicated pattern recognition, based on 2D data (targeted fingerprinting), is used to mine analytical outputs, while principal component analysis (PCA), Fisher ratio, hierarchical clustering, and analysis of variance are used to find decision makers among the most informative chemicals. Low-temperature drying (18-20 °C) has a decisive effect on quality; it correlates negatively with bacteria and mold metabolic activity, nut viability, and lipid oxidation products (2-methyl-1-propanol, 3-methyl-1-butanol, 2-ethyl-1-hexanol, 2-octanol, 1-octen-3-ol, hexanal, octanal and (E)-2-heptanal). Protective atmosphere storage (99% N 2 -1% O 2 ) effectively limits lipid oxidation for 9-12 months after nut harvest. The combination of optimal drying and storage preserves the aroma potential; after roasting at different shelf-lives, key odorants responsible for malty and buttery (2- and 3-methylbutanal, 2,3-butanedione and 2,3-pentanedione), earthy (methylpyrazine, 2-ethyl-5-methyl pyrazine and 3-ethyl-2,5-dimethyl pyrazine) and caramel-like and musty notes (2,5-dimethyl-4-hydroxy-3(2H)-furanone - furaneol and acetyl pyrrole) show no
Brown Alfred L
Full Text Available Abstract Background Annotations that describe the function of sequences are enormously important to researchers during laboratory investigations and when making computational inferences. However, there has been little investigation into the data quality of sequence function annotations. Here we have developed a new method of estimating the error rate of curated sequence annotations, and applied this to the Gene Ontology (GO sequence database (GOSeqLite. This method involved artificially adding errors to sequence annotations at known rates, and used regression to model the impact on the precision of annotations based on BLAST matched sequences. Results We estimated the error rate of curated GO sequence annotations in the GOSeqLite database (March 2006 at between 28% and 30%. Annotations made without use of sequence similarity based methods (non-ISS had an estimated error rate of between 13% and 18%. Annotations made with the use of sequence similarity methodology (ISS had an estimated error rate of 49%. Conclusion While the overall error rate is reasonably low, it would be prudent to treat all ISS annotations with caution. Electronic annotators that use ISS annotations as the basis of predictions are likely to have higher false prediction rates, and for this reason designers of these systems should consider avoiding ISS annotations where possible. Electronic annotators that use ISS annotations to make predictions should be viewed sceptically. We recommend that curators thoroughly review ISS annotations before accepting them as valid. Overall, users of curated sequence annotations from the GO database should feel assured that they are using a comparatively high quality source of information.
The Metabolomics and Epidemiology (MetEpi) Working Group promotes metabolomics analyses in population-based studies, as well as advancement in the field of metabolomics for broader biomedical and public health research.
In this work we study the task of image annotation, of which the goal is to describe an image using a few tags. Instead of predicting the full list of tags, here we target for providing a short list of tags under a limited number (e.g., 3), to cover as much information as possible of the image. The tags in such a short list should be representative and diverse. It means they are required to be not only corresponding to the contents of the image, but also be different to each other. To this end, we treat the image annotation as a subset selection problem based on the conditional determinantal point process (DPP) model, which formulates the representation and diversity jointly. We further explore the semantic hierarchy and synonyms among the candidate tags, and require that two tags in a semantic hierarchy or in a pair of synonyms should not be selected simultaneously. This requirement is then embedded into the sampling algorithm according to the learned conditional DPP model. Besides, we find that traditional metrics for image annotation (e.g., precision, recall and F1 score) only consider the representation, but ignore the diversity. Thus we propose new metrics to evaluate the quality of the selected subset (i.e., the tag list), based on the semantic hierarchy and synonyms. Human study through Amazon Mechanical Turk verifies that the proposed metrics are more close to the humans judgment than traditional metrics. Experiments on two benchmark datasets show that the proposed method can produce more representative and diverse tags, compared with existing image annotation methods.
Wu, Baoyuan; Jia, Fan; Liu, Wei; Ghanem, Bernard
In this work we study the task of image annotation, of which the goal is to describe an image using a few tags. Instead of predicting the full list of tags, here we target for providing a short list of tags under a limited number (e.g., 3), to cover as much information as possible of the image. The tags in such a short list should be representative and diverse. It means they are required to be not only corresponding to the contents of the image, but also be different to each other. To this end, we treat the image annotation as a subset selection problem based on the conditional determinantal point process (DPP) model, which formulates the representation and diversity jointly. We further explore the semantic hierarchy and synonyms among the candidate tags, and require that two tags in a semantic hierarchy or in a pair of synonyms should not be selected simultaneously. This requirement is then embedded into the sampling algorithm according to the learned conditional DPP model. Besides, we find that traditional metrics for image annotation (e.g., precision, recall and F1 score) only consider the representation, but ignore the diversity. Thus we propose new metrics to evaluate the quality of the selected subset (i.e., the tag list), based on the semantic hierarchy and synonyms. Human study through Amazon Mechanical Turk verifies that the proposed metrics are more close to the humans judgment than traditional metrics. Experiments on two benchmark datasets show that the proposed method can produce more representative and diverse tags, compared with existing image annotation methods.
Hall, R.D.; Brouwer, I.D.; Fitzgerald, M.A.
With the growing interest in the use of metabolomic technologies for a wide range of biological targets, food applications related to nutrition and quality are rapidly emerging. Metabolomics offers us the opportunity to gain deeper insights into, and have better control of, the fundamental
Full Text Available Selenium (Se is an essential nutrient for humans, due to its antioxidant properties, whereas, to date, its essentiality to plants still remains to be demonstrated. Nevertheless, if added to the cultivation substrate, plants growth resulted enhanced. However, the concentration of Se in agricultural soils is very variable, ranging from 0.01 mg kg-1 up to 10 mg kg-1 in seleniferous areas. Therefore several studies have been performed aimed at bio-fortifying crops with Se and the approaches exploited were mainly based on the application of Se fertilizers. The aim of the present research was to assess the biofortification potential of Se in hydroponically grown strawberry fruits and its effects on qualitative parameters and nutraceutical compounds. The supplementation with Se did not negatively affect the growth and the yield of strawberries, and induced an accumulation of Se in fruits. Furthermore, the metabolomic analyses highlighted an increase in flavonoid and polyphenol compounds, which contributes to the organoleptic features and antioxidant capacity of fruits; in addition, an increase in the fruits sweetness also was detected in biofortified strawberries. In conclusion, based on our observations, strawberry plants seem a good target for Se biofortification, thus allowing the increase in the human intake of this essential micronutrient.
Cox, James E; Thummel, Carl S; Tennessen, Jason M
Metabolomic analysis provides a powerful new tool for studies of Drosophila physiology. This approach allows investigators to detect thousands of chemical compounds in a single sample, representing the combined contributions of gene expression, enzyme activity, and environmental context. Metabolomics has been used for a wide range of studies in Drosophila , often providing new insights into gene function and metabolic state that could not be obtained using any other approach. In this review, we survey the uses of metabolomic analysis since its entry into the field. We also cover the major methods used for metabolomic studies in Drosophila and highlight new directions for future research. Copyright © 2017 by the Genetics Society of America.
Sun, Weiyi; Rumshisky, Anna; Uzuner, Ozlem
Temporal information in clinical narratives plays an important role in patients' diagnosis, treatment and prognosis. In order to represent narrative information accurately, medical natural language processing (MLP) systems need to correctly identify and interpret temporal information. To promote research in this area, the Informatics for Integrating Biology and the Bedside (i2b2) project developed a temporally annotated corpus of clinical narratives. This corpus contains 310 de-identified discharge summaries, with annotations of clinical events, temporal expressions and temporal relations. This paper describes the process followed for the development of this corpus and discusses annotation guideline development, annotation methodology, and corpus quality. Copyright © 2013 Elsevier Inc. All rights reserved.
Lijin K. Gopi
Full Text Available Current era of functional genomics is enriched with good quality draft genomes and annotations for many thousands of species and varieties with the support of the advancements in the next generation sequencing technologies (NGS. Around 25,250 genomes, of the organisms from various kingdoms, are submitted in the NCBI genome resource till date. Each of these genomes was annotated using various tools and knowledge-bases that were available during the period of the annotation. It is obvious that these annotations will be improved if the same genome is annotated using improved tools and knowledge-bases. Here we present a new genome annotation pipeline, strengthened with various tools and knowledge-bases that are capable of producing better quality annotations from the consensus of the predictions from different tools. This resource also perform various additional annotations, apart from the usual gene predictions and functional annotations, which involve SSRs, novel repeats, paralogs, proteins with transmembrane helices, signal peptides etc. This new annotation resource is trained to evaluate and integrate all the predictions together to resolve the overlaps and ambiguities of the boundaries. One of the important highlights of this resource is the capability of predicting the phylogenetic relations of the repeats using the evolutionary trace analysis and orthologous gene clusters. We also present a case study, of the pipeline, in which we upgrade the genome annotation of Nelumbo nucifera (sacred lotus. It is demonstrated that this resource is capable of producing an improved annotation for a better understanding of the biology of various organisms.
Franceschi, Pietro; Mylonas, Roman; Shahaf, Nir; Scholz, Matthias; Arapitsas, Panagiotis; Masuero, Domenico; Weingart, Georg; Carlin, Silvia; Vrhovsek, Urska; Mattivi, Fulvio; Wehrens, Ron
Due to their sensitivity and speed, mass-spectrometry based analytical technologies are widely used to in metabolomics to characterize biological phenomena. To address issues like metadata organization, quality assessment, data processing, data storage, and, finally, submission to public repositories, bioinformatic pipelines of a non-interactive nature are often employed, complementing the interactive software used for initial inspection and visualization of the data. These pipelines often are created as open-source software allowing the complete and exhaustive documentation of each step, ensuring the reproducibility of the analysis of extensive and often expensive experiments. In this paper, we will review the major steps which constitute such a data processing pipeline, discussing them in the context of an open-source software for untargeted MS-based metabolomics experiments recently developed at our institute. The software has been developed by integrating our metaMS R package with a user-friendly web-based application written in Grails. MetaMS takes care of data pre-processing and annotation, while the interface deals with the creation of the sample lists, the organization of the data storage, and the generation of survey plots for quality assessment. Experimental and biological metadata are stored in the ISA-Tab format making the proposed pipeline fully integrated with the Metabolights framework.
Liu, Xiaojing; Locasale, Jason W
Metabolomics generates a profile of small molecules that are derived from cellular metabolism and can directly reflect the outcome of complex networks of biochemical reactions, thus providing insights into multiple aspects of cellular physiology. Technological advances have enabled rapid and increasingly expansive data acquisition with samples as small as single cells; however, substantial challenges in the field remain. In this primer we provide an overview of metabolomics, especially mass spectrometry (MS)-based metabolomics, which uses liquid chromatography (LC) for separation, and discuss its utilities and limitations. We identify and discuss several areas at the frontier of metabolomics. Our goal is to give the reader a sense of what might be accomplished when conducting a metabolomics experiment, now and in the near future. Copyright © 2017 Elsevier Ltd. All rights reserved.
Tetko, Igor V; Brauner, Barbara; Dunger-Kaltenbach, Irmtraud; Frishman, Goar; Montrone, Corinna; Fobo, Gisela; Ruepp, Andreas; Antonov, Alexey V; Surmeli, Dimitrij; Mewes, Hans-Wernen
Any development of new methods for automatic functional annotation of proteins according to their sequences requires high-quality data (as benchmark) as well as tedious preparatory work to generate sequence parameters required as input data for the machine learning methods. Different program settings and incompatible protocols make a comparison of the analyzed methods difficult. The MIPS Bacterial Functional Annotation Benchmark dataset (MIPS-BFAB) is a new, high-quality resource comprising four bacterial genomes manually annotated according to the MIPS functional catalogue (FunCat). These resources include precalculated sequence parameters, such as sequence similarity scores, InterPro domain composition and other parameters that could be used to develop and benchmark methods for functional annotation of bacterial protein sequences. These data are provided in XML format and can be used by scientists who are not necessarily experts in genome annotation. BFAB is available at http://mips.gsf.de/proj/bfab
Psychogios, Nikolaos; Hau, David D.; Peng, Jun; Guo, An Chi; Mandal, Rupasri; Bouatra, Souhaila; Sinelnikov, Igor; Krishnamurthy, Ramanarayan; Eisner, Roman; Gautam, Bijaya; Young, Nelson; Xia, Jianguo; Knox, Craig; Dong, Edison; Huang, Paul; Hollander, Zsuzsanna; Pedersen, Theresa L.; Smith, Steven R.; Bamforth, Fiona; Greiner, Russ; McManus, Bruce; Newman, John W.; Goodfriend, Theodore; Wishart, David S.
Continuing improvements in analytical technology along with an increased interest in performing comprehensive, quantitative metabolic profiling, is leading to increased interest pressures within the metabolomics community to develop centralized metabolite reference resources for certain clinically important biofluids, such as cerebrospinal fluid, urine and blood. As part of an ongoing effort to systematically characterize the human metabolome through the Human Metabolome Project, we have undertaken the task of characterizing the human serum metabolome. In doing so, we have combined targeted and non-targeted NMR, GC-MS and LC-MS methods with computer-aided literature mining to identify and quantify a comprehensive, if not absolutely complete, set of metabolites commonly detected and quantified (with today's technology) in the human serum metabolome. Our use of multiple metabolomics platforms and technologies allowed us to substantially enhance the level of metabolome coverage while critically assessing the relative strengths and weaknesses of these platforms or technologies. Tables containing the complete set of 4229 confirmed and highly probable human serum compounds, their concentrations, related literature references and links to their known disease associations are freely available at http://www.serummetabolome.ca. PMID:21359215
Kuhlisch, Constanze; Pohnert, Georg
Chemical ecology elucidates the nature and role of natural products as mediators of organismal interactions. The emerging techniques that can be summarized under the concept of metabolomics provide new opportunities to study such environmentally relevant signaling molecules. Especially comparative tools in metabolomics enable the identification of compounds that are regulated during interaction situations and that might play a role as e.g. pheromones, allelochemicals or in induced and activated defenses. This approach helps overcoming limitations of traditional bioassay-guided structure elucidation approaches. But the power of metabolomics is not limited to the comparison of metabolic profiles of interacting partners. Especially the link to other -omics techniques helps to unravel not only the compounds in question but the entire biosynthetic and genetic re-wiring, required for an ecological response. This review comprehensively highlights successful applications of metabolomics in chemical ecology and discusses existing limitations of these novel techniques. It focuses on recent developments in comparative metabolomics and discusses the use of metabolomics in the systems biology of organismal interactions. It also outlines the potential of large metabolomics initiatives for model organisms in the field of chemical ecology.
Huang, Weiliang; Brewer, Luke K; Jones, Jace W; Nguyen, Angela T; Marcu, Ana; Wishart, David S; Oglesby-Sherrouse, Amanda G; Kane, Maureen A; Wilks, Angela
The Pseudomonas aeruginosaMetabolome Database (PAMDB, http://pseudomonas.umaryland.edu) is a searchable, richly annotated metabolite database specific to P. aeruginosa. P. aeruginosa is a soil organism and significant opportunistic pathogen that adapts to its environment through a versatile energy metabolism network. Furthermore, P. aeruginosa is a model organism for the study of biofilm formation, quorum sensing, and bioremediation processes, each of which are dependent on unique pathways and metabolites. The PAMDB is modelled on the Escherichia coli (ECMDB), yeast (YMDB) and human (HMDB) metabolome databases and contains >4370 metabolites and 938 pathways with links to over 1260 genes and proteins. The database information was compiled from electronic databases, journal articles and mass spectrometry (MS) metabolomic data obtained in our laboratories. For each metabolite entered, we provide detailed compound descriptions, names and synonyms, structural and physiochemical information, nuclear magnetic resonance (NMR) and MS spectra, enzymes and pathway information, as well as gene and protein sequences. The database allows extensive searching via chemical names, structure and molecular weight, together with gene, protein and pathway relationships. The PAMBD and its future iterations will provide a valuable resource to biologists, natural product chemists and clinicians in identifying active compounds, potential biomarkers and clinical diagnostics. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.
R. R. Furina
Full Text Available The review shows the results of metabolomic studies in pulmonology. The key idea of metabolomics is to detect specific biomarkers in a biological sample for the diagnosis of diseases of the bronchi and lung. Main methods for the separation and identification of volatile organic substances as biomarkers (gas chromatography, mass spectrometry, and nuclear magnetic resonance spectrometry used in metabolomics are given. A solid-phase microextraction method used to pre-prepare a sample is also covered. The results of laboratory tests for biomarkers for lung cancer, acute respiratory distress syndrome, chronic obstructive pulmonary disease, cystic fibrosis, chronic infections, and pulmonary tuberculosis are presented. In addition, emphasis is placed on the possibilities of metabolomics used in experimental medicine, including to the study of asthma. The information is of interest to both theorists and practitioners.
Scalbert, Augustin; Brennan, Lorraine; Manach, Claudine
to the diet. By its very nature it represents a considerable and still largely unexploited source of novel dietary biomarkers that could be used to measure dietary exposures with a high level of detail and precision. Most dietary biomarkers currently have been identified on the basis of our knowledge of food......The food metabolome is defined as the part of the human metabolome directly derived from the digestion and biotransformation of foods and their constituents. With >25,000 compounds known in various foods, the food metabolome is extremely complex, with a composition varying widely according...... by the recent identification of novel biomarkers of intakes for fruit, vegetables, beverages, meats, or complex diets. Moreover, examples also show how the scrutiny of the food metabolome can lead to the discovery of bioactive molecules and dietary factors associated with diseases. However, researchers still...
Putri, Sastia P; Yamamoto, Shinya; Tsugawa, Hiroshi; Fukusaki, Eiichiro
Metabolomics, the global quantitative assessment of metabolites in a biological system, has played a pivotal role in various fields of science in the post-genomic era. Metabolites are the result of the interaction of the system's genome with its environment and are not merely the end product of gene expression, but also form part of the regulatory system in an integrated manner. Therefore, metabolomics is often considered a powerful tool to provide an instantaneous snapshot of the physiology of a cell. The power of metabolomics lies on the acquisition of analytical data in which metabolites in a cellular system are quantified, and the extraction of the most meaningful elements of the data by using various data analysis tool. In this review, we discuss the latest development of analytical techniques and data analyses methods in metabolomics study. Copyright © 2013 The Society for Biotechnology, Japan. Published by Elsevier B.V. All rights reserved.
Forskningen i fødevarer har fået et potent redskab i hånden. Metabolomics er vejen frem, mener professor Søren Balling Engelsen fra Københavns Universitet......Forskningen i fødevarer har fået et potent redskab i hånden. Metabolomics er vejen frem, mener professor Søren Balling Engelsen fra Københavns Universitet...
Vieth, M; Quirke, P; Lambert, R; von Karsa, L; Risio, M
Multidisciplinary, evidence-based guidelines for quality assurance in colorectal cancer screening and diagnosis have been developed by experts in a project coordinated by the International Agency for Research on Cancer. The full guideline document covers the entire process of population-based screening. It consists of 10 chapters and over 250 recommendations, graded according to the strength of the recommendation and the supporting evidence. The 450-page guidelines and the extensive evidence base have been published by the European Commission. The chapter on quality assurance in pathology was supplemented by an annex describing in greater detail some issues raised in the chapter, particularly details of special interest to pathologists. The content of the annex is presented here to promote international discussion and collaboration by making the issues discussed in the guidelines known to a wider professional and scientific community. © Georg Thieme Verlag KG Stuttgart · New York.
Marc Schreiber; Kai Barkschat; Bodo Kraft; Albert Zundorf
More and more domain specific applications in the internet make use of Natural Language Processing (NLP) tools (e. g. Information Extraction systems). The output quality of these applications relies on the output quality of the used NLP tools. Often, the quality can be increased by annotating a domain specific corpus. However, annotating a corpus is a time consuming and exhaustive task. To reduce the annota tion time we present...
Hansen, Frank Allan
Ubiquitous annotation systems allow users to annotate physical places, objects, and persons with digital information. Especially in the field of location based information systems much work has been done to implement adaptive and context-aware systems, but few efforts have focused on the general...... requirements for linking information to objects in both physical and digital space. This paper surveys annotation techniques from open hypermedia systems, Web based annotation systems, and mobile and augmented reality systems to illustrate different approaches to four central challenges ubiquitous annotation...... systems have to deal with: anchoring, structuring, presentation, and authoring. Through a number of examples each challenge is discussed and HyCon, a context-aware hypermedia framework developed at the University of Aarhus, Denmark, is used to illustrate an integrated approach to ubiquitous annotations...
Bada, Michael; Eckert, Miriam; Evans, Donald; Garcia, Kristin; Shipley, Krista; Sitnikov, Dmitry; Baumgartner, William A; Cohen, K Bretonnel; Verspoor, Karin; Blake, Judith A; Hunter, Lawrence E
Manually annotated corpora are critical for the training and evaluation of automated methods to identify concepts in biomedical text. This paper presents the concept annotations of the Colorado Richly Annotated Full-Text (CRAFT) Corpus, a collection of 97 full-length, open-access biomedical journal articles that have been annotated both semantically and syntactically to serve as a research resource for the biomedical natural-language-processing (NLP) community. CRAFT identifies all mentions of nearly all concepts from nine prominent biomedical ontologies and terminologies: the Cell Type Ontology, the Chemical Entities of Biological Interest ontology, the NCBI Taxonomy, the Protein Ontology, the Sequence Ontology, the entries of the Entrez Gene database, and the three subontologies of the Gene Ontology. The first public release includes the annotations for 67 of the 97 articles, reserving two sets of 15 articles for future text-mining competitions (after which these too will be released). Concept annotations were created based on a single set of guidelines, which has enabled us to achieve consistently high interannotator agreement. As the initial 67-article release contains more than 560,000 tokens (and the full set more than 790,000 tokens), our corpus is among the largest gold-standard annotated biomedical corpora. Unlike most others, the journal articles that comprise the corpus are drawn from diverse biomedical disciplines and are marked up in their entirety. Additionally, with a concept-annotation count of nearly 100,000 in the 67-article subset (and more than 140,000 in the full collection), the scale of conceptual markup is also among the largest of comparable corpora. The concept annotations of the CRAFT Corpus have the potential to significantly advance biomedical text mining by providing a high-quality gold standard for NLP systems. The corpus, annotation guidelines, and other associated resources are freely available at http://bionlp-corpora.sourceforge.net/CRAFT/index.shtml.
Gresham Cathy R
Full Text Available Abstract Background Modeling results from chicken microarray studies is challenging for researchers due to little functional annotation associated with these arrays. The Affymetrix GenChip chicken genome array, one of the biggest arrays that serve as a key research tool for the study of chicken functional genomics, is among the few arrays that link gene products to Gene Ontology (GO. However the GO annotation data presented by Affymetrix is incomplete, for example, they do not show references linked to manually annotated functions. In addition, there is no tool that facilitates microarray researchers to directly retrieve functional annotations for their datasets from the annotated arrays. This costs researchers amount of time in searching multiple GO databases for functional information. Results We have improved the breadth of functional annotations of the gene products associated with probesets on the Affymetrix chicken genome array by 45% and the quality of annotation by 14%. We have also identified the most significant diseases and disorders, different types of genes, and known drug targets represented on Affymetrix chicken genome array. To facilitate functional annotation of other arrays and microarray experimental datasets we developed an Array GO Mapper (AGOM tool to help researchers to quickly retrieve corresponding functional information for their dataset. Conclusion Results from this study will directly facilitate annotation of other chicken arrays and microarray experimental datasets. Researchers will be able to quickly model their microarray dataset into more reliable biological functional information by using AGOM tool. The disease, disorders, gene types and drug targets revealed in the study will allow researchers to learn more about how genes function in complex biological systems and may lead to new drug discovery and development of therapies. The GO annotation data generated will be available for public use via AgBase website and
Su, Qiao; Guan, Tianbing; Lv, Haitao
Uropathogenic Escherichia coli (UPEC) growth in women's bladders during urinary tract infection (UTI) incurs substantial chemical exchange, termed the "interactive metabolome", which primarily accounts for the metabolic costs (utilized metabolome) and metabolic donations (excreted metabolome) between UPEC and human urine. Here, we attempted to identify the individualized interactive metabolome between UPEC and human urine. We were able to distinguish UPEC from non-UPEC by employing a combination of metabolomics and genetics. Our results revealed that the interactive metabolome between UPEC and human urine was markedly different from that between non-UPEC and human urine, and that UPEC triggered much stronger perturbations in the interactive metabolome in human urine. Furthermore, siderophore biosynthesis coordinately modulated the individualized interactive metabolome, which we found to be a critical component of UPEC virulence. The individualized virulence-associated interactive metabolome contained 31 different metabolites and 17 central metabolic pathways that were annotated to host these different metabolites, including energetic metabolism, amino acid metabolism, and gut microbe metabolism. Changes in the activities of these pathways mechanistically pinpointed the virulent capability of siderophore biosynthesis. Together, our findings provide novel insights into UPEC virulence, and we propose that siderophores are potential targets for further discovery of drugs to treat UPEC-induced UTI.
Carroll Adam J
Full Text Available Abstract Background Standardization of analytical approaches and reporting methods via community-wide collaboration can work synergistically with web-tool development to result in rapid community-driven expansion of online data repositories suitable for data mining and meta-analysis. In metabolomics, the inter-laboratory reproducibility of gas-chromatography/mass-spectrometry (GC/MS makes it an obvious target for such development. While a number of web-tools offer access to datasets and/or tools for raw data processing and statistical analysis, none of these systems are currently set up to act as a public repository by easily accepting, processing and presenting publicly submitted GC/MS metabolomics datasets for public re-analysis. Description Here, we present MetabolomeExpress, a new File Transfer Protocol (FTP server and web-tool for the online storage, processing, visualisation and statistical re-analysis of publicly submitted GC/MS metabolomics datasets. Users may search a quality-controlled database of metabolite response statistics from publicly submitted datasets by a number of parameters (eg. metabolite, species, organ/biofluid etc.. Users may also perform meta-analysis comparisons of multiple independent experiments or re-analyse public primary datasets via user-friendly tools for t-test, principal components analysis, hierarchical cluster analysis and correlation analysis. They may interact with chromatograms, mass spectra and peak detection results via an integrated raw data viewer. Researchers who register for a free account may upload (via FTP their own data to the server for online processing via a novel raw data processing pipeline. Conclusions MetabolomeExpress https://www.metabolome-express.org provides a new opportunity for the general metabolomics community to transparently present online the raw and processed GC/MS data underlying their metabolomics publications. Transparent sharing of these data will allow researchers to
Full Text Available Understanding and harnessing the interactions between nanoparticles and biological molecules is at the forefront of applications of nanotechnology to modern biology. Metabolomics has emerged as a prominent player in systems biology as a complement to genomics, transcriptomics and proteomics. Its focus is the systematic study of metabolite identities and concentration changes in living systems. Despite significant progress over the recent past, important challenges in metabolomics remain, such as the deconvolution of the spectra of complex mixtures with strong overlaps, the sensitive detection of metabolites at low abundance, unambiguous identification of known metabolites, structure determination of unknown metabolites and standardized sample preparation for quantitative comparisons. Recent research has demonstrated that some of these challenges can be substantially alleviated with the help of nanoscience. Nanoparticles in particular have found applications in various areas of bioanalytical chemistry and metabolomics. Their chemical surface properties and increased surface-to-volume ratio endows them with a broad range of binding affinities to biomacromolecules and metabolites. The specific interactions of nanoparticles with metabolites or biomacromolecules help, for example, simplify metabolomics spectra, improve the ionization efficiency for mass spectrometry or reveal relationships between spectral signals that belong to the same molecule. Lessons learned from nanoparticle-assisted metabolomics may also benefit other emerging areas, such as nanotoxicity and nanopharmaceutics.
Barbosa-Breda, João; Himmelreich, Uwe; Ghesquière, Bart; Rocha-Sousa, Amândio; Stalmans, Ingeborg
Glaucoma is one of the leading causes of irreversible blindness worldwide. However, there are no biomarkers that accurately help clinicians perform an early diagnosis or detect patients with a high risk of progression. Metabolomics is the study of all metabolites in an organism, and it has the potential to provide a biomarker. This review summarizes the findings of metabolomics in glaucoma patients and explains why this field is promising for new research. We identified published studies that focused on metabolomics and ophthalmology. After providing an overview of metabolomics in ophthalmology, we focused on human glaucoma studies. Five studies have been conducted in glaucoma patients and all compared patients to healthy controls. Using mass spectrometry, significant differences were found in blood plasma in the metabolic pathways that involve palmitoylcarnitine, sphingolipids, vitamin D-related compounds, and steroid precursors. For nuclear magnetic resonance spectroscopy, a high glutamine-glutamate/creatine ratio was found in the vitreous and lateral geniculate body; no differences were detected in the optic radiations, and a lower N-acetylaspartate/choline ratio was observed in the geniculocalcarine and striate areas. Metabolomics can move glaucoma care towards a personalized approach and provide new knowledge concerning the pathophysiology of glaucoma, which can lead to new therapeutic options. © 2017 S. Karger AG, Basel.
Bean, Heather D.; Hill, Jane E.; Dimandja, Jean-Marie D.
The potential of high-resolution analytical technologies like GC×GC/TOF MS in untargeted metabolomics and biomarker discovery has been limited by the development of fully automated software that can efficiently align and extract information from multiple chromatographic data sets. In this work we report the first investigation on a peak-by-peak basis of the chromatographic factors that impact GC×GC data alignment. A representative set of 16 compounds of different chromatographic characteristics were followed through the alignment of 63 GC×GC chromatograms. We found that varying the mass spectral match parameter had a significant influence on the alignment for poorly- resolved peaks, especially those at the extremes of the detector linear range, and no influence on well- chromatographed peaks. Therefore, optimized chromatography is required for proper GC×GC data alignment. Based on these observations, a workflow is presented for the conservative selection of biomarker candidates from untargeted metabolomics analyses. PMID:25857541
Sartor, Maureen A; Ade, Alex; Wright, Zach; States, David; Omenn, Gilbert S; Athey, Brian; Karnovsky, Alla
Progress in high-throughput genomic technologies has led to the development of a variety of resources that link genes to functional information contained in the biomedical literature. However, tools attempting to link small molecules to normal and diseased physiology and published data relevant to biologists and clinical investigators, are still lacking. With metabolomics rapidly emerging as a new omics field, the task of annotating small molecule metabolites becomes highly relevant. Our tool Metab2MeSH uses a statistical approach to reliably and automatically annotate compounds with concepts defined in Medical Subject Headings, and the National Library of Medicine's controlled vocabulary for biomedical concepts. These annotations provide links from compounds to biomedical literature and complement existing resources such as PubChem and the Human Metabolome Database.
Lamichhane, Santosh; Sen, Partho; Dickens, Alex M
It is well established that gut microbes and their metabolic products regulate host metabolism. The interactions between the host and its gut microbiota are highly dynamic and complex. In this review we present and discuss the metabolomic strategies to study the gut microbial ecosystem. We...... highlight the metabolic profiling approaches to study faecal samples aimed at deciphering the metabolic product derived from gut microbiota. We also discuss how metabolomics data can be integrated with metagenomics data derived from gut microbiota and how such approaches may lead to better understanding...
Howe, Kevin; Davis, Paul; Paulini, Michael; Tuli, Mary Ann; Williams, Gary; Yook, Karen; Durbin, Richard; Kersey, Paul; Sternberg, Paul W
WormBase (www.wormbase.org) has been serving the scientific community for over 11 years as the central repository for genomic and genetic information for the soil nematode Caenorhabditis elegans. The resource has evolved from its beginnings as a database housing the genomic sequence and genetic and physical maps of a single species, and now represents the breadth and diversity of nematode research, currently serving genome sequence and annotation for around 20 nematodes. In this article, we focus on WormBase's role of genome sequence annotation, describing how we annotate and integrate data from a growing collection of nematode species and strains. We also review our approaches to sequence curation, and discuss the impact on annotation quality of large functional genomics projects such as modENCODE.
Koen, Nadia; Du Preez, Ilse; Loots, Du Toit
Current clinical practice strongly relies on the prognosis, diagnosis, and treatment of diseases using methods determined and averaged for the specific diseased cohort/population. Although this approach complies positively with most patients, misdiagnosis, treatment failure, relapse, and adverse drug effects are common occurrences in many individuals, which subsequently hamper the control and eradication of a number of diseases. These incidences can be explained by individual variation in the genome, transcriptome, proteome, and metabolome of a patient. Various "omics" approaches have investigated the influence of these factors on a molecular level, with the intention of developing personalized approaches to disease diagnosis and treatment. Metabolomics, the newest addition to the "omics" domain and the closest to the observed phenotype, reflects changes occurring at all molecular levels, as well as influences resulting from other internal and external factors. By comparing the metabolite profiles of two or more disease phenotypes, metabolomics can be applied to identify biomarkers related to the perturbation being investigated. These biomarkers can, in turn, be used to develop personalized prognostic, diagnostic, and treatment approaches, and can also be applied to the monitoring of disease progression, treatment efficacy, predisposition to drug-related side effects, and potential relapse. In this review, we discuss the contributions that metabolomics has made, and can potentially still make, towards the field of personalized medicine. © 2016 Elsevier Inc. All rights reserved.
Heinemann, Matthias; Zenobi, Renato
Recent discoveries suggest that cells of a clonal population often display multiple metabolic phenotypes at the same time. Motivated by the success of mass spectrometry (MS) in the investigation of population-level metabolomics, the analytical community has initiated efforts towards MS-based single
Shu, Shengqiang; Rokhsar, Dan; Goodstein, David; Hayes, David; Mitros, Therese
Plant genomes vary in size and are highly complex with a high amount of repeats, genome duplication and tandem duplication. Gene encodes a wealth of information useful in studying organism and it is critical to have high quality and stable gene annotation. Thanks to advancement of sequencing technology, many plant species genomes have been sequenced and transcriptomes are also sequenced. To use these vastly large amounts of sequence data to make gene annotation or re-annotation in a timely fashion, an automatic pipeline is needed. JGI plant genomics gene annotation pipeline, called integrated gene call (IGC), is our effort toward this aim with aid of a RNA-seq transcriptome assembly pipeline. It utilizes several gene predictors based on homolog peptides and transcript ORFs. See Methods for detail. Here we present genome annotation of JGI flagship green plants produced by this pipeline plus Arabidopsis and rice except for chlamy which is done by a third party. The genome annotations of these species and others are used in our gene family build pipeline and accessible via JGI Phytozome portal whose URL and front page snapshot are shown below.
Kirpich, Alexander S; Ibarra, Miguel; Moskalenko, Oleksandr; Fear, Justin M; Gerken, Joseph; Mi, Xinlei; Ashrafi, Ali; Morse, Alison M; McIntyre, Lauren M
Metabolomics has the promise to transform the area of personalized medicine with the rapid development of high throughput technology for untargeted analysis of metabolites. Open access, easy to use, analytic tools that are broadly accessible to the biological community need to be developed. While technology used in metabolomics varies, most metabolomics studies have a set of features identified. Galaxy is an open access platform that enables scientists at all levels to interact with big data. Galaxy promotes reproducibility by saving histories and enabling the sharing workflows among scientists. SECIMTools (SouthEast Center for Integrated Metabolomics) is a set of Python applications that are available both as standalone tools and wrapped for use in Galaxy. The suite includes a comprehensive set of quality control metrics (retention time window evaluation and various peak evaluation tools), visualization techniques (hierarchical cluster heatmap, principal component analysis, modular modularity clustering), basic statistical analysis methods (partial least squares - discriminant analysis, analysis of variance, t-test, Kruskal-Wallis non-parametric test), advanced classification methods (random forest, support vector machines), and advanced variable selection tools (least absolute shrinkage and selection operator LASSO and Elastic Net). SECIMTools leverages the Galaxy platform and enables integrated workflows for metabolomics data analysis made from building blocks designed for easy use and interpretability. Standard data formats and a set of utilities allow arbitrary linkages between tools to encourage novel workflow designs. The Galaxy framework enables future data integration for metabolomics studies with other omics data.
Sundekilde, Ulrik; Larsen, Lotte Bach; Bertram, Hanne Christine S.
and processing capabilities of bovine milk is closely associated to milk composition. Metabolomics is ideal in the study of the low-molecular-weight compounds in milk, and this review focuses on the recent nuclear magnetic resonance (NMR)-based metabolomics trends in milk research, including applications linking...... compounds. Furthermore, metabolomics applications elucidating how the differential regulated genes affects milk composition are also reported. This review will highlight the recent advances in NMR-based metabolomics on milk, as well as give a brief summary of when NMR spectroscopy can be useful for gaining...
Bean, Heather D; Hill, Jane E; Dimandja, Jean-Marie D
The potential of high-resolution analytical technologies like GC×GC/TOF MS in untargeted metabolomics and biomarker discovery has been limited by the development of fully automated software that can efficiently align and extract information from multiple chromatographic data sets. In this work we report the first investigation on a peak-by-peak basis of the chromatographic factors that impact GC×GC data alignment. A representative set of 16 compounds of different chromatographic characteristics were followed through the alignment of 63 GC×GC chromatograms. We found that varying the mass spectral match parameter had a significant influence on the alignment for poorly-resolved peaks, especially those at the extremes of the detector linear range, and no influence on well-chromatographed peaks. Therefore, optimized chromatography is required for proper GC×GC data alignment. Based on these observations, a workflow is presented for the conservative selection of biomarker candidates from untargeted metabolomics analyses. Copyright © 2015 Elsevier B.V. All rights reserved.
Full Text Available Genome annotations are accumulating rapidly and depend heavily on automated annotation systems. Many genome centers offer annotation systems but no one has compared their output in a systematic way to determine accuracy and inherent errors. Errors in the annotations are routinely deposited in databases such as NCBI and used to validate subsequent annotation errors. We submitted the genome sequence of halophilic archaeon Halorhabdus utahensis to be analyzed by three genome annotation services. We have examined the output from each service in a variety of ways in order to compare the methodology and effectiveness of the annotations, as well as to explore the genes, pathways, and physiology of the previously unannotated genome. The annotation services differ considerably in gene calls, features, and ease of use. We had to manually identify the origin of replication and the species-specific consensus ribosome-binding site. Additionally, we conducted laboratory experiments to test H. utahensis growth and enzyme activity. Current annotation practices need to improve in order to more accurately reflect a genome's biological potential. We make specific recommendations that could improve the quality of microbial annotation projects.
Kale, Namrata S; Haug, Kenneth; Conesa, Pablo; Jayseelan, Kalaivani; Moreno, Pablo; Rocca-Serra, Philippe; Nainala, Venkata Chandrasekhar; Spicer, Rachel A; Williams, Mark; Li, Xuefei; Salek, Reza M; Griffin, Julian L; Steinbeck, Christoph
MetaboLights is the first general purpose, open-access database repository for cross-platform and cross-species metabolomics research at the European Bioinformatics Institute (EMBL-EBI). Based upon the open-source ISA framework, MetaboLights provides Metabolomics Standard Initiative (MSI) compliant metadata and raw experimental data associated with metabolomics experiments. Users can upload their study datasets into the MetaboLights Repository. These studies are then automatically assigned a stable and unique identifier (e.g., MTBLS1) that can be used for publication reference. The MetaboLights Reference Layer associates metabolites with metabolomics studies in the archive and is extensively annotated with data fields such as structural and chemical information, NMR and MS spectra, target species, metabolic pathways, and reactions. The database is manually curated with no specific release schedules. MetaboLights is also recommended by journals for metabolomics data deposition. This unit provides a guide to using MetaboLights, downloading experimental data, and depositing metabolomics datasets using user-friendly submission tools. Copyright © 2016 John Wiley & Sons, Inc.
A case is made for the importance of high quality semantic and coreference annotation. The challenges of providing such annotation are described. Asperger's Syndrome is introduced, and the connections are drawn between the needs of text annotation and the abilities of persons with Asperger's Syndrome to meet those needs. Finally, a pilot program is recommended wherein semantic annotation is performed by people with Asperger's Syndrome. The primary points embodied in this paper are as follows: (1) Document annotation is essential to the Natural Language Processing (NLP) projects at Lawrence Livermore National Laboratory (LLNL); (2) LLNL does not currently have a system in place to meet its need for text annotation; (3) Text annotation is challenging for a variety of reasons, many related to its very rote nature; (4) Persons with Asperger's Syndrome are particularly skilled at rote verbal tasks, and behavioral experts agree that they would excel at text annotation; and (6) A pilot study is recommend in which two to three people with Asperger's Syndrome annotate documents and then the quality and throughput of their work is evaluated relative to that of their neuro-typical peers.
Goldansaz, Seyed Ali; Guo, An Chi; Sajed, Tanvir; Steele, Michael A; Plastow, Graham S; Wishart, David S
Metabolomics uses advanced analytical chemistry techniques to comprehensively measure large numbers of small molecule metabolites in cells, tissues and biofluids. The ability to rapidly detect and quantify hundreds or even thousands of metabolites within a single sample is helping scientists paint a far more complete picture of system-wide metabolism and biology. Metabolomics is also allowing researchers to focus on measuring the end-products of complex, hard-to-decipher genetic, epigenetic and environmental interactions. As a result, metabolomics has become an increasingly popular "omics" approach to assist with the robust phenotypic characterization of humans, crop plants and model organisms. Indeed, metabolomics is now routinely used in biomedical, nutritional and crop research. It is also being increasingly used in livestock research and livestock monitoring. The purpose of this systematic review is to quantitatively and objectively summarize the current status of livestock metabolomics and to identify emerging trends, preferred technologies and important gaps in the field. In conducting this review we also critically assessed the applications of livestock metabolomics in key areas such as animal health assessment, disease diagnosis, bioproduct characterization and biomarker discovery for highly desirable economic traits (i.e., feed efficiency, growth potential and milk production). A secondary goal of this critical review was to compile data on the known composition of the livestock metabolome (for 5 of the most common livestock species namely cattle, sheep, goats, horses and pigs). These data have been made available through an open access, comprehensive livestock metabolome database (LMDB, available at http://www.lmdb.ca). The LMDB should enable livestock researchers and producers to conduct more targeted metabolomic studies and to identify where further metabolome coverage is needed.
Guo, An Chi; Sajed, Tanvir; Steele, Michael A.; Plastow, Graham S.; Wishart, David S.
Metabolomics uses advanced analytical chemistry techniques to comprehensively measure large numbers of small molecule metabolites in cells, tissues and biofluids. The ability to rapidly detect and quantify hundreds or even thousands of metabolites within a single sample is helping scientists paint a far more complete picture of system-wide metabolism and biology. Metabolomics is also allowing researchers to focus on measuring the end-products of complex, hard-to-decipher genetic, epigenetic and environmental interactions. As a result, metabolomics has become an increasingly popular “omics” approach to assist with the robust phenotypic characterization of humans, crop plants and model organisms. Indeed, metabolomics is now routinely used in biomedical, nutritional and crop research. It is also being increasingly used in livestock research and livestock monitoring. The purpose of this systematic review is to quantitatively and objectively summarize the current status of livestock metabolomics and to identify emerging trends, preferred technologies and important gaps in the field. In conducting this review we also critically assessed the applications of livestock metabolomics in key areas such as animal health assessment, disease diagnosis, bioproduct characterization and biomarker discovery for highly desirable economic traits (i.e., feed efficiency, growth potential and milk production). A secondary goal of this critical review was to compile data on the known composition of the livestock metabolome (for 5 of the most common livestock species namely cattle, sheep, goats, horses and pigs). These data have been made available through an open access, comprehensive livestock metabolome database (LMDB, available at http://www.lmdb.ca). The LMDB should enable livestock researchers and producers to conduct more targeted metabolomic studies and to identify where further metabolome coverage is needed. PMID:28531195
Torkamani, Ali; Scott-Van Zeeland, Ashley A; Topol, Eric J; Schork, Nicholas J
Advances in DNA sequencing technologies have made it possible to rapidly, accurately and affordably sequence entire individual human genomes. As impressive as this ability seems, however, it will not likely amount to much if one cannot extract meaningful information from individual sequence data. Annotating variations within individual genomes and providing information about their biological or phenotypic impact will thus be crucially important in moving individual sequencing projects forward, especially in the context of the clinical use of sequence information. In this paper we consider the various ways in which one might annotate individual sequence variations and point out limitations in the available methods for doing so. It is arguable that, in the foreseeable future, DNA sequencing of individual genomes will become routine for clinical, research, forensic, and personal purposes. We therefore also consider directions and areas for further research in annotating genomic variants. Copyright © 2011 Elsevier Inc. All rights reserved.
Torkamani, Ali; Scott-Van Zeeland, Ashley A.; Topol, Eric J.; Schork, Nicholas J.
Advances in DNA sequencing technologies have made it possible to rapidly, accurately and affordably sequence entire individual human genomes. As impressive as this ability seems, however, it will not likely to amount to much if one cannot extract meaningful information from individual sequence data. Annotating variations within individual genomes and providing information about their biological or phenotypic impact will thus be crucially important in moving individual sequencing projects forward, especially in the context of the clinical use of sequence information. In this paper we consider the various ways in which one might annotate individual sequence variations and point out limitations in the available methods for doing so. It is arguable that, in the foreseeable future, DNA sequencing of individual genomes will become routine for clinical, research, forensic, and personal purposes. We therefore also consider directions and areas for further research in annotating genomic variants. PMID:21839162
Roberts, Randy S. [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States); Pope, Paul A. [Los Alamos National Lab. (LANL), Los Alamos, NM (United States); Jiang, Ming [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States); Trucano, Timothy G. [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States); Aragon, Cecilia R. [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States); Ni, Kevin [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States); Wei, Thomas [Argonne National Lab. (ANL), Argonne, IL (United States); Chilton, Lawrence K. [Pacific Northwest National Lab. (PNNL), Richland, WA (United States); Bakel, Alan [Argonne National Lab. (ANL), Argonne, IL (United States)
The following annotated bibliography was developed as part of the geospatial algorithm verification and validation (GSV) project for the Simulation, Algorithms and Modeling program of NA-22. Verification and Validation of geospatial image analysis algorithms covers a wide range of technologies. Papers in the bibliography are thus organized into the following five topic areas: Image processing and analysis, usability and validation of geospatial image analysis algorithms, image distance measures, scene modeling and image rendering, and transportation simulation models. Many other papers were studied during the course of the investigation including. The annotations for these articles can be found in the paper "On the verification and validation of geospatial image analysis algorithms".
Hong, Jun; Yang, Litao; Zhang, Dabing; Shi, Jianxin
As genomes of many plant species have been sequenced, demand for functional genomics has dramatically accelerated the improvement of other omics including metabolomics. Despite a large amount of metabolites still remaining to be identified, metabolomics has contributed significantly not only to the understanding of plant physiology and biology from the view of small chemical molecules that reflect the end point of biological activities, but also in past decades to the attempts to improve plant behavior under both normal and stressed conditions. Hereby, we summarize the current knowledge on the genetic and biochemical mechanisms underlying plant growth, development, and stress responses, focusing further on the contributions of metabolomics to practical applications in crop quality improvement and food safety assessment, as well as plant metabolic engineering. We also highlight the current challenges and future perspectives in this inspiring area, with the aim to stimulate further studies leading to better crop improvement of yield and quality. PMID:27258266
Full Text Available As genomes of many plant species have been sequenced, demand for functional genomics has dramatically accelerated the improvement of other omics including metabolomics. Despite a large amount of metabolites still remaining to be identified, metabolomics has contributed significantly not only to the understanding of plant physiology and biology from the view of small chemical molecules that reflect the end point of biological activities, but also in past decades to the attempts to improve plant behavior under both normal and stressed conditions. Hereby, we summarize the current knowledge on the genetic and biochemical mechanisms underlying plant growth, development, and stress responses, focusing further on the contributions of metabolomics to practical applications in crop quality improvement and food safety assessment, as well as plant metabolic engineering. We also highlight the current challenges and future perspectives in this inspiring area, with the aim to stimulate further studies leading to better crop improvement of yield and quality.
Full Text Available Background. This paper presents the literature on biomarkers of in vitro fertilisation (IVF outcome, demonstrating the progression of these studies towards metabolite profiling, specifically metabolomics. The need for more, and improved, metabolomics studies in the field of assisted conception is discussed. Methods. Searches were performed on ISI Web of Knowledge SM for literature associated with biomarkers of oocyte and embryo quality, and biomarkers of IVF outcome in embryo culture medium, follicular fluid (FF, and blood plasma in female mammals. Results. Metabolomics in the field of female reproduction is still in its infancy. Metabolomics investigations of embryo culture medium for embryo selection have been the most common, but only within the last five years. Only in 2012 has the first metabolomics investigation of FF for biomarkers of oocyte quality been reported. The only metabolomics studies of human blood plasma in this context have been aimed at identifying women with polycystic ovary syndrome (PCOS. Conclusions. Metabolomics is becoming more established in the field of assisted conception, but the studies performed so far have been preliminary and not all potential applications have yet been explored. With further improved metabolomics studies, the possibility of identifying a method for predicting IVF outcome may become a reality.
Töpfer, Nadine; Kleessen, Sabrina; Nikoloski, Zoran
Metabolite levels together with their corresponding metabolic fluxes are integrative outcomes of biochemical transformations and regulatory processes and they can be used to characterize the response of biological systems to genetic and/or environmental changes. However, while changes in transcript or to some extent protein levels can usually be traced back to one or several responsible genes, changes in fluxes and particularly changes in metabolite levels do not follow such rationale and are often the outcome of complex interactions of several components. The increasing quality and coverage of metabolomics technologies have fostered the development of computational approaches for integrating metabolic read-outs with large-scale models to predict the physiological state of a system. Constraint-based approaches, relying on the stoichiometry of the considered reactions, provide a modeling framework amenable to analyses of large-scale systems and to the integration of high-throughput data. Here we review the existing approaches that integrate metabolomics data in variants of constrained-based approaches to refine model reconstructions, to constrain flux predictions in metabolic models, and to relate network structural properties to metabolite levels. Finally, we discuss the challenges and perspectives in the developments of constraint-based modeling approaches driven by metabolomics data.
Vaniya, Arpana; Fiehn, Oliver
Identification of unknown metabolites is the bottleneck in advancing metabolomics, leaving interpretation of metabolomics results ambiguous. The chemical diversity of metabolism is vast, making structure identification arduous and time consuming. Currently, comprehensive analysis of mass spectra in metabolomics is limited to library matching, but tandem mass spectral libraries are small compared to the large number of compounds found in the biosphere, including xenobiotics. Resolving this bottleneck requires richer data acquisition and better computational tools. Multi-stage mass spectrometry (MSn) trees show promise to aid in this regard. Fragmentation trees explore the fragmentation process, generate fragmentation rules and aid in sub-structure identification, while mass spectral trees delineate the dependencies in multi-stage MS of collision-induced dissociations. This review covers advancements over the past 10 years as a tool for metabolite identification, including algorithms, software and databases used to build and to implement fragmentation trees and mass spectral annotations.
Heaton, Pamela; Wallace, Gregory L.
Background: Whilst interest has focused on the origin and nature of the savant syndrome for over a century, it is only within the past two decades that empirical group studies have been carried out. Methods: The following annotation briefly reviews relevant research and also attempts to address outstanding issues in this research area.…
Reidsma, Dennis; Heylen, Dirk K.J.; Ordelman, Roeland J.F.
We present the results of two trials testing procedures for the annotation of emotion and mental state of the AMI corpus. The first procedure is an adaptation of the FeelTrace method, focusing on a continuous labelling of emotion dimensions. The second method is centered around more discrete
Wishart, David S; Jewison, Timothy; Guo, An Chi; Wilson, Michael; Knox, Craig; Liu, Yifeng; Djoumbou, Yannick; Mandal, Rupasri; Aziat, Farid; Dong, Edison; Bouatra, Souhaila; Sinelnikov, Igor; Arndt, David; Xia, Jianguo; Liu, Philip; Yallou, Faizath; Bjorndahl, Trent; Perez-Pineiro, Rolando; Eisner, Roman; Allen, Felicity; Neveu, Vanessa; Greiner, Russ; Scalbert, Augustin
The Human Metabolome Database (HMDB) (www.hmdb.ca) is a resource dedicated to providing scientists with the most current and comprehensive coverage of the human metabolome. Since its first release in 2007, the HMDB has been used to facilitate research for nearly 1000 published studies in metabolomics, clinical biochemistry and systems biology. The most recent release of HMDB (version 3.0) has been significantly expanded and enhanced over the 2009 release (version 2.0). In particular, the number of annotated metabolite entries has grown from 6500 to more than 40,000 (a 600% increase). This enormous expansion is a result of the inclusion of both 'detected' metabolites (those with measured concentrations or experimental confirmation of their existence) and 'expected' metabolites (those for which biochemical pathways are known or human intake/exposure is frequent but the compound has yet to be detected in the body). The latest release also has greatly increased the number of metabolites with biofluid or tissue concentration data, the number of compounds with reference spectra and the number of data fields per entry. In addition to this expansion in data quantity, new database visualization tools and new data content have been added or enhanced. These include better spectral viewing tools, more powerful chemical substructure searches, an improved chemical taxonomy and better, more interactive pathway maps. This article describes these enhancements to the HMDB, which was previously featured in the 2009 NAR Database Issue. (Note to referees, HMDB 3.0 will go live on 18 September 2012.).
Stanley, N.E.; Thurow, T.L.; Russell, B.F.; Sullivan, J.F.
This annotated bibliography covers the following topics: algae, wetland ecosystems; institutional aspects; macrophytes - general, production rates, and mineral absorption; trace metal absorption; wetland soils; water quality; and other aspects of marsh ecosystems. (MHR)
Full Text Available Metabolomic-based approaches are increasingly applied to analyse genetically modified organisms (GMOs making it possible to obtain broader and deeper information on the composition of GMOs compared to that obtained from traditional analytical approaches. The combination in metabolomics of advanced analytical methods and bioinformatics tools provides wide chemical compositional data that contributes to corroborate (or not the substantial equivalence and occurrence of unintended changes resulting from genetic transformation. This review provides insight into recent progress in metabolomics studies on transgenic crops focusing mainly in papers published in the last decade.
Simó, Carolina; Ibáñez, Clara; Valdés, Alberto; Cifuentes, Alejandro; García-Cañas, Virginia
Metabolomic-based approaches are increasingly applied to analyse genetically modified organisms (GMOs) making it possible to obtain broader and deeper information on the composition of GMOs compared to that obtained from traditional analytical approaches. The combination in metabolomics of advanced analytical methods and bioinformatics tools provides wide chemical compositional data that contributes to corroborate (or not) the substantial equivalence and occurrence of unintended changes resulting from genetic transformation. This review provides insight into recent progress in metabolomics studies on transgenic crops focusing mainly in papers published in the last decade. PMID:25334064
Wang, San-Yuan; Kuo, Ching-Hua; Tseng, Yufeng J
Able to detect known and unknown metabolites, untargeted metabolomics has shown great potential in identifying novel biomarkers. However, elucidating all possible liquid chromatography/time-of-flight mass spectrometry (LC/TOF-MS) ion signals in a complex biological sample remains challenging since many ions are not the products of metabolites. Methods of reducing ions not related to metabolites or simply directly detecting metabolite related (pure) ions are important. In this work, we describe PITracer, a novel algorithm that accurately detects the pure ions of a LC/TOF-MS profile to extract pure ion chromatograms and detect chromatographic peaks. PITracer estimates the relative mass difference tolerance of ions and calibrates the mass over charge (m/z) values for peak detection algorithms with an additional option to further mass correction with respect to a user-specified metabolite. PITracer was evaluated using two data sets containing 373 human metabolite standards, including 5 saturated standards considered to be split peaks resultant from huge m/z fluctuation, and 12 urine samples spiked with 50 forensic drugs of varying concentrations. Analysis of these data sets show that PITracer correctly outperformed existing state-of-art algorithm and extracted the pure ion chromatograms of the 5 saturated standards without generating split peaks and detected the forensic drugs with high recall, precision, and F-score and small mass error.
Fazelzadeh, P.; Hangelbroek, R.W.J.; Tieland, M.; de Groot, C.P.G.M.; Verdijk, L.B.; van Loon, L.J.C.; Smilde, A.K.; Alves, R.D.A.M.; Vervoort, J.; Müller, M.; van Duynhoven, J.P.M.; Boekschoten, M.V.
Populations around the world are aging rapidly. Age-related loss of physiological functions negatively affects quality of life. A major contributor to the frailty syndrome of aging is loss of skeletal muscle. In this study we assessed the skeletal muscle biopsy metabolome of healthy young, healthy
Roberts, Randy S. [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States); Pope, Paul A. [Los Alamos National Lab. (LANL), Los Alamos, NM (United States); Jiang, Ming [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States); Trucano, Timothy G. [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States); Aragon, Cecilia R. [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States); Ni, Kevin [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States); Wei, Thomas [Argonne National Lab. (ANL), Argonne, IL (United States); Chilton, Lawrence K. [Pacific Northwest National Lab. (PNNL), Richland, WA (United States); Bakel, Alan [Argonne National Lab. (ANL), Argonne, IL (United States)
The following annotated bibliography was developed as part of the Geospatial Algorithm Veri cation and Validation (GSV) project for the Simulation, Algorithms and Modeling program of NA-22. Veri cation and Validation of geospatial image analysis algorithms covers a wide range of technologies. Papers in the bibliography are thus organized into the following ve topic areas: Image processing and analysis, usability and validation of geospatial image analysis algorithms, image distance measures, scene modeling and image rendering, and transportation simulation models.
Guitton, Yann; Tremblay-Franco, Marie; Le Corguillé, Gildas; Martin, Jean-François; Pétéra, Mélanie; Roger-Mele, Pierrick; Delabrière, Alexis; Goulitquer, Sophie; Monsoor, Misharl; Duperier, Christophe; Canlet, Cécile; Servien, Rémi; Tardivel, Patrick; Caron, Christophe; Giacomoni, Franck; Thévenot, Etienne A
Metabolomics is a key approach in modern functional genomics and systems biology. Due to the complexity of metabolomics data, the variety of experimental designs, and the multiplicity of bioinformatics tools, providing experimenters with a simple and efficient resource to conduct comprehensive and rigorous analysis of their data is of utmost importance. In 2014, we launched the Workflow4Metabolomics (W4M; http://workflow4metabolomics.org) online infrastructure for metabolomics built on the Galaxy environment, which offers user-friendly features to build and run data analysis workflows including preprocessing, statistical analysis, and annotation steps. Here we present the new W4M 3.0 release, which contains twice as many tools as the first version, and provides two features which are, to our knowledge, unique among online resources. First, data from the four major metabolomics technologies (i.e., LC-MS, FIA-MS, GC-MS, and NMR) can be analyzed on a single platform. By using three studies in human physiology, alga evolution, and animal toxicology, we demonstrate how the 40 available tools can be easily combined to address biological issues. Second, the full analysis (including the workflow, the parameter values, the input data and output results) can be referenced with a permanent digital object identifier (DOI). Publication of data analyses is of major importance for robust and reproducible science. Furthermore, the publicly shared workflows are of high-value for e-learning and training. The Workflow4Metabolomics 3.0 e-infrastructure thus not only offers a unique online environment for analysis of data from the main metabolomics technologies, but it is also the first reference repository for metabolomics workflows. Copyright © 2017 Elsevier Ltd. All rights reserved.
Dominguez Del Angel, Victoria; Hjerde, Erik; Sterck, Lieven; Capella-Gutierrez, Salvadors; Notredame, Cederic; Vinnere Pettersson, Olga; Amselem, Joelle; Bouri, Laurent; Bocs, Stephanie; Klopp, Christophe; Gibrat, Jean-Francois; Vlasova, Anna; Leskosek, Brane L.; Soler, Lucile; Binzer-Panchal, Mahesh; Lantz, Henrik
As a part of the ELIXIR-EXCELERATE efforts in capacity building, we present here 10 steps to facilitate researchers getting started in genome assembly and genome annotation. The guidelines given are broadly applicable, intended to be stable over time, and cover all aspects from start to finish of a general assembly and annotation project. Intrinsic properties of genomes are discussed, as is the importance of using high quality DNA. Different sequencing technologies and generally applicable workflows for genome assembly are also detailed. We cover structural and functional annotation and encourage readers to also annotate transposable elements, something that is often omitted from annotation workflows. The importance of data management is stressed, and we give advice on where to submit data and how to make your results Findable, Accessible, Interoperable, and Reusable (FAIR). PMID:29568489
Song, Dezhao; Chute, Christopher G; Tao, Cui
To facilitate clinical research, clinical data needs to be stored in a machine processable and understandable way. Manual annotating clinical data is time consuming. Automatic approaches (e.g., Natural Language Processing systems) have been adopted to convert such data into structured formats; however, the quality of such automatically extracted data may not always be satisfying. In this paper, we propose Semantator, a semi-automatic tool for document annotation with Semantic Web ontologies. With a loaded free text document and an ontology, Semantator supports the creation/deletion of ontology instances for any document fragment, linking/disconnecting instances with the properties in the ontology, and also enables automatic annotation by connecting to the NCBO annotator and cTAKES. By representing annotations in Semantic Web standards, Semantator supports reasoning based upon the underlying semantics of the owl:disjointWith and owl:equivalentClass predicates. We present discussions based on user experiences of using Semantator.
U.S. Department of Health & Human Services — The Metabolomics Program's Data Repository and Coordinating Center (DRCC), housed at the San Diego Supercomputer Center (SDSC), University of California, San Diego,...
Metabolomics is the scientific discipline that identifies and quantifies endogenous and exogenous metabolites in different biological samples. Metabolites are crucial components of a biological system and they are highly informative about its functional state, due to their closeness to the organism...... focused on the analysis of various samples covering a wide range of fields, namely, food and nutraceutical sciences, cell metabolomics and medicine using a metabolomics approach. Indeed, the first part of the thesis describes two exploratory studies performed on Algerian extra virgin olive oil and apple...... juice from ancient Danish apple cultivars. Both studies revealed variety-related peculiarities that would have been difficult to detect by means of traditional analysis. The second part of the project includes four metabolomics studies performed on samples of biological origin. In particular, the first...
Following a general introduction, this book includes details of metabolomics of model species including Arabidopsis and tomato. Further chapters provide in-depth coverage of abiotic stress, data integration, systems biology, genetics, genomics, chemometrics and biostatisitcs. Applications of plant
Marsch, Amanda F; Espiritu, Baltazar; Groth, John; Hutchens, Kelli A
With today's technology, paraffin-embedded, hematoxylin & eosin-stained pathology slides can be scanned to generate high quality virtual slides. Using proprietary software, digital images can also be annotated with arrows, circles and boxes to highlight certain diagnostic features. Previous studies assessing digital microscopy as a teaching tool did not involve the annotation of digital images. The objective of this study was to compare the effectiveness of annotated digital pathology slides versus non-annotated digital pathology slides as a teaching tool during dermatology and pathology residencies. A study group composed of 31 dermatology and pathology residents was asked to complete an online pre-quiz consisting of 20 multiple choice style questions, each associated with a static digital pathology image. After completion, participants were given access to an online tutorial composed of digitally annotated pathology slides and subsequently asked to complete a post-quiz. A control group of 12 residents completed a non-annotated version of the tutorial. Nearly all participants in the study group improved their quiz score, with an average improvement of 17%, versus only 3% (P = 0.005) in the control group. These results support the notion that annotated digital pathology slides are superior to non-annotated slides for the purpose of resident education. © 2014 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.
Brouwer-Brolsma, Elske M; Brennan, Lorraine; Drevon, Christian A
food metabolomics techniques that allow the quantification of up to thousands of metabolites simultaneously, which may be applied in intervention and observational studies. As biomarkers are often influenced by various other factors than the food under investigation, FoodBAll developed a food intake...... in these metabolomics studies, knowledge about available electronic metabolomics resources is necessary and further developments of these resources are essential. Ultimately, present efforts in this research area aim to advance quality control of traditional dietary assessment methods, advance compliance evaluation...
Roessner, U.; Rolin, D.; Rijswijk, van M.E.C.; Hall, R.D.; Hankemeier, T.
In 2012 the Metabolomics Society established a more formal system for national and regional metabolomics initiatives, interest groups, societies and networks to become an International Affiliate of the Society. A number of groups (http://metabolomicssociety.org/international-affilia
Gorbalenya Alexander E
Full Text Available Abstract Background A growing diversity of biological data is tagged with unique identifiers (UIDs associated with polynucleotides and proteins to ensure efficient computer-mediated data storage, maintenance, and processing. These identifiers, which are not informative for most people, are often substituted by biologically meaningful names in various presentations to facilitate utilization and dissemination of sequence-based knowledge. This substitution is commonly done manually that may be a tedious exercise prone to mistakes and omissions. Results Here we introduce SNAD (Sequence Name Annotation-based Designer that mediates automatic conversion of sequence UIDs (associated with multiple alignment or phylogenetic tree, or supplied as plain text list into biologically meaningful names and acronyms. This conversion is directed by precompiled or user-defined templates that exploit wealth of annotation available in cognate entries of external databases. Using examples, we demonstrate how this tool can be used to generate names for practical purposes, particularly in virology. Conclusion A tool for controllable annotation-based conversion of sequence UIDs into biologically meaningful names and acronyms has been developed and placed into service, fostering links between quality of sequence annotation, and efficiency of communication and knowledge dissemination among researchers.
Martinez Alonso, Hector
Regular polysemy has received a lot of attention from the theory of lexical semantics and from computational linguistics. However, there is no consensus on how to represent the sense of underspecified examples at the token level, namely when annotating or disambiguating senses of metonymic words...... and metonymic. We have conducted an analysis in English, Danish and Spanish. Later on, we have tried to replicate the human judgments by means of unsupervised and semi-supervised sense prediction. The automatic sense-prediction systems have been unable to find empiric evidence for the underspecified sense, even...
Uziel, M.S.; Hannon, E.H.
This bibliography of 655 annotated references on impingement of aquatic organisms at intake structures of thermal-power-plant cooling systems was compiled from the published and unpublished literature. The bibliography includes references from 1928 to 1978 on impingement monitoring programs; impingement impact assessment; applicable law; location and design of intake structures, screens, louvers, and other barriers; fish behavior and swim speed as related to impingement susceptibility; and the effects of light, sound, bubbles, currents, and temperature on fish behavior. References are arranged alphabetically by author or corporate author. Indexes are provided for author, keywords, subject category, geographic location, taxon, and title
Wolahan, Stephanie M.; Hirt, Daniel; Braas, Daniel; Glenn, Thomas C.
Synopsis Metabolomics is an important member of the omics community in that it defines which small molecules may be responsible for disease states. This article reviews the essential principles of metabolomics from specimen preparation, chemical analysis, and advanced statistical methods. Metabolomics in TBI has so far been underutilized. Future metabolomics based studies focused on the diagnoses, prognoses, and treatment effects, need to be conducted across all types of TBI. PMID:27637396
Martinez Alonso, Hector; Johannsen, Anders Trærup; Lopez de Lacalle, Oier
High agreement is a common objective when annotating data for word senses. However, a number of factors make perfect agreement impossible, e.g. the limitations of the sense inventories, the difficulty of the examples or the interpretation preferences of the annotations. Estimating potential...... agreement is thus a relevant task to supplement the evaluation of sense annotations. In this article we propose two methods to predict agreement on word-annotation instances. We experiment with a continuous representation and a three-way discretization of observed agreement. In spite of the difficulty...
Showalter, C.; Rex, R.; Hurlburt, N. E.; Zita, E. J.
We have written a software suite designed to facilitate solar data analysis by scientists, students, and the public, anticipating enormous datasets from future instruments. Our “STAR" suite includes an interactive learning section explaining 15 classes of solar events. Users learn software tools that exploit humans’ superior ability (over computers) to identify many events. Annotation tools include time slice generation to quantify loop oscillations, the interpolation of event shapes using natural cubic splines (for loops, sigmoids, and filaments) and closed cubic splines (for coronal holes). Learning these tools in an environment where examples are provided prepares new users to comfortably utilize annotation software with new data. Upon completion of our tutorial, users are presented with media of various solar events and asked to identify and annotate the images, to test their mastery of the system. Goals of the project include public input into the data analysis of very large datasets from future solar satellites, and increased public interest and knowledge about the Sun. In 2010, the Solar Dynamics Observatory (SDO) will be launched into orbit. SDO’s advancements in solar telescope technology will generate a terabyte per day of high-quality data, requiring innovation in data management. While major projects develop automated feature recognition software, so that computers can complete much of the initial event tagging and analysis, still, that software cannot annotate features such as sigmoids, coronal magnetic loops, coronal dimming, etc., due to large amounts of data concentrated in relatively small areas. Previously, solar physicists manually annotated these features, but with the imminent influx of data it is unrealistic to expect specialized researchers to examine every image that computers cannot fully process. A new approach is needed to efficiently process these data. Providing analysis tools and data access to students and the public have proven
Macel, M.; Van Dam, N.M.; Keurentjes, J.J.B.
Metabolomics is a fast developing field of comprehensive untargeted chemical analyses. It has many applications and can in principle be used on any organism without prior knowledge of the metabolome or genome. The amount of functional information that is acquired with metabolomics largely depends on
Full Text Available Abstract Background The genome sequencing projects have shown our limited knowledge regarding gene function, e.g. S. cerevisiae has 5–6,000 genes of which nearly 1,000 have an uncertain function. Their gross influence on the behaviour of the cell can be observed using large-scale metabolomic studies. The metabolomic data produced need to be structured and annotated in a machine-usable form to facilitate the exploration of the hidden links between the genes and their functions. Description MeMo is a formal model for representing metabolomic data and the associated metadata. Two predominant platforms (SQL and XML are used to encode the model. MeMo has been implemented as a relational database using a hybrid approach combining the advantages of the two technologies. It represents a practical solution for handling the sheer volume and complexity of the metabolomic data effectively and efficiently. The MeMo model and the associated software are available at http://dbkgroup.org/memo/. Conclusion The maturity of relational database technology is used to support efficient data processing. The scalability and self-descriptiveness of XML are used to simplify the relational schema and facilitate the extensibility of the model necessitated by the creation of new experimental techniques. Special consideration is given to data integration issues as part of the systems biology agenda. MeMo has been physically integrated and cross-linked to related metabolomic and genomic databases. Semantic integration with other relevant databases has been supported through ontological annotation. Compatibility with other data formats is supported by automatic conversion.
Engelhardt, Barbara E; Jordan, Michael I; Repo, Susanna T; Brenner, Steven E
It is now easier to discover thousands of protein sequences in a new microbial genome than it is to biochemically characterize the specific activity of a single protein of unknown function. The molecular functions of protein sequences have typically been predicted using homology-based computational methods, which rely on the principle that homologous proteins share a similar function. However, some protein families include groups of proteins with different molecular functions. A phylogenetic approach for predicting molecular function (sometimes called 'phylogenomics') is an effective means to predict protein molecular function. These methods incorporate functional evidence from all members of a family that have functional characterizations using the evolutionary history of the protein family to make robust predictions for the uncharacterized proteins. However, they are often difficult to apply on a genome-wide scale because of the time-consuming step of reconstructing the phylogenies of each protein to be annotated. Our automated approach for function annotation using phylogeny, the SIFTER (Statistical Inference of Function Through Evolutionary Relationships) methodology, uses a statistical graphical model to compute the probabilities of molecular functions for unannotated proteins. Our benchmark tests showed that SIFTER provides accurate functional predictions on various protein families, outperforming other available methods.
Hamed Hassanzadeh; MohammadReza Keyvanpour
The Semantic Web is an extension of the current web in which information is given well-defined meaning. The perspective of Semantic Web is to promote the quality and intelligence of the current web by changing its contents into machine understandable form. Therefore, semantic level information is one of the cornerstones of the Semantic Web. The process of adding semantic metadata to web resources is called Semantic Annotation. There are many obstacles against the Semantic Annotation, such as ...
Kalim, Sahir; Rhee, Eugene P
The high-throughput, high-resolution phenotyping enabled by metabolomics has been applied increasingly to a variety of questions in nephrology research. This article provides an overview of current metabolomics methodologies and nomenclature, citing specific considerations in sample preparation, metabolite measurement, and data analysis that investigators should understand when examining the literature or designing a study. Furthermore, we review several notable findings that have emerged in the literature that both highlight some of the limitations of current profiling approaches, as well as outline specific strengths unique to metabolomics. More specifically, we review data on the following: (i) tryptophan metabolites and chronic kidney disease onset, illustrating the interpretation of metabolite data in the context of established biochemical pathways; (ii) trimethylamine-N-oxide and cardiovascular disease in chronic kidney disease, illustrating the integration of exogenous and endogenous inputs to the blood metabolome; and (iii) renal mitochondrial function in diabetic kidney disease and acute kidney injury, illustrating the potential for rapid translation of metabolite data for diagnostic or therapeutic aims. Finally, we review future directions, including the need to better characterize interperson and intraperson variation in the metabolome, pool existing data sets to identify the most robust signals, and capitalize on the discovery potential of emerging nontargeted methods. Copyright © 2016 International Society of Nephrology. Published by Elsevier Inc. All rights reserved.
Cuperlovic-Culf, M; Culf, A S
The metabolic profile is a direct signature of phenotype and biochemical activity following any perturbation. Metabolites are small molecules present in a biological system including natural products as well as drugs and their metabolism by-products depending on the biological system studied. Metabolomics can provide activity information about possible novel drugs and drug scaffolds, indicate interesting targets for drug development and suggest binding partners of compounds. Furthermore, metabolomics can be used for the discovery of novel natural products and in drug development. Metabolomics can enhance the discovery and testing of new drugs and provide insight into the on- and off-target effects of drugs. This review focuses primarily on the application of metabolomics in the discovery of active drugs from natural products and the analysis of chemical libraries and the computational analysis of metabolic networks. Metabolomics methodology, both experimental and analytical is fast developing. At the same time, databases of compounds are ever growing with the inclusion of more molecular and spectral information. An increasing number of systems are being represented by very detailed metabolic network models. Combining these experimental and computational tools with high throughput drug testing and drug discovery techniques can provide new promising compounds and leads.
Yu, Guoxian; Lu, Chang; Wang, Jun
Gene Ontology (GO) is a community effort to represent functional features of gene products. GO annotations (GOA) provide functional associations between GO terms and gene products. Due to resources limitation, only a small portion of annotations are manually checked by curators, and the others are electronically inferred. Although quality control techniques have been applied to ensure the quality of annotations, the community consistently report that there are still considerable noisy (or incorrect) annotations. Given the wide application of annotations, however, how to identify noisy annotations is an important but yet seldom studied open problem. We introduce a novel approach called NoGOA to predict noisy annotations. NoGOA applies sparse representation on the gene-term association matrix to reduce the impact of noisy annotations, and takes advantage of sparse representation coefficients to measure the semantic similarity between genes. Secondly, it preliminarily predicts noisy annotations of a gene based on aggregated votes from semantic neighborhood genes of that gene. Next, NoGOA estimates the ratio of noisy annotations for each evidence code based on direct annotations in GOA files archived on different periods, and then weights entries of the association matrix via estimated ratios and propagates weights to ancestors of direct annotations using GO hierarchy. Finally, it integrates evidence-weighted association matrix and aggregated votes to predict noisy annotations. Experiments on archived GOA files of six model species (H. sapiens, A. thaliana, S. cerevisiae, G. gallus, B. Taurus and M. musculus) demonstrate that NoGOA achieves significantly better results than other related methods and removing noisy annotations improves the performance of gene function prediction. The comparative study justifies the effectiveness of integrating evidence codes with sparse representation for predicting noisy GO annotations. Codes and datasets are available at http://mlda.swu.edu.cn/codes.php?name=NoGOA .
Kwon, Hyuk Nam; Phan, Hong-Duc; Xu, Wen Jun; Ko, Yoon-Joo; Park, Sunghyouk
Herbal medicines have been used for a long time all around the world. Since the quality of herbal preparations depends on the source of herbal materials, there has been a strong need to develop methods to correctly identify the origin of materials. To develop a smartphone metabolomics platform as a simpler and low-cost alternative for the identification of herbal material source. Schisandra sinensis extracts from Korea and China were prepared. The visible spectra of all samples were measured by a smartphone spectrometer platform. This platform included all the necessary measures built-in for the metabolomics research: data acquisition, processing, chemometric analysis and visualisation of the results. The result of the smartphone metabolomics platform was compared to that of NMR-based metabolomics, suggesting the feasibility of smartphone platform in metabolomics research. The smartphone metabolomics platform gave similar results to the NMR method, showing good separation between Korean and Chinese materials and correct predictability for all test samples. With its accuracy and advantages of affordability, user-friendliness, and portability, the smartphone metabolomics platform could be applied to the authentication of other medicinal plants. Copyright © 2016 John Wiley & Sons, Ltd. Copyright © 2016 John Wiley & Sons, Ltd.
Grossman, Arthur R
Shotgun sequencing of the nuclear genome of Chlamydomonas reinhardtii (Chlamydomonas throughout) was performed at an approximate 10X coverage by JGI. Roughly half of the genome is now contained on 26 scaffolds, all of which are at least 1.6 Mb, and the coverage of the genome is ~95%. There are now over 200,000 cDNA sequence reads that we have generated as part of the Chlamydomonas genome project (Grossman, 2003; Shrager et al., 2003; Grossman et al. 2007; Merchant et al., 2007); other sequences have also been generated by the Kasuza sequence group (Asamizu et al., 1999; Asamizu et al., 2000) or individual laboratories that have focused on specific genes. Shrager et al. (2003) placed the reads into distinct contigs (an assemblage of reads with overlapping nucleotide sequences), and contigs that group together as part of the same genes have been designated ACEs (assembly of contigs generated from EST information). All of the reads have also been mapped to the Chlamydomonas nuclear genome and the cDNAs and their corresponding genomic sequences have been reassembled, and the resulting assemblage is called an ACEG (an Assembly of contiguous EST sequences supported by genomic sequence) (Jain et al., 2007). Most of the unique genes or ACEGs are also represented by gene models that have been generated by the Joint Genome Institute (JGI, Walnut Creek, CA). These gene models have been placed onto the DNA scaffolds and are presented as a track on the Chlamydomonas genome browser associated with the genome portal (http://genome.jgi-psf.org/Chlre3/Chlre3.home.html). Ultimately, the meeting grant awarded by DOE has helped enormously in the development of an annotation pipeline (a set of guidelines used in the annotation of genes) and resulted in high quality annotation of over 4,000 genes; the annotators were from both Europe and the USA. Some of the people who led the annotation initiative were Arthur Grossman, Olivier Vallon, and Sabeeha Merchant (with many individual
Kim, Sooah; Kim, Jungyeon; Yun, Eun Ju; Kim, Kyoung Heon
Metabolomics, one of the latest components in the suite of systems biology, has been used to understand the metabolism and physiology of living systems, including microorganisms, plants, animals and humans. Food metabolomics can be defined as the application of metabolomics in food systems, including food resources, food processing and diet for humans. The study of food metabolomics has increased gradually in the recent years, because food systems are directly related to nutrition and human health. This review describes the recent trends and applications of metabolomics to food systems, from farm to human, including food resource production, industrial food processing and food intake by humans. Copyright © 2015 Elsevier Ltd. All rights reserved.
Boot, P.; Boot, P.; Stronks, E.
From the introduction: Annotation is an important item on the wish list for digital scholarly tools. It is one of John Unsworth’s primitives of scholarship (Unsworth 2000). Especially in linguistics,a number of tools have been developed that facilitate the creation of annotations to source material
ENGLISH TEACHER ANNOTATIONS WERE STUDIED TO DETERMINE THE DIMENSIONS AND PROPERTIES OF THE ENTIRE SYSTEM FOR WRITING CORRECTIONS AND CRITICISMS ON COMPOSITIONS. FOUR SETS OF COMPOSITIONS WERE WRITTEN BY STUDENTS IN GRADES 9 THROUGH 13. TYPESCRIPTS OF THE COMPOSITIONS WERE ANNOTATED BY CLASSROOM ENGLISH TEACHERS. THEN, 32 ENGLISH TEACHERS JUDGED…
Apweiler, R; Gateau, A; Contrino, S; Martin, M J; Junker, V; O'Donovan, C; Lang, F; Mitaritonna, N; Kappus, S; Bairoch, A
SWISS-PROT is a curated protein sequence database which strives to provide a high level of annotation, a minimal level of redundancy and high level of integration with other databases. Ongoing genome sequencing projects have dramatically increased the number of protein sequences to be incorporated into SWISS-PROT. Since we do not want to dilute the quality standards of SWISS-PROT by incorporating sequences without proper sequence analysis and annotation, we cannot speed up the incorporation of new incoming data indefinitely. However, as we also want to make the sequences available as fast as possible, we introduced TREMBL (TRanslation of EMBL nucleotide sequence database), a supplement to SWISS-PROT. TREMBL consists of computer-annotated entries in SWISS-PROT format derived from the translation of all coding sequences (CDS) in the EMBL nucleotide sequence database, except for CDS already included in SWISS-PROT. While TREMBL is already of immense value, its computer-generated annotation does not match the quality of SWISS-PROTs. The main difference is in the protein functional information attached to sequences. With this in mind, we are dedicating substantial effort to develop and apply computer methods to enhance the functional information attached to TREMBL entries.
Kovalchuk, Anna; Nersisyan, Lilit; Mandal, Rupasri; Wishart, David; Mancini, Maria; Sidransky, David; Kolb, Bryan; Kovalchuk, Olga
Cancer survivors experience numerous treatment side effects that negatively affect their quality of life. Cognitive side effects are especially insidious, as they affect memory, cognition, and learning. Neurocognitive deficits occur prior to cancer treatment, arising even before cancer diagnosis, and we refer to them as “tumor brain.” Metabolomics is a new area of research that focuses on metabolome profiles and provides important mechanistic insights into various human diseases, including cancer, neurodegenerative diseases, and aging. Many neurological diseases and conditions affect metabolic processes in the brain. However, the tumor brain metabolome has never been analyzed. In our study we used direct flow injection/mass spectrometry (DI-MS) analysis to establish the effects of the growth of lung cancer, pancreatic cancer, and sarcoma on the brain metabolome of TumorGraft™ mice. We found that the growth of malignant non-CNS tumors impacted metabolic processes in the brain, affecting protein biosynthesis, and amino acid and sphingolipid metabolism. The observed metabolic changes were similar to those reported for neurodegenerative diseases and brain aging, and may have potential mechanistic value for future analysis of the tumor brain phenomenon. PMID:29515623
modeling components are not directly bound to framework by the use of specific APIs and/or data types they can more easily be reused both within the framework as well as outside. While providing all those capabilities, a significant reduction in the size of the model source code was achieved. To support the benefit of annotations for a modeler, studies were conducted to evaluate the effectiveness of an annotation based framework approach with other modeling frameworks and libraries, a framework-invasiveness study was conducted to evaluate the effects of framework design on model code quality. A typical hydrological model was implemented across several modeling frameworks and several software metrics were collected. The metrics selected were measures of non-invasive design methods for modeling frameworks from a software engineering perspective. It appears that the use of annotations positively impacts several software quality measures. Experience to date has demonstrated the multi-purpose value of using annotations. Annotations are also a feasible and practical method to enable interoperability among models and modeling frameworks.
Guignon, Valentin; Droc, Gaëtan; Alaux, Michael; Baurens, Franc-Christophe; Garsmeur, Olivier; Poiron, Claire; Carver, Tim; Rouard, Mathieu; Bocs, Stéphanie
We developed a controller that is compliant with the Chado database schema, GBrowse and genome annotation-editing tools such as Artemis and Apollo. It enables the management of public and private data, monitors manual annotation (with controlled vocabularies, structural and functional annotation controls) and stores versions of annotation for all modified features. The Chado controller uses PostgreSQL and Perl. The Chado Controller package is available for download at http://www.gnpannot.org/content/chado-controller and runs on any Unix-like operating system, and documentation is available at http://www.gnpannot.org/content/chado-controller-doc The system can be tested using the GNPAnnot Sandbox at http://www.gnpannot.org/content/gnpannot-sandbox-form firstname.lastname@example.org; email@example.com Supplementary data are available at Bioinformatics online.
Mondul, Alison M; Weinstein, Stephanie J; Albanes, Demetrius
How micronutrients might influence risk of developing adenocarcinoma of the prostate has been the focus of a large body of research (especially regarding vitamins E, A, and D). Metabolomic profiling has the potential to discover molecular species relevant to prostate cancer etiology, early detection, and prevention, and may help elucidate the biologic mechanisms through which vitamins influence prostate cancer risk. Prostate cancer risk data related to vitamins E, A, and D and metabolomic profiling from clinical, cohort, and nested case-control studies, along with randomized controlled trials, are examined and summarized, along with recent metabolomic data of the vitamin phenotypes. Higher vitamin E serologic status is associated with lower prostate cancer risk, and vitamin E genetic variant data support this. By contrast, controlled vitamin E supplementation trials have had mixed results based on differing designs and dosages. Beta-carotene supplementation (in smokers) and higher circulating retinol and 25-hydroxy-vitamin D concentrations appear related to elevated prostate cancer risk. Our prospective metabolomic profiling of fasting serum collected 1-20 years prior to clinical diagnoses found reduced lipid and energy/TCA cycle metabolites, including inositol-1-phosphate, lysolipids, alpha-ketoglutarate, and citrate, significantly associated with lower risk of aggressive disease. Several active leads exist regarding the role of micronutrients and metabolites in prostate cancer carcinogenesis and risk. How vitamins D and A may adversely impact risk, and whether low-dose vitamin E supplementation remains a viable preventive approach, require further study.
Fiehn, O.; Robertson, D.; Griffin, J.; Werf, M. van der; Nikolau, B.; Morrison, N.; Sumner, L.W.; Goodacre, R.; Hardy, N.W.; Taylor, C.; Fostel, J.; Kristal, B.; Kaddurah-Daouk, R.; Mendes, P.; Ommen, B. van; Lindon, J.C.; Sansone, S.-A.
In 2005, the Metabolomics Standards Initiative has been formed. An outline and general introduction is provided to inform about the history, structure, working plan and intentions of this initiative. Comments on any of the suggested minimal reporting standards are welcome to be sent to the open
Gede, Mátyás; Farbinger, Anna
Thanks to the efforts of the various globe digitising projects, nowadays there are plenty of old globes that can be examined as 3D models on the computer screen. These globes usually contain a lot of interesting details that an average observer would not entirely discover for the first time. The authors developed a website that can display annotations for such digitised globes. These annotations help observers of the globe to discover all the important, interesting details. Annotations consist of a plain text title, a HTML formatted descriptive text and a corresponding polygon and are stored in KML format. The website is powered by the Cesium virtual globe engine.
Tie Hua Zhou
Full Text Available The ever-increasing quantities of digital photo resources are annotated with enriching vocabularies to form semantic annotations. Photo-sharing social networks have boosted the need for efficient and intuitive querying to respond to user requirements in large-scale image collections. In order to help users formulate efficient and effective image retrieval, we present a novel integration of a probabilistic model based on keyword query architecture that models the probability distribution of image annotations: allowing users to obtain satisfactory results from image retrieval via the integration of multiple annotations. We focus on the annotation integration step in order to specify the meaning of each image annotation, thus leading to the most representative annotations of the intent of a keyword search. For this demonstration, we show how a probabilistic model has been integrated to semantic annotations to allow users to intuitively define explicit and precise keyword queries in order to retrieve satisfactory image results distributed in heterogeneous large data sources. Our experiments on SBU (collected by Stony Brook University database show that (i our integrated annotation contains higher quality representatives and semantic matches; and (ii the results indicating annotation integration can indeed improve image search result quality.
Ida, Megumi; Kosaka, Reia; Miura, Daisuke; Wariishi, Hiroyuki; Maeda-Yamamoto, Mari; Nesumi, Atsushi; Saito, Takeshi; Kanda, Tomomasa; Yamada, Koji; Tachibana, Hirofumi
Background Green tea has various health promotion effects. Although there are numerous tea cultivars, little is known about the differences in their nutraceutical properties. Metabolic profiling techniques can provide information on the relationship between the metabolome and factors such as phenotype or quality. Here, we performed metabolomic analyses to explore the relationship between the metabolome and health-promoting attributes (bioactivity) of diverse Japanese green tea cultivars. Methodology/Principal Findings We investigated the ability of leaf extracts from 43 Japanese green tea cultivars to inhibit thrombin-induced phosphorylation of myosin regulatory light chain (MRLC) in human umbilical vein endothelial cells (HUVECs). This thrombin-induced phosphorylation is a potential hallmark of vascular endothelial dysfunction. Among the tested cultivars, Cha Chuukanbohon Nou-6 (Nou-6) and Sunrouge (SR) strongly inhibited MRLC phosphorylation. To evaluate the bioactivity of green tea cultivars using a metabolomics approach, the metabolite profiles of all tea extracts were determined by high-performance liquid chromatography-mass spectrometry (LC-MS). Multivariate statistical analyses, principal component analysis (PCA) and orthogonal partial least-squares-discriminant analysis (OPLS-DA), revealed differences among green tea cultivars with respect to their ability to inhibit MRLC phosphorylation. In the SR cultivar, polyphenols were associated with its unique metabolic profile and its bioactivity. In addition, using partial least-squares (PLS) regression analysis, we succeeded in constructing a reliable bioactivity-prediction model to predict the inhibitory effect of tea cultivars based on their metabolome. This model was based on certain identified metabolites that were associated with bioactivity. When added to an extract from the non-bioactive cultivar Yabukita, several metabolites enriched in SR were able to transform the extract into a bioactive extract
Mayorga Gross, Ana Lucía; Quirós Guerrero, Luis Manuel; Fourny, G.; Vaillant Barka, Fabrice
Fermentation is a critical step in the processing of high quality cocoa; however, the biochemistry behind is still not well understood at a molecular level. In this research, using a non-targeted approach, the main metabolomic changes that occur throughout the fermentation process were explored. Genetically undefined cocoa varieties from Trinidad and Tobago (n = 3), Costa Rica (n = 1) and one clone IMC-67 (n = 3) were subjected to spontaneous fermentation using farm-based and pilot plant cont...
Enteric (or typhoid) fever is a systemic infection mainly caused by Salmonella Typhi and Salmonella Paratyphi A. The disease is common in areas with poor water quality and insufficient sanitation. Humans are the only reservoir for transmission of the disease. The presence of asymptomatic chronic carriers is a complicating factor for the transmission. There are major limitations regarding the current diagnostic methods both for acute infection and chronic carriage. Metabolomics is a methodolog...
Full Text Available Transposable elements (TEs are mobile, repetitive sequences that make up significant fractions of metazoan genomes. Despite their near ubiquity and importance in genome and chromosome biology, most efforts to annotate TEs in genome sequences rely on the results of a single computational program, RepeatMasker. In contrast, recent advances in gene annotation indicate that high-quality gene models can be produced from combining multiple independent sources of computational evidence. To elevate the quality of TE annotations to a level comparable to that of gene models, we have developed a combined evidence-model TE annotation pipeline, analogous to systems used for gene annotation, by integrating results from multiple homology-based and de novo TE identification methods. As proof of principle, we have annotated "TE models" in Drosophila melanogaster Release 4 genomic sequences using the combined computational evidence derived from RepeatMasker, BLASTER, TBLASTX, all-by-all BLASTN, RECON, TE-HMM and the previous Release 3.1 annotation. Our system is designed for use with the Apollo genome annotation tool, allowing automatic results to be curated manually to produce reliable annotations. The euchromatic TE fraction of D. melanogaster is now estimated at 5.3% (cf. 3.86% in Release 3.1, and we found a substantially higher number of TEs (n = 6,013 than previously identified (n = 1,572. Most of the new TEs derive from small fragments of a few hundred nucleotides long and highly abundant families not previously annotated (e.g., INE-1. We also estimated that 518 TE copies (8.6% are inserted into at least one other TE, forming a nest of elements. The pipeline allows rapid and thorough annotation of even the most complex TE models, including highly deleted and/or nested elements such as those often found in heterochromatic sequences. Our pipeline can be easily adapted to other genome sequences, such as those of the D. melanogaster heterochromatin or other
Matsuda, Fumio; Nakabayashi, Ryo; Sawada, Yuji; Suzuki, Makoto; Hirai, Masami Y.; Kanaya, Shigehiko; Saito, Kazuki
A novel framework for automated elucidation of metabolite structures in liquid chromatography–mass spectrometer metabolome data was constructed by integrating databases. High-resolution tandem mass spectra data automatically acquired from each metabolite signal were used for database searches. Three distinct databases, KNApSAcK, ReSpect, and the PRIMe standard compound database, were employed for the structural elucidation. The outputs were retrieved using the CAS metabolite identifier for identification and putative annotation. A simple metabolite ontology system was also introduced to attain putative characterization of the metabolite signals. The automated method was applied for the metabolome data sets obtained from the rosette leaves of 20 Arabidopsis accessions. Phenotypic variations in novel Arabidopsis metabolites among these accessions could be investigated using this method. PMID:22645535
Full Text Available A novel framework for automated elucidation of metabolite structures in liquid chromatography-mass spectrometer (LC-MS metabolome data was constructed by integrating databases. High-resolution tandem mass spectra data automatically acquired from each metabolite signal were used for database searches. Three distinct databases, KNApSAcK, ReSpect, and the PRIMe standard compound database, were employed for the structural elucidation. The outputs were retrieved using the CAS metabolite identifier for identification and putative annotation. A simple metabolite ontology system was also introduced to attain putative characterization of the metabolite signals. The automated method was applied for the metabolome data sets obtained from the rosette leaves of 20 Arabidopsis accessions. Phenotypic variations in novel Arabidopsis metabolites among these accessions could be investigated using this method.
Haas, B J; Salzberg, S L; Zhu, W; Pertea, M; Allen, J E; Orvis, J; White, O; Buell, C R; Wortman, J R
EVidenceModeler (EVM) is presented as an automated eukaryotic gene structure annotation tool that reports eukaryotic gene structures as a weighted consensus of all available evidence. EVM, when combined with the Program to Assemble Spliced Alignments (PASA), yields a comprehensive, configurable annotation system that predicts protein-coding genes and alternatively spliced isoforms. Our experiments on both rice and human genome sequences demonstrate that EVM produces automated gene structure annotation approaching the quality of manual curation.
Corthaut, Nik; Lippens, Stefaan; Govaerts, Sten; Duval, Erik; Martens, Jean-Pierre
In the MuziK project we try to automate the typically hard task of annotating music files manually. This annotation is used for music recommendation and for automated playlist creation. The music experts of Aristo Music (http://www.aristomusic.com) defined the data fields. High quality annotations are required since the results, playlists, are used in commercial live settings and the cost of a wrong selection is high .
Full Text Available Background: Generating good training datasets is essential for machine learning-based nuclei detection methods. However, creating exhaustive nuclei contour annotations, to derive optimal training data from, is often infeasible. Methods: We compared different approaches for training nuclei detection methods solely based on nucleus center markers. Such markers contain less accurate information, especially with regard to nuclear boundaries, but can be produced much easier and in greater quantities. The approaches use different automated sample extraction methods to derive image positions and class labels from nucleus center markers. In addition, the approaches use different automated sample selection methods to improve the detection quality of the classification algorithm and reduce the run time of the training process. We evaluated the approaches based on a previously published generic nuclei detection algorithm and a set of Ki-67-stained breast cancer images. Results: A Voronoi tessellation-based sample extraction method produced the best performing training sets. However, subsampling of the extracted training samples was crucial. Even simple class balancing improved the detection quality considerably. The incorporation of active learning led to a further increase in detection quality. Conclusions: With appropriate sample extraction and selection methods, nuclei detection algorithms trained on the basis of simple center marker annotations can produce comparable quality to algorithms trained on conventionally created training sets.
Nielsen, Jens; Oliver, S.
The metabolome of a cell represents the amplification and integration of signals from other functional genomic levels, such as the transcriptome and the proteome. Although this makes metabolomics a useful tool for the high-throughput analysis of phenotypes, the lack of a direct connection...... to the genome makes it difficult to interpret metabolomic data. Nevertheless, functional genomics has produced examples of the use of metabolomics to elucidate the phenotypes of otherwise silent mutations. Despite several successes, we believe that future metabolomic studies must focus on the accurate...... measurement of the concentrations of unambiguously identified metabolites. The research community must develop databases of metabolite concentrations in cells that are grown in several well-defined conditions if metabolomic data are to be integrated meaningfully with data from the other levels of functional...
Full Text Available Grapevine is a fruit crop with worldwide economic importance. The grape berry undergoes complex biochemical changes from fruit set until ripening. This ripening process and production processes define the wine quality. Thus, a thorough understanding of berry ripening is crucial for the prediction of wine quality. For a systemic analysis of grape berry development we applied mass spectrometry based platforms to analyse the metabolome and proteome of Early Campbell at 12 stages covering major developmental phases. Primary metabolites involved in central carbon metabolism, such as sugars, organic acids and amino acids together with various bioactive secondary metabolites like flavonols, flavan-3-ols and anthocyanins were annotated and quantified. At the same time, the proteomic analysis revealed the protein dynamics of the developing grape berries. Multivariate statistical analysis of the integrated metabolomic and proteomic dataset revealed the growth trajectory and corresponding metabolites and proteins contributing most to the specific developmental process. K-means clustering analysis revealed 12 highly specific clusters of co-regulated metabolites and proteins. Granger causality network analysis allowed for the identification of time-shift correlations between metabolite-metabolite, protein- protein and protein-metabolite pairs which is especially interesting for the understanding of developmental processes. The integration of metabolite and protein dynamics with their corresponding biochemical pathways revealed an energy-linked metabolism before veraison with high abundances of amino acids and accumulation of organic acids, followed by protein and secondary metabolite synthesis. Anthocyanins were strongly accumulated after veraison whereas other flavonoids were in higher abundance at early developmental stages and decreased during the grape berry developmental processes. A comparison of the anthocyanin profile of Early Campbell to other
Wang, Lei; Sun, Xiaoliang; Weiszmann, Jakob; Weckwerth, Wolfram
Grapevine is a fruit crop with worldwide economic importance. The grape berry undergoes complex biochemical changes from fruit set until ripening. This ripening process and production processes define the wine quality. Thus, a thorough understanding of berry ripening is crucial for the prediction of wine quality. For a systemic analysis of grape berry development we applied mass spectrometry based platforms to analyse the metabolome and proteome of Early Campbell at 12 stages covering major developmental phases. Primary metabolites involved in central carbon metabolism, such as sugars, organic acids and amino acids together with various bioactive secondary metabolites like flavonols, flavan-3-ols and anthocyanins were annotated and quantified. At the same time, the proteomic analysis revealed the protein dynamics of the developing grape berries. Multivariate statistical analysis of the integrated metabolomic and proteomic dataset revealed the growth trajectory and corresponding metabolites and proteins contributing most to the specific developmental process. K-means clustering analysis revealed 12 highly specific clusters of co-regulated metabolites and proteins. Granger causality network analysis allowed for the identification of time-shift correlations between metabolite-metabolite, protein- protein and protein-metabolite pairs which is especially interesting for the understanding of developmental processes. The integration of metabolite and protein dynamics with their corresponding biochemical pathways revealed an energy-linked metabolism before veraison with high abundances of amino acids and accumulation of organic acids, followed by protein and secondary metabolite synthesis. Anthocyanins were strongly accumulated after veraison whereas other flavonoids were in higher abundance at early developmental stages and decreased during the grape berry developmental processes. A comparison of the anthocyanin profile of Early Campbell to other cultivars revealed
Fanos, Vassilios; Atzori, Luigi; Makarenko, Karina; Melis, Gian Benedetto; Ferrazzi, Enrico
Metabolomics in maternal-fetal medicine is still an “embryonic” science. However, there is already an increasing interest in metabolome of normal and complicated pregnancies, and neonatal outcomes. Tissues used for metabolomics interrogations of pregnant women, fetuses and newborns are amniotic fluid, blood, plasma, cord blood, placenta, urine, and vaginal secretions. All published papers highlight the strong correlation between biomarkers found in these tissues and fetal malformations, prete...
Ramirez, Tzutzuy; Daneshian, Mardas; Kamp, Hennicke; Bois, Frederic Y.; Clench, Malcolm R.; Coen, Muireann; Donley, Beth; Fischer, Steven M.; Ekman, Drew R.; Fabian, Eric; Guillou, Claude; Heuer, Joachim; Hogberg, Helena T.; Jungnickel, Harald; Keun, Hector C.; Krennrich, Gerhard; Krupp, Eckart; Luch, Andreas; Noor, Fozia; Peter, Erik; Riefke, Bjoern; Seymour, Mark; Skinner, Nigel; Smirnova, Lena; Verheij, Elwin; Wagner, Silvia; Hartung, Thomas; van Ravenzwaay, Bennard; Leist, Marcel
Summary Metabolomics, the comprehensive analysis of metabolites in a biological system, provides detailed information about the biochemical/physiological status of a biological system, and about the changes caused by chemicals. Metabolomics analysis is used in many fields, ranging from the analysis of the physiological status of genetically modified organisms in safety science to the evaluation of human health conditions. In toxicology, metabolomics is the -omics discipline that is most closely related to classical knowledge of disturbed biochemical pathways. It allows rapid identification of the potential targets of a hazardous compound. It can give information on target organs and often can help to improve our understanding regarding the mode-of-action of a given compound. Such insights aid the discovery of biomarkers that either indicate pathophysiological conditions or help the monitoring of the efficacy of drug therapies. The first toxicological applications of metabolomics were for mechanistic research, but different ways to use the technology in a regulatory context are being explored. Ideally, further progress in that direction will position the metabolomics approach to address the challenges of toxicology of the 21st century. To address these issues, scientists from academia, industry, and regulatory bodies came together in a workshop to discuss the current status of applied metabolomics and its potential in the safety assessment of compounds. We report here on the conclusions of three working groups addressing questions regarding 1) metabolomics for in vitro studies 2) the appropriate use of metabolomics in systems toxicology, and 3) use of metabolomics in a regulatory context. PMID:23665807
Tokarz, Janina; Haid, Mark; Cecil, Alexander; Prehn, Cornelia; Artati, Anna; Möller, Gabriele; Adamski, Jerzy
The metabolome, although very dynamic, is sufficiently stable to provide specific quantitative traits related to health and disease. Metabolomics requires balanced use of state-of-the-art study design, chemical analytics, biostatistics, and bioinformatics to deliver meaningful answers to contemporary questions in human disease research. The technology is now frequently employed for biomarker discovery and for elucidating the mechanisms underlying endocrine-related diseases. Metabolomics has also enriched genome-wide association studies (GWAS) in this area by providing functional data. The contributions of rare genetic variants to metabolome variance and to the human phenotype have been underestimated until now. Copyright © 2017 Elsevier Ltd. All rights reserved.
Full Text Available Endogenous mechanisms for successful resolution of an acute inflammatory response and the local return to homeostasis are of interest because excessive inflammation underlies many human diseases. In this review, we provide an update and overview of functional metabolomics that identified a new bioactive metabolome of docosahexaenoic acid (DHA. Systematic studies revealed that DHA was converted to DHEA-derived novel bioactive products as well as aspirin-triggered (AT forms of protectins. The new oxygenated DHEA derived products blocked PMN chemotaxis, reduced P-selectin expression and platelet-leukocyte adhesion, and showed organ protection in ischemia/reperfusion injury. These products activated cannabinoid receptor (CB2 receptor and not CB1 receptors. The AT-PD1 reduced neutrophil (PMN recruitment in murine peritonitis. With human cells, AT-PD1 decreased transendothelial PMN migration as well as enhanced efferocytosis of apoptotic human PMN by macrophages. The recent findings reviewed here indicate that DHEA oxidative metabolism and aspirin-triggered conversion of DHA produce potent novel molecules with anti-inflammatory and organ-protective properties, opening the DHA metabolome functional roles.
Full Text Available Abstract Background Detection of low abundance metabolites is important for de novo mapping of metabolic pathways related to diet, microbiome or environmental exposures. Multiple algorithms are available to extract m/z features from liquid chromatography-mass spectral data in a conservative manner, which tends to preclude detection of low abundance chemicals and chemicals found in small subsets of samples. The present study provides software to enhance such algorithms for feature detection, quality assessment, and annotation. Results xMSanalyzer is a set of utilities for automated processing of metabolomics data. The utilites can be classified into four main modules to: 1 improve feature detection for replicate analyses by systematic re-extraction with multiple parameter settings and data merger to optimize the balance between sensitivity and reliability, 2 evaluate sample quality and feature consistency, 3 detect feature overlap between datasets, and 4 characterize high-resolution m/z matches to small molecule metabolites and biological pathways using multiple chemical databases. The package was tested with plasma samples and shown to more than double the number of features extracted while improving quantitative reliability of detection. MS/MS analysis of a random subset of peaks that were exclusively detected using xMSanalyzer confirmed that the optimization scheme improves detection of real metabolites. Conclusions xMSanalyzer is a package of utilities for data extraction, quality control assessment, detection of overlapping and unique metabolites in multiple datasets, and batch annotation of metabolites. The program was designed to integrate with existing packages such as apLCMS and XCMS, but the framework can also be used to enhance data extraction for other LC/MS data software.
Holt, Carson; Yandell, Mark
Second-generation sequencing technologies are precipitating major shifts with regards to what kinds of genomes are being sequenced and how they are annotated. While the first generation of genome projects focused on well-studied model organisms, many of today's projects involve exotic organisms whose genomes are largely terra incognita. This complicates their annotation, because unlike first-generation projects, there are no pre-existing 'gold-standard' gene-models with which to train gene-finders. Improvements in genome assembly and the wide availability of mRNA-seq data are also creating opportunities to update and re-annotate previously published genome annotations. Today's genome projects are thus in need of new genome annotation tools that can meet the challenges and opportunities presented by second-generation sequencing technologies. We present MAKER2, a genome annotation and data management tool designed for second-generation genome projects. MAKER2 is a multi-threaded, parallelized application that can process second-generation datasets of virtually any size. We show that MAKER2 can produce accurate annotations for novel genomes where training-data are limited, of low quality or even non-existent. MAKER2 also provides an easy means to use mRNA-seq data to improve annotation quality; and it can use these data to update legacy annotations, significantly improving their quality. We also show that MAKER2 can evaluate the quality of genome annotations, and identify and prioritize problematic annotations for manual review. MAKER2 is the first annotation engine specifically designed for second-generation genome projects. MAKER2 scales to datasets of any size, requires little in the way of training data, and can use mRNA-seq data to improve annotation quality. It can also update and manage legacy genome annotation datasets.
Mao, Qi; Tsang, Ivor Wai-Hung; Gao, Shenghua
Automatic image annotation, which is usually formulated as a multi-label classification problem, is one of the major tools used to enhance the semantic understanding of web images. Many multimedia applications (e.g., tag-based image retrieval) can greatly benefit from image annotation. However, the insufficient performance of image annotation methods prevents these applications from being practical. On the other hand, specific measures are usually designed to evaluate how well one annotation method performs for a specific objective or application, but most image annotation methods do not consider optimization of these measures, so that they are inevitably trapped into suboptimal performance of these objective-specific measures. To address this issue, we first summarize a variety of objective-guided performance measures under a unified representation. Our analysis reveals that macro-averaging measures are very sensitive to infrequent keywords, and hamming measure is easily affected by skewed distributions. We then propose a unified multi-label learning framework, which directly optimizes a variety of objective-specific measures of multi-label learning tasks. Specifically, we first present a multilayer hierarchical structure of learning hypotheses for multi-label problems based on which a variety of loss functions with respect to objective-guided measures are defined. And then, we formulate these loss functions as relaxed surrogate functions and optimize them by structural SVMs. According to the analysis of various measures and the high time complexity of optimizing micro-averaging measures, in this paper, we focus on example-based measures that are tailor-made for image annotation tasks but are seldom explored in the literature. Experiments show consistency with the formal analysis on two widely used multi-label datasets, and demonstrate the superior performance of our proposed method over state-of-the-art baseline methods in terms of example-based measures on four
Chen, Yujie; Zhao, Zhongzhen; Chen, Hubiao; Yi, Tao; Qin, Minjian; Liang, Zhitao
Asian and American ginsengs are widely used medicinal materials and are being used more and more in health products. The two materials look alike but function differently. Various forms of both types of ginseng are found in the market, causing confusion for consumers in their choice. To evaluate the overall quality of commercial Asian and American ginsengs and investigate the characteristic chemical markers for differentiating between them. This article investigated 17 Asian and 21 American ginseng samples using an ultra-HPLC combined with quadrupole time-of-flight MS/MS technique. The data were processed by principal component analysis and orthogonal partial least squared discriminant analysis. In the chromatograms, a total of 40 peaks were detected. Among them, six were positively identified, and all of the remainder were tentatively identified. According to statistical results, ginsenosides Rf, Rb2 and Rc together with their isomers and derivatives were more likely to be present in Asian ginsengs, whereas ginsenoside Rb1 , pseudoginsenoside F11 and ginsenoside Rd together with their isomers and derivatives tended to be present in American ginsengs. For Asian ginsengs, ginsenoside Ra3 and 20-β-D-glucopyranosyl-ginsenoside-Rf were more likely to be present in forest samples, whereas contents of floralquinquenoside B, ginsenosides Ro and Rc, and zingibroside R1 were higher in sun-dried ginsengs. For American ginseng, wild samples often had more of the notoginsenosides R1 and Rw2 and less of the ginsenosides Rd, Rd isomer and 20 (S)-Rg3 than cultivated samples. The method provided important fingerprint information for authentication and evaluation of Asian and American ginsengs from various commercial products. Copyright © 2014 John Wiley & Sons, Ltd.
Wu, Tao; Qiao, Shuxuan; Shi, Chenze; Wang, Shuya; Ji, Guang
Diabetes has become a major global health problem. The elucidation of characteristic metabolic alterations during the diabetic progression is critical for better understanding its pathogenesis, and identifying potential biomarkers and drug targets. Metabolomics is a promising tool to reveal the metabolic changes and the underlying mechanism involved in the pathogenesis of diabetic complications. The present review provides an update on the application of metabolomics in diabetic complications, including diabetic coronary artery disease, diabetic nephropathy, diabetic retinopathy and diabetic neuropathy, and this review provides notes on the prevention and prediction of diabetic complications. © 2017 The Authors. Journal of Diabetes Investigation published by Asian Association for the Study of Diabetes (AASD) and John Wiley & Sons Australia, Ltd.
Lv, Mengying; Huang, Wanqiu; Chen, Zhipeng; Jiang, Hulin; Chen, Jiaqing; Tian, Yuan; Zhang, Zunjian; Xu, Fengguo
Nanomaterials are commonly defined as engineered structures with at least one dimension of 100 nm or less. Investigations of their potential toxicological impact on biological systems and the environment have yet to catch up with the rapid development of nanotechnology and extensive production of nanoparticles. High-throughput methods are necessary to assess the potential toxicity of nanoparticles. The omics techniques are well suited to evaluate toxicity in both in vitro and in vivo systems. Besides genomic, transcriptomic and proteomic profiling, metabolomics holds great promises for globally evaluating and understanding the molecular mechanism of nanoparticle-organism interaction. This manuscript presents a general overview of metabolomics techniques, summarizes its early application in nanotoxicology and finally discusses opportunities and challenges faced in nanotoxicology.
A mechanism for attaching graphic and overlay annotation to multiple bits/pixel imagery while providing levels of performance approaching that of native mode graphics systems is presented. This mechanism isolates programming complexity from the application programmer through software encapsulation under the X Window System. It ensures display accuracy throughout operations on the imagery and annotation including zooms, pans, and modifications of the annotation. Trade-offs that affect speed of display, consumption of memory, and system functionality are explored. The use of resource files to tune the display system is discussed. The mechanism makes use of an abstraction consisting of four parts; a graphics overlay, a dithered overlay, an image overly, and a physical display window. Data structures are maintained that retain the distinction between the four parts so that they can be modified independently, providing system flexibility. A unique technique for associating user color preferences with annotation is introduced. An interface that allows interactive modification of the mapping between image value and color is discussed. A procedure that provides for the colorization of imagery on 8-bit display systems using pixel dithering is explained. Finally, the application of annotation mechanisms to various applications is discussed.
Vorkas, Panagiotis A; Abellona U, M R; Li, Jia V
The use of tissue as a matrix to elucidate disease pathology or explore intervention comes with several advantages. It allows investigation of the target alteration directly at the focal location and facilitates the detection of molecules that could become elusive after secretion into biofluids. However, tissue metabolomics/metabonomics comes with challenges not encountered in biofluid analyses. Furthermore, tissue heterogeneity does not allow for tissue aliquoting. Here we describe a multiplatform, multi-method workflow which enables metabolic profiling analysis of tissue samples, while it can deliver enhanced metabolome coverage. After applying a dual consecutive extraction (organic followed by aqueous), tissue extracts are analyzed by reversed-phase (RP-) and hydrophilic interaction liquid chromatography (HILIC-) ultra-performance liquid chromatography coupled to mass spectrometry (UPLC-MS) and nuclear magnetic resonance (NMR) spectroscopy. This pipeline incorporates the required quality control features, enhances versatility, allows provisional aliquoting of tissue extracts for future guided analyses, expands the range of metabolites robustly detected, and supports data integration. It has been successfully employed for the analysis of a wide range of tissue types.
van Rijswijk, Merlijn; Beirnaert, Charlie; Caron, Christophe; Cascante, Marta; Dominguez, Victoria; Dunn, Warwick B.; Ebbels, Timothy M. D.; Giacomoni, Franck; Gonzalez-Beltran, Alejandra; Hankemeier, Thomas; Haug, Kenneth; Izquierdo-Garcia, Jose L.; Jimenez, Rafael C.; Jourdan, Fabien; Kale, Namrata; Klapa, Maria I.; Kohlbacher, Oliver; Koort, Kairi; Kultima, Kim; Le Corguillé, Gildas; Moreno, Pablo; Moschonas, Nicholas K.; Neumann, Steffen; O’Donovan, Claire; Reczko, Martin; Rocca-Serra, Philippe; Rosato, Antonio; Salek, Reza M.; Sansone, Susanna-Assunta; Satagopam, Venkata; Schober, Daniel; Shimmo, Ruth; Spicer, Rachel A.; Spjuth, Ola; Thévenot, Etienne A.; Viant, Mark R.; Weber, Ralf J. M.; Willighagen, Egon L.; Zanetti, Gianluigi; Steinbeck, Christoph
Metabolomics, the youngest of the major omics technologies, is supported by an active community of researchers and infrastructure developers across Europe. To coordinate and focus efforts around infrastructure building for metabolomics within Europe, a workshop on the “Future of metabolomics in ELIXIR” was organised at Frankfurt Airport in Germany. This one-day strategic workshop involved representatives of ELIXIR Nodes, members of the PhenoMeNal consortium developing an e-infrastructure that supports workflow-based metabolomics analysis pipelines, and experts from the international metabolomics community. The workshop established metabolite identification as the critical area, where a maximal impact of computational metabolomics and data management on other fields could be achieved. In particular, the existing four ELIXIR Use Cases, where the metabolomics community - both industry and academia - would benefit most, and which could be exhaustively mapped onto the current five ELIXIR Platforms were discussed. This opinion article is a call for support for a new ELIXIR metabolomics Use Case, which aligns with and complements the existing and planned ELIXIR Platforms and Use Cases. PMID:29043062
van Rijswijk, Merlijn; Beirnaert, Charlie; Caron, Christophe; Cascante, Marta; Dominguez, Victoria; Dunn, Warwick B; Ebbels, Timothy M D; Giacomoni, Franck; Gonzalez-Beltran, Alejandra; Hankemeier, Thomas; Haug, Kenneth; Izquierdo-Garcia, Jose L; Jimenez, Rafael C; Jourdan, Fabien; Kale, Namrata; Klapa, Maria I; Kohlbacher, Oliver; Koort, Kairi; Kultima, Kim; Le Corguillé, Gildas; Moreno, Pablo; Moschonas, Nicholas K; Neumann, Steffen; O'Donovan, Claire; Reczko, Martin; Rocca-Serra, Philippe; Rosato, Antonio; Salek, Reza M; Sansone, Susanna-Assunta; Satagopam, Venkata; Schober, Daniel; Shimmo, Ruth; Spicer, Rachel A; Spjuth, Ola; Thévenot, Etienne A; Viant, Mark R; Weber, Ralf J M; Willighagen, Egon L; Zanetti, Gianluigi; Steinbeck, Christoph
Metabolomics, the youngest of the major omics technologies, is supported by an active community of researchers and infrastructure developers across Europe. To coordinate and focus efforts around infrastructure building for metabolomics within Europe, a workshop on the "Future of metabolomics in ELIXIR" was organised at Frankfurt Airport in Germany. This one-day strategic workshop involved representatives of ELIXIR Nodes, members of the PhenoMeNal consortium developing an e-infrastructure that supports workflow-based metabolomics analysis pipelines, and experts from the international metabolomics community. The workshop established metabolite identification as the critical area, where a maximal impact of computational metabolomics and data management on other fields could be achieved. In particular, the existing four ELIXIR Use Cases, where the metabolomics community - both industry and academia - would benefit most, and which could be exhaustively mapped onto the current five ELIXIR Platforms were discussed. This opinion article is a call for support for a new ELIXIR metabolomics Use Case, which aligns with and complements the existing and planned ELIXIR Platforms and Use Cases.
van der Greef, J.; Smilde, A. K.
Metabolomics is a growing area in the field of systems biology. Metabolomics has already a long history and also the connection of metabolomics with chemometrics goes back some time. This review discusses the symbiosis of metabolomics and chemometrics with emphasis on the medical domain, puts the
Gaston K Mazandu
Full Text Available With the advancement of new high throughput sequencing technologies, there has been an increase in the number of genome sequencing projects worldwide, which has yielded complete genome sequences of human, animals and plants. Subsequently, several labs have focused on genome annotation, consisting of assigning functions to gene products, mostly using Gene Ontology (GO terms. As a consequence, there is an increased heterogeneity in annotations across genomes due to different approaches used by different pipelines to infer these annotations and also due to the nature of the GO structure itself. This makes a curator's task difficult, even if they adhere to the established guidelines for assessing these protein annotations. Here we develop a genome-scale approach for integrating GO annotations from different pipelines using semantic similarity measures. We used this approach to identify inconsistencies and similarities in functional annotations between orthologs of human and Drosophila melanogaster, to assess the quality of GO annotations derived from InterPro2GO mappings compared to manually annotated GO annotations for the Drosophila melanogaster proteome from a FlyBase dataset and human, and to filter GO annotation data for these proteomes. Results obtained indicate that an efficient integration of GO annotations eliminates redundancy up to 27.08 and 22.32% in the Drosophila melanogaster and human GO annotation datasets, respectively. Furthermore, we identified lack of and missing annotations for some orthologs, and annotation mismatches between InterPro2GO and manual pipelines in these two proteomes, thus requiring further curation. This simplifies and facilitates tasks of curators in assessing protein annotations, reduces redundancy and eliminates inconsistencies in large annotation datasets for ease of comparative functional genomics.
James P Balhoff
Full Text Available Phenotypic differences among species have long been systematically itemized and described by biologists in the process of investigating phylogenetic relationships and trait evolution. Traditionally, these descriptions have been expressed in natural language within the context of individual journal publications or monographs. As such, this rich store of phenotype data has been largely unavailable for statistical and computational comparisons across studies or integration with other biological knowledge.Here we describe Phenex, a platform-independent desktop application designed to facilitate efficient and consistent annotation of phenotypic similarities and differences using Entity-Quality syntax, drawing on terms from community ontologies for anatomical entities, phenotypic qualities, and taxonomic names. Phenex can be configured to load only those ontologies pertinent to a taxonomic group of interest. The graphical user interface was optimized for evolutionary biologists accustomed to working with lists of taxa, characters, character states, and character-by-taxon matrices.Annotation of phenotypic data using ontologies and globally unique taxonomic identifiers will allow biologists to integrate phenotypic data from different organisms and studies, leveraging decades of work in systematics and comparative morphology.
Balhoff, James P; Dahdul, Wasila M; Kothari, Cartik R; Lapp, Hilmar; Lundberg, John G; Mabee, Paula; Midford, Peter E; Westerfield, Monte; Vision, Todd J
Phenotypic differences among species have long been systematically itemized and described by biologists in the process of investigating phylogenetic relationships and trait evolution. Traditionally, these descriptions have been expressed in natural language within the context of individual journal publications or monographs. As such, this rich store of phenotype data has been largely unavailable for statistical and computational comparisons across studies or integration with other biological knowledge. Here we describe Phenex, a platform-independent desktop application designed to facilitate efficient and consistent annotation of phenotypic similarities and differences using Entity-Quality syntax, drawing on terms from community ontologies for anatomical entities, phenotypic qualities, and taxonomic names. Phenex can be configured to load only those ontologies pertinent to a taxonomic group of interest. The graphical user interface was optimized for evolutionary biologists accustomed to working with lists of taxa, characters, character states, and character-by-taxon matrices. Annotation of phenotypic data using ontologies and globally unique taxonomic identifiers will allow biologists to integrate phenotypic data from different organisms and studies, leveraging decades of work in systematics and comparative morphology.
Full Text Available Previous studies have shown that calcium stressed Saccharomyces cerevisiae, challenged with immunosuppressant drugs FK506 and Cyclosporin A, responds with comprehensive gene expression changes and attenuation of the generalized calcium stress response. Here, we describe a global metabolomics workflow for investigating the utility of tracking corresponding phenotypic changes. This was achieved by efficiently analyzing relative abundance differences between intracellular metabolite pools from wild-type and calcium stressed cultures, with and without prior immunosuppressant drugs exposure. We used pathway database content from WikiPathways and YeastCyc to facilitate the projection of our metabolomics profiling results onto biological pathways. A key challenge was to increase the coverage of the detected metabolites. This was achieved by applying both reverse phase (RP and aqueous normal phase (ANP chromatographic separations, as well as electrospray ionization (ESI and atmospheric pressure chemical ionization (APCI sources for detection in both ion polarities. Unsupervised principle component analysis (PCA and ANOVA results revealed differentiation between wild-type controls, calcium stressed and immunosuppressant/calcium challenged cells. Untargeted data mining resulted in 247 differentially expressed, annotated metabolites, across at least one pair of conditions. A separate, targeted data mining strategy identified 187 differential, annotated metabolites. All annotated metabolites were subsequently mapped onto curated pathways from YeastCyc and WikiPathways for interactive pathway analysis and visualization. Dozens of pathways showed differential responses to stress conditions based on one or more matches to the list of annotated metabolites or to metabolites that had been identified further by MS/MS. The purine salvage, pantothenate and sulfur amino acid pathways were flagged as being enriched, which is consistent with previously published
Gille, Christoph; Fähling, Michael; Weyand, Birgit; Wieland, Thomas; Gille, Andreas
Alignment-Annotator is a novel web service designed to generate interactive views of annotated nucleotide and amino acid sequence alignments (i) de novo and (ii) embedded in other software. All computations are performed at server side. Interactivity is implemented in HTML5, a language native to web browsers. The alignment is initially displayed using default settings and can be modified with the graphical user interfaces. For example, individual sequences can be reordered or deleted using drag and drop, amino acid color code schemes can be applied and annotations can be added. Annotations can be made manually or imported (BioDAS servers, the UniProt, the Catalytic Site Atlas and the PDB). Some edits take immediate effect while others require server interaction and may take a few seconds to execute. The final alignment document can be downloaded as a zip-archive containing the HTML files. Because of the use of HTML the resulting interactive alignment can be viewed on any platform including Windows, Mac OS X, Linux, Android and iOS in any standard web browser. Importantly, no plugins nor Java are required and therefore Alignment-Anotator represents the first interactive browser-based alignment visualization. http://www.bioinformatics.org/strap/aa/ and http://strap.charite.de/aa/. © The Author(s) 2014. Published by Oxford University Press on behalf of Nucleic Acids Research.
Metabolomics is an “omic” science that is now emerging with the purpose of elaborating a comprehensive analysis of the metabolome, which is the complete set of metabolites (i.e., small molecules intermediates) in an organism, tissue, cell, or biofluid. In the past decade, metabolomics has already proved to be useful for the characterization of several pathological conditions and offers promises as a clinical tool. A metabolomics investigation of coeliac disease (CD) revealed that a metabolic fingerprint for CD can be defined, which accounts for three different but complementary components: malabsorption, energy metabolism, and alterations in gut microflora and/or intestinal permeability. In this review, we will discuss the major advancements in metabolomics of CD, in particular with respect to the role of gut microbiome and energy metabolism. PMID:24665364
Peng, Jun; Chen, Yi-Ting; Chen, Chien-Lun; Li, Liang
Large-scale metabolomics study requires a quantitative method to generate metabolome data over an extended period with high technical reproducibility. We report a universal metabolome-standard (UMS) method, in conjunction with chemical isotope labeling liquid chromatography-mass spectrometry (LC-MS), to provide long-term analytical reproducibility and facilitate metabolome comparison among different data sets. In this method, UMS of a specific type of sample labeled by an isotope reagent is prepared a priori. The UMS is spiked into any individual samples labeled by another form of the isotope reagent in a metabolomics study. The resultant mixture is analyzed by LC-MS to provide relative quantification of the individual sample metabolome to UMS. UMS is independent of a study undertaking as well as the time of analysis and useful for profiling the same type of samples in multiple studies. In this work, the UMS method was developed and applied for a urine metabolomics study of bladder cancer. UMS of human urine was prepared by (13)C2-dansyl labeling of a pooled sample from 20 healthy individuals. This method was first used to profile the discovery samples to generate a list of putative biomarkers potentially useful for bladder cancer detection and then used to analyze the verification samples about one year later. Within the discovery sample set, three-month technical reproducibility was examined using a quality control sample and found a mean CV of 13.9% and median CV of 9.4% for all the quantified metabolites. Statistical analysis of the urine metabolome data showed a clear separation between the bladder cancer group and the control group from the discovery samples, which was confirmed by the verification samples. Receiver operating characteristic (ROC) test showed that the area under the curve (AUC) was 0.956 in the discovery data set and 0.935 in the verification data set. These results demonstrated the utility of the UMS method for long-term metabolomics and
customers, Fortune, 122, 38-48. Key words: Consumer preferences , customer expectations Abstract: Rice presents a profile of the 1990 U.S. consumers...business process, 16 competitive advantage, 6, 10 consumer, 5 consumer affairs department, 19 consumer preferences , 30 consumer research, 10,24
Castro, Juan C; Maddox, J Dylan; Cobos, Marianela; Requena, David; Zimic, Mirko; Bombarely, Aureliano; Imán, Sixto A; Cerdeira, Luis A; Medina, Andersson E
Myrciaria dubia is an Amazonian fruit shrub that produces numerous bioactive phytochemicals, but is best known by its high L-ascorbic acid (AsA) content in fruits. Pronounced variation in AsA content has been observed both within and among individuals, but the genetic factors responsible for this variation are largely unknown. The goals of this research, therefore, were to assemble, characterize, and annotate the fruit transcriptome of M. dubia in order to reconstruct metabolic pathways and determine if multiple pathways contribute to AsA biosynthesis. In total 24,551,882 high-quality sequence reads were de novo assembled into 70,048 unigenes (mean length = 1150 bp, N50 = 1775 bp). Assembled sequences were annotated using BLASTX against public databases such as TAIR, GR-protein, FB, MGI, RGD, ZFIN, SGN, WB, TIGR_CMR, and JCVI-CMR with 75.2 % of unigenes having annotations. Of the three core GO annotation categories, biological processes comprised 53.6 % of the total assigned annotations, whereas cellular components and molecular functions comprised 23.3 and 23.1 %, respectively. Based on the KEGG pathway assignment of the functionally annotated transcripts, five metabolic pathways for AsA biosynthesis were identified: animal-like pathway, myo-inositol pathway, L-gulose pathway, D-mannose/L-galactose pathway, and uronic acid pathway. All transcripts coding enzymes involved in the ascorbate-glutathione cycle were also identified. Finally, we used the assembly to identified 6314 genic microsatellites and 23,481 high quality SNPs. This study describes the first next-generation sequencing effort and transcriptome annotation of a non-model Amazonian plant that is relevant for AsA production and other bioactive phytochemicals. Genes encoding key enzymes were successfully identified and metabolic pathways involved in biosynthesis of AsA, anthocyanins, and other metabolic pathways have been reconstructed. The identification of these genes and pathways is in agreement with
Adkins, Daniel E.; McClay, Joseph L.; Vunck, Sarah A.; Batman, Angela M.; Vann, Robert E.; Clark, Shaunna L.; Souza, Renan P.; Crowley, James J.; Sullivan, Patrick F.; van den Oord, Edwin J.C.G.; Beardsley, Patrick M.
Behavioral sensitization has been widely studied in animal models and is theorized to reflect neural modifications associated with human psychostimulant addiction. While the mesolimbic dopaminergic pathway is known to play a role, the neurochemical mechanisms underlying behavioral sensitization remain incompletely understood. In the present study, we conducted the first metabolomics analysis to globally characterize neurochemical differences associated with behavioral sensitization. Methamphetamine-induced sensitization measures were generated by statistically modeling longitudinal activity data for eight inbred strains of mice. Subsequent to behavioral testing, nontargeted liquid and gas chromatography-mass spectrometry profiling was performed on 48 brain samples, yielding 301 metabolite levels per sample after quality control. Association testing between metabolite levels and three primary dimensions of behavioral sensitization (total distance, stereotypy and margin time) showed four robust, significant associations at a stringent metabolome-wide significance threshold (false discovery rate < 0.05). Results implicated homocarnosine, a dipeptide of GABA and histidine, in total distance sensitization, GABA metabolite 4-guanidinobutanoate and pantothenate in stereotypy sensitization, and myo-inositol in margin time sensitization. Secondary analyses indicated that these associations were independent of concurrent methamphetamine levels and, with the exception of the myo-inositol association, suggest a mechanism whereby strain-based genetic variation produces specific baseline neurochemical differences that substantially influence the magnitude of MA-induced sensitization. These findings demonstrate the utility of mouse metabolomics for identifying novel biomarkers, and developing more comprehensive neurochemical models, of psychostimulant sensitization. PMID:24034544
Waaijenborg, Sandra; Korobko, Oksana; Willems van Dijk, Ko; Lips, Mirjam; Hankemeier, Thomas; Wilderjans, Tom F.; Smilde, Age K.
Combining different metabolomics platforms can contribute significantly to the discovery of complementary processes expressed under different conditions. However, analysing the fused data might be hampered by the difference in their quality. In metabolomics data, one often observes that measurement errors increase with increasing measurement level and that different platforms have different measurement error variance. In this paper we compare three different approaches to correct for the measurement error heterogeneity, by transformation of the raw data, by weighted filtering before modelling and by a modelling approach using a weighted sum of residuals. For an illustration of these different approaches we analyse data from healthy obese and diabetic obese individuals, obtained from two metabolomics platforms. Concluding, the filtering and modelling approaches that both estimate a model of the measurement error did not outperform the data transformation approaches for this application. This is probably due to the limited difference in measurement error and the fact that estimation of measurement error models is unstable due to the small number of repeats available. A transformation of the data improves the classification of the two groups. PMID:29698490
Young, Jasmine Y.; Feng, Zukang; Dimitropoulos, Dimitris; Sala, Raul; Westbrook, John; Zhuravleva, Marina; Shao, Chenghua; Quesada, Martha; Peisach, Ezra; Berman, Helen M.
Over the past decade, the number of polymers and their complexes with small molecules in the Protein Data Bank archive (PDB) has continued to increase significantly. To support scientific advancements and ensure the best quality and completeness of the data files over the next 10 years and beyond, the Worldwide PDB partnership that manages the PDB archive is developing a new deposition and annotation system. This system focuses on efficient data capture across all supported experimental methods. The new deposition and annotation system is composed of four major modules that together support all of the processing requirements for a PDB entry. In this article, we describe one such module called the Chemical Component Annotation Tool. This tool uses information from both the Chemical Component Dictionary and Biologically Interesting molecule Reference Dictionary to aid in annotation. Benchmark studies have shown that the Chemical Component Annotation Tool provides significant improvements in processing efficiency and data quality. Database URL: http://wwpdb.org PMID:24291661
Lee, Sang-Hyun; Kim, Sooah; Kwon, Min-A; Jung, Young Hoon; Shin, Yong-An; Kim, Kyoung Heon
Well-established metabolome sample preparation is a prerequisite for reliable metabolomic data. For metabolome sampling of a Gram-positive strict anaerobe, Clostridium acetobutylicum, fast filtration and metabolite extraction with acetonitrile/methanol/water (2:2:1, v/v) at -20°C under anaerobic conditions has been commonly used. This anaerobic metabolite processing method is laborious and time-consuming since it is conducted in an anaerobic chamber. Also, there have not been any systematic method evaluation and development of metabolome sample preparation for strict anaerobes and Gram-positive bacteria. In this study, metabolome sampling and extraction methods were rigorously evaluated and optimized for C. acetobutylicum by using gas chromatography/time-of-flight mass spectrometry-based metabolomics, in which a total of 116 metabolites were identified. When comparing the atmospheric (i.e., in air) and anaerobic (i.e., in an anaerobic chamber) processing of metabolome sample preparation, there was no significant difference in the quality and quantity of the metabolomic data. For metabolite extraction, pure methanol at -20°C was a better solvent than acetonitrile/methanol/water (2:2:1, v/v/v) at -20°C that is frequently used for C. acetobutylicum, and metabolite profiles were significantly different depending on extraction solvents. This is the first evaluation of metabolite sample preparation under aerobic processing conditions for an anaerobe. This method could be applied conveniently, efficiently, and reliably to metabolome analysis for strict anaerobes in air. © 2014 Wiley Periodicals, Inc.
Designed for students and practitioners of public relations (PR), this annotated bibliography focuses on recent journal articles and ERIC documents. The 34 citations include the following: (1) surveys of public relations professionals on career-related education; (2) literature reviews of research on measurement and evaluation of PR and…
McDermott, Steven T.
Designed to reflect the diversity of approaches to persuasion, this annotated bibliography cites materials selected for their contribution to that diversity as well as for being relatively current and/or especially significant representatives of particular approaches. The bibliography starts with a list of 17 general textbooks on approaches to…
Welfare Pharmacy contains medical formulas documented by the government and official prescriptions used by the official pharmacy in the pharmaceutical process. In the last years of Southern Song Dynasty, anonyms gave a lot of prescription annotations, made textual researches for the name, source, composition and origin of the prescriptions, and supplemented important historical data of medical cases and researched historical facts. The annotations of Welfare Pharmacy gathered the essence of medical theory, and can be used as precious materials to correctly understand the syndrome differentiation, compatibility regularity and clinical application of prescriptions. This article deeply investigated the style and form of the prescription annotations in Welfare Pharmacy, the name of prescriptions and the evolution of terminology, the major functions of the prescriptions, processing methods, instructions for taking medicine and taboos of prescriptions, the medical cases and clinical efficacy of prescriptions, the backgrounds, sources, composition and cultural meanings of prescriptions, proposed that the prescription annotations played an active role in the textual dissemination, patent medicine production and clinical diagnosis and treatment of Welfare Pharmacy. This not only helps understand the changes in the names and terms of traditional Chinese medicines in Welfare Pharmacy, but also provides the basis for understanding the knowledge sources, compatibility regularity, important drug innovations and clinical medications of prescriptions in Welfare Pharmacy. Copyright© by the Chinese Pharmaceutical Association.
We compare the costs of semantic annotation of textual documents to its benefits for information processing tasks. Semantic annotation can improve the performance of retrieval tasks and facilitates an improved search experience through faceted search, focused retrieval, better document summaries,
Covington, William G., Jr.
This annotated bibliography presents annotations of 31 books and journal articles dealing with systems theory and its relation to organizational communication, marketing, information theory, and cybernetics. Materials were published between 1963 and 1992 and are listed alphabetically by author. (RS)
Seyler, L. M.; Rempfert, K. R.; Kraus, E. A.; Spear, J. R.; Templeton, A. S.; Schrenk, M. O.
Environmental metabolomics is an emerging approach used to study ecosystem properties. Through bioinformatic comparisons to metagenomic data sets, metabolomics can be used to study microbial adaptations and responses to varying environmental conditions. Since the techniques are highly parallel to organic geochemistry approaches, metabolomics can also provide insight into biogeochemical processes. These analyses are a reflection of metabolic potential and intersection with other organisms and environmental components. Here, we used an untargeted metabolomics approach to characterize dissolved organic carbon and aqueous metabolites from groundwater obtained from an actively serpentinizing habitat. Serpentinites are known to support microbial communities that feed off of the products of serpentinization (such as methane and H2 gas), while adapted to harsh environmental conditions such as high pH and low DIC availability. However, the biochemistry of microbial populations that inhabit these environments are understudied and are complicated by overlapping biotic and abiotic processes. The aim of this study was to identify potential sources of carbon in an environment that is depleted of soluble inorganic carbon, and to characterize the flow of metabolites and describe overlapping biogenic and abiogenic processes impacting carbon cycling in serpentinizing rocks. We applied untargeted metabolomics techniques to groundwater taken from a series of wells drilled into the Semail Ophiolite in Oman.. Samples were analyzed via quadrupole time-of-flight liquid chromatography tandem mass spectrometry (QToF-LC/MS/MS). Metabolomes and metagenomic data were imported into Progenesis QI software for statistical analysis and correlation, and metabolic networks constructed using the Genome-Linked Application for Metabolic Maps (GLAMM), a web interface tool. Further multivariate statistical analyses and quality control was performed using EZinfo. Pools of dissolved organic carbon could
Wang, X.J.; Zhang, L.; Li, X.; Ma, W.Y.
Although it has been studied for years by the computer vision and machine learning communities, image annotation is still far from practical. In this paper, we propose a novel attempt at model-free image annotation, which is a data-driven approach that annotates images by mining their search
Abstract Background Data from metabolomic studies are typically complex and high-dimensional. Principal component analysis (PCA) is currently the most widely used statistical technique for analyzing metabolomic data. However, PCA is limited by the fact that it is not based on a statistical model. Results Here, probabilistic principal component analysis (PPCA) which addresses some of the limitations of PCA, is reviewed and extended. A novel extension of PPCA, called probabilistic principal component and covariates analysis (PPCCA), is introduced which provides a flexible approach to jointly model metabolomic data and additional covariate information. The use of a mixture of PPCA models for discovering the number of inherent groups in metabolomic data is demonstrated. The jackknife technique is employed to construct confidence intervals for estimated model parameters throughout. The optimal number of principal components is determined through the use of the Bayesian Information Criterion model selection tool, which is modified to address the high dimensionality of the data. Conclusions The methods presented are illustrated through an application to metabolomic data sets. Jointly modeling metabolomic data and covariates was successfully achieved and has the potential to provide deeper insight to the underlying data structure. Examination of confidence intervals for the model parameters, such as loadings, allows for principled and clear interpretation of the underlying data structure. A software package called MetabolAnalyze, freely available through the R statistical software, has been developed to facilitate implementation of the presented methods in the metabolomics field.
Johnson, Caroline H.; Ivanisevic, Julijana; Siuzdak, Gary
Metabolomics, which is the profiling of metabolites in biofluids, cells and tissues, is routinely applied as a tool for biomarker discovery. Owing to innovative developments in informatics and analytical technologies, and the integration of orthogonal biological approaches, it is now possible to expand metabolomic analyses to understand the systems-level effects of metabolites. Moreover, because of the inherent sensitivity of metabolomics, subtle alterations in biological pathways can be detected to provide insight into the mechanisms that underlie various physiological conditions and aberrant processes, including diseases. PMID:26979502
Gooding, Jessica R; Jensen, Mette V; Newgard, Christopher B
Metabolomics, the characterization of the set of small molecules in a biological system, is advancing research in multiple areas of islet biology. Measuring a breadth of metabolites simultaneously provides a broad perspective on metabolic changes as the islets respond dynamically to metabolic fuels, hormones, or environmental stressors. As a result, metabolomics has the potential to provide new mechanistic insights into islet physiology and pathophysiology. Here we summarize advances in our understanding of islet physiology and the etiologies of type-1 and type-2 diabetes gained from metabolomics studies. Copyright © 2015 Elsevier Inc. All rights reserved.
Fürtauer, Lisa; Weiszmann, Jakob; Weckwerth, Wolfram; Nägele, Thomas
The experimental analysis of a plant metabolome typically results in a comprehensive and multidimensional data set. To interpret metabolomics data in the context of biochemical regulation and environmental fluctuation, various approaches of mathematical modeling have been developed and have proven useful. In this chapter, a general introduction to mathematical modeling is presented and discussed in context of plant metabolism. A particular focus is laid on the suitability of mathematical approaches to functionally integrate plant metabolomics data in a metabolic network and combine it with other biochemical or physiological parameters.
Rigoutsos, Isidore; Huynh, Tien; Floratos, Aris; Parida, Laxmi; Platt, Daniel
Computational methods seeking to automatically determine the properties (functional, structural, physicochemical, etc.) of a protein directly from the sequence have long been the focus of numerous research groups. With the advent of advanced sequencing methods and systems, the number of amino acid sequences that are being deposited in the public databases has been increasing steadily. This has in turn generated a renewed demand for automated approaches that can annotate individual sequences and complete genomes quickly, exhaustively and objectively. In this paper, we present one such approach that is centered around and exploits the Bio-Dictionary, a collection of amino acid patterns that completely covers the natural sequence space and can capture functional and structural signals that have been reused during evolution, within and across protein families. Our annotation approach also makes use of a weighted, position-specific scoring scheme that is unaffected by the over-representation of well-conserved proteins and protein fragments in the databases used. For a given query sequence, the method permits one to determine, in a single pass, the following: local and global similarities between the query and any protein already present in a public database; the likeness of the query to all available archaeal/ bacterial/eukaryotic/viral sequences in the database as a function of amino acid position within the query; the character of secondary structure of the query as a function of amino acid position within the query; the cytoplasmic, transmembrane or extracellular behavior of the query; the nature and position of binding domains, active sites, post-translationally modified sites, signal peptides, etc. In terms of performance, the proposed method is exhaustive, objective and allows for the rapid annotation of individual sequences and full genomes. Annotation examples are presented and discussed in Results, including individual queries and complete genomes that were
Amberg, Alexander; Barrett, Dave; Beale, Michael H.; Beger, Richard; Daykin, Clare A.; Fan, Teresa W.-M.; Fiehn, Oliver; Goodacre, Royston; Griffin, Julian L.; Hankemeier, Thomas; Hardy, Nigel; Harnly, James; Higashi, Richard; Kopka, Joachim; Lane, Andrew N.; Lindon, John C.; Marriott, Philip; Nicholls, Andrew W.; Reily, Michael D.; Thaden, John J.; Viant, Mark R.
There is a general consensus that supports the need for standardized reporting of metadata or information describing large-scale metabolomics and other functional genomics data sets. Reporting of standard metadata provides a biological and empirical context for the data, facilitates experimental replication, and enables the re-interrogation and comparison of data by others. Accordingly, the Metabolomics Standards Initiative is building a general consensus concerning the minimum reporting standards for metabolomics experiments of which the Chemical Analysis Working Group (CAWG) is a member of this community effort. This article proposes the minimum reporting standards related to the chemical analysis aspects of metabolomics experiments including: sample preparation, experimental analysis, quality control, metabolite identification, and data pre-processing. These minimum standards currently focus mostly upon mass spectrometry and nuclear magnetic resonance spectroscopy due to the popularity of these techniques in metabolomics. However, additional input concerning other techniques is welcomed and can be provided via the CAWG on-line discussion forum at http://msi-workgroups.sourceforge.net/ or http://Msifirstname.lastname@example.org. Further, community input related to this document can also be provided via this electronic forum. PMID:24039616
Brooke N. Dulka
Full Text Available Acute social defeat represents a naturalistic form of conditioned fear and is an excellent model in which to investigate the biological basis of stress resilience. While there is growing interest in identifying biomarkers of stress resilience, until recently, it has not been feasible to associate levels of large numbers of neurochemicals and metabolites to stress-related phenotypes. The objective of the present study was to use an untargeted metabolomics approach to identify known and unknown neurochemicals in select brain regions that distinguish susceptible and resistant individuals in two rodent models of acute social defeat. In the first experiment, male mice were first phenotyped as resistant or susceptible. Then, mice were subjected to acute social defeat, and tissues were immediately collected from the ventromedial prefrontal cortex (vmPFC, basolateral/central amygdala (BLA/CeA, nucleus accumbens (NAc, and dorsal hippocampus (dHPC. Ultra-high performance liquid chromatography coupled with high resolution mass spectrometry (UPLC-HRMS was used for the detection of water-soluble neurochemicals. In the second experiment, male Syrian hamsters were paired in daily agonistic encounters for 2 weeks, during which they formed stable dominant-subordinate relationships. Then, 24 h after the last dominance encounter, animals were exposed to acute social defeat stress. Immediately after social defeat, tissue was collected from the vmPFC, BLA/CeA, NAc, and dHPC for analysis using UPLC-HRMS. Although no single biomarker characterized stress-related phenotypes in both species, commonalities were found. For instance, in both model systems, animals resistant to social defeat stress also show increased concentration of molecules to protect against oxidative stress in the NAc and vmPFC. Additionally, in both mice and hamsters, unidentified spectral features were preliminarily annotated as potential targets for future experiments. Overall, these findings
David, O.; Lloyd, W.; Carlson, J.; Leavesley, G. H.; Geter, F.
The popular programming languages Java and C# provide annotations, a form of meta-data construct. Software frameworks for web integration, web services, database access, and unit testing now take advantage of annotations to reduce the complexity of APIs and the quantity of integration code between the application and framework infrastructure. Adopting annotation features in frameworks has been observed to lead to cleaner and leaner application code. The USDA Object Modeling System (OMS) version 3.0 fully embraces the annotation approach and additionally defines a meta-data standard for components and models. In version 3.0 framework/model integration previously accomplished using API calls is now achieved using descriptive annotations. This enables the framework to provide additional functionality non-invasively such as implicit multithreading, and auto-documenting capabilities while achieving a significant reduction in the size of the model source code. Using a non-invasive methodology leads to models and modeling components with only minimal dependencies on the modeling framework. Since models and modeling components are not directly bound to framework by the use of specific APIs and/or data types they can more easily be reused both within the framework as well as outside of it. To study the effectiveness of an annotation based framework approach with other modeling frameworks, a framework-invasiveness study was conducted to evaluate the effects of framework design on model code quality. A monthly water balance model was implemented across several modeling frameworks and several software metrics were collected. The metrics selected were measures of non-invasive design methods for modeling frameworks from a software engineering perspective. It appears that the use of annotations positively impacts several software quality measures. In a next step, the PRMS model was implemented in OMS 3.0 and is currently being implemented for water supply forecasting in the
McFee, Brian; Nieto, Oriol; Farbood, Morwaread M; Bello, Juan Pablo
Music exhibits structure at multiple scales, ranging from motifs to large-scale functional components. When inferring the structure of a piece, different listeners may attend to different temporal scales, which can result in disagreements when they describe the same piece. In the field of music informatics research (MIR), it is common to use corpora annotated with structural boundaries at different levels. By quantifying disagreements between multiple annotators, previous research has yielded several insights relevant to the study of music cognition. First, annotators tend to agree when structural boundaries are ambiguous. Second, this ambiguity seems to depend on musical features, time scale, and genre. Furthermore, it is possible to tune current annotation evaluation metrics to better align with these perceptual differences. However, previous work has not directly analyzed the effects of hierarchical structure because the existing methods for comparing structural annotations are designed for "flat" descriptions, and do not readily generalize to hierarchical annotations. In this paper, we extend and generalize previous work on the evaluation of hierarchical descriptions of musical structure. We derive an evaluation metric which can compare hierarchical annotations holistically across multiple levels. sing this metric, we investigate inter-annotator agreement on the multilevel annotations of two different music corpora, investigate the influence of acoustic properties on hierarchical annotations, and evaluate existing hierarchical segmentation algorithms against the distribution of inter-annotator agreement.
Full Text Available Music exhibits structure at multiple scales, ranging from motifs to large-scale functional components. When inferring the structure of a piece, different listeners may attend to different temporal scales, which can result in disagreements when they describe the same piece. In the field of music informatics research (MIR, it is common to use corpora annotated with structural boundaries at different levels. By quantifying disagreements between multiple annotators, previous research has yielded several insights relevant to the study of music cognition. First, annotators tend to agree when structural boundaries are ambiguous. Second, this ambiguity seems to depend on musical features, time scale, and genre. Furthermore, it is possible to tune current annotation evaluation metrics to better align with these perceptual differences. However, previous work has not directly analyzed the effects of hierarchical structure because the existing methods for comparing structural annotations are designed for “flat” descriptions, and do not readily generalize to hierarchical annotations. In this paper, we extend and generalize previous work on the evaluation of hierarchical descriptions of musical structure. We derive an evaluation metric which can compare hierarchical annotations holistically across multiple levels. sing this metric, we investigate inter-annotator agreement on the multilevel annotations of two different music corpora, investigate the influence of acoustic properties on hierarchical annotations, and evaluate existing hierarchical segmentation algorithms against the distribution of inter-annotator agreement.
Victoria Dominguez Del Angel
Full Text Available As a part of the ELIXIR-EXCELERATE efforts in capacity building, we present here 10 steps to facilitate researchers getting started in genome assembly and genome annotation. The guidelines given are broadly applicable, intended to be stable over time, and cover all aspects from start to finish of a general assembly and annotation project. Intrinsic properties of genomes are discussed, as is the importance of using high quality DNA. Different sequencing technologies and generally applicable workflows for genome assembly are also detailed. We cover structural and functional annotation and encourage readers to also annotate transposable elements, something that is often omitted from annotation workflows. The importance of data management is stressed, and we give advice on where to submit data and how to make your results Findable, Accessible, Interoperable, and Reusable (FAIR.
Manach, Claudine; Brennan, Lorraine; Dragsted, Lars Ove
Improving dietary assessment is essential for modern nutritional epidemiology. This chapter discusses the potential of metabolomics for the identification of new biomarkers of intake and presents the first candidate biomarkers discovered using this approach. It then describes the challenges...
Metabolomics is the study of small molecules of both endogenous and exogenous origin, such as metabolic substrates and their products, lipids, small peptides, vitamins and other protein cofactors generated by metabolism, which are downstream from genes.
Issaq, Haleem J
Proteomic and Metabolomic Approaches to Biomarker Discovery demonstrates how to leverage biomarkers to improve accuracy and reduce errors in research. Disease biomarker discovery is one of the most vibrant and important areas of research today, as the identification of reliable biomarkers has an enormous impact on disease diagnosis, selection of treatment regimens, and therapeutic monitoring. Various techniques are used in the biomarker discovery process, including techniques used in proteomics, the study of the proteins that make up an organism, and metabolomics, the study of chemical fingerprints created from cellular processes. Proteomic and Metabolomic Approaches to Biomarker Discovery is the only publication that covers techniques from both proteomics and metabolomics and includes all steps involved in biomarker discovery, from study design to study execution. The book describes methods, and presents a standard operating procedure for sample selection, preparation, and storage, as well as data analysis...
Full Text Available Metabolomics in maternal-fetal medicine is still an “embryonic” science. However, there is already an increasing interest in metabolome of normal and complicated pregnancies, and neonatal outcomes. Tissues used for metabolomics interrogations of pregnant women, fetuses and newborns are amniotic fluid, blood, plasma, cord blood, placenta, urine, and vaginal secretions. All published papers highlight the strong correlation between biomarkers found in these tissues and fetal malformations, preterm delivery, premature rupture of membranes, gestational diabetes mellitus, preeclampsia, neonatal asphyxia, and hypoxic-ischemic encephalopathy. The aim of this review is to summarize and comment on original data available in relevant published works in order to emphasize the clinical potential of metabolomics in obstetrics in the immediate future.
Full Text Available Nicotinamide phosphoribosyltransferase (NAMPT plays an important role in cellular bioenergetics. It is responsible for converting nicotinamide to nicotinamide adenine dinucleotide, an essential molecule in cellular metabolism. NAMPT has been extensively studied over the past decade due to its role as a key regulator of nicotinamide adenine dinucleotide-consuming enzymes. NAMPT is also known as a potential target for therapeutic intervention due to its involvement in disease. In the current study, we used a global mass spectrometry-based metabolomic approach to investigate the effects of FK866, a small molecule inhibitor of NAMPT currently in clinical trials, on metabolic perturbations in human cancer cells. We treated A2780 (ovarian cancer and HCT-116 (colorectal cancer cell lines with FK866 in the presence and absence of nicotinic acid. Significant changes were observed in the amino acids metabolism and the purine and pyrimidine metabolism. We also observed metabolic alterations in glycolysis, the citric acid cycle (TCA, and the pentose phosphate pathway. To expand the range of the detected polar metabolites and improve data confidence, we applied a global metabolomics profiling platform by using both non-targeted and targeted hydrophilic (HILIC-LC-MS and GC-MS analysis. We used Ingenuity Knowledge Base to facilitate the projection of metabolomics data onto metabolic pathways. Several metabolic pathways showed differential responses to FK866 based on several matches to the list of annotated metabolites. This study suggests that global metabolomics can be a useful tool in pharmacological studies of the mechanism of action of drugs at a cellular level.
Daria A Kokova
Full Text Available Opisthorchiasis is a parasitic infection caused by the liver flukes of the Opisthorchiidae family. Both experimental and epidemiological data strongly support a role of these parasites in the etiology of the hepatobiliary pathologies and an increased risk of intrahepatic cholangiocarcinoma. Understanding a functional link between the infection and hepatobiliary pathologies requires a detailed description a host-parasite interaction on different levels of biological regulation including the metabolic response on the infection. The last one, however, remains practically undocumented. Here we are describing a host response on Opisthorchiidae infection using a metabolomics approach and present the first exploratory metabolomics study of an experimental model of O. felineus infection.We conducted a Nuclear Magnetic Resonance (NMR based longitudinal metabolomics study involving a cohort of 30 animals with two degrees of infection and a control group. An exploratory analysis shows that the most noticeable trend (30% of total variance in the data was related to the gender differences. Therefore further analysis was done of each gender group separately applying a multivariate extension of the ANOVA-ASCA (ANOVA simultaneous component analysis. We show that in the males the infection specific time trends are present in the main component (43.5% variance, while in the females it is presented only in the second component and covers 24% of the variance. We have selected and annotated 24 metabolites associated with the observed effects and provided a physiological interpretation of the findings.The first exploratory metabolomics study an experimental model of O. felineus infection is presented. Our data show that at early stage of infection a response of an organism unfolds in a gender specific manner. Also main physiological mechanisms affected appear rather nonspecific (a status of the metabolic stress the data provides a set of the hypothesis for a search
Full Text Available In biological networks of molecular interactions in a cell, network motifs that are biologically relevant are also functionally coherent, or form functional modules. These functionally coherent modules combine in a hierarchical manner into larger, less cohesive subsystems, thus revealing one of the essential design principles of system-level cellular organization and function-hierarchical modularity. Arguably, hierarchical modularity has not been explicitly taken into consideration by most, if not all, functional annotation systems. As a result, the existing methods would often fail to assign a statistically significant functional coherence score to biologically relevant molecular machines. We developed a methodology for hierarchical functional annotation. Given the hierarchical taxonomy of functional concepts (e.g., Gene Ontology and the association of individual genes or proteins with these concepts (e.g., GO terms, our method will assign a Hierarchical Modularity Score (HMS to each node in the hierarchy of functional modules; the HMS score and its p-value measure functional coherence of each module in the hierarchy. While existing methods annotate each module with a set of "enriched" functional terms in a bag of genes, our complementary method provides the hierarchical functional annotation of the modules and their hierarchically organized components. A hierarchical organization of functional modules often comes as a bi-product of cluster analysis of gene expression data or protein interaction data. Otherwise, our method will automatically build such a hierarchy by directly incorporating the functional taxonomy information into the hierarchy search process and by allowing multi-functional genes to be part of more than one component in the hierarchy. In addition, its underlying HMS scoring metric ensures that functional specificity of the terms across different levels of the hierarchical taxonomy is properly treated. We have evaluated our
Wu, Yiman; Li, Liang
To reveal metabolomic changes caused by a biological event in quantitative metabolomics, it is critical to use an analytical tool that can perform accurate and precise quantification to examine the true concentration differences of individual metabolites found in different samples. A number of steps are involved in metabolomic analysis including pre-analytical work (e.g., sample collection and storage), analytical work (e.g., sample analysis) and data analysis (e.g., feature extraction and quantification). Each one of them can influence the quantitative results significantly and thus should be performed with great care. Among them, the total sample amount or concentration of metabolites can be significantly different from one sample to another. Thus, it is critical to reduce or eliminate the effect of total sample amount variation on quantification of individual metabolites. In this review, we describe the importance of sample normalization in the analytical workflow with a focus on mass spectrometry (MS)-based platforms, discuss a number of methods recently reported in the literature and comment on their applicability in real world metabolomics applications. Sample normalization has been sometimes ignored in metabolomics, partially due to the lack of a convenient means of performing sample normalization. We show that several methods are now available and sample normalization should be performed in quantitative metabolomics where the analyzed samples have significant variations in total sample amounts. Copyright © 2015 Elsevier B.V. All rights reserved.
Stubbs, Amber; Uzuner, Özlem
The 2014 i2b2/UTHealth natural language processing shared task featured a track focused on identifying risk factors for heart disease (specifically, Cardiac Artery Disease) in clinical narratives. For this track, we used a "light" annotation paradigm to annotate a set of 1304 longitudinal medical records describing 296 patients for risk factors and the times they were present. We designed the annotation task for this track with the goal of balancing annotation load and time with quality, so as to generate a gold standard corpus that can benefit a clinically-relevant task. We applied light annotation procedures and determined the gold standard using majority voting. On average, the agreement of annotators with the gold standard was above 0.95, indicating high reliability. The resulting document-level annotations generated for each record in each longitudinal EMR in this corpus provide information that can support studies of progression of heart disease risk factors in the included patients over time. These annotations were used in the Risk Factor track of the 2014 i2b2/UTHealth shared task. Participating systems achieved a mean micro-averaged F1 measure of 0.815 and a maximum F1 measure of 0.928 for identifying these risk factors in patient records. Copyright © 2015 Elsevier Inc. All rights reserved.
Berardini, Tanya Z; Li, Donghui; Muller, Robert; Chetty, Raymond; Ploetz, Larry; Singh, Shanker; Wensel, April; Huala, Eva
As the scientific literature grows, leading to an increasing volume of published experimental data, so does the need to access and analyze this data using computational tools. The most commonly used method to convert published experimental data on gene function into controlled vocabulary annotations relies on a professional curator, employed by a model organism database or a more general resource such as UniProt, to read published articles and compose annotation statements based on the articles' contents. A more cost-effective and scalable approach capable of capturing gene function data across the whole range of biological research organisms in computable form is urgently needed. We have analyzed a set of ontology annotations generated through collaborations between the Arabidopsis Information Resource and several plant science journals. Analysis of the submissions entered using the online submission tool shows that most community annotations were well supported and the ontology terms chosen were at an appropriate level of specificity. Of the 503 individual annotations that were submitted, 97% were approved and community submissions captured 72% of all possible annotations. This new method for capturing experimental results in a computable form provides a cost-effective way to greatly increase the available body of annotations without sacrificing annotation quality. Database URL: www.arabidopsis.org.
Jovanović, Jelena; Bagheri, Ebrahim
The abundance and unstructured nature of biomedical texts, be it clinical or research content, impose significant challenges for the effective and efficient use of information and knowledge stored in such texts. Annotation of biomedical documents with machine intelligible semantics facilitates advanced, semantics-based text management, curation, indexing, and search. This paper focuses on annotation of biomedical entity mentions with concepts from relevant biomedical knowledge bases such as UMLS. As a result, the meaning of those mentions is unambiguously and explicitly defined, and thus made readily available for automated processing. This process is widely known as semantic annotation, and the tools that perform it are known as semantic annotators.Over the last dozen years, the biomedical research community has invested significant efforts in the development of biomedical semantic annotation technology. Aiming to establish grounds for further developments in this area, we review a selected set of state of the art biomedical semantic annotators, focusing particularly on general purpose annotators, that is, semantic annotation tools that can be customized to work with texts from any area of biomedicine. We also examine potential directions for further improvements of today's annotators which could make them even more capable of meeting the needs of real-world applications. To motivate and encourage further developments in this area, along the suggested and/or related directions, we review existing and potential practical applications and benefits of semantic annotators.
Background: Metabolomics is a promising tool of cardiovascular biomarker discovery. We systematically reviewed the literature on comprehensive metabolomic profiling in association with incident cardiovascular disease (CVD). Methods and Results: We searched MEDLINE and EMBASE from inception to Janua...
Hall, R.D.; Beale, M.; Fiehn, O.; Hardy, N.; Summer, L.; Bino, R.
After the establishment of technologies for high-throughput DNA sequencing (genomics), gene expression analysis (transcriptomics), and protein analysis (proteomics), the remaining functional genomics challenge is that of metabolomics. Metabolomics is the term coined for essentially comprehensive,
Hooft, van der J.J.J.; Vervoort, J.J.M.; Bino, R.J.; Vos, de C.H.
The identification of large series of metabolites detectable by mass spectrometry (MS) in crude extracts is a challenging task. In order to test and apply the so-called multistage mass spectrometry (MS n ) spectral tree approach as tool in metabolite identification in complex sample extracts, we
fusion where data from different platforms can be combined. Complex data are obtained when samples are analysed using NMR, fluorescence and GC-MS. Chemometrics methods which can be used to extract the relevant information from the obtained data are presented. Focus has been on principal component...... based on NMR data with RRV and known risk markers. The sensitivity and specificity values are 0.80 and 0.79, respectively, for a test set validated model. The second case study is based on plasma samples with verified colorectal cancer and three types of control samples analysed by fluorescence...
Jacob, Minnie; Malkawi, Abeer; Albast, Nour; Al Bougha, Salam; Lopata, Andreas; Dasouki, Majed; Abdel Rahman, Anas M
Metabolome, the ultimate functional product of the genome, can be studied through identification and quantification of small molecules. The global metabolome influences the individual phenotype through clinical and environmental interventions. Metabolomics has become an integral part of clinical research and allowed for another dimension of better understanding of disease pathophysiology and mechanism. More than 95% of the clinical biochemistry laboratory routine workload is based on small molecular identification, which can potentially be analyzed through metabolomics. However, multiple challenges in clinical metabolomics impact the entire workflow and data quality, thus the biological interpretation needs to be standardized for a reproducible outcome. Herein, we introduce the establishment of a comprehensive targeted metabolomics method for a panel of 220 clinically relevant metabolites using Liquid chromatography-tandem mass spectrometry (LC-MS/MS) standardized for clinical research. The sensitivity, reproducibility and molecular stability of each targeted metabolite (amino acids, organic acids, acylcarnitines, sugars, bile acids, neurotransmitters, polyamines, and hormones) were assessed under multiple experimental conditions. The metabolic tissue distribution was determined in various rat organs. Furthermore, the method was validated in dry blood spot (DBS) samples collected from patients known to have various inborn errors of metabolism (IEMs). Using this approach, our panel appears to be sensitive and robust as it demonstrated differential and unique metabolic profiles in various rat tissues. Also, as a prospective screening method, this panel of diverse metabolites has the ability to identify patients with a wide range of IEMs who otherwise may need multiple, time-consuming and expensive biochemical assays causing a delay in clinical management. Copyright © 2018 Elsevier B.V. All rights reserved.
Yuliya V Karpievitch
Full Text Available Liquid chromatography mass spectrometry has become one of the analytical platforms of choice for metabolomics studies. However, LC-MS metabolomics data can suffer from the effects of various systematic biases. These include batch effects, day-to-day variations in instrument performance, signal intensity loss due to time-dependent effects of the LC column performance, accumulation of contaminants in the MS ion source and MS sensitivity among others. In this study we aimed to test a singular value decomposition-based method, called EigenMS, for normalization of metabolomics data. We analyzed a clinical human dataset where LC-MS serum metabolomics data and physiological measurements were collected from thirty nine healthy subjects and forty with type 2 diabetes and applied EigenMS to detect and correct for any systematic bias. EigenMS works in several stages. First, EigenMS preserves the treatment group differences in the metabolomics data by estimating treatment effects with an ANOVA model (multiple fixed effects can be estimated. Singular value decomposition of the residuals matrix is then used to determine bias trends in the data. The number of bias trends is then estimated via a permutation test and the effects of the bias trends are eliminated. EigenMS removed bias of unknown complexity from the LC-MS metabolomics data, allowing for increased sensitivity in differential analysis. Moreover, normalized samples better correlated with both other normalized samples and corresponding physiological data, such as blood glucose level, glycated haemoglobin, exercise central augmentation pressure normalized to heart rate of 75, and total cholesterol. We were able to report 2578 discriminatory metabolite peaks in the normalized data (p<0.05 as compared to only 1840 metabolite signals in the raw data. Our results support the use of singular value decomposition-based normalization for metabolomics data.
López-Fernández, H; Reboiro-Jato, M; Glez-Peña, D; Aparicio, F; Gachet, D; Buenaga, M; Fdez-Riverola, F
Automatic term annotation from biomedical documents and external information linking are becoming a necessary prerequisite in modern computer-aided medical learning systems. In this context, this paper presents BioAnnote, a flexible and extensible open-source platform for automatically annotating biomedical resources. Apart from other valuable features, the software platform includes (i) a rich client enabling users to annotate multiple documents in a user friendly environment, (ii) an extensible and embeddable annotation meta-server allowing for the annotation of documents with local or remote vocabularies and (iii) a simple client/server protocol which facilitates the use of our meta-server from any other third-party application. In addition, BioAnnote implements a powerful scripting engine able to perform advanced batch annotations. Copyright © 2013 Elsevier Ireland Ltd. All rights reserved.
Zhen, Shoumin; Dong, Kun; Deng, Xiong; Zhou, Jiaxing; Xu, Xuexin; Han, Caixia; Zhang, Wenying; Xu, Yanhao; Wang, Zhimin; Yan, Yueming
Metabolites in wheat grains greatly influence nutritional values. Wheat provides proteins, minerals, B-group vitamins and dietary fiber to humans. These metabolites are important to human health. However, the metabolome of the grain during the development of bread wheat has not been studied so far. In this work the first dynamic metabolome of the developing grain of the elite Chinese bread wheat cultivar Zhongmai 175 was analyzed, using non-targeted gas chromatography/mass spectrometry (GC/MS) for metabolite profiling. In total, 74 metabolites were identified over the grain developmental stages. Metabolite-metabolite correlation analysis revealed that the metabolism of amino acids, carbohydrates, organic acids, amines and lipids was interrelated. An integrated metabolic map revealed a distinct regulatory profile. The results provide information that can be used by metabolic engineers and molecular breeders to improve wheat grain quality. The present metabolome approach identified dynamic changes in metabolite levels, and correlations among such levels, in developing seeds. The comprehensive metabolic map may be useful when breeding programs seek to improve grain quality. The work highlights the utility of GC/MS-based metabolomics, in conjunction with univariate and multivariate data analysis, when it is sought to understand metabolic changes in developing seeds. © 2015 Society of Chemical Industry. © 2015 Society of Chemical Industry.
Lee, Jang-Eun; Lee, Bum-Jin; Chung, Jin-Oh; Kim, Hak-Nam; Kim, Eun-Hee; Jung, Sungheuk; Lee, Hyosang; Lee, Sang-Jun; Hong, Young-Shick
Numerous factors such as geographical origin, cultivar, climate, cultural practices, and manufacturing processes influence the chemical compositions of tea, in the same way as growing conditions and grape variety affect wine quality. However, the relationships between these factors and tea chemical compositions are not well understood. In this study, a new approach for non-targeted or global analysis, i.e., metabolomics, which is highly reproducible and statistically effective in analysing a diverse range of compounds, was used to better understand the metabolome of Camellia sinensis and determine the influence of environmental factors, including geography, climate, and cultural practices, on tea-making. We found a strong correlation between environmental factors and the metabolome of green, white, and oolong teas from China, Japan, and South Korea. In particular, multivariate statistical analysis revealed strong inter-country and inter-city relationships in the levels of theanine and catechin derivatives found in green and white teas. This information might be useful for assessing tea quality or producing distinct tea products across different locations, and highlights simultaneous identification of diverse tea metabolites through an NMR-based metabolomics approach. Copyright © 2014 Elsevier Ltd. All rights reserved.
Pfaff, Claas-Thido; Eichenberg, David; Liebergesell, Mario; König-Ries, Birgitta; Wirth, Christian
Ecology has become a data intensive science over the last decades which often relies on the reuse of data in cross-experimental analyses. However, finding data which qualifies for the reuse in a specific context can be challenging. It requires good quality metadata and annotations as well as efficient search strategies. To date, full text search (often on the metadata only) is the most widely used search strategy although it is known to be inaccurate. Faceted navigation is providing a filter mechanism which is based on fine granular metadata, categorizing search objects along numeric and categorical parameters relevant for their discovery. Selecting from these parameters during a full text search creates a system of filters which allows to refine and improve the results towards more relevance. We developed a framework for the efficient annotation and faceted navigation in ecology. It consists of an XML schema for storing the annotation of search objects and is accompanied by a vocabulary focused on ecology to support the annotation process. The framework consolidates ideas which originate from widely accepted metadata standards, textbooks, scientific literature, and vocabularies as well as from expert knowledge contributed by researchers from ecology and adjacent disciplines.
Eichenberg, David; Liebergesell, Mario; König-Ries, Birgitta; Wirth, Christian
Ecology has become a data intensive science over the last decades which often relies on the reuse of data in cross-experimental analyses. However, finding data which qualifies for the reuse in a specific context can be challenging. It requires good quality metadata and annotations as well as efficient search strategies. To date, full text search (often on the metadata only) is the most widely used search strategy although it is known to be inaccurate. Faceted navigation is providing a filter mechanism which is based on fine granular metadata, categorizing search objects along numeric and categorical parameters relevant for their discovery. Selecting from these parameters during a full text search creates a system of filters which allows to refine and improve the results towards more relevance. We developed a framework for the efficient annotation and faceted navigation in ecology. It consists of an XML schema for storing the annotation of search objects and is accompanied by a vocabulary focused on ecology to support the annotation process. The framework consolidates ideas which originate from widely accepted metadata standards, textbooks, scientific literature, and vocabularies as well as from expert knowledge contributed by researchers from ecology and adjacent disciplines. PMID:29023519
Full Text Available Ecology has become a data intensive science over the last decades which often relies on the reuse of data in cross-experimental analyses. However, finding data which qualifies for the reuse in a specific context can be challenging. It requires good quality metadata and annotations as well as efficient search strategies. To date, full text search (often on the metadata only is the most widely used search strategy although it is known to be inaccurate. Faceted navigation is providing a filter mechanism which is based on fine granular metadata, categorizing search objects along numeric and categorical parameters relevant for their discovery. Selecting from these parameters during a full text search creates a system of filters which allows to refine and improve the results towards more relevance. We developed a framework for the efficient annotation and faceted navigation in ecology. It consists of an XML schema for storing the annotation of search objects and is accompanied by a vocabulary focused on ecology to support the annotation process. The framework consolidates ideas which originate from widely accepted metadata standards, textbooks, scientific literature, and vocabularies as well as from expert knowledge contributed by researchers from ecology and adjacent disciplines.
Smilde, Age K.; van der Werf, Mariët J.; Bijlsma, Sabina; van der Werff-van der Vat, Bianca J. C.; Jellema, Renger H.
A general method is presented for combining mass spectrometry-based metabolomics data. Such data are becoming more and more abundant, and proper tools for fusing these types of data sets are needed. Fusion of metabolomics data leads to a comprehensive view on the metabolome of an organism or
Full Text Available Tracking occluded objects at different depths has become as extremely important component of study for any video sequence having wide applications in object tracking, scene recognition, coding, editing the videos and mosaicking. The paper studies the ability of annotation to track the occluded object based on pyramids with variation in depth further establishing a threshold at which the ability of the system to track the occluded object fails. Image annotation is applied on 3 similar video sequences varying in depth. In the experiment, one bike occludes the other at a depth of 60cm, 80cm and 100cm respectively. Another experiment is performed on tracking humans with similar depth to authenticate the results. The paper also computes the frame by frame error incurred by the system, supported by detailed simulations. This system can be effectively used to analyze the error in motion tracking and further correcting the error leading to flawless tracking. This can be of great interest to computer scientists while designing surveillance systems etc.
Fazelzadeh, Parastoo; Hangelbroek, Roland W J; Tieland, Michael; de Groot, Lisette C P G M; Verdijk, Lex B; van Loon, Luc J C; Smilde, Age K; Alves, Rodrigo D A M; Vervoort, Jacques; Müller, Michael; van Duynhoven, John P M; Boekschoten, Mark V
Populations around the world are aging rapidly. Age-related loss of physiological functions negatively affects quality of life. A major contributor to the frailty syndrome of aging is loss of skeletal muscle. In this study we assessed the skeletal muscle biopsy metabolome of healthy young, healthy older and frail older subjects to determine the effect of age and frailty on the metabolic signature of skeletal muscle tissue. In addition, the effects of prolonged whole-body resistance-type exercise training on the muscle metabolome of older subjects were examined. The baseline metabolome was measured in muscle biopsies collected from 30 young, 66 healthy older subjects, and 43 frail older subjects. Follow-up samples from frail older (24 samples) and healthy older subjects (38 samples) were collected after 6 months of prolonged resistance-type exercise training. Young subjects were included as a reference group. Primary differences in skeletal muscle metabolite levels between young and healthy older subjects were related to mitochondrial function, muscle fiber type, and tissue turnover. Similar differences were observed when comparing frail older subjects with healthy older subjects at baseline. Prolonged resistance-type exercise training resulted in an adaptive response of amino acid metabolism, especially reflected in branched chain amino acids and genes related to tissue remodeling. The effect of exercise training on branched-chain amino acid-derived acylcarnitines in older subjects points to a downward shift in branched-chain amino acid catabolism upon training. We observed only modest correlations between muscle and plasma metabolite levels, which pleads against the use of plasma metabolites as a direct read-out of muscle metabolism and stresses the need for direct assessment of metabolites in muscle tissue biopsies.
Robert A. Dromms
Full Text Available The goals of metabolic engineering are well-served by the biological information provided by metabolomics: information on how the cell is currently using its biochemical resources is perhaps one of the best ways to inform strategies to engineer a cell to produce a target compound. Using the analysis of extracellular or intracellular levels of the target compound (or a few closely related molecules to drive metabolic engineering is quite common. However, there is surprisingly little systematic use of metabolomics datasets, which simultaneously measure hundreds of metabolites rather than just a few, for that same purpose. Here, we review the most common systematic approaches to integrating metabolite data with metabolic engineering, with emphasis on existing efforts to use whole-metabolome datasets. We then review some of the most common approaches for computational modeling of cell-wide metabolism, including constraint-based models, and discuss current computational approaches that explicitly use metabolomics data. We conclude with discussion of the broader potential of computational approaches that systematically use metabolomics data to drive metabolic engineering.
Swain-Lenz, Devjanee; Nikolskiy, Igor; Cheng, Jiye; Sudarsanam, Priya; Nayler, Darcy; Staller, Max V; Cohen, Barak A
An ongoing challenge in biology is to predict the phenotypes of individuals from their genotypes. Genetic variants that cause disease often change an individual's total metabolite profile, or metabolome. In light of our extensive knowledge of metabolic pathways, genetic variants that alter the metabolome may help predict novel phenotypes. To link genetic variants to changes in the metabolome, we studied natural variation in the yeast Saccharomyces cerevisiae We used an untargeted mass spectrometry method to identify dozens of metabolite Quantitative Trait Loci (mQTL), genomic regions containing genetic variation that control differences in metabolite levels between individuals. We mapped differences in urea cycle metabolites to genetic variation in specific genes known to regulate amino acid biosynthesis. Our functional assays reveal that genetic variation in two genes, AUA1 and ARG81 , cause the differences in the abundance of several urea cycle metabolites. Based on knowledge of the urea cycle, we predicted and then validated a new phenotype: sensitivity to a particular class of amino acid isomers. Our results are a proof-of-concept that untargeted mass spectrometry can reveal links between natural genetic variants and metabolome diversity. The interpretability of our results demonstrates the promise of using genetic variants underlying natural differences in the metabolome to predict novel phenotypes from genotype. Copyright © 2017 by the Genetics Society of America.
Full Text Available Thyroid cancer is the most common endocrine malignancy with four major types distinguished on the basis of histopathological features: papillary, follicular, medullary, and anaplastic. Classification of thyroid cancer is the primary step in the assessment of prognosis and selection of the treatment. However, in some cases, cytological and histological patterns are inconclusive; hence, classification based on histopathology could be supported by molecular biomarkers, including markers identified with the use of high-throughput “omics” techniques. Beside genomics, transcriptomics, and proteomics, metabolomic approach emerges as the most downstream attitude reflecting phenotypic changes and alterations in pathophysiological states of biological systems. Metabolomics using mass spectrometry and magnetic resonance spectroscopy techniques allows qualitative and quantitative profiling of small molecules present in biological systems. This approach can be applied to reveal metabolic differences between different types of thyroid cancer and to identify new potential candidates for molecular biomarkers. In this review, we consider current results concerning application of metabolomics in the field of thyroid cancer research. Recent studies show that metabolomics can provide significant information about the discrimination between different types of thyroid lesions. In the near future, one could expect a further progress in thyroid cancer metabolomics leading to development of molecular markers and improvement of the tumor types classification and diagnosis.
Amberg, Alexander; Riefke, Björn; Schlotterbeck, Götz; Ross, Alfred; Senn, Hans; Dieterle, Frank; Keck, Matthias
Metabolomics, also often referred as "metabolic profiling," is the systematic profiling of metabolites in biofluids or tissues of organisms and their temporal changes. In the last decade, metabolomics has become more and more popular in drug development, molecular medicine, and other biotechnology fields, since it profiles directly the phenotype and changes thereof in contrast to other "-omics" technologies. The increasing popularity of metabolomics has been possible only due to the enormous development in the technology and bioinformatics fields. In particular, the analytical technologies supporting metabolomics, i.e., NMR, UPLC-MS, and GC-MS, have evolved into sensitive and highly reproducible platforms allowing the determination of hundreds of metabolites in parallel. This chapter describes the best practices of metabolomics as seen today. All important steps of metabolic profiling in drug development and molecular medicine are described in great detail, starting from sample preparation to determining the measurement details of all analytical platforms, and finally to discussing the corresponding specific steps of data analysis.
Elag, M.; Kumar, P.; Marini, L.; Li, R.; Jiang, P.
There is a growing need for increased integration across the data and model resources that are disseminated on the web to advance their reuse across different earth science applications. Meaningful reuse of resources requires semantic metadata to realize the semantic web vision for allowing pragmatic linkage and integration among resources. Semantic metadata associates standard metadata with resources to turn them into semantically-enabled resources on the web. However, the lack of a common standardized metadata framework as well as the uncoordinated use of metadata fields across different geo-information systems, has led to a situation in which standards and related Standard Names abound. To address this need, we have designed SAS to provide a bridge between the core ontologies required to annotate resources and information systems in order to enable queries and analysis over annotation from a single environment (web). SAS is one of the services that are provided by the Geosematnic framework, which is a decentralized semantic framework to support the integration between models and data and allow semantically heterogeneous to interact with minimum human intervention. Here we present the design of SAS and demonstrate its application for annotating data and models. First we describe how predicates and their attributes are extracted from standards and ingested in the knowledge-base of the Geosemantic framework. Then we illustrate the application of SAS in annotating data managed by SEAD and annotating simulation models that have web interface. SAS is a step in a broader approach to raise the quality of geoscience data and models that are published on the web and allow users to better search, access, and use of the existing resources based on standard vocabularies that are encoded and published using semantic technologies.
Roullier-Gall, Chloé; Hemmler, Daniel; Gonsior, Michael; Li, Yan; Nikolantonaki, Maria; Aron, Alissa; Coelho, Christian; Gougeon, Régis D; Schmitt-Kopplin, Philippe
In a context of societal concern about food preservation, the reduction of sulfite input plays a major role in the wine industry. To improve the understanding of the chemistry involved in the SO 2 protection, a series of bottle aged Chardonnay wines made from the same must, but with different concentrations of SO 2 added at pressing were analyzed by ultrahigh resolution mass spectrometry (FT-ICR-MS) and excitation emission matrix fluorescence (EEMF). Metabolic fingerprints from FT-ICR-MS data could discriminate wines according to the added concentration to the must but they also revealed chemistry-related differences according to the type of stopper, providing a wine metabolomics picture of the impact of distinct stopping strategies. Spearman rank correlation was applied to link the statistically modeled EEMF components (parallel factor analysis (PARAFAC)) and the exact mass information from FT-ICR-MS, and thus revealing the extent of sulfur-containing compounds which could show some correlation with fluorescence fingerprints. Copyright © 2017 Elsevier Ltd. All rights reserved.
Mardanbeigi, Diako; Qvarfordt, Pernilla
To facilitate distributed communication in mobile settings, we developed GazeNote for creating and sharing gaze annotations in head mounted displays (HMDs). With gaze annotations it possible to point out objects of interest within an image and add a verbal description. To create an annota- tion...
This video shows how to annotate the ground truth tracks in the thermal videos. The ground truth tracks are produced to be able to compare them to tracks obtained from a Computer Vision tracking approach. The program used for annotation is T-Analyst, which is developed by Aliaksei Laureshyn, Ph...
Martínez Alonso, Héctor; Pedersen, Bolette Sandford; Bel, Núria
We present the result of an annotation task on regular polysemy for a series of seman- tic classes or dot types in English, Dan- ish and Spanish. This article describes the annotation process, the results in terms of inter-encoder agreement, and the sense distributions obtained with two methods...
This report describes a program that uses annotations in the teacher's editions of existing reading programs to indicate the characteristics of black English that may interfere with the reading process of black children. The first part of the report provides a rationale for the annotation approach, explaining that the discrepancy between written…
Lin, Jian-Wei; Lai, Yuan-Cheng
This paper harnesses collaborative annotations by students as learning feedback on online formative assessments to improve the learning achievements of students. Through the developed Web platform, students can conduct formative assessments, collaboratively annotate, and review historical records in a convenient way, while teachers can generate…
Brister, James Rodney; Bao, Yiming; Kuiken, Carla; Lefkowitz, Elliot J; Le Mercier, Philippe; Leplae, Raphael; Madupu, Ramana; Scheuermann, Richard H; Schobel, Seth; Seto, Donald; Shrivastava, Susmita; Sterk, Peter; Zeng, Qiandong; Klimke, William; Tatusova, Tatiana
Improvements in DNA sequencing technologies portend a new era in virology and could possibly lead to a giant leap in our understanding of viral evolution and ecology. Yet, as viral genome sequences begin to fill the world's biological databases, it is critically important to recognize that the scientific promise of this era is dependent on consistent and comprehensive genome annotation. With this in mind, the NCBI Genome Annotation Workshop recently hosted a study group tasked with developing sequence, function, and metadata annotation standards for viral genomes. This report describes the issues involved in viral genome annotation and reviews policy recommendations presented at the NCBI Annotation Workshop.
Full Text Available Improvements in DNA sequencing technologies portend a new era in virology and could possibly lead to a giant leap in our understanding of viral evolution and ecology. Yet, as viral genome sequences begin to fill the world’s biological databases, it is critically important to recognize that the scientific promise of this era is dependent on consistent and comprehensive genome annotation. With this in mind, the NCBI Genome Annotation Workshop recently hosted a study group tasked with developing sequence, function, and metadata annotation standards for viral genomes. This report describes the issues involved in viral genome annotation and reviews policy recommendations presented at the NCBI Annotation Workshop.
Klassen, Aline; Faccio, Andréa Tedesco; Canuto, Gisele André Baptista; da Cruz, Pedro Luis Rocha; Ribeiro, Henrique Caracho; Tavares, Marina Franco Maggi; Sussulini, Alessandra
Nowadays, there is a growing interest in deeply understanding biological mechanisms not only at the molecular level (biological components) but also the effects of an ongoing biological process in the organism as a whole (biological functionality), as established by the concept of systems biology. Within this context, metabolomics is one of the most powerful bioanalytical strategies that allow obtaining a picture of the metabolites of an organism in the course of a biological process, being considered as a phenotyping tool. Briefly, metabolomics approach consists in identifying and determining the set of metabolites (or specific metabolites) in biological samples (tissues, cells, fluids, or organisms) under normal conditions in comparison with altered states promoted by disease, drug treatment, dietary intervention, or environmental modulation. The aim of this chapter is to review the fundamentals and definitions used in the metabolomics field, as well as to emphasize its importance in systems biology and clinical studies.
Full Text Available Inflammatory Bowel Disease (IBD is a multifactorial disorder that conceptually occurs as a result of altered immune responses to commensal and/or pathogenic gut microbes in individuals most susceptible to the disease. During Crohn’s Disease (CD or Ulcerative Colitis (UC, two components of the human IBD, distinct stages define the disease onset, severity, progression and remission. Epigenetic, environmental (microbiome, metabolome and nutritional factors are important in IBD pathogenesis. While the dysbiotic microbiota has been proposed to play a role in disease pathogenesis, the data on IBD and diet are still less convincing. Nonetheless, studies are ongoing to examine the effect of pre/probiotics and/or FODMAP reduced diets on both the gut microbiome and its metabolome in an effort to define the healthy diet in patients with IBD. Knowledge of a unique metabolomic fingerprint in IBD could be useful for diagnosis, treatment and detection of disease pathogenesis.
Full Text Available Dairy products are an important component in the Western diet and represent a valuable source of nutrients for humans. However, a reliable dairy intake assessment in nutrition research is crucial to correctly elucidate the link between dairy intake and human health. Metabolomics is considered a potential tool for assessment of dietary intake instead of traditional methods, such as food frequency questionnaires, food records, and 24-h recalls. Metabolomics has been successfully applied to discriminate between consumption of different dairy products under different experimental conditions. Moreover, potential metabolites related to dairy intake were identified, although these metabolites need to be further validated in other intervention studies before they can be used as valid biomarkers of dairy consumption. Therefore, this review provides an overview of metabolomics for assessment of dairy intake in order to better clarify the role of dairy products in human nutrition and health.
Li, Shuzhao; Todor, Andrei; Luo, Ruiyan
Molecular analysis of blood samples is pivotal to clinical diagnosis and has been intensively investigated since the rise of systems biology. Recent developments have opened new opportunities to utilize transcriptomics and metabolomics for personalized and precision medicine. Efforts from human immunology have infused into this area exquisite characterizations of subpopulations of blood cells. It is now possible to infer from blood transcriptomics, with fine accuracy, the contribution of immune activation and of cell subpopulations. In parallel, high-resolution mass spectrometry has brought revolutionary analytical capability, detecting > 10,000 metabolites, together with environmental exposure, dietary intake, microbial activity, and pharmaceutical drugs. Thus, the re-examination of blood chemicals by metabolomics is in order. Transcriptomics and metabolomics can be integrated to provide a more comprehensive understanding of the human biological states. We will review these new data and methods and discuss how they can contribute to personalized medicine.
Balhoff, James P; Dahdul, Wasila M; Dececchi, T Alexander; Lapp, Hilmar; Mabee, Paula M; Vision, Todd J
Phenex (http://phenex.phenoscape.org/) is a desktop application for semantically annotating the phenotypic character matrix datasets common in evolutionary biology. Since its initial publication, we have added new features that address several major bottlenecks in the efficiency of the phenotype curation process: allowing curators during the data curation phase to provisionally request terms that are not yet available from a relevant ontology; supporting quality control against annotation guidelines to reduce later manual review and revision; and enabling the sharing of files for collaboration among curators. We decoupled data annotation from ontology development by creating an Ontology Request Broker (ORB) within Phenex. Curators can use the ORB to request a provisional term for use in data annotation; the provisional term can be automatically replaced with a permanent identifier once the term is added to an ontology. We added a set of annotation consistency checks to prevent common curation errors, reducing the need for later correction. We facilitated collaborative editing by improving the reliability of Phenex when used with online folder sharing services, via file change monitoring and continual autosave. With the addition of these new features, and in particular the Ontology Request Broker, Phenex users have been able to focus more effectively on data annotation. Phenoscape curators using Phenex have reported a smoother annotation workflow, with much reduced interruptions from ontology maintenance and file management issues.
ADRIANO, C. M.
Full Text Available Digital annotation systems are usually based on partial scenarios and arbitrary requirements. Accidental and essential characteristics are usually mixed in non explicit models. Documents and annotations are linked together accidentally according to the current technology, allowing for the development of disposable prototypes, but not to the support of non-functional requirements such as extensibility, robustness and interactivity. In this paper we perform a careful analysis on the concept of annotation, studying the scenarios supported by digital annotation tools. We also derived essential requirements based on a classification of annotation systems applied to existing tools. The analysis performed and the proposed classification can be applied and extended to other type of collaborative systems.
Chatzitoulousis, Antonios; Efraimidis, Pavlos S.; Athanasiadis, I.N.
The Atlas Metadata System (AMS) employs semantic web annotation techniques in order to create an interoperable information annotation and retrieval platform for the tourism sector. AMS adopts state-of-the-art metadata vocabularies, annotation techniques and semantic web technologies.
Full Text Available The comprehensive experimental analysis of a metabolic constitution plays a central role in approaches of organismal systems biology.Quantifying the impact of a changing environment on the homeostasis of cellular metabolism has been the focus of numerous studies applying various metabolomics techniques. It has been proven that approaches which integrate different analytical techniques, e.g. LC-MS, GC-MS, CE-MS and H-NMR, can provide a comprehensive picture of a certain metabolic homeostasis. Identification of metabolic compounds and quantification of metabolite levels represent the groundwork for the analysis of regulatory strategies in cellular metabolism. This significantly promotes our current understanding of the molecular organization and regulation of cells, tissues and whole organisms.Nevertheless, it is demanding to elicit the pertinent information which is contained in metabolomics data sets.Based on the central dogma of molecular biology, metabolite levels and their fluctuations are the result of a directed flux of information from gene activation over transcription to translation and posttranslational modification.Hence, metabolomics data represent the summed output of a metabolic system comprising various levels of molecular organization.As a consequence, the inverse assignment of metabolomics data to underlying regulatory processes should yield information which-if deciphered correctly-provides comprehensive insight into a metabolic system.Yet, the deduction of regulatory principles is complex not only due to the high number of metabolic compounds, but also because of a high level of cellular compartmentalization and differentiation.Motivated by the question how metabolomics approaches can provide a representative view on regulatory biochemical processes, this article intends to present and discuss current metabolomics applications, strategies of data analysis and their limitations with respect to the interpretability in context of
Ting, R.N.; Subramanyam, K.
Ion implantation is a technique for introducing controlled amounts of dopants into target substrates, and has been successfully used for the manufacture of silicon semiconductor devices. Ion implantation is superior to other methods of doping such as thermal diffusion and epitaxy, in view of its advantages such as high degree of control, flexibility, and amenability to automation. This annotated bibliography of 416 references consists of journal articles, books, and conference papers in English and foreign languages published during 1973-74, on all aspects of ion implantation including range distribution and concentration profile, channeling, radiation damage and annealing, compound semiconductors, structural and electrical characterization, applications, equipment and ion sources. Earlier bibliographies on ion implantation, and national and international conferences in which papers on ion implantation were presented have also been listed separately
. An increase in the number and size of GO groups without any noticeable decrease of the link density within the groups indicated that this expansion significantly broadens the public GO annotation without diluting its quality. We revealed that functional GO annotation correlates mostly with clustering in a physical interaction protein network, while its overlap with indirect regulatory network communities is two to three times smaller. Conclusion Protein functional annotations extracted by the NLP technology expand and enrich the existing GO annotation system. The GO functional modularity correlates mostly with the clustering in the physical interaction network, suggesting that the essential role of structural organization maintained by these interactions. Reciprocally, clustering of proteins in physical interaction networks can serve as an evidence for their functional similarity.
Tulipani, Sara; Mora-Cubillos, Ximena; Jáuregui, Olga; Llorach, Rafael; García-Fuentes, Eduardo; Tinahones, Francisco J; Andres-Lacueva, Cristina
Although LC-MS untargeted metabolomics continues to expand into exiting research domains, methodological issues have not been solved yet by the definition of unbiased, standardized and globally accepted analytical protocols. In the present study, the response of the plasma metabolome coverage to specific methodological choices of the sample preparation (two SPE technologies, three sample-to-solvent dilution ratios) and the LC-ESI-MS data acquisition steps of the metabolomics workflow (four RP columns, four elution solvent combinations, two solvent quality grades, postcolumn modification of the mobile phase) was investigated in a pragmatic and decision tree-like performance evaluation strategy. Quality control samples, reference plasma and human plasma from a real nutrimetabolomic study were used for intermethod comparisons. Uni- and multivariate data analysis approaches were independently applied. The highest method performance was obtained by combining the plasma hybrid extraction with the highest solvent proportion during sample preparation, the use of a RP column compatible with 100% aqueous polar phase (Atlantis T3), and the ESI enhancement by using UHPLC-MS purity grade methanol as both organic phase and postcolumn modifier. Results led to the following considerations: submit plasma samples to hybrid extraction for removal of interfering components to minimize the major sample-dependent matrix effects; avoid solvent evaporation following sample extraction if loss in detection and peak shape distortion of early eluting metabolites are not noticed; opt for a RP column for superior retention of highly polar species when analysis fractionation is not feasible; use ultrahigh quality grade solvents and "vintage" analytical tricks such as postcolumn organic enrichment of the mobile phase to enhance ESI efficiency. The final proposed protocol offers an example of how novel and old-fashioned analytical solutions may fruitfully cohabit in untargeted metabolomics
van der Pluijm, B.
What do colleagues do with your assigned textbook? What they say or think about the material? Want students to be more engaged in their learning experience? If so, online materials that complement standard lecture format provide new opportunity through managed, online group annotation that leverages the ubiquity of internet access, while personalizing learning. The concept is illustrated with the new online textbook "Processes in Structural Geology and Tectonics", by Ben van der Pluijm and Stephen Marshak, which offers a platform for sharing of experiences, supplementary materials and approaches, including readings, mathematical applications, exercises, challenge questions, quizzes, alternative explanations, and more. The annotation framework used is Hypothes.is, which offers a free, open platform markup environment for annotation of websites and PDF postings. The annotations can be public, grouped or individualized, as desired, including export access and download of annotations. A teacher group, hosted by a moderator/owner, limits access to members of a user group of teachers, so that its members can use, copy or transcribe annotations for their own lesson material. Likewise, an instructor can host a student group that encourages sharing of observations, questions and answers among students and instructor. Also, the instructor can create one or more closed groups that offers study help and hints to students. Options galore, all of which aim to engage students and to promote greater responsibility for their learning experience. Beyond new capacity, the ability to analyze student annotation supports individual learners and their needs. For example, student notes can be analyzed for key phrases and concepts, and identify misunderstandings, omissions and problems. Also, example annotations can be shared to enhance notetaking skills and to help with studying. Lastly, online annotation allows active application to lecture posted slides, supporting real-time notetaking
Xia, Jianguo; Sinelnikov, Igor V; Han, Beomsoo; Wishart, David S
MetaboAnalyst (www.metaboanalyst.ca) is a web server designed to permit comprehensive metabolomic data analysis, visualization and interpretation. It supports a wide range of complex statistical calculations and high quality graphical rendering functions that require significant computational resources. First introduced in 2009, MetaboAnalyst has experienced more than a 50X growth in user traffic (>50 000 jobs processed each month). In order to keep up with the rapidly increasing computational demands and a growing number of requests to support translational and systems biology applications, we performed a substantial rewrite and major feature upgrade of the server. The result is MetaboAnalyst 3.0. By completely re-implementing the MetaboAnalyst suite using the latest web framework technologies, we have been able substantially improve its performance, capacity and user interactivity. Three new modules have also been added including: (i) a module for biomarker analysis based on the calculation of receiver operating characteristic curves; (ii) a module for sample size estimation and power analysis for improved planning of metabolomics studies and (iii) a module to support integrative pathway analysis for both genes and metabolites. In addition, popular features found in existing modules have been significantly enhanced by upgrading the graphical output, expanding the compound libraries and by adding support for more diverse organisms. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.
Pannkuk, Evan L; Fornace, Albert J; Laiakis, Evagelia C
Exposure of the general population to ionizing radiation has increased in the past decades, primarily due to long distance travel and medical procedures. On the other hand, accidental exposures, nuclear accidents, and elevated threats of terrorism with the potential detonation of a radiological dispersal device or improvised nuclear device in a major city, all have led to increased needs for rapid biodosimetry and assessment of exposure to different radiation qualities and scenarios. Metabolomics, the qualitative and quantitative assessment of small molecules in a given biological specimen, has emerged as a promising technology to allow for rapid determination of an individual's exposure level and metabolic phenotype. Advancements in mass spectrometry techniques have led to untargeted (discovery phase, global assessment) and targeted (quantitative phase) methods not only to identify biomarkers of radiation exposure, but also to assess general perturbations of metabolism with potential long-term consequences, such as cancer, cardiovascular, and pulmonary disease. Metabolomics of radiation exposure has provided a highly informative snapshot of metabolic dysregulation. Biomarkers in easily accessible biofluids and biospecimens (urine, blood, saliva, sebum, fecal material) from mouse, rat, and minipig models, to non-human primates and humans have provided the basis for determination of a radiation signature to assess the need for medical intervention. Here we provide a comprehensive description of the current status of radiation metabolomic studies for the purpose of rapid high-throughput radiation biodosimetry in easily accessible biofluids and discuss future directions of radiation metabolomics research.
Full Text Available Metabolic pathway disturbances associated with drug-induced liver injury remain unsatisfactorily characterized. Diagnostic biomarkers for hepatotoxicity have been used to minimize drug-induced liver injury and to increase the clinical safety. A metabolomics strategy using rapid-resolution liquid chromatography/tandem mass spectrometry (RRLC-MS/MS analyses and multivariate statistics was implemented to identify potential biomarkers for hydrazine-induced hepatotoxicity. The global serum and urine metabolomics of 30 hydrazine-treated rats at 24 or 48 h postdosing and 24 healthy rats were characterized by a metabolomics approach. Multivariate statistical data analyses and receiver operating characteristic (ROC curves were performed to identify the most significantly altered metabolites. The 16 most significant potential biomarkers were identified to be closely related to hydrazine-induced liver injury. The combination of these biomarkers had an area under the curve (AUC > 0.85, with 100% specificity and sensitivity, respectively. This high-quality classification group included amino acids and their derivatives, glutathione metabolites, vitamins, fatty acids, intermediates of pyrimidine metabolism, and lipids. Additionally, metabolomics pathway analyses confirmed that phenylalanine, tyrosine, and tryptophan biosynthesis as well as tyrosine metabolism had great interactions with hydrazine-induced liver injury in rats. These discriminating metabolites might be useful in understanding the pathogenesis mechanisms of liver injury and provide good prospects for drug-induced liver injury diagnosis clinically.
We describe an automatic face tracker plugin for the ANVIL annotation tool. The face tracker produces data for velocity and for acceleration in two dimensions. We compare the annotations generated by the face tracking algorithm with independently made manual annotations for head movements....... The annotations are a useful supplement to manual annotations and may help human annotators to quickly and reliably determine onset of head movements and to suggest which kind of head movement is taking place....
Salek, R.M.; Neumann, S.; Schober, D.; Hummel, J.; Billiau, K.; Kopka, J.; Correa, E.; Reijmers, T.; Rosato, A.; Tenori, L.; Turano, P.; Marin, S.; Deborde, C.; Jacob, D.; Rolin, D.; Dartigues, B.; Conesa, P.; Haug, K.; Rocca-Serra, P.; O’Hagan, S.; Hao, J.; Vliet, M. van; Sysi-Aho, M.; Ludwig, C.; Bouwman, J.; Cascante, M.; Ebbels, T.; Griffin, J.L.; Moing, A.; Nikolski, M.; Oresic, M.; Sansone, S.A.; Viant, M.R.; Goodacre, R.; Günther, U.L.; Hankemeier, T.; Luchinat, C.; Walther, D.; Steinbeck, C.
Metabolomics has become a crucial phenotyping technique in a range of research fields including medicine, the life sciences, biotechnology and the environmental sciences. This necessitates the transfer of experimental information between research groups, as well as potentially to publishers and
Thonusin, Chanisa; IglayReger, Heidi B; Soni, Tanu; Rothberg, Amy E; Burant, Charles F; Evans, Charles R
In recent years, mass spectrometry-based metabolomics has increasingly been applied to large-scale epidemiological studies of human subjects. However, the successful use of metabolomics in this context is subject to the challenge of detecting biologically significant effects despite substantial intensity drift that often occurs when data are acquired over a long period or in multiple batches. Numerous computational strategies and software tools have been developed to aid in correcting for intensity drift in metabolomics data, but most of these techniques are implemented using command-line driven software and custom scripts which are not accessible to all end users of metabolomics data. Further, it has not yet become routine practice to assess the quantitative accuracy of drift correction against techniques which enable true absolute quantitation such as isotope dilution mass spectrometry. We developed an Excel-based tool, MetaboDrift, to visually evaluate and correct for intensity drift in a multi-batch liquid chromatography - mass spectrometry (LC-MS) metabolomics dataset. The tool enables drift correction based on either quality control (QC) samples analyzed throughout the batches or using QC-sample independent methods. We applied MetaboDrift to an original set of clinical metabolomics data from a mixed-meal tolerance test (MMTT). The performance of the method was evaluated for multiple classes of metabolites by comparison with normalization using isotope-labeled internal standards. QC sample-based intensity drift correction significantly improved correlation with IS-normalized data, and resulted in detection of additional metabolites with significant physiological response to the MMTT. The relative merits of different QC-sample curve fitting strategies are discussed in the context of batch size and drift pattern complexity. Our drift correction tool offers a practical, simplified approach to drift correction and batch combination in large metabolomics studies
Kilicoglu, Halil; Ben Abacha, Asma; Mrabet, Yassine; Shooshan, Sonya E; Rodriguez, Laritza; Masterton, Kate; Demner-Fushman, Dina
Consumers increasingly use online resources for their health information needs. While current search engines can address these needs to some extent, they generally do not take into account that most health information needs are complex and can only fully be expressed in natural language. Consumer health question answering (QA) systems aim to fill this gap. A major challenge in developing consumer health QA systems is extracting relevant semantic content from the natural language questions (question understanding). To develop effective question understanding tools, question corpora semantically annotated for relevant question elements are needed. In this paper, we present a two-part consumer health question corpus annotated with several semantic categories: named entities, question triggers/types, question frames, and question topic. The first part (CHQA-email) consists of relatively long email requests received by the U.S. National Library of Medicine (NLM) customer service, while the second part (CHQA-web) consists of shorter questions posed to MedlinePlus search engine as queries. Each question has been annotated by two annotators. The annotation methodology is largely the same between the two parts of the corpus; however, we also explain and justify the differences between them. Additionally, we provide information about corpus characteristics, inter-annotator agreement, and our attempts to measure annotation confidence in the absence of adjudication of annotations. The resulting corpus consists of 2614 questions (CHQA-email: 1740, CHQA-web: 874). Problems are the most frequent named entities, while treatment and general information questions are the most common question types. Inter-annotator agreement was generally modest: question types and topics yielded highest agreement, while the agreement for more complex frame annotations was lower. Agreement in CHQA-web was consistently higher than that in CHQA-email. Pairwise inter-annotator agreement proved most
Michaelis, J.; Zednik, S.; West, P.; Fox, P. A.; McGuinness, D. L.
eScience based systems generate provenance of their data products, related to such things as: data processing, data collection conditions, expert evaluation, and data product quality. Recent advances in web-based technology offer users the possibility of making annotations to both data products and steps in accompanying provenance traces, thereby expanding the utility of such provenance for others. These contributing users may have varying backgrounds, ranging from system experts to outside domain experts to citizen scientists. Furthermore, such users may wish to make varying types of annotations - ranging from documenting the purpose of a provenance step to raising concerns about the quality of data dependencies. Semantic Web technologies allow for such kinds of rich annotations to be made to provenance through the use of ontology vocabularies for (i) organizing provenance, and (ii) organizing user/annotation classifications. Furthermore, through Linked Data practices, Semantic linkages may be made from provenance steps to external data of interest. A desire for Semantically-annotated provenance has been motivated by data management issues in the Mauna Loa Solar Observatory’s (MLSO) Advanced Coronal Observing System (ACOS). In ACOS, photomoeter-based readings are taken of solar activity and subsequently processed into final data products consumable by end users. At intermediate stages of ACOS processing, factors such as evaluations by human experts and weather conditions are logged, which could impact data product quality. If such factors are linked via user-submitted annotations to provenance, it could be significantly beneficial for other users. Likewise, the background of a user could impact the credibility of their annotations. For example, an annotation made by a citizen scientist describing the purpose of a provenance step may not be as reliable as a similar annotation made by an ACOS project member. For this work, we have developed a software package that
Finnegan, Tarryn; Steenkamp, Paul A.; Piater, Lizelle A.
Lipopolysaccharides (LPSs), as MAMP molecules, trigger the activation of signal transduction pathways involved in defence. Currently, plant metabolomics is providing new dimensions into understanding the intracellular adaptive responses to external stimuli. The effect of LPS on the metabolomes of Arabidopsis thaliana cells and leaf tissue was investigated over a 24 h period. Cellular metabolites and those secreted into the medium were extracted with methanol and liquid chromatography coupled to mass spectrometry was used for quantitative and qualitative analyses. Multivariate statistical data analyses were used to extract interpretable information from the generated multidimensional LC-MS data. The results show that LPS perception triggered differential changes in the metabolomes of cells and leaves, leading to variation in the biosynthesis of specialised secondary metabolites. Time-dependent changes in metabolite profiles were observed and biomarkers associated with the LPS-induced response were tentatively identified. These include the phytohormones salicylic acid and jasmonic acid, and also the associated methyl esters and sugar conjugates. The induced defensive state resulted in increases in indole—and other glucosinolates, indole derivatives, camalexin as well as cinnamic acid derivatives and other phenylpropanoids. These annotated metabolites indicate dynamic reprogramming of metabolic pathways that are functionally related towards creating an enhanced defensive capacity. The results reveal new insights into the mode of action of LPS as an activator of plant innate immunity, broadens knowledge about the defence metabolite pathways involved in Arabidopsis responses to LPS, and identifies specialised metabolites of functional importance that can be employed to enhance immunity against pathogen infection. PMID:27656890
Cardoso, Silvio Domingos; Chantal, Reynaud-Delaître; Da Silveira, Marcos; Pruski, Cédric
Knowledge Organization Systems (KOS) play a key role in enriching biomedical information in order to make it machine-understandable and shareable. This is done by annotating medical documents, or more specifically, associating concept labels from KOS with pieces of digital information, e.g., images or texts. However, the dynamic nature of KOS may impact the annotations, thus creating a mismatch between the evolved concept and the associated information. To solve this problem, methods to maintain the quality of the annotations are required. In this paper, we define a framework based on rules, background knowledge and change patterns to drive the annotation adaption process. We evaluate experimentally the proposed approach in realistic cases-studies and demonstrate the overall performance of our approach in different KOS considering the precision, recall, F1-score and AUC value of the system.
Sanderson, Robert [Los Alamos National Laboratory; Van De Sompel, Herbert [Los Alamos National Laboratory
As Digital Libraries (DL) become more aligned with the web architecture, their functional components need to be fundamentally rethought in terms of URIs and HTTP. Annotation, a core scholarly activity enabled by many DL solutions, exhibits a clearly unacceptable characteristic when existing models are applied to the web: due to the representations of web resources changing over time, an annotation made about a web resource today may no longer be relevant to the representation that is served from that same resource tomorrow. We assume the existence of archived versions of resources, and combine the temporal features of the emerging Open Annotation data model with the capability offered by the Memento framework that allows seamless navigation from the URI of a resource to archived versions of that resource, and arrive at a solution that provides guarantees regarding the persistence of web annotations over time. More specifically, we provide theoretical solutions and proof-of-concept experimental evaluations for two problems: reconstructing an existing annotation so that the correct archived version is displayed for all resources involved in the annotation, and retrieving all annotations that involve a given archived version of a web resource.
Bracewell-Milnes, Timothy; Saso, Srdjan; Abdalla, Hossam; Nikolau, Dimitrios; Norman-Taylor, Julian; Johnson, Mark; Holmes, Elaine; Thum, Meen-Yau
Infertility is a complex disorder with significant medical, psychological and financial consequences for patients. With live-birth rates per cycle below 30% and a drive from the Human Fertilisation and Embryology Authority (HFEA) to encourage single embryo transfer, there is significant research in different areas aiming to improve success rates of fertility treatments. One such area is investigating the causes of infertility at a molecular level, and metabolomics techniques provide a platform for studying relevant biofluids in the reproductive tract. The aim of this systematic review is to examine the recent findings for the potential application of metabolomics to female reproduction, specifically to the metabolomics of follicular fluid (FF), embryo culture medium (ECM) and endometrial fluid. To our knowledge no other systematic review has investigated this topic. English peer-reviewed journals on PubMed, Science Direct, SciFinder, were systematically searched for studies investigating metabolomics and the female reproductive tract with no time restriction set for publications. Studies were assessed for quality using the risk of bias assessment and ROBIN-I. There were 21 studies that met the inclusion criteria and were included in the systematic review. Metabolomic studies have been employed for the compositional analysis of various biofluids in the female reproductive tract, including FF, ECM, blastocoele fluid and endometrial fluid. There is some weak evidence that metabolomics technologies studying ECM might be able to predict the viability of individual embryos and implantation rate better than standard embryo morphology, However these data were not supported by randomized the controlled trials (RCTs) which showed no evidence that using metabolomics is able to improve the most important reproductive outcomes, such as clinical pregnancy and live-birth rates. This systematic review provides guidance for future metabolomic studies on biofluids of the female
Grison, S.; Grandcolas, L.; Martin, J.C.
Reports have described apparent biological effects of 137 Cs (the most persistent dispersed radionuclide) irradiation in people living in Chernobyl-contaminated territory. The sensitive analytical technology described here should now help assess the relation of this contamination to the observed effects. A rat model chronically exposed to 137 Cs through drinking water was developed to identify biomarkers of radiation-induced metabolic disorders, and the biological impact was evaluated by a metabolomic approach that allowed us to detect several hundred metabolites in biofluids and assess their association with disease states. After collection of plasma and urine from contaminated and non-contaminated rats at the end of the 9-months contamination period, analysis with a liquid chromatography coupled to mass spectrometry (LC-MS) system detected 742 features in urine and 1309 in plasma. Biostatistical discriminant analysis extracted a subset of 26 metabolite signals (2 urinary, 4 plasma non-polar, and 19 plasma polar metabolites) that in combination were able to predict from 68 up to 94% of the contaminated rats, depending on the prediction method used, with a misclassification rate as low as 5.3%. The difference in this metabolic score between the contaminated and non-contaminated rats was highly significant (P=0.019 after ANOVA cross-validation). In conclusion, our proof-of-principle study demonstrated for the first time the usefulness of a metabolomic approach for addressing biological effects of chronic low-dose contamination. We can conclude that a metabolomic signature discriminated 137 Cs-contaminated from control animals in our model. Further validation is nevertheless required together with full annotation of the metabolic indicators. (author)
Miyamoto, Licht; Egawa, Tatsuro; Oshima, Rieko; Kurogi, Eriko; Tomida, Yosuke; Tsuchiya, Koichiro; Hayashi, Tatsuya
Physical exercise has potent therapeutic and preventive effects against metabolic disorders. A number of studies have suggested that 5'-AMP-activated protein kinase (AMPK) plays a pivotal role in regulating carbohydrate and lipid metabolism in contracting skeletal muscles, while several genetically manipulated animal models revealed the significance of AMPK-independent pathways. To elucidate significance of AMPK and AMPK-independent signals in contracting skeletal muscles, we conducted a metabolomic analysis that compared the metabolic effects of 5-aminoimidazole-4-carboxamide-1-β-D-ribonucleoside (AICAR) stimulation with the electrical contraction ex vivo in isolated rat epitrochlearis muscles, in which both α1- and α2-isoforms of AMPK and glucose uptake were equally activated. The metabolomic analysis using capillary electrophoresis time-of-flight mass spectrometry detected 184 peaks and successfully annotated 132 small molecules. AICAR stimulation exhibited high similarity to the electrical contraction in overall metabolites. Principal component analysis (PCA) demonstrated that the major principal component characterized common effects whereas the minor principal component distinguished the difference. PCA and a factor analysis suggested a substantial change in redox status as a result of AMPK activation. We also found a decrease in reduced glutathione levels in both AICAR-stimulated and contracting muscles. The muscle contraction-evoked influences related to the metabolism of amino acids, in particular, aspartate, alanine, or lysine, are supposed to be independent of AMPK activation. Our results substantiate the significance of AMPK activation in contracting skeletal muscles and provide novel evidence that AICAR stimulation closely mimics the metabolomic changes in the contracting skeletal muscles.
Good, Benjamin M; Nanis, Max; Wu, Chunlei; Su, Andrew I
Identifying concepts and relationships in biomedical text enables knowledge to be applied in computational analyses. Many biological natural language processing (BioNLP) projects attempt to address this challenge, but the state of the art still leaves much room for improvement. Progress in BioNLP research depends on large, annotated corpora for evaluating information extraction systems and training machine learning models. Traditionally, such corpora are created by small numbers of expert annotators often working over extended periods of time. Recent studies have shown that workers on microtask crowdsourcing platforms such as Amazon's Mechanical Turk (AMT) can, in aggregate, generate high-quality annotations of biomedical text. Here, we investigated the use of the AMT in capturing disease mentions in PubMed abstracts. We used the NCBI Disease corpus as a gold standard for refining and benchmarking our crowdsourcing protocol. After several iterations, we arrived at a protocol that reproduced the annotations of the 593 documents in the 'training set' of this gold standard with an overall F measure of 0.872 (precision 0.862, recall 0.883). The output can also be tuned to optimize for precision (max = 0.984 when recall = 0.269) or recall (max = 0.980 when precision = 0.436). Each document was completed by 15 workers, and their annotations were merged based on a simple voting method. In total 145 workers combined to complete all 593 documents in the span of 9 days at a cost of $.066 per abstract per worker. The quality of the annotations, as judged with the F measure, increases with the number of workers assigned to each task; however minimal performance gains were observed beyond 8 workers per task. These results add further evidence that microtask crowdsourcing can be a valuable tool for generating well-annotated corpora in BioNLP. Data produced for this analysis are available at http://figshare.com/articles/Disease_Mention_Annotation_with_Mechanical_Turk/1126402.
Thrane, Ulf; Andersen, Birgitte; Frisvad, Jens Christian
Filamentous fungi are a diverse group of eukaryotic microorganisms that have a significant impact on human life as spoilers of food and feed by degradation and toxin production. They are also most useful as a source of bulk and fine chemicals and pharmaceuticals. This chapter focuses on the exo-metabolome...
An, Phan Nguyen Thuy; Yamaguchi, Masamitsu; Bamba, Takeshi; Fukusaki, Eiichiro
The Drosophila melanogaster embryo has been widely utilized as a model for genetics and developmental biology due to its small size, short generation time, and large brood size. Information on embryonic metabolism during developmental progression is important for further understanding the mechanisms of Drosophila embryogenesis. Therefore, the aim of this study is to assess the changes in embryos' metabolome that occur at different stages of the Drosophila embryonic development. Time course samples of Drosophila embryos were subjected to GC/MS-based metabolome analysis for profiling of low molecular weight hydrophilic metabolites, including sugars, amino acids, and organic acids. The results showed that the metabolic profiles of Drosophila embryo varied during the course of development and there was a strong correlation between the metabolome and different embryonic stages. Using the metabolome information, we were able to establish a prediction model for developmental stages of embryos starting from their high-resolution quantitative metabolite composition. Among the important metabolites revealed from our model, we suggest that different amino acids appear to play distinct roles in different developmental stages and an appropriate balance in trehalose-glucose ratio is crucial to supply the carbohydrate source for the development of Drosophila embryo.
Kenneth A. Dyar
Full Text Available Circadian rhythms are widely known to govern human health and disease, but specific pathogenic mechanisms linking circadian disruption to metabolic diseases are just beginning to come to light. This is thanks in part to the development and application of various “omics”-based tools in biology and medicine. Current high-throughput technologies allow for the simultaneous monitoring of multiple dynamic cellular events over time, ranging from gene expression to metabolite abundance and sub-cellular localization. These fundamental temporal and spatial perspectives have allowed for a more comprehensive understanding of how various dynamic cellular events and biochemical processes are related in health and disease. With advances in technology, metabolomics has become a more routine “omics” approach for studying metabolism, and “circadian metabolomics” (i.e., studying the 24-h metabolome has recently been undertaken by several groups. To date, circadian metabolomes have been reported for human serum, saliva, breath, and urine, as well as tissues from several species under specific disease or mutagenesis conditions. Importantly, these studies have consistently revealed that 24-h rhythms are prevalent in almost every tissue and metabolic pathway. Furthermore, these circadian rhythms in tissue metabolism are ultimately linked to and directed by internal 24-h biological clocks. In this review, we will attempt to put these data-rich circadian metabolomics experiments into perspective to find out what they can tell us about metabolic health and disease, and what additional biomarker potential they may reveal.
Jorge L Salinas
Full Text Available Metabolomics uses high-resolution mass spectrometry to provide a chemical fingerprint of thousands of metabolites present in cells, tissues or body fluids. Such metabolic phenotyping has been successfully used to study various biologic processes and disease states. High-resolution metabolomics can shed new light on the intricacies of host-parasite interactions in each stage of the Plasmodium life cycle and the downstream ramifications on the host’s metabolism, pathogenesis and disease. Such data can become integrated with other large datasets generated using top-down systems biology approaches and be utilised by computational biologists to develop and enhance models of malaria pathogenesis relevant for identifying new drug targets or intervention strategies. Here, we focus on the promise of metabolomics to complement systems biology approaches in the quest for novel interventions in the fight against malaria. We introduce the Malaria Host-Pathogen Interaction Center (MaHPIC, a new systems biology research coalition. A primary goal of the MaHPIC is to generate systems biology datasets relating to human and non-human primate (NHP malaria parasites and their hosts making these openly available from an online relational database. Metabolomic data from NHP infections and clinical malaria infections from around the world will comprise a unique global resource.
Gonzalez-Riano, Carolina; Garcia, Antonia; Barbas, Coral
Brain is still an organ with a composition to be discovered but beyond that, mental disorders and especially all diseases that curse with dementia are devastating for the patient, the family and the society. Metabolomics can offer an alternative tool for unveiling new insights in the discovery of new treatments and biomarkers of mental disorders. Until now, most of metabolomic studies have been based on biofluids: serum/plasma or urine, because brain tissue accessibility is limited to animal models or post mortem studies, but even so it is crucial for understanding the pathological processes. Metabolomics studies of brain tissue imply several challenges due to sample extraction, along with brain heterogeneity, sample storage, and sample treatment for a wide coverage of metabolites with a wide range of concentrations of many lipophilic and some polar compounds. In this review, the current analytical practices for target and non-targeted metabolomics are described and discussed with emphasis on critical aspects: sample treatment (quenching, homogenization, filtration, centrifugation and extraction), analytical methods, as well as findings considering the used strategies. Besides that, the altered analytes in the different brain regions have been associated with their corresponding pathways to obtain a global overview of their dysregulation, trying to establish the link between altered biological pathways and pathophysiological conditions. Copyright © 2016 Elsevier B.V. All rights reserved.
Barkal, Layla J.; Theberge, Ashleigh B.; Guo, Chun-Jun; Spraker, Joe; Rappert, Lucas; Berthier, Jean; Brakke, Kenneth A.; Wang, Clay C. C.; Beebe, David J.; Keller, Nancy P.; Berthier, Erwin
The microbial secondary metabolome encompasses great synthetic diversity, empowering microbes to tune their chemical responses to changing microenvironments. Traditional metabolomics methods are ill-equipped to probe a wide variety of environments or environmental dynamics. Here we introduce a class of microscale culture platforms to analyse chemical diversity of fungal and bacterial secondary metabolomes. By leveraging stable biphasic interfaces to integrate microculture with small molecule isolation via liquid–liquid extraction, we enable metabolomics-scale analysis using mass spectrometry. This platform facilitates exploration of culture microenvironments (including rare media typically inaccessible using established methods), unusual organic solvents for metabolite isolation and microbial mutants. Utilizing Aspergillus, a fungal genus known for its rich secondary metabolism, we characterize the effects of culture geometry and growth matrix on secondary metabolism, highlighting the potential use of microscale systems to unlock unknown or cryptic secondary metabolites for natural products discovery. Finally, we demonstrate the potential for this class of microfluidic systems to study interkingdom communication between fungi and bacteria. PMID:26842393
Verouden, M.P.H.; Westerhuis, J.A.; van der Werf, M.J.; Smilde, A.K.
In metabolomics research a large number of metabolites are measured that reflect the cellular state under the experimental conditions studied. In many occasions the experiments are performed according to an experimental design to make sure that sufficient variation is induced in the metabolite
Draisma, Hermanus Henricus Maria
Metabolomics is the comprehensive analysis of small molecules involved in metabolism, on the basis of samples that have been obtained from organisms in a given physiological state. Data obtained from measurements of trait levels in twin families can be used to elucidate the importance of genetic and
Hendriks, M.M.W.B.; Eeuwijk, van F.A.; Jellema, R.H.; Westerhuis, J.A.; Reijmers, T.H.; Hoefsloot, H.C.J.; Smilde, A.K.
Metabolomics studies aim at a better understanding of biochemical processes by studying relations between metabolites and between metabolites and other types of information (e.g., sensory and phenotypic features). The objectives of these studies are diverse, but the types of data generated and the
Wright, T.; Tsao, H.J.
The success or failure of any sample survey of a finite population is largely dependent upon the condition and adequacy of the list or frame from which the probability sample is selected. Much of the published survey sampling related work has focused on the measurement of sampling errors and, more recently, on nonsampling errors to a lesser extent. Recent studies on data quality for various types of data collection systems have revealed that the extent of the nonsampling errors far exceeds that of the sampling errors in many cases. While much of this nonsampling error, which is difficult to measure, can be attributed to poor frames, relatively little effort or theoretical work has focused on this contribution to total error. The objective of this paper is to present an annotated bibliography on frames with the hope that it will bring together, for experimenters, a number of suggestions for action when sampling from imperfect frames and that more attention will be given to this area of survey methods research
Luan, Hemi; Meng, Nan; Liu, Ping
Background: Metabolomics has the potential to be a powerful and sensitive approach for investigating the low molecular weight metabolite profiles present in maternal fluids and their role in pregnancy.Findings: In this Data Note, LC-MS metabolome, lipidome and carnitine profiling data were...... collected from 180 healthy pregnant women, representing six time points spanning all three trimesters, and providing sufficient coverage to model the progression of normal pregnancy.Conclusions: As a relatively large scale, real-world dataset with robust numbers of quality control samples, the data...
Wilbrandt, Jeanne; Misof, Bernhard; Niehuis, Oliver
The comparison of gene and genome structures across species has the potential to reveal major trends of genome evolution. However, such a comparative approach is currently hampered by a lack of standardization (e.g., Elliott TA, Gregory TR, Philos Trans Royal Soc B: Biol Sci 370:20140331, 2015). For example, testing the hypothesis that the total amount of coding sequences is a reliable measure of potential proteome diversity (Wang M, Kurland CG, Caetano-Anollés G, PNAS 108:11954, 2011) requires the application of standardized definitions of coding sequence and genes to create both comparable and comprehensive data sets and corresponding summary statistics. However, such standard definitions either do not exist or are not consistently applied. These circumstances call for a standard at the descriptive level using a minimum of parameters as well as an undeviating use of standardized terms, and for software that infers the required data under these strict definitions. The acquisition of a comprehensive, descriptive, and standardized set of parameters and summary statistics for genome publications and further analyses can thus greatly benefit from the availability of an easy to use standard tool. We developed a new open-source command-line tool, COGNATE (Comparative Gene Annotation Characterizer), which uses a given genome assembly and its annotation of protein-coding genes for a detailed description of the respective gene and genome structure parameters. Additionally, we revised the standard definitions of gene and genome structures and provide the definitions used by COGNATE as a working draft suggestion for further reference. Complete parameter lists and summary statistics are inferred using this set of definitions to allow down-stream analyses and to provide an overview of the genome and gene repertoire characteristics. COGNATE is written in Perl and freely available at the ZFMK homepage ( https://www.zfmk.de/en/COGNATE ) and on github ( https
Boysen, Angela K; Heal, Katherine R; Carlson, Laura T; Ingalls, Anitra E
The goal of metabolomics is to measure the entire range of small organic molecules in biological samples. In liquid chromatography-mass spectrometry-based metabolomics, formidable analytical challenges remain in removing the nonbiological factors that affect chromatographic peak areas. These factors include sample matrix-induced ion suppression, chromatographic quality, and analytical drift. The combination of these factors is referred to as obscuring variation. Some metabolomics samples can exhibit intense obscuring variation due to matrix-induced ion suppression, rendering large amounts of data unreliable and difficult to interpret. Existing normalization techniques have limited applicability to these sample types. Here we present a data normalization method to minimize the effects of obscuring variation. We normalize peak areas using a batch-specific normalization process, which matches measured metabolites with isotope-labeled internal standards that behave similarly during the analysis. This method, called best-matched internal standard (B-MIS) normalization, can be applied to targeted or untargeted metabolomics data sets and yields relative concentrations. We evaluate and demonstrate the utility of B-MIS normalization using marine environmental samples and laboratory grown cultures of phytoplankton. In untargeted analyses, B-MIS normalization allowed for inclusion of mass features in downstream analyses that would have been considered unreliable without normalization due to obscuring variation. B-MIS normalization for targeted or untargeted metabolomics is freely available at https://github.com/IngallsLabUW/B-MIS-normalization .
Fromreide, Hege; Hovy, Dirk; Søgaard, Anders
We present two new NER datasets for Twitter; a manually annotated set of 1,467 tweets (kappa=0.942) and a set of 2,975 expert-corrected, crowdsourced NER annotated tweets from the dataset described in Finin et al. (2010). In our experiments with these datasets, we observe two important points: (a......) language drift on Twitter is significant, and while off-the-shelf systems have been reported to perform well on in-sample data, they often perform poorly on new samples of tweets, (b) state-of-the-art performance across various datasets can beobtained from crowdsourced annotations, making it more feasible...
Kleinstreuer, N.C.; Smith, A.M.; West, P.R.; Conard, K.R.; Fontaine, B.R.; Weir-Hauptman, A.M.; Palmer, J.A.; Knudsen, T.B.; Dix, D.J.; Donley, E.L.R.; Cezar, G.G.
Metabolomics analysis was performed on the supernatant of human embryonic stem (hES) cell cultures exposed to a blinded subset of 11 chemicals selected from the chemical library of EPA's ToxCast™ chemical screening and prioritization research project. Metabolites from hES cultures were evaluated for known and novel signatures that may be indicative of developmental toxicity. Significant fold changes in endogenous metabolites were detected for 83 putatively annotated mass features in response to the subset of ToxCast chemicals. The annotations were mapped to specific human metabolic pathways. This revealed strong effects on pathways for nicotinate and nicotinamide metabolism, pantothenate and CoA biosynthesis, glutathione metabolism, and arginine and proline metabolism pathways. Predictivity for adverse outcomes in mammalian prenatal developmental toxicity studies used ToxRefDB and other sources of information, including Stemina Biomarker Discovery's predictive DevTox® model trained on 23 pharmaceutical agents of known developmental toxicity and differing potency. The model initially predicted developmental toxicity from the blinded ToxCast compounds in concordance with animal data with 73% accuracy. Retraining the model with data from the unblinded test compounds at one concentration level increased the predictive accuracy for the remaining concentrations to 83%. These preliminary results on a 11-chemical subset of the ToxCast chemical library indicate that metabolomics analysis of the hES secretome provides information valuable for predictive modeling and mechanistic understanding of mammalian developmental toxicity. -- Highlights: ► We tested 11 environmental compounds in a hESC metabolomics platform. ► Significant changes in secreted small molecule metabolites were observed. ► Perturbed mass features map to pathways critical for normal development and pregnancy. ► Arginine, proline, nicotinate, nicotinamide and glutathione pathways were affected.
Kleinstreuer, N.C., E-mail: email@example.com [NCCT, US EPA, RTP, NC 27711 (United States); Smith, A.M.; West, P.R.; Conard, K.R.; Fontaine, B.R. [Stemina Biomarker Discovery, Inc., Madison, WI 53719 (United States); Weir-Hauptman, A.M. [Covance, Inc., Madison, WI 53704 (United States); Palmer, J.A. [Stemina Biomarker Discovery, Inc., Madison, WI 53719 (United States); Knudsen, T.B.; Dix, D.J. [NCCT, US EPA, RTP, NC 27711 (United States); Donley, E.L.R. [Stemina Biomarker Discovery, Inc., Madison, WI 53719 (United States); Cezar, G.G. [Stemina Biomarker Discovery, Inc., Madison, WI 53719 (United States); University of Wisconsin-Madison, Madison, WI 53706 (United States)
Metabolomics analysis was performed on the supernatant of human embryonic stem (hES) cell cultures exposed to a blinded subset of 11 chemicals selected from the chemical library of EPA's ToxCast Trade-Mark-Sign chemical screening and prioritization research project. Metabolites from hES cultures were evaluated for known and novel signatures that may be indicative of developmental toxicity. Significant fold changes in endogenous metabolites were detected for 83 putatively annotated mass features in response to the subset of ToxCast chemicals. The annotations were mapped to specific human metabolic pathways. This revealed strong effects on pathways for nicotinate and nicotinamide metabolism, pantothenate and CoA biosynthesis, glutathione metabolism, and arginine and proline metabolism pathways. Predictivity for adverse outcomes in mammalian prenatal developmental toxicity studies used ToxRefDB and other sources of information, including Stemina Biomarker Discovery's predictive DevTox Registered-Sign model trained on 23 pharmaceutical agents of known developmental toxicity and differing potency. The model initially predicted developmental toxicity from the blinded ToxCast compounds in concordance with animal data with 73% accuracy. Retraining the model with data from the unblinded test compounds at one concentration level increased the predictive accuracy for the remaining concentrations to 83%. These preliminary results on a 11-chemical subset of the ToxCast chemical library indicate that metabolomics analysis of the hES secretome provides information valuable for predictive modeling and mechanistic understanding of mammalian developmental toxicity. -- Highlights: Black-Right-Pointing-Pointer We tested 11 environmental compounds in a hESC metabolomics platform. Black-Right-Pointing-Pointer Significant changes in secreted small molecule metabolites were observed. Black-Right-Pointing-Pointer Perturbed mass features map to pathways critical for normal
Koek, M.M.; Jellema, R.H.; Greef, J. van der; Tas, A.C.; Hankemeier, T.
Metabolomics involves the unbiased quantitative and qualitative analysis of the complete set of metabolites present in cells, body fluids and tissues (the metabolome). By analyzing differences between metabolomes using biostatistics (multivariate data analysis; pattern recognition), metabolites
Seyed, P.; Chastain, K.; McGuinness, D. L.
Use of Semantic Web technologies for data management in the Earth sciences (and beyond) has great potential but is still in its early stages, since the challenges of translating data into a more explicit or semantic form for immediate use within applications has not been fully addressed. In this abstract we help address this challenge by introducing the SemantEco Annotator, which enables anyone, regardless of expertise, to semantically annotate tabular Earth Science data and translate it into linked data format, while applying the logic inherent in community-standard vocabularies to guide the process. The Annotator was conceived under a desire to unify dataset content from a variety of sources under common vocabularies, for use in semantically-enabled web applications. Our current use case employs linked data generated by the Annotator for use in the SemantEco environment, which utilizes semantics to help users explore, search, and visualize water or air quality measurement and species occurrence data through a map-based interface. The generated data can also be used immediately to facilitate discovery and search capabilities within 'big data' environments. The Annotator provides a method for taking information about a dataset, that may only be known to its maintainers, and making it explicit, in a uniform and machine-readable fashion, such that a person or information system can more easily interpret the underlying structure and meaning. Its primary mechanism is to enable a user to formally describe how columns of a tabular dataset relate and/or describe entities. For example, if a user identifies columns for latitude and longitude coordinates, we can infer the data refers to a point that can be plotted on a map. Further, it can be made explicit that measurements of 'nitrate' and 'NO3-' are of the same entity through vocabulary assignments, thus more easily utilizing data sets that use different nomenclatures. The Annotator provides an extensive and searchable
This book is a rewritten and annotated version of Leo P. Kadanoff and Gordon Bayms lectures that were presented in the book Quantum Statistical Mechanics: Greens Function Methods in Equilibrium and Nonequilibrium Problems. The lectures were devoted to a discussion on the use of thermodynamic Greens functions in describing the properties of many-particle systems. The functions provided a method for discussing finite-temperature problems with no more conceptual difficulty than ground-state problems, and the method was equally applicable to boson and fermion systems and equilibrium and nonequilibrium problems. The lectures also explained nonequilibrium statistical physics in a systematic way and contained essential concepts on statistical physics in terms of Greens functions with sufficient and rigorous details. In-Gee Kim thoroughly studied the lectures during one of his research projects but found that the unspecialized method used to present them in the form of a book reduced their readability. He st...
Kronk, Gary W
Meteor showers are among the most spectacular celestial events that may be observed by the naked eye, and have been the object of fascination throughout human history. In “Meteor Showers: An Annotated Catalog,” the interested observer can access detailed research on over 100 annual and periodic meteor streams in order to capitalize on these majestic spectacles. Each meteor shower entry includes details of their discovery, important observations and orbits, and gives a full picture of duration, location in the sky, and expected hourly rates. Armed with a fuller understanding, the amateur observer can better view and appreciate the shower of their choice. The original book, published in 1988, has been updated with over 25 years of research in this new and improved edition. Almost every meteor shower study is expanded, with some original minor showers being dropped while new ones are added. The book also includes breakthroughs in the study of meteor showers, such as accurate predictions of outbursts as well ...
Fiorini, Nicolas; Ranwez, Sylvie; Montmain, Jacky; Ranwez, Vincent
Semantic approaches such as concept-based information retrieval rely on a corpus in which resources are indexed by concepts belonging to a domain ontology. In order to keep such applications up-to-date, new entities need to be frequently annotated to enrich the corpus. However, this task is time-consuming and requires a high-level of expertise in both the domain and the related ontology. Different strategies have thus been proposed to ease this indexing process, each one taking advantage from the features of the document. In this paper we present USI (User-oriented Semantic Indexer), a fast and intuitive method for indexing tasks. We introduce a solution to suggest a conceptual annotation for new entities based on related already indexed documents. Our results, compared to those obtained by previous authors using the MeSH thesaurus and a dataset of biomedical papers, show that the method surpasses text-specific methods in terms of both quality and speed. Evaluations are done via usual metrics and semantic similarity. By only relying on neighbor documents, the User-oriented Semantic Indexer does not need a representative learning set. Yet, it provides better results than the other approaches by giving a consistent annotation scored with a global criterion - instead of one score per concept.
Rabinowitz, Joshua D [Princeton Univ., NJ (United States); Aristilde, Ludmilla [Cornell Univ., Ithaca, NY (United States); Amador-Noguez, Daniel [Univ. of Wisconsin, Madison, WI (United States)
Members of the genus Clostridium collectively have the ideal set of the metabolic capabilities for fermentative biofuel production: cellulose degradation, hydrogen production, and solvent excretion. No single organism, however, can effectively convert cellulose into biofuels. Here we developed, using metabolomics and isotope tracers, basic science knowledge of Clostridial metabolism of utility for future efforts to engineer such an organism. In glucose fermentation carried out by the biofuel producer Clostridium acetobutylicum, we observed a remarkably ordered series of metabolite concentration changes as the fermentation progressed from acidogenesis to solventogenesis. In general, high-energy compounds decreased while low-energy species increased during solventogenesis. These changes in metabolite concentrations were accompanied by large changes in intracellular metabolic fluxes, with pyruvate directed towards acetyl-CoA and solvents instead of oxaloacetate and amino acids. Thus, the solventogenic transition involves global remodeling of metabolism to redirect resources from biomass production into solvent production. In contrast to C. acetobutylicum, which is an avid fermenter, C. cellulolyticum metabolizes glucose only slowly. We find that glycolytic intermediate concentrations are radically different from fast fermenting organisms. Associated thermodynamic and isotope tracer analysis revealed that the full glycolytic pathway in C. cellulolyticum is reversible. This arises from changes in cofactor utilization for phosphofructokinase and an alternative pathway from phosphoenolpyruvate to pyruvate. The net effect is to increase the high-energy phosphate bond yield of glycolysis by 150% (from 2 to 5) at the expense of lower net flux. Thus, C. cellulolyticum prioritizes glycolytic energy efficiency over speed. Degradation of cellulose results in other sugars in addition to glucose. Simultaneous feeding of stable isotope-labeled glucose and unlabeled pentose sugars
Bezdan, Eniko; Kester, Liesbeth; Kirschner, Paul A.
Bezdan, E., Kester, L., & Kirschner, P. A. (2012, 29-31 August). The influence of annotation in graphical organizers. Poster presented at the biannual meeting of the EARLI Special Interest Group Comprehension of Text and Graphics, Grenoble, France.
This annotated bibliography of sociolinguistics is divided into the following sections: speech events, ethnography of speaking and anthropological approaches to analysis of conversation; discourse analysis (including analysis of conversation and narrative), ethnomethodology and nonverbal communication; sociolinguistics; pragmatics (including…
Rarig, Emory W., Jr., Ed.
This annotated bibliography on the junior college is arranged by topic: research tools, history, functions and purposes, organization and administration, students, programs, personnel, facilities, and research. It covers publications through the fall of 1965 and has an author index. (HH)
Pararas-Carayannis, G.; Dong, B.; Farmer, R.
This compilation contains annotated citations to nearly 3000 tsunami-related publications from 1962 to 1976 in English and several other languages. The foreign-language citations have English titles and abstracts
HEISS, ANN M.; AND OTHERS
THIS ANNOTATED BIBLIOGRAPHY CONTAINS REFERENCES TO GENERAL GRADUATE EDUCATION AND TO EDUCATION FOR THE FOLLOWING PROFESSIONAL FIELDS--ARCHITECTURE, BUSINESS, CLINICAL PSYCHOLOGY, DENTISTRY, ENGINEERING, LAW, LIBRARY SCIENCE, MEDICINE, NURSING, SOCIAL WORK, TEACHING, AND THEOLOGY. (HW)
Kalkatawi, Manal M.
Genome annotation is an important topic since it provides information for the foundation of downstream genomic and biological research. It is considered as a way of summarizing part of existing knowledge about the genomic characteristics of an organism. Annotating different regions of a genome sequence is known as structural annotation, while identifying functions of these regions is considered as a functional annotation. In silico approaches can facilitate both tasks that otherwise would be difficult and timeconsuming. This study contributes to genome annotation by introducing several novel bioinformatics methods, some based on machine learning (ML) approaches. First, we present Dragon PolyA Spotter (DPS), a method for accurate identification of the polyadenylation signals (PAS) within human genomic DNA sequences. For this, we derived a novel feature-set able to characterize properties of the genomic region surrounding the PAS, enabling development of high accuracy optimized ML predictive models. DPS considerably outperformed the state-of-the-art results. The second contribution concerns developing generic models for structural annotation, i.e., the recognition of different genomic signals and regions (GSR) within eukaryotic DNA. We developed DeepGSR, a systematic framework that facilitates generating ML models to predict GSR with high accuracy. To the best of our knowledge, no available generic and automated method exists for such task that could facilitate the studies of newly sequenced organisms. The prediction module of DeepGSR uses deep learning algorithms to derive highly abstract features that depend mainly on proper data representation and hyperparameters calibration. DeepGSR, which was evaluated on recognition of PAS and translation initiation sites (TIS) in different organisms, yields a simpler and more precise representation of the problem under study, compared to some other hand-tailored models, while producing high accuracy prediction results. Finally
Full Text Available Abstract Background In response to the rapid growth of available genome sequences, efforts have been made to develop automatic inference methods to functionally characterize them. Pipelines that infer functional annotation are now routinely used to produce new annotations at a genome scale and for a broad variety of species. These pipelines differ widely in their inference algorithms, confidence thresholds and data sources for reasoning. This heterogeneity makes a comparison of the relative merits of each approach extremely complex. The evaluation of the quality of the resultant annotations is also challenging given there is often no existing gold-standard against which to evaluate precision and recall. Results In this paper, we present a pragmatic approach to the study of functional annotations. An ensemble of 12 metrics, describing various aspects of functional annotations, is defined and implemented in a unified framework, which facilitates their systematic analysis and inter-comparison. The use of this framework is demonstrated on three illustrative examples: analysing the outputs of state-of-the-art inference pipelines, comparing electronic versus manual annotation methods, and monitoring the evolution of publicly available functional annotations. The framework is part of the AIGO library (http://code.google.com/p/aigo for the Analysis and the Inter-comparison of the products of Gene Ontology (GO annotation pipelines. The AIGO library also provides functionalities to easily load, analyse, manipulate and compare functional annotations and also to plot and export the results of the analysis in various formats. Conclusions This work is a step toward developing a unified framework for the systematic study of GO functional annotations. This framework has been designed so that new metrics on GO functional annotations can be added in a very straightforward way.
Zellweger, Polle Trescott; Bouvin, Niels Olof; Jehøj, Henning
Fluid Documents use animated typographical changes to provide a novel and appealing user experience for hypertext browsing and for viewing document annotations in context. This paper describes an effort to broaden the utility of Fluid Documents by using the open hypermedia Arakne Environment to l...... to layer fluid annotations and links on top of abitrary HTML pages on the World Wide Web. Changes to both Fluid Documents and Arakne are required....
Wang, Qinghua; Arighi, Cecilia N; King, Benjamin L; Polson, Shawn W; Vincent, James; Chen, Chuming; Huang, Hongzhan; Kingham, Brewster F; Page, Shallee T; Rendino, Marc Farnum; Thomas, William Kelley; Udwary, Daniel W; Wu, Cathy H
Recent advances in high-throughput DNA sequencing technologies have equipped biologists with a powerful new set of tools for advancing research goals. The resulting flood of sequence data has made it critically important to train the next generation of scientists to handle the inherent bioinformatic challenges. The North East Bioinformatics Collaborative (NEBC) is undertaking the genome sequencing and annotation of the little skate (Leucoraja erinacea) to promote advancement of bioinformatics infrastructure in our region, with an emphasis on practical education to create a critical mass of informatically savvy life scientists. In support of the Little Skate Genome Project, the NEBC members have developed several annotation workshops and jamborees to provide training in genome sequencing, annotation and analysis. Acting as a nexus for both curation activities and dissemination of project data, a project web portal, SkateBase (http://skatebase.org) has been developed. As a case study to illustrate effective coupling of community annotation with workforce development, we report the results of the Mitochondrial Genome Annotation Jamborees organized to annotate the first completely assembled element of the Little Skate Genome Project, as a culminating experience for participants from our three prior annotation workshops. We are applying the physical/virtual infrastructure and lessons learned from these activities to enhance and streamline the genome annotation workflow, as we look toward our continuing efforts for larger-scale functional and structural community annotation of the L. erinacea genome.
Wang, Qinghua; Arighi, Cecilia N.; King, Benjamin L.; Polson, Shawn W.; Vincent, James; Chen, Chuming; Huang, Hongzhan; Kingham, Brewster F.; Page, Shallee T.; Farnum Rendino, Marc; Thomas, William Kelley; Udwary, Daniel W.; Wu, Cathy H.
Recent advances in high-throughput DNA sequencing technologies have equipped biologists with a powerful new set of tools for advancing research goals. The resulting flood of sequence data has made it critically important to train the next generation of scientists to handle the inherent bioinformatic challenges. The North East Bioinformatics Collaborative (NEBC) is undertaking the genome sequencing and annotation of the little skate (Leucoraja erinacea) to promote advancement of bioinformatics infrastructure in our region, with an emphasis on practical education to create a critical mass of informatically savvy life scientists. In support of the Little Skate Genome Project, the NEBC members have developed several annotation workshops and jamborees to provide training in genome sequencing, annotation and analysis. Acting as a nexus for both curation activities and dissemination of project data, a project web portal, SkateBase (http://skatebase.org) has been developed. As a case study to illustrate effective coupling of community annotation with workforce development, we report the results of the Mitochondrial Genome Annotation Jamborees organized to annotate the first completely assembled element of the Little Skate Genome Project, as a culminating experience for participants from our three prior annotation workshops. We are applying the physical/virtual infrastructure and lessons learned from these activities to enhance and streamline the genome annotation workflow, as we look toward our continuing efforts for larger-scale functional and structural community annotation of the L. erinacea genome. PMID:22434832
Osborne, John D; Flatow, Jared; Holko, Michelle; Lin, Simon M; Kibbe, Warren A; Zhu, Lihua (Julie); Danila, Maria I; Feng, Gang; Chisholm, Rex L
Background The human genome has been extensively annotated with Gene Ontology for biological functions, but minimally computationally annotated for diseases. Results We used the Unified Medical Language System (UMLS) MetaMap Transfer tool (MMTx) to discover gene-disease relationships from the GeneRIF database. We utilized a comprehensive subset of UMLS, which is disease-focused and structured as a directed acyclic graph (the Disease Ontology), to filter and interpret results from MMTx. The results were validated against the Homayouni gene collection using recall and precision measurements. We compared our results with the widely used Online Mendelian Inheritance in Man (OMIM) annotations. Conclusion The validation data set suggests a 91% recall rate and 97% precision rate of disease annotation using GeneRIF, in contrast with a 22% recall and 98% precision using OMIM. Our thesaurus-based approach allows for comparisons to be made between disease containing databases and allows for increased accuracy in disease identification through synonym matching. The much higher recall rate of our approach demonstrates that annotating human genome with Disease Ontology and GeneRIF for diseases dramatically increases the coverage of the disease annotation of human genome. PMID:19594883
Oellrich, Anika; Collier, Nigel; Smedley, Damian; Groza, Tudor
Electronic health records and scientific articles possess differing linguistic characteristics that may impact the performance of natural language processing tools developed for one or the other. In this paper, we investigate the performance of four extant concept recognition tools: the clinical Text Analysis and Knowledge Extraction System (cTAKES), the National Center for Biomedical Ontology (NCBO) Annotator, the Biomedical Concept Annotation System (BeCAS) and MetaMap. Each of the four concept recognition systems is applied to four different corpora: the i2b2 corpus of clinical documents, a PubMed corpus of Medline abstracts, a clinical trails corpus and the ShARe/CLEF corpus. In addition, we assess the individual system performances with respect to one gold standard annotation set, available for the ShARe/CLEF corpus. Furthermore, we built a silver standard annotation set from the individual systems' output and assess the quality as well as the contribution of individual systems to the quality of the silver standard. Our results demonstrate that mainly the NCBO annotator and cTAKES contribute to the silver standard corpora (F1-measures in the range of 21% to 74%) and their quality (best F1-measure of 33%), independent from the type of text investigated. While BeCAS and MetaMap can contribute to the precision of silver standard annotations (precision of up to 42%), the F1-measure drops when combined with NCBO Annotator and cTAKES due to a low recall. In conclusion, the performances of individual systems need to be improved independently from the text types, and the leveraging strategies to best take advantage of individual systems' annotations need to be revised. The textual content of the PubMed corpus, accession numbers for the clinical trials corpus, and assigned annotations of the four concept recognition systems as well as the generated silver standard annotation sets are available from http://purl.org/phenotype/resources. The textual content of the Sh
Full Text Available Electronic health records and scientific articles possess differing linguistic characteristics that may impact the performance of natural language processing tools developed for one or the other. In this paper, we investigate the performance of four extant concept recognition tools: the clinical Text Analysis and Knowledge Extraction System (cTAKES, the National Center for Biomedical Ontology (NCBO Annotator, the Biomedical Concept Annotation System (BeCAS and MetaMap. Each of the four concept recognition systems is applied to four different corpora: the i2b2 corpus of clinical documents, a PubMed corpus of Medline abstracts, a clinical trails corpus and the ShARe/CLEF corpus. In addition, we assess the individual system performances with respect to one gold standard annotation set, available for the ShARe/CLEF corpus. Furthermore, we built a silver standard annotation set from the individual systems' output and assess the quality as well as the contribution of individual systems to the quality of the silver standard. Our results demonstrate that mainly the NCBO annotator and cTAKES contribute to the silver standard corpora (F1-measures in the range of 21% to 74% and their quality (best F1-measure of 33%, independent from the type of text investigated. While BeCAS and MetaMap can contribute to the precision of silver standard annotations (precision of up to 42%, the F1-measure drops when combined with NCBO Annotator and cTAKES due to a low recall. In conclusion, the performances of individual systems need to be improved independently from the text types, and the leveraging strategies to best take advantage of individual systems' annotations need to be revised. The textual content of the PubMed corpus, accession numbers for the clinical trials corpus, and assigned annotations of the four concept recognition systems as well as the generated silver standard annotation sets are available from http://purl.org/phenotype/resources. The textual content
Winsor, Geoffrey L; Griffiths, Emma J; Lo, Raymond; Dhillon, Bhavjinder K; Shay, Julie A; Brinkman, Fiona S L
The Pseudomonas Genome Database (http://www.pseudomonas.com) is well known for the application of community-based annotation approaches for producing a high-quality Pseudomonas aeruginosa PAO1 genome annotation, and facilitating whole-genome comparative analyses with other Pseudomonas strains. To aid analysis of potentially thousands of complete and draft genome assemblies, this database and analysis platform was upgraded to integrate curated genome annotations and isolate metadata with enhanced tools for larger scale comparative analysis and visualization. Manually curated gene annotations are supplemented with improved computational analyses that help identify putative drug targets and vaccine candidates or assist with evolutionary studies by identifying orthologs, pathogen-associated genes and genomic islands. The database schema has been updated to integrate isolate metadata that will facilitate more powerful analysis of genomes across datasets in the future. We continue to place an emphasis on providing high-quality updates to gene annotations through regular review of the scientific literature and using community-based approaches including a major new Pseudomonas community initiative for the assignment of high-quality gene ontology terms to genes. As we further expand from thousands of genomes, we plan to provide enhancements that will aid data visualization and analysis arising from whole-genome comparative studies including more pan-genome and population-based approaches. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.
In recent years, multimedia annotation problem has been attracting significant research attention in multimedia and computer vision areas, especially for automatic image annotation, whose purpose is to provide an efficient and effective searching environment for users to query their images more easily.In this paper, a semi-supervised learning based probabilistic latent semantic analysis ( PL-SA) model for automatic image annotation is presenred.Since it' s often hard to obtain or create la-beled images in large quantities while unlabeled ones are easier to collect, a transductive support vector machine ( TSVM) is exploited to enhance the quality of the training image data.Then, differ-ent image features with different magnitudes will result in different performance for automatic image annotation.To this end, a Gaussian normalization method is utilized to normalize different features extracted from effective image regions segmented by the normalized cuts algorithm so as to reserve the intrinsic content of images as complete as possible.Finally, a PLSA model with asymmetric mo-dalities is constructed based on the expectation maximization( EM) algorithm to predict a candidate set of annotations with confidence scores.Extensive experiments on the general-purpose Corel5k dataset demonstrate that the proposed model can significantly improve performance of traditional PL-SA for the task of automatic image annotation.
Full Text Available Metabolomics is a powerful technology with broad applications in life science that, like other -omics approaches, requires high-quality samples to achieve reliable results and ensure reproducibility. Therefore, along with quality assurance, methods to assess sample quality regarding pre-analytical confounders are urgently needed. In this study, we analyzed the response of the human serum metabolome to pre-analytical variations comprising prolonged blood incubation and extended serum storage at room temperature by using gas chromatography-mass spectrometry (GC-MS and liquid chromatography-tandem mass spectrometry (LC-MS/MS -based metabolomics. We found that the prolonged incubation of blood results in a statistically significant 20% increase and 4% decrease of 225 tested serum metabolites. Extended serum storage affected 21% of the analyzed metabolites (14% increased, 7% decreased. Amino acids and nucleobases showed the highest percentage of changed metabolites in both confounding conditions, whereas lipids were remarkably stable. Interestingly, the amounts of taurine and O-phosphoethanolamine, which have both been discussed as biomarkers for various diseases, were 1.8- and 2.9-fold increased after 6 h of blood incubation. Since we found that both are more stable in ethylenediaminetetraacetic acid (EDTA blood, EDTA plasma should be the preferred metabolomics matrix.
Rácz, Anita; Andrić, Filip; Bajusz, Dávid; Héberger, Károly
Contemporary metabolomic fingerprinting is based on multiple spectrometric and chromatographic signals, used either alone or combined with structural and chemical information of metabolic markers at the qualitative and semiquantitative level. However, signal shifting, convolution, and matrix effects may compromise metabolomic patterns. Recent increase in the use of qualitative metabolomic data, described by the presence (1) or absence (0) of particular metabolites, demonstrates great potential in the field of metabolomic profiling and fingerprint analysis. The aim of this study is a comprehensive evaluation of binary similarity measures for the elucidation of patterns among samples of different botanical origin and various metabolomic profiles. Nine qualitative metabolomic data sets covering a wide range of natural products and metabolomic profiles were applied to assess 44 binary similarity measures for the fingerprinting of plant extracts and natural products. The measures were analyzed by the novel sum of ranking differences method (SRD), searching for the most promising candidates. Baroni-Urbani-Buser (BUB) and Hawkins-Dotson (HD) similarity coefficients were selected as the best measures by SRD and analysis of variance (ANOVA), while Dice (Di1), Yule, Russel-Rao, and Consonni-Todeschini 3 ranked the worst. ANOVA revealed that concordantly and intermediately symmetric similarity coefficients are better candidates for metabolomic fingerprinting than the asymmetric and correlation based ones. The fingerprint analysis based on the BUB and HD coefficients and qualitative metabolomic data performed equally well as the quantitative metabolomic profile analysis. Fingerprint analysis based on the qualitative metabolomic profiles and binary similarity measures proved to be a reliable way in finding the same/similar patterns in metabolomic data as that extracted from quantitative data.
Goede, Patricia A.; Lauman, Jason R.; Cochella, Christopher; Katzman, Gregory L.; Morton, David A.; Albertine, Kurt H.
Use of digital medical images has become common over the last several years, coincident with the release of inexpensive, mega-pixel quality digital cameras and the transition to digital radiology operation by hospitals. One problem that clinicians, medical educators, and basic scientists encounter when handling images is the difficulty of using business and graphic arts commercial-off-the-shelf (COTS) software in multicontext authoring and interactive teaching environments. The authors investigated and developed software-supported methodologies to help clinicians, medical educators, and basic scientists become more efficient and effective in their digital imaging environments. The software that the authors developed provides the ability to annotate images based on a multispecialty methodology for annotation and visual knowledge representation. This annotation methodology is designed by consensus, with contributions from the authors and physicians, medical educators, and basic scientists in the Departments of Radiology, Neurobiology and Anatomy, Dermatology, and Ophthalmology at the University of Utah. The annotation methodology functions as a foundation for creating, using, reusing, and extending dynamic annotations in a context-appropriate, interactive digital environment. The annotation methodology supports the authoring process as well as output and presentation mechanisms. The annotation methodology is the foundation for a Windows implementation that allows annotated elements to be represented as structured eXtensible Markup Language and stored separate from the image(s). PMID:14527971
Full Text Available Abstract Background The expressed sequence tag (EST methodology is an attractive option for the generation of sequence data for species for which no completely sequenced genome is available. The annotation and comparative analysis of such datasets poses a formidable challenge for research groups that do not have the bioinformatics infrastructure of major genome sequencing centres. Therefore, there is a need for user-friendly tools to facilitate the annotation of non-model species EST datasets with well-defined ontologies that enable meaningful cross-species comparisons. To address this, we have developed annot8r, a platform for the rapid annotation of EST datasets with GO-terms, EC-numbers and KEGG-pathways. Results annot8r automatically downloads all files relevant for the annotation process and generates a reference database that stores UniProt entries, their associated Gene Ontology (GO, Enzyme Commission (EC and Kyoto Encyclopaedia of Genes and Genomes (KEGG annotation and additional relevant data. For each of GO, EC and KEGG, annot8r extracts a specific sequence subset from the UniProt dataset based on the information stored in the reference database. These three subsets are then formatted for BLAST searches. The user provides the protein or nucleotide sequences to be annotated and annot8r runs BLAST searches against these three subsets. The BLAST results are parsed and the corresponding annotations retrieved from the reference database. The annotations are saved both as flat files and also in a relational postgreSQL results database to facilitate more advanced searches within the results. annot8r is integrated with the PartiGene suite of EST analysis tools. Conclusion annot8r is a tool that assigns GO, EC and KEGG annotations for data sets resulting from EST sequencing projects both rapidly and efficiently. The benefits of an underlying relational database, flexibility and the ease of use of the program make it ideally suited for non
Full Text Available Abstract Background Genes and gene products are frequently annotated with Gene Ontology concepts based on the evidence provided in genomics articles. Manually locating and curating information about a genomic entity from the biomedical literature requires vast amounts of human effort. Hence, there is clearly a need forautomated computational tools to annotate the genes and gene products with Gene Ontology concepts by computationally capturing the related knowledge embedded in textual data. Results In this article, we present an automated genomic entity annotation system, GEANN, which extracts information about the characteristics of genes and gene products in article abstracts from PubMed, and translates the discoveredknowledge into Gene Ontology (GO concepts, a widely-used standardized vocabulary of genomic traits. GEANN utilizes textual "extraction patterns", and a semantic matching framework to locate phrases matching to a pattern and produce Gene Ontology annotations for genes and gene products. In our experiments, GEANN has reached to the precision level of 78% at therecall level of 61%. On a select set of Gene Ontology concepts, GEANN either outperforms or is comparable to two other automated annotation studies. Use of WordNet for semantic pattern matching improves the precision and recall by 24% and 15%, respectively, and the improvement due to semantic pattern matching becomes more apparent as the Gene Ontology terms become more general. Conclusion GEANN is useful for two distinct purposes: (i automating the annotation of genomic entities with Gene Ontology concepts, and (ii providing existing annotations with additional "evidence articles" from the literature. The use of textual extraction patterns that are constructed based on the existing annotations achieve high precision. The semantic pattern matching framework provides a more flexible pattern matching scheme with respect to "exactmatching" with the advantage of locating approximate
Systematic analysis and interpretation of the large number of tandem mass spectra (MS/MS) obtained in metabolomics experiments is a bottleneck in discovery-driven research. MS/MS mass spectral libraries are small compared to all known small molecule structures and are often not freely available. MS2Analyzer was therefore developed to enable user-defined searches of thousands of spectra for mass spectral features such as neutral losses, m/z differences, and product and precursor ions from MS/MS spectra in MSP/MGF files. The software is freely available at http://fiehnlab.ucdavis.edu/projects/MS2Analyzer/. As the reference query set, 147 literature-reported neutral losses and their corresponding substructures were collected. This set was tested for accuracy of linking neutral loss analysis to substructure annotations using 19 329 accurate mass tandem mass spectra of structurally known compounds from the NIST11 MS/MS library. Validation studies showed that 92.1 ± 6.4% of 13 typical neutral losses such as acetylations, cysteine conjugates, or glycosylations are correct annotating the associated substructures, while the absence of mass spectra features does not necessarily imply the absence of such substructures. Use of this tool has been successfully demonstrated for complex lipids in microalgae. PMID:25263576
Vallenet, D; Engelen, S; Mornico, D; Cruveiller, S; Fleury, L; Lajus, A; Rouy, Z; Roche, D; Salvignol, G; Scarpelli, C; Médigue, C
The initial outcome of genome sequencing is the creation of long text strings written in a four letter alphabet. The role of in silico sequence analysis is to assist biologists in the act of associating biological knowledge with these sequences, allowing investigators to make inferences and predictions that can be tested experimentally. A wide variety of software is available to the scientific community, and can be used to identify genomic objects, before predicting their biological functions. However, only a limited number of biologically interesting features can be revealed from an isolated sequence. Comparative genomics tools, on the other hand, by bringing together the information contained in numerous genomes simultaneously, allow annotators to make inferences based on the idea that evolution and natural selection are central to the definition of all biological processes. We have developed the MicroScope platform in order to offer a web-based framework for the systematic and efficient revision of microbial genome annotation and comparative analysis (http://www.genoscope.cns.fr/agc/microscope). Starting with the description of the flow chart of the annotation processes implemented in the MicroScope pipeline, and the development of traditional and novel microbial annotation and comparative analysis tools, this article emphasizes the essential role of expert annotation as a complement of automatic annotation. Several examples illustrate the use of implemented tools for the review and curation of annotations of both new and publicly available microbial genomes within MicroScope's rich integrated genome framework. The platform is used as a viewer in order to browse updated annotation information of available microbial genomes (more than 440 organisms to date), and in the context of new annotation projects (117 bacterial genomes). The human expertise gathered in the MicroScope database (about 280,000 independent annotations) contributes to improve the quality of
Hosseini, Parsa; Tremblay, Arianne; Matthews, Benjamin F; Alkharouf, Nadim W
The data produced by an Illumina flow cell with all eight lanes occupied, produces well over a terabyte worth of images with gigabytes of reads following sequence alignment. The ability to translate such reads into meaningful annotation is therefore of great concern and importance. Very easily, one can get flooded with such a great volume of textual, unannotated data irrespective of read quality or size. CASAVA, a optional analysis tool for Illumina sequencing experiments, enables the ability to understand INDEL detection, SNP information, and allele calling. To not only extract from such analysis, a measure of gene expression in the form of tag-counts, but furthermore to annotate such reads is therefore of significant value. We developed TASE (Tag counting and Analysis of Solexa Experiments), a rapid tag-counting and annotation software tool specifically designed for Illumina CASAVA sequencing datasets. Developed in Java and deployed using jTDS JDBC driver and a SQL Server backend, TASE provides an extremely fast means of calculating gene expression through tag-counts while annotating sequenced reads with the gene's presumed function, from any given CASAVA-build. Such a build is generated for both DNA and RNA sequencing. Analysis is broken into two distinct components: DNA sequence or read concatenation, followed by tag-counting and annotation. The end result produces output containing the homology-based functional annotation and respective gene expression measure signifying how many times sequenced reads were found within the genomic ranges of functional annotations. TASE is a powerful tool to facilitate the process of annotating a given Illumina Solexa sequencing dataset. Our results indicate that both homology-based annotation and tag-count analysis are achieved in very efficient times, providing researchers to delve deep in a given CASAVA-build and maximize information extraction from a sequencing dataset. TASE is specially designed to translate sequence data
Sakai, Hiroaki; Naito, Ken; Takahashi, Yu; Sato, Toshiyuki; Yamamoto, Toshiya; Muto, Isamu; Itoh, Takeshi; Tomooka, Norihiko
The genus Vigna includes legume crops such as cowpea, mungbean and azuki bean, as well as >100 wild species. A number of the wild species are highly tolerant to severe environmental conditions including high-salinity, acid or alkaline soil; drought; flooding; and pests and diseases. These features of the genus Vigna make it a good target for investigation of genetic diversity in adaptation to stressful environments; however, a lack of genomic information has hindered such research in this genus. Here, we present a genome database of the genus Vigna, Vigna Genome Server ('VigGS', http://viggs.dna.affrc.go.jp), based on the recently sequenced azuki bean genome, which incorporates annotated exon-intron structures, along with evidence for transcripts and proteins, visualized in GBrowse. VigGS also facilitates user construction of multiple alignments between azuki bean genes and those of six related dicot species. In addition, the database displays sequence polymorphisms between azuki bean and its wild relatives and enables users to design primer sequences targeting any variant site. VigGS offers a simple keyword search in addition to sequence similarity searches using BLAST and BLAT. To incorporate up to date genomic information, VigGS automatically receives newly deposited mRNA sequences of pre-set species from the public database once a week. Users can refer to not only gene structures mapped on the azuki bean genome on GBrowse but also relevant literature of the genes. VigGS will contribute to genomic research into plant biotic and abiotic stresses and to the future development of new stress-tolerant crops. © The Author 2015. Published by Oxford University Press on behalf of Japanese Society of Plant Physiologists. All rights reserved. For permissions, please email: firstname.lastname@example.org.
Saber A Akhondi
Full Text Available Exploring the chemical and biological space covered by patent applications is crucial in early-stage medicinal chemistry activities. Patent analysis can provide understanding of compound prior art, novelty checking, validation of biological assays, and identification of new starting points for chemical exploration. Extracting chemical and biological entities from patents through manual extraction by expert curators can take substantial amount of time and resources. Text mining methods can help to ease this process. To validate the performance of such methods, a manually annotated patent corpus is essential. In this study we have produced a large gold standard chemical patent corpus. We developed annotation guidelines and selected 200 full patents from the World Intellectual Property Organization, United States Patent and Trademark Office, and European Patent Office. The patents were pre-annotated automatically and made available to four independent annotator groups each consisting of two to ten annotators. The annotators marked chemicals in different subclasses, diseases, targets, and modes of action. Spelling mistakes and spurious line break due to optical character recognition errors were also annotated. A subset of 47 patents was annotated by at least three annotator groups, from which harmonized annotations and inter-annotator agreement scores were derived. One group annotated the full set. The patent corpus includes 400,125 annotations for the full set and 36,537 annotations for the harmonized set. All patents and annotated entities are publicly available at www.biosemantics.org.
Full Text Available This work elaborates the semi-semantic part of speech annotation guidelines for the URDU.KON-TB treebank: an annotated corpus. A hierarchical annotation scheme was designed to label the part of speech and then applied on the corpus. This raw corpus was collected from the Urdu Wikipedia and the Jang newspaper and then annotated with the proposed semi-semantic part of speech labels. The corpus contains text of local & international news, social stories, sports, culture, finance, religion, traveling, etc. This exercise finally contributed a part of speech annotation to the URDU.KON-TB treebank. Twenty-two main part of speech categories are divided into subcategories, which conclude the morphological, and semantical information encoded in it. This article reports the annotation guidelines in major; however, it also briefs the development of the URDU.KON-TB treebank, which includes the raw corpus collection, designing & employment of annotation scheme and finally, its statistical evaluation and results. The guidelines presented as follows, will be useful for linguistic community to annotate the sentences not only for the national language Urdu but for the other indigenous languages like Punjab, Sindhi, Pashto, etc., as well.
Full Text Available The MixtureTree Annotator, written in JAVA, allows the user to automatically color any phylogenetic tree in Newick format generated from any phylogeny reconstruction program and output the Nexus file. By providing the ability to automatically color the tree by sequence name, the MixtureTree Annotator provides a unique advantage over any other programs which perform a similar function. In addition, the MixtureTree Annotator is the only package that can efficiently annotate the output produced by MixtureTree with mutation information and coalescent time information. In order to visualize the resulting output file, a modified version of FigTree is used. Certain popular methods, which lack good built-in visualization tools, for example, MEGA, Mesquite, PHY-FI, TreeView, treeGraph and Geneious, may give results with human errors due to either manually adding colors to each node or with other limitations, for example only using color based on a number, such as branch length, or by taxonomy. In addition to allowing the user to automatically color any given Newick tree by sequence name, the MixtureTree Annotator is the only method that allows the user to automatically annotate the resulting tree created by the MixtureTree program. The MixtureTree Annotator is fast and easy-to-use, while still allowing the user full control over the coloring and annotating process.
Sud, Manish; Fahy, Eoin; Cotter, Dawn; Azam, Kenan; Vadivelu, Ilango; Burant, Charles; Edison, Arthur; Fiehn, Oliver; Higashi, Richard; Nair, K Sreekumaran; Sumner, Susan; Subramaniam, Shankar
The Metabolomics Workbench, available at www.metabolomicsworkbench.org, is a public repository for metabolomics metadata and experimental data spanning various species and experimental platforms, metabolite standards, metabolite structures, protocols, tutorials, and training material and other educational resources. It provides a computational platform to integrate, analyze, track, deposit and disseminate large volumes of heterogeneous data from a wide variety of metabolomics studies including mass spectrometry (MS) and nuclear magnetic resonance spectrometry (NMR) data spanning over 20 different species covering all the major taxonomic categories including humans and other mammals, plants, insects, invertebrates and microorganisms. Additionally, a number of protocols are provided for a range of metabolite classes, sample types, and both MS and NMR-based studies, along with a metabolite structure database. The metabolites characterized in the studies available on the Metabolomics Workbench are linked to chemical structures in the metabolite structure database to facilitate comparative analysis across studies. The Metabolomics Workbench, part of the data coordinating effort of the National Institute of Health (NIH) Common Fund's Metabolomics Program, provides data from the Common Fund's Metabolomics Resource Cores, metabolite standards, and analysis tools to the wider metabolomics community and seeks data depositions from metabolomics researchers across the world. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.
Gagnebin, Yoric; Tonoli, David; Lescuyer, Pierre; Ponte, Belen; de Seigneux, Sophie; Martin, Pierre-Yves; Schappler, Julie; Boccard, Julien; Rudaz, Serge
Among the various biological matrices used in metabolomics, urine is a biofluid of major interest because of its non-invasive collection and its availability in large quantities. However, significant sources of variability in urine metabolomics based on UHPLC-MS are related to the analytical drift and variation of the sample concentration, thus requiring normalization. A sequential normalization strategy was developed to remove these detrimental effects, including: (i) pre-acquisition sample normalization by individual dilution factors to narrow the concentration range and to standardize the analytical conditions, (ii) post-acquisition data normalization by quality control-based robust LOESS signal correction (QC-RLSC) to correct for potential analytical drift, and (iii) post-acquisition data normalization by MS total useful signal (MSTUS) or probabilistic quotient normalization (PQN) to prevent the impact of concentration variability. This generic strategy was performed with urine samples from healthy individuals and was further implemented in the context of a clinical study to detect alterations in urine metabolomic profiles due to kidney failure. In the case of kidney failure, the relation between creatinine/osmolality and the sample concentration is modified, and relying only on these measurements for normalization could be highly detrimental. The sequential normalization strategy was demonstrated to significantly improve patient stratification by decreasing the unwanted variability and thus enhancing data quality. Copyright © 2016 Elsevier B.V. All rights reserved.
Kortesniemi, Maaria; Vuorinen, Anssi L; Sinkkonen, Jari; Yang, Baoru; Rajala, Ari; Kallio, Heikki
The oilseeds of the commercially important oilseed rape (Brassica napus) and turnip rape (Brassica rapa) were investigated with (1)H NMR metabolomics. The compositions of ripened (cultivated in field trials) and developing seeds (cultivated in controlled conditions) were compared in multivariate models using principal component analysis (PCA), partial least squares discriminant analysis (PLS-DA), and orthogonal partial least squares discriminant analysis (OPLS-DA). Differences in the major lipids and the minor metabolites between the two species were found. A higher content of polyunsaturated fatty acids and sucrose were observed in turnip rape, while the overall oil content and sinapine levels were higher in oilseed rape. The genotype traits were negligible compared to the effect of the growing site and concomitant conditions on the oilseed metabolome. This study demonstrates the applicability of NMR-based analysis in determining the species, geographical origin, developmental stage, and quality of oilseed Brassicas. Copyright © 2014 Elsevier Ltd. All rights reserved.
Gobeill, Julien; Gaudinat, Arnaud; Pasche, Emilie; Vishnyakova, Dina; Gaudet, Pascale; Bairoch, Amos; Ruch, Patrick
Biomedical professionals have access to a huge amount of literature, but when they use a search engine, they often have to deal with too many documents to efficiently find the appropriate information in a reasonable time. In this perspective, question-answering (QA) engines are designed to display answers, which were automatically extracted from the retrieved documents. Standard QA engines in literature process a user question, then retrieve relevant documents and finally extract some possible answers out of these documents using various named-entity recognition processes. In our study, we try to answer complex genomics questions, which can be adequately answered only using Gene Ontology (GO) concepts. Such complex answers cannot be found using state-of-the-art dictionary- and redundancy-based QA engines. We compare the effectiveness of two dictionary-based classifiers for extracting correct GO answers from a large set of 100 retrieved abstracts per question. In the same way, we also investigate the power of GOCat, a GO supervised classifier. GOCat exploits the GOA database to propose GO concepts that were annotated by curators for similar abstracts. This approach is called deep QA, as it adds an original classification step, and exploits curated biological data to infer answers, which are not explicitly mentioned in the retrieved documents. We show that for complex answers such as protein functional descriptions, the redundancy phenomenon has a limited effect. Similarly usual dictionary-based approaches are relatively ineffective. In contrast, we demonstrate how existing curated data, beyond information extraction, can be exploited by a supervised classifier, such as GOCat, to massively improve both the quantity and the quality of the answers with a +100% improvement for both recall and precision. Database URL: http://eagl.unige.ch/DeepQA4PA/. © The Author(s) 2015. Published by Oxford University Press.
involved in rheumatoid arthritis . PLoS Comput Biol 2011;7.  Zhu J. Stitching together multiple data dimensions reveals interacting metabolomic...capability, detecting N10,000 metabo- lites, together with environmental exposure, dietary intake, microbial activity, and pharmaceutical drugs. Thus...research to clinical care has constantly seen huge disap- pointments. With the accumulation of detailed, information-rich data, human subjects start to
Bro, Rasmus; Nielsen, Hans Jørgen; Savorani, Francesco
We have recently shown that fluorescence spectroscopy of plasma samples has promising abilities regarding early detection of colorectal cancer. In the present paper, these results were further developed by combining fluorescence with the biomarkers, CEA and TIMP-1 and traditional metabolomic meas...... measurements in the form of (1)H NMR spectroscopy. The results indicate that using an extensive profile established by combining such measurements together with the biomarkers is better than using single markers....
Catherine G Vasilopoulou
Full Text Available Metabolism being a fundamental part of molecular physiology, elucidating the structure and regulation of metabolic pathways is crucial for obtaining a comprehensive perspective of cellular function and understanding the underlying mechanisms of its dysfunction(s. Therefore, quantifying an accurate metabolic network activity map under various physiological conditions is among the major objectives of systems biology in the context of many biological applications. Especially for CNS, metabolic network activity analysis can substantially enhance our knowledge about the complex structure of the mammalian brain and the mechanisms of neurological disorders, leading to the design of effective therapeutic treatments. Metabolomics has emerged as the high-throughput quantitative analysis of the concentration profile of small molecular weight metabolites, which act as reactants and products in metabolic reactions and as regulatory molecules of proteins participating in many biological processes. Thus, the metabolic profile provides a metabolic activity fingerprint, through the simultaneous analysis of tens to hundreds of molecules of pathophysiological and pharmacological interest. The application of metabolomics is at its standardization phase in general, and the challenges for paving a standardized procedure are even more pronounced in brain studies. In this review, we support the value of metabolomics in brain research. Moreover, we demonstrate the challenges of designing and setting up a reliable brain metabolomic study, which, among other parameters, has to take into consideration the sex differentiation and the complexity of brain physiology manifested in its regional variation. We finally propose ways to overcome these challenges and design a study that produces reproducible and consistent results.
Irshad, H; Montaser-Kouhsari, L; Waltz, G; Bucur, O; Nowak, J A; Dong, F; Knoblauch, N W; Beck, A H
The development of tools in computational pathology to assist physicians and biomedical scientists in the diagnosis of disease requires access to high-quality annotated images for algorithm learning and evaluation. Generating high-quality expert-derived annotations is time-consuming and expensive. We explore the use of crowdsourcing for rapidly obtaining annotations for two core tasks in com- putational pathology: nucleus detection and nucleus segmentation. We designed and implemented crowdsourcing experiments using the CrowdFlower platform, which provides access to a large set of labor channel partners that accesses and manages millions of contributors worldwide. We obtained annotations from four types of annotators and compared concordance across these groups. We obtained: crowdsourced annotations for nucleus detection and segmentation on a total of 810 images; annotations using automated methods on 810 images; annotations from research fellows for detection and segmentation on 477 and 455 images, respectively; and expert pathologist-derived annotations for detection and segmentation on 80 and 63 images, respectively. For the crowdsourced annotations, we evaluated performance across a range of contributor skill levels (1, 2, or 3). The crowdsourced annotations (4,860 images in total) were completed in only a fraction of the time and cost required for obtaining annotations using traditional methods. For the nucleus detection task, the research fellow-derived annotations showed the strongest concordance with the expert pathologist- derived annotations (F-M =93.68%), followed by the crowd-sourced contributor levels 1,2, and 3 and the automated method, which showed relatively similar performance (F-M = 87.84%, 88.49%, 87.26%, and 86.99%, respectively). For the nucleus segmentation task, the crowdsourced contributor level 3-derived annotations, research fellow-derived annotations, and automated method showed the strongest concordance with the expert pathologist
Kholghi, Mahnoosh; Sitbon, Laurianne; Zuccon, Guido; Nguyen, Anthony
To investigate: (1) the annotation time savings by various active learning query strategies compared to supervised learning and a random sampling baseline, and (2) the benefits of active learning-assisted pre-annotations in accelerating the manual annotation process compared to de novo annotation. There are 73 and 120 discharge summary reports provided by Beth Israel institute in the train and test sets of the concept extraction task in the i2b2/VA 2010 challenge, respectively. The 73 reports were used in user study experiments for manual annotation. First, all sequences within the 73 reports were manually annotated from scratch. Next, active learning models were built to generate pre-annotations for the sequences selected by a query strategy. The annotation/reviewing time per sequence was recorded. The 120 test reports were used to measure the effectiveness of the active learning models. When annotating from scratch, active learning reduced the annotation time up to 35% and 28% compared to a fully supervised approach and a random sampling baseline, respectively. Reviewing active learning-assisted pre-annotations resulted in 20% further reduction of the annotation time when compared to de novo annotation. The number of concepts that require manual annotation is a good indicator of the annotation time for various active learning approaches as demonstrated by high correlation between time rate and concept annotation rate. Active learning has a key role in reducing the time required to manually annotate domain concepts from clinical free text, either when annotating from scratch or reviewing active learning-assisted pre-annotations. Copyright © 2017 Elsevier B.V. All rights reserved.
Plake, Conrad; Royer, Loic; Winnenburg, Rainer; Hakenberg, Jörg; Schroeder, Michael
High-throughput screens such as microarrays and RNAi screens produce huge amounts of data. They typically result in hundreds of genes, which are often further explored and clustered via enriched GeneOntology terms. The strength of such analyses is that they build on high-quality manual annotations provided with the GeneOntology. However, the weakness is that annotations are restricted to process, function and location and that they do not cover all known genes in model organisms. GoGene addresses this weakness by complementing high-quality manual annotation with high-throughput text mining extracting co-occurrences of genes and ontology terms from literature. GoGene contains over 4,000,000 associations between genes and gene-related terms for 10 model organisms extracted from more than 18,000,000 PubMed entries. It does not cover only process, function and location of genes, but also biomedical categories such as diseases, compounds, techniques and mutations. By bringing it all together, GoGene provides the most recent and most complete facts about genes and can rank them according to novelty and importance. GoGene accepts keywords, gene lists, gene sequences and protein sequences as input and supports search for genes in PubMed, EntrezGene and via BLAST. Since all associations of genes to terms are supported by evidence in the literature, the results are transparent and can be verified by the user. GoGene is available at http://gopubmed.org/gogene.
Hoeynck, Michael; Auweiler, Thorsten; Wellhausen, Jens
The huge amount of multimedia data produced worldwide requires annotation in order to enable universal content access and to provide content-based search-and-retrieval functionalities. Since manual video annotation can be time consuming, automatic annotation systems are required. We review recent approaches to content-based indexing and annotation of videos for different kind of sports and describe our approach to automatic annotation of equestrian sports videos. We especially concentrate on MPEG-7 based feature extraction and content description, where we apply different visual descriptors for cut detection. Further, we extract the temporal positions of single obstacles on the course by analyzing MPEG-7 edge information. Having determined single shot positions as well as the visual highlights, the information is jointly stored with meta-textual information in an MPEG-7 description scheme. Based on this information, we generate content summaries which can be utilized in a user-interface in order to provide content-based access to the video stream, but further for media browsing on a streaming server.
Stokes Harold W
Full Text Available Abstract Background Although integrons and their associated gene cassettes are present in ~10% of bacteria and can represent up to 3% of the genome in which they are found, very few have been properly identified and annotated in public databases. These genetic elements have been overlooked in comparison to other vectors that facilitate lateral gene transfer between microorganisms. Description By automating the identification of integron integrase genes and of the non-coding cassette-associated attC recombination sites, we were able to assemble a database containing all publicly available sequence information regarding these genetic elements. Specialists manually curated the database and this information was used to improve the automated detection and annotation of integrons and their encoded gene cassettes. ACID (annotation of cassette and integron data can be searched using a range of queries and the data can be downloaded in a number of formats. Users can readily annotate their own data and integrate it into ACID using the tools provided. Conclusion ACID is a community resource providing easy access to annotations of integrons and making tools available to detect them in novel sequence data. ACID also hosts a forum to prompt integron-related discussion, which can hopefully lead to a more universal definition of this genetic element.
Theodore R Sana
Full Text Available Malaria is a global infectious disease that threatens the lives of millions of people. Transcriptomics, proteomics and functional genomics studies, as well as sequencing of the Plasmodium falciparum and Homo sapiens genomes, have shed new light on this host-parasite relationship. Recent advances in accurate mass measurement mass spectrometry, sophisticated data analysis software, and availability of biological pathway databases, have converged to facilitate our global, untargeted biochemical profiling study of in vitro P. falciparum-infected (IRBC and uninfected (NRBC erythrocytes. In order to expand the number of detectable metabolites, several key analytical steps in our workflows were optimized. Untargeted and targeted data mining resulted in detection of over one thousand features or chemical entities. Untargeted features were annotated via matching to the METLIN metabolite database. For targeted data mining, we queried the data using a compound database derived from a metabolic reconstruction of the P. falciparum genome. In total, over one hundred and fifty differential annotated metabolites were observed. To corroborate the representation of known biochemical pathways from our data, an inferential pathway analysis strategy was used to map annotated metabolites onto the BioCyc pathway collection. This hypothesis-generating approach resulted in over-representation of many metabolites onto several IRBC pathways, most prominently glycolysis. In addition, components of the "branched" TCA cycle, partial urea cycle, and nucleotide, amino acid, chorismate, sphingolipid and fatty acid metabolism were found to be altered in IRBCs. Interestingly, we detected and confirmed elevated levels for cyclic ADP ribose and phosphoribosyl AMP in IRBCs, a novel observation. These metabolites may play a role in regulating the release of intracellular Ca(2+ during P. falciparum infection. Our results support a strategy of global metabolite profiling by untargeted
Full Text Available Recent advances in metabolomics technologies have resulted in high-quality (time-resolved metabolic profiles with an increasing coverage of metabolic pathways. These data profiles represent read-outs from often non-linear dynamics of metabolic networks. Yet, metabolic profiles have largely been explored with regression-based approaches that only capture linear relationships, rendering it difficult to determine the extent to which the data reflect the underlying reaction rates and their couplings. Here we propose an approach termed Stoichiometric Correlation Analysis (SCA based on correlation between positive linear combinations of log-transformed metabolic profiles. The log-transformation is due to the evidence that metabolic networks can be modeled by mass action law and kinetics derived from it. Unlike the existing approaches which establish a relation between pairs of metabolites, SCA facilitates the discovery of higher-order dependence between more than two metabolites. By using a paradigmatic model of the tricarboxylic acid cycle we show that the higher-order dependence reflects the coupling of concentration of reactant complexes, capturing the subtle difference between the employed enzyme kinetics. Using time-resolved metabolic profiles from Arabidopsis thaliana and Escherichia coli, we show that SCA can be used to quantify the difference in coupling of reactant complexes, and hence, reaction rates, underlying the stringent response in these model organisms. By using SCA with data from natural variation of wild and domesticated wheat and tomato accession, we demonstrate that the domestication is accompanied by loss of such couplings, in these species. Therefore, application of SCA to metabolomics data from natural variation in wild and domesticated populations provides a mechanistic way to understanding domestication and its relation to metabolic networks.
Roberts, Kirk; Demner-Fushman, Dina
This paper discusses the creation of a semantically annotated corpus of questions about patient data in electronic health records (EHRs). The goal is to provide the training data necessary for semantic parsers to automatically convert EHR questions into a structured query. A layered annotation strategy is used which mirrors a typical natural language processing (NLP) pipeline. First, questions are syntactically analyzed to identify multi-part questions. Second, medical concepts are recognized and normalized to a clinical ontology. Finally, logical forms are created using a lambda calculus representation. We use a corpus of 446 questions asking for patient-specific information. From these, 468 specific questions are found containing 259 unique medical concepts and requiring 53 unique predicates to represent the logical forms. We further present detailed characteristics of the corpus, including inter-annotator agreement results, and describe the challenges automatic NLP systems will face on this task.
Suzuki, Makoto; Nishiumi, Shin; Matsubara, Atsuki; Azuma, Takeshi; Yoshida, Masaru
Improvements in analytical technologies have made it possible to rapidly determine the concentrations of thousands of metabolites in any biological sample, which has resulted in metabolome analysis being applied to various types of research, such as clinical, cell biology, and plant/food science studies. The metabolome represents all of the end products and by-products of the numerous complex metabolic pathways operating in a biological system. Thus, metabolome analysis allows one to survey the global changes in an organism's metabolic profile and gain a holistic understanding of the changes that occur in organisms during various biological processes, e.g., during disease development. In clinical metabolomic studies, there is a strong possibility that differences in the metabolic profiles of human specimens reflect disease-specific states. Recently, metabolome analysis of biofluids, e.g., blood, urine, or saliva, has been increasingly used for biomarker discovery and disease diagnosis. Mass spectrometry-based techniques have been extensively used for metabolome analysis because they exhibit high selectivity and sensitivity during the identification and quantification of metabolites. Here, we describe metabolome analysis using liquid chromatography-mass spectrometry, gas chromatography-mass spectrometry, and capillary electrophoresis-mass spectrometry. Furthermore, the findings of studies that attempted to discover biomarkers of gastroenterological cancer are also outlined. Finally, we discuss metabolome analysis-based disease diagnosis. Copyright © 2014 Elsevier B.V. All rights reserved.
Dragsted, L. O.; Kristensen, M.; Ravn-Haren, Gitte
Metabolomics is a promising tool for searching out new biomarkers and the development of hypotheses in nutrition research. This chapter will describe the design of human dietary intervention studies where samples are collected for metabolomics analyses as well as the analytical issues and data...
Schrimpe-Rutledge, Alexandra C.; Codreanu, Simona G.; Sherrod, Stacy D.; McLean, John A.
Metabolites are building blocks of cellular function. These species are involved in enzyme-catalyzed chemical reactions and are essential for cellular function. Upstream biological disruptions result in a series of metabolomic changes and, as such, the metabolome holds a wealth of information that is thought to be most predictive of phenotype. Uncovering this knowledge is a work in progress. The field of metabolomics is still maturing; the community has leveraged proteomics experience when applicable and developed a range of sample preparation and instrument methodology along with myriad data processing and analysis approaches. Research focuses have now shifted toward a fundamental understanding of the biology responsible for metabolomic changes. There are several types of metabolomics experiments including both targeted and untargeted analyses. While untargeted, hypothesis generating workflows exhibit many valuable attributes, challenges inherent to the approach remain. This Critical Insight comments on these challenges, focusing on the identification process of LC-MS-based untargeted metabolomics studies—specifically in mammalian systems. Biological interpretation of metabolomics data hinges on the ability to accurately identify metabolites. The range of confidence associated with identifications that is often overlooked is reviewed, and opportunities for advancing the metabolomics field are described.
Koek, Maud Marijtje
Metabolomics involves the unbiased quantitative and qualitative analysis of the complete set of metabolites present in cells, body fluids and tissues. Gas chromatography coupled to mass spectrometry (GC-MS) is very suitable for metabolomics analysis, as it combines high separation power with
This thesis focusses on metabolomics approaches performed in cultured cells and blood samples from patients with peroxisomal disorders. By applying both targeted and untargeted metabolomics, the aim of these approaches was to study the functional consequences of the primary genetic defects causing
Marques, Ana Patrícia; Serralheiro, Maria Luisa; Ferreira, António E. N.; Freire, Ana Ponces; Cordeiro, Carlos; Silva, Marta Sousa
Metabolomics is a key discipline in systems biology, together with genomics, transcriptomics, and proteomics. In this omics cascade, the metabolome represents the biochemical products that arise from cellular processes and is often regarded as the final response of a biological system to environmental or genetic changes. The overall screening…
Miller, Marion G
Metabolomic approaches have the potential to make an exceptional contribution to understanding how chemicals and other environmental stressors can affect both human and environmental health. However, the application of metabolomics to environmental exposures, although getting underway, has not yet been extensively explored. This review will use a SWOT analysis model to discuss some of the strengths, weaknesses, opportunities, and threats that are apparent to an investigator venturing into this relatively new field. SWOT has been used extensively in business settings to uncover new outlooks and identify problems that would impede progress. The field of environmental metabolomics provides great opportunities for discovery, and this is recognized by a high level of interest in potential applications. However, understanding the biological consequence of environmental exposures can be confounded by inter- and intra-individual differences. Metabolomic profiles can yield a plethora of data, the interpretation of which is complex and still being evaluated and researched. The development of the field will depend on the availability of technologies for data handling and that permit ready access metabolomic databases. Understanding the relevance of metabolomic endpoints to organism health vs adaptation vs variation is an important step in understanding what constitutes a substantive environmental threat. Metabolomic applications in reproductive research are discussed. Overall, the development of a comprehensive mechanistic-based interpretation of metabolomic changes offers the possibility of providing information that will significantly contribute to the protection of human health and the environment.
Wang, Xin-Jing; Zhang, Lei; Li, Xirong; Ma, Wei-Ying
Although it has been studied for years by the computer vision and machine learning communities, image annotation is still far from practical. In this paper, we propose a novel attempt at model-free image annotation, which is a data-driven approach that annotates images by mining their search results. Some 2.4 million images with their surrounding text are collected from a few photo forums to support this approach. The entire process is formulated in a divide-and-conquer framework where a query keyword is provided along with the uncaptioned image to improve both the effectiveness and efficiency. This is helpful when the collected data set is not dense everywhere. In this sense, our approach contains three steps: 1) the search process to discover visually and semantically similar search results, 2) the mining process to identify salient terms from textual descriptions of the search results, and 3) the annotation rejection process to filter out noisy terms yielded by Step 2. To ensure real-time annotation, two key techniques are leveraged-one is to map the high-dimensional image visual features into hash codes, the other is to implement it as a distributed system, of which the search and mining processes are provided as Web services. As a typical result, the entire process finishes in less than 1 second. Since no training data set is required, our approach enables annotating with unlimited vocabulary and is highly scalable and robust to outliers. Experimental results on both real Web images and a benchmark image data set show the effectiveness and efficiency of the proposed algorithm. It is also worth noting that, although the entire approach is illustrated within the divide-and conquer framework, a query keyword is not crucial to our current implementation. We provide experimental results to prove this.
Kobayashi, Daisuke; Sakamoto, Ryota; Nomura, Yoshihiko
This paper describes a learning assistant system using motion capture data and annotation to teach "Naginata-jutsu" (a skill to practice Japanese halberd) performance. There are some video annotation tools such as YouTube. However these video based tools have only single angle of view. Our approach that uses motion-captured data allows us to view any angle. A lecturer can write annotations related to parts of body. We have made a comparison of effectiveness between the annotation tool of YouTube and the proposed system. The experimental result showed that our system triggered more annotations than the annotation tool of YouTube.
Stegmann, Mikkel Bille
Full Text Available We describe Bioconductor infrastructure for representing and computing on annotated genomic ranges and integrating genomic data with the statistical computing features of R and its extensions. At the core of the infrastructure are three packages: IRanges, GenomicRanges, and GenomicFeatures. These packages provide scalable data structures for representing annotated ranges on the genome, with special support for transcript structures, read alignments and coverage vectors. Computational facilities include efficient algorithms for overlap and nearest neighbor detection, coverage calculation and other range operations. This infrastructure directly supports more than 80 other Bioconductor packages, including those for sequence analysis, differential expression analysis and visualization.
Lawrence, Michael; Huber, Wolfgang; Pagès, Hervé; Aboyoun, Patrick; Carlson, Marc; Gentleman, Robert; Morgan, Martin T; Carey, Vincent J
We describe Bioconductor infrastructure for representing and computing on annotated genomic ranges and integrating genomic data with the statistical computing features of R and its extensions. At the core of the infrastructure are three packages: IRanges, GenomicRanges, and GenomicFeatures. These packages provide scalable data structures for representing annotated ranges on the genome, with special support for transcript structures, read alignments and coverage vectors. Computational facilities include efficient algorithms for overlap and nearest neighbor detection, coverage calculation and other range operations. This infrastructure directly supports more than 80 other Bioconductor packages, including those for sequence analysis, differential expression analysis and visualization.
Evangelatos, Nikolaos; Bauer, Pia; Reumann, Matthias; Satyamoorthy, Kapaettu; Lehrach, Hans; Brand, Angela
Sepsis, with its often devastating consequences for patients and their families, remains a major public health concern that poses an increasing financial burden. Early resuscitation together with the elucidation of the biological pathways and pathophysiological mechanisms with the use of "-omics" technologies have started changing the clinical and research landscape in sepsis. Metabolomics (i.e., the study of the metabolome), an "-omics" technology further down in the "-omics" cascade between the genome and the phenome, could be particularly fruitful in sepsis research with the potential to alter the clinical practice. Apart from its benefit for the individual patient, metabolomics has an impact on public health that extends beyond its applications in medicine. In this review, we present recent developments in metabolomics research in sepsis, with a focus on pneumonia, and we discuss the impact of metabolomics on public health, with a focus on free/libre open source software. © 2018 S. Karger AG, Basel.
Misra, Biswapriya B
Rapid advances in mass spectrometry (MS) and nuclear magnetic resonance (NMR)-based platforms for metabolomics have led to an upsurge of data every single year. Newer high-throughput platforms, hyphenated technologies, miniaturization, and tool kits in data acquisition efforts in metabolomics have led to additional challenges in metabolomics data pre-processing, analysis, interpretation, and integration. Thanks to the informatics, statistics, and computational community, new resources continue to develop for metabolomics researchers. The purpose of this review is to provide a summary of the metabolomics tools, software, and databases that were developed or improved during 2016-2017, thus, enabling readers, developers, and researchers access to a succinct but thorough list of resources for further improvisation, implementation, and application in due course of time. © 2018 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Spicer, Rachel A; Salek, Reza; Steinbeck, Christoph
The Metabolomics Standards Initiative (MSI) guidelines were first published in 2007. These guidelines provided reporting standards for all stages of metabolomics analysis: experimental design, biological context, chemical analysis and data processing. Since 2012, a series of public metabolomics databases and repositories, which accept the deposition of metabolomic datasets, have arisen. In this study, the compliance of 399 public data sets, from four major metabolomics data repositories, to the biological context MSI reporting standards was evaluated. None of the reporting standards were complied with in every publicly available study, although adherence rates varied greatly, from 0 to 97%. The plant minimum reporting standards were the most complied with and the microbial and in vitro were the least. Our results indicate the need for reassessment and revision of the existing MSI reporting standards.
Ansong, Charles; Tolic, Nikola; Purvine, Samuel O.; Porwollik, Steffen; Jones, Marcus B.; Yoon, Hyunjin; Payne, Samuel H.; Martin, Jessica L.; Burnet, Meagan C.; Monroe, Matthew E.; Venepally, Pratap; Smith, Richard D.; Peterson, Scott; Heffron, Fred; Mcclelland, Michael; Adkins, Joshua N.
Complete and accurate genome annotation is crucial for comprehensive and systematic studies of biological systems. For example systems biology-oriented genome scale modeling efforts greatly benefit from accurate annotation of protein-coding genes to develop proper functioning models. However, determining protein-coding genes for most new genomes is almost completely performed by inference, using computational predictions with significant documented error rates (> 15%). Furthermore, gene prediction programs provide no information on biologically important post-translational processing events critical for protein function. With the ability to directly measure peptides arising from expressed proteins, mass spectrometry-based proteomics approaches can be used to augment and verify coding regions of a genomic sequence and importantly detect post-translational processing events. In this study we utilized “shotgun” proteomics to guide accurate primary genome annotation of the bacterial pathogen Salmonella Typhimurium 14028 to facilitate a systems-level understanding of Salmonella biology. The data provides protein-level experimental confirmation for 44% of predicted protein-coding genes, suggests revisions to 48 genes assigned incorrect translational start sites, and uncovers 13 non-annotated genes missed by gene prediction programs. We also present a comprehensive analysis of post-translational processing events in Salmonella, revealing a wide range of complex chemical modifications (70 distinct modifications) and confirming more than 130 signal peptide and N-terminal methionine cleavage events in Salmonella. This study highlights several ways in which proteomics data applied during the primary stages of annotation can improve the quality of genome annotations, especially with regards to the annotation of mature protein products.
Ibáñez, Clara; Simó, Carolina; García-Cañas, Virginia; Cifuentes, Alejandro; Castro-Puyana, María
Graphical abstract: -- Highlights: •Foodomics allows studying food and nutrition through the application of advanced omics approaches. •CE-MS plays a crucial role as analytical platform to carry out omics studies. •CE-MS applications for food metabolomics, proteomics and peptidomics are presented. -- Abstract: In the current post-genomic era, Foodomics has been defined as a discipline that studies food and nutrition through the application of advanced omics approaches. Foodomics involves the use of genomics, transcriptomics, epigenetics, proteomics, peptidomics, and/or metabolomics to investigate food quality, safety, traceability and bioactivity. In this context, capillary electrophoresis-mass spectrometry (CE-MS) has been applied mainly in food proteomics, peptidomics and metabolomics. The aim of this review work is to present an overview of the most recent developments and applications of CE-MS as analytical platform for Foodomics, covering the relevant works published from 2008 to 2012. The review provides also information about the integration of several omics approaches in the new Foodomics field
Ibáñez, Clara; Simó, Carolina; García-Cañas, Virginia; Cifuentes, Alejandro, E-mail: email@example.com; Castro-Puyana, María
Graphical abstract: -- Highlights: •Foodomics allows studying food and nutrition through the application of advanced omics approaches. •CE-MS plays a crucial role as analytical platform to carry out omics studies. •CE-MS applications for food metabolomics, proteomics and peptidomics are presented. -- Abstract: In the current post-genomic era, Foodomics has been defined as a discipline that studies food and nutrition through the application of advanced omics approaches. Foodomics involves the use of genomics, transcriptomics, epigenetics, proteomics, peptidomics, and/or metabolomics to investigate food quality, safety, traceability and bioactivity. In this context, capillary electrophoresis-mass spectrometry (CE-MS) has been applied mainly in food proteomics, peptidomics and metabolomics. The aim of this review work is to present an overview of the most recent developments and applications of CE-MS as analytical platform for Foodomics, covering the relevant works published from 2008 to 2012. The review provides also information about the integration of several omics approaches in the new Foodomics field.
Castillo Luis F.
Full Text Available Gene annotation is a process that encompasses multiple approaches on the analysis of nucleic acids or protein sequences in order to assign structural and functional characteristics to gene models. When thousands of gene models are being described in an organism genome, construction and visualization of gene networks impose novel challenges in the understanding of complex expression patterns and the generation of new knowledge in genomics research. In order to take advantage of accumulated text data after conventional gene sequence analysis, this work applied semantics in combination with visualization tools to build transcriptome networks from a set of coffee gene annotations. A set of selected coffee transcriptome sequences, chosen by the quality of the sequence comparison reported by Basic Local Alignment Search Tool (BLAST and Interproscan, were filtered out by coverage, identity, length of the query, and e-values. Meanwhile, term descriptors for molecular biology and biochemistry were obtained along the Wordnet dictionary in order to construct a Resource Description Framework (RDF using Ruby scripts and Methontology to find associations between concepts. Relationships between sequence annotations and semantic concepts were graphically represented through a total of 6845 oriented vectors, which were reduced to 745 non-redundant associations. A large gene network connecting transcripts by way of relational concepts was created where detailed connections remain to be validated for biological significance based on current biochemical and genetics frameworks. Besides reusing text information in the generation of gene connections and for data mining purposes, this tool development opens the possibility to visualize complex and abundant transcriptome data, and triggers the formulation of new hypotheses in metabolic pathways analysis.
Baldock, Richard A; Armit, Chris
"The Atlas of Mouse Development" by Kaufman is a classic paper atlas that is the de facto standard for the definition of mouse embryo anatomy in the context of standard histological images. We have re-digitised the original H&E stained tissue sections used for the book at high resolution and transferred the hand-drawn annotations to digital form. We have augmented the annotations with standard ontological assignments (EMAPA anatomy) and made the data freely available via an online viewer (eHistology) and from the University of Edinburgh DataShare archive. The dataset captures and preserves the definitive anatomical knowledge of the original atlas, provides a core image set for deeper community annotation and teaching, and delivers a unique high-quality set of high-resolution histological images through mammalian development for manual and automated analysis. © The Authors 2017. Published by Oxford University Press.
Deng, Feilong; Chen, Shi-Yi; Wu, Zhou-Lin; Hu, Yongsong; Jia, Xianbo; Lai, Song-Jia
Owing to wide application of RNA sequencing (RNA-seq) technology, more and more eukaryotic genomes have been extensively annotated, such as the gene structure, alternative splicing, and noncoding loci. Annotation information of genome is prevalently stored as plain text in General Feature Format (GFF), which could be hundreds or thousands Mb in size. Therefore, it is a challenge for manipulating GFF file for biologists who have no bioinformatic skill. In this study, we provide a web server (GFFview) for parsing the annotation information of eukaryotic genome and then generating statistical description of six indices for visualization. GFFview is very useful for investigating quality and difference of the de novo assembled transcriptome in RNA-seq studies.
Conner, Ronald C.
This 25-page annotated bibliography describes the legal reference materials in the special collection of a medium-sized public library. Sources are listed in 12 categories: cases, dictionaries, directories, encyclopedias, forms, references for the lay person, general, indexes, laws and legislation, legal research aids, periodicals, and specialized…
Sanfilippo, Antonio P.; Tratz, Stephen C.; Gregory, Michelle L.; Chappell, Alan R.; Whitney, Paul D.; Posse, Christian; Paulson, Patrick R.; Baddeley, Bob; Hohimer, Ryan E.; White, Amanda M.
Semantic Web applications require robust and accurate annotation tools that are capable of automating the assignment of ontological classes to words in naturally occurring text (ontological annotation). Most current ontologies do not include rich lexical databases and are therefore not easily integrated with word sense disambiguation algorithms that are needed to automate ontological annotation. WordNet provides a potentially ideal solution to this problem as it offers a highly structured lexical conceptual representation that has been extensively used to develop word sense disambiguation algorithms. However, WordNet has not been designed as an ontology, and while it can be easily turned into one, the result of doing this would present users with serious practical limitations due to the great number of concepts (synonym sets) it contains. Moreover, mapping WordNet to an existing ontology may be difficult and requires substantial labor. We propose to overcome these limitations by developing an analytical platform that (1) provides a WordNet-based ontology offering a manageable and yet comprehensive set of concept classes, (2) leverages the lexical richness of WordNet to give an extensive characterization of concept class in terms of lexical instances, and (3) integrates a class recognition algorithm that automates the assignment of concept classes to words in naturally occurring text. The ensuing framework makes available an ontological annotation platform that can be effectively integrated with intelligence analysis systems to facilitate evidence marshaling and sustain the creation and validation of inference models.
Sanfilippo, Antonio P.; Tratz, Stephen C.; Gregory, Michelle L.; Chappell, Alan R.; Whitney, Paul D.; Posse, Christian; Paulson, Patrick R.; Baddeley, Bob L.; Hohimer, Ryan E.; White, Amanda M.
Semantic Web applications require robust and accurate annotation tools that are capable of automating the assignment of ontological classes to words in naturally occurring text (ontological annotation). Most current ontologies do not include rich lexical databases and are therefore not easily integrated with word sense disambiguation algorithms that are needed to automate ontological annotation. WordNet provides a potentially ideal solution to this problem as it offers a highly structured lexical conceptual representation that has been extensively used to develop word sense disambiguation algorithms. However, WordNet has not been designed as an ontology, and while it can be easily turned into one, the result of doing this would present users with serious practical limitations due to the great number of concepts (synonym sets) it contains. Moreover, mapping WordNet to an existing ontology may be difficult and requires substantial labor. We propose to overcome these limitations by developing an analytical platform that (1) provides a WordNet-based ontology offering a manageable and yet comprehensive set of concept classes, (2) leverages the lexical richness of WordNet to give an extensive characterization of concept class in terms of lexical instances, and (3) integrates a class recognition algorithm that automates the assignment of concept classes to words in naturally occurring text. The ensuing framework makes available an ontological annotation platform that can be effectively integrated with intelligence analysis systems to facilitate evidence marshaling and sustain the creation and validation of inference models.
J.C. van de Pol (Jaco)
textabstractA simple kind of strategy annotations is investigated, giving rise to a class of strategies, including leftmost-innermost. It is shown that under certain restrictions, an interpreter can be written which computes the normal form of a term in a bottom-up traversal. The main contribution
Benoit, William L.
Materials dealing with aspects of argumentation theory are cited in this annotated bibliography. The 50 citations are organized by topic as follows: (1) argumentation; (2) the nature of argument; (3) traditional perspectives on argument; (4) argument diagrams; (5) Chaim Perelman's theory of rhetoric; (6) the evaluation of argument; (7) argument…
J.D. Strachan and G. Corrigan
This annotated bibliography is intended to help EDGE2D users, and particularly new users, find existing published literature that has used EDGE2D. Our idea is that a person can find existing studies which may relate to his intended use, as well as gain ideas about other possible applications by scanning the attached tables.
National Agricultural Library (USDA), Washington, DC.
This annotated bibliography on nutrition and adolescent pregnancy is intended to be a source of technical assistance for nurses, nutritionists, physicians, educators, social workers, and other personnel concerned with improving the health of teenage mothers and their babies. It is divided into two major sections. The first section lists selected…
E. Durant McArthur; Bryce A. Richardson; Stanley G. Kitchen
This annotated bibliography documents the research that has been conducted on the Great Basin Experimental Range (GBER, also known as the Utah Experiment Station, Great Basin Station, the Great Basin Branch Experiment Station, Great Basin Experimental Center, and other similar name variants) over the 102 years of its existence. Entries were drawn from the original...
Bloem, J.; Bański, P.; Kupietz, M.; Lüngen, H.; Witt, A.; Barbaresi, A.; Biber, H.; Breiteneder, E.; Clematide, S.
This study discusses evaluation methods for linguists to use when employing an automatically annotated treebank as a source of linguistic evidence. While treebanks are usually evaluated with a general measure over all the data, linguistic studies often focus on a particular construction or a group
Kügler, Frank; Smolibocki, Bernadett; Arnold, Denis
This paper presents newly developed guidelines for prosodic annotation of German as a consensus system agreed upon by German intonologists. The DIMA system is rooted in the framework of autosegmental-metrical phonology. One important goal of the consensus is to make exchanging data between groups...
Strachan, J.D.; Corrigan, G.
This annotated bibliography is intended to help EDGE2D users, and particularly new users, find existing published literature that has used EDGE2D. Our idea is that a person can find existing studies which may relate to his intended use, as well as gain ideas about other possible applications by scanning the attached tables
National Cancer Inst. (NIH), Bethesda, MD.
This annotated bibliography presents 85 entries on a variety of approaches to cancer education. The entries are grouped under three broad headings, two of which contain smaller sub-divisions. The first heading, Public Education, contains prevention and general information, and non-print materials. The second heading, Professional Education,…
From reading texts to annotating web pages, grade 6-8 students rely on group cooperation and individual reading and writing skills in this research project that spans six 50-minute lessons. Student objectives for this project are that they will: read, discuss, and keep a journal on a book in literature circles; understand the elements of and…
Li, Shengting; Ma, Lijia; Li, Heng
Snap (Single Nucleotide Polymorphism Annotation Platform) is a server designed to comprehensively analyze single genes and relationships between genes basing on SNPs in the human genome. The aim of the platform is to facilitate the study of SNP finding and analysis within the framework of medical...
Heylen, Dirk K.J.; Reidsma, Dennis; Ordelman, Roeland J.F.; Devillers, L.; Martin, J-C.; Cowie, R.; Batliner, A.
We discuss the annotation procedure for mental state and emotion that is under development for the AMI (Augmented Multiparty Interaction) corpus. The categories that were found to be most appropriate relate not only to emotions but also to (meta-)cognitive states and interpersonal variables. The
newapplicationsfor the ePNK and, in particular, visualizing the result of an application in the graphical editor of the ePNK by singannotations, and interacting with the end user using these annotations. In this paper, we give an overview of the concepts of ePNK applications by discussing the implementation...
Liu, Weifeng; Tao, Dacheng
The rapid development of computer hardware and Internet technology makes large scale data dependent models computationally tractable, and opens a bright avenue for annotating images through innovative machine learning algorithms. Semisupervised learning (SSL) therefore received intensive attention in recent years and was successfully deployed in image annotation. One representative work in SSL is Laplacian regularization (LR), which smoothes the conditional distribution for classification along the manifold encoded in the graph Laplacian, however, it is observed that LR biases the classification function toward a constant function that possibly results in poor generalization. In addition, LR is developed to handle uniformly distributed data (or single-view data), although instances or objects, such as images and videos, are usually represented by multiview features, such as color, shape, and texture. In this paper, we present multiview Hessian regularization (mHR) to address the above two problems in LR-based image annotation. In particular, mHR optimally combines multiple HR, each of which is obtained from a particular view of instances, and steers the classification function that varies linearly along the data manifold. We apply mHR to kernel least squares and support vector machines as two examples for image annotation. Extensive experiments on the PASCAL VOC'07 dataset validate the effectiveness of mHR by comparing it with baseline algorithms, including LR and HR.
Pullin, Richard A.
This annotated bibliography lists 310 articles from the "Journal of Cooperative Education" from Volumes XIX-XXXII, 1983-1997. Annotations are presented in the order they appear in the journal; author and subject indexes are provided. (JOW)
Schwartz, David Charles; Severin, Jessica
There are provided computer systems for visualizing and annotating single molecule images. Annotation systems in accordance with this disclosure allow a user to mark and annotate single molecules of interest and their restriction enzyme cut sites thereby determining the restriction fragments of single nucleic acid molecules. The markings and annotations may be automatically generated by the system in certain embodiments and they may be overlaid translucently onto the single molecule images. An image caching system may be implemented in the computer annotation systems to reduce image processing time. The annotation systems include one or more connectors connecting to one or more databases capable of storing single molecule data as well as other biomedical data. Such diverse array of data can be retrieved and used to validate the markings and annotations. The annotation systems may be implemented and deployed over a computer network. They may be ergonomically optimized to facilitate user interactions.
Kalkatawi, Manal; Alam, Intikhab; Bajic, Vladimir B
Genome annotation is one way of summarizing the existing knowledge about genomic characteristics of an organism. There has been an increased interest during the last several decades in computer-based structural and functional genome annotation. Many methods for this purpose have been developed for eukaryotes and prokaryotes. Our study focuses on comparison of functional annotations of prokaryotic genomes. To the best of our knowledge there is no fully automated system for detailed comparison of functional genome annotations generated by different annotation methods (AMs). The presence of many AMs and development of new ones introduce needs to: a/ compare different annotations for a single genome, and b/ generate annotation by combining individual ones. To address these issues we developed an Automated Tool for Bacterial GEnome Annotation ComparisON (BEACON) that benefits both AM developers and annotation analysers. BEACON provides detailed comparison of gene function annotations of prokaryotic genomes obtained by different AMs and generates extended annotations through combination of individual ones. For the illustration of BEACON's utility, we provide a comparison analysis of multiple different annotations generated for four genomes and show on these examples that the extended annotation can increase the number of genes annotated by putative functions up to 27%, while the number of genes without any function assignment is reduced. We developed BEACON, a fast tool for an automated and a systematic comparison of different annotations of single genomes. The extended annotation assigns putative functions to many genes with unknown functions. BEACON is available under GNU General Public License version 3.0 and is accessible at: http://www.cbrc.kaust.edu.sa/BEACON/ .
Kalkatawi, Manal M.
Background Genome annotation is one way of summarizing the existing knowledge about genomic characteristics of an organism. There has been an increased interest during the last several decades in computer-based structural and functional genome annotation. Many methods for this purpose have been developed for eukaryotes and prokaryotes. Our study focuses on comparison of functional annotations of prokaryotic genomes. To the best of our knowledge there is no fully automated system for detailed comparison of functional genome annotations generated by different annotation methods (AMs). Results The presence of many AMs and development of new ones introduce needs to: a/ compare different annotations for a single genome, and b/ generate annotation by combining individual ones. To address these issues we developed an Automated Tool for Bacterial GEnome Annotation ComparisON (BEACON) that benefits both AM developers and annotation analysers. BEACON provides detailed comparison of gene function annotations of prokaryotic genomes obtained by different AMs and generates extended annotations through combination of individual ones. For the illustration of BEACON’s utility, we provide a comparison analysis of multiple different annotations generated for four genomes and show on these examples that the extended annotation can increase the number of genes annotated by putative functions up to 27 %, while the number of genes without any function assignment is reduced. Conclusions We developed BEACON, a fast tool for an automated and a systematic comparison of different annotations of single genomes. The extended annotation assigns putative functions to many genes with unknown functions. BEACON is available under GNU General Public License version 3.0 and is accessible at: http://www.cbrc.kaust.edu.sa/BEACON/
Brettin, Thomas; Davis, James J.; Disz, Terry; Edwards, Robert A.; Gerdes, Svetlana; Olsen, Gary J.; Olson, Robert; Overbeek, Ross; Parrello, Bruce; Pusch, Gordon D.; Shukla, Maulik; Thomason, James A.; Stevens, Rick; Vonstein, Veronika; Wattam, Alice R.; Xia, Fangfang
The RAST (Rapid Annotation using Subsystem Technology) annotation engine was built in 2008 to annotate bacterial and archaeal genomes. It works by offering a standard software pipeline for identifying genomic features (i.e., protein-encoding genes and RNA) and annotating their functions. Recently, in order to make RAST a more useful research tool and to keep pace with advancements in bioinformatics, it has become desirable to build a version of RAST that is both customizable and extensible. In this paper, we describe the RAST tool kit (RASTtk), a modular version of RAST that enables researchers to build custom annotation pipelines. RASTtk offers a choice of software for identifying and annotating genomic features as well as the ability to add custom features to an annotation job. RASTtk also accommodates the batch submission of genomes and the ability to customize annotation protocols for batch submissions. This is the first major software restructuring of RAST since its inception.
Brettin, Thomas; Davis, James J; Disz, Terry; Edwards, Robert A; Gerdes, Svetlana; Olsen, Gary J; Olson, Robert; Overbeek, Ross; Parrello, Bruce; Pusch, Gordon D; Shukla, Maulik; Thomason, James A; Stevens, Rick; Vonstein, Veronika; Wattam, Alice R; Xia, Fangfang
The RAST (Rapid Annotation using Subsystem Technology) annotation engine was built in 2008 to annotate bacterial and archaeal genomes. It works by offering a standard software pipeline for identifying genomic features (i.e., protein-encoding genes and RNA) and annotating their functions. Recently, in order to make RAST a more useful research tool and to keep pace with advancements in bioinformatics, it has become desirable to build a version of RAST that is both customizable and extensible. In this paper, we describe the RAST tool kit (RASTtk), a modular version of RAST that enables researchers to build custom annotation pipelines. RASTtk offers a choice of software for identifying and annotating genomic features as well as the ability to add custom features to an annotation job. RASTtk also accommodates the batch submission of genomes and the ability to customize annotation protocols for batch submissions. This is the first major software restructuring of RAST since its inception.
Smirnov, Kirill S; Maier, Tanja V; Walker, Alesia; Heinzmann, Silke S; Forcisi, Sara; Martinez, Inés; Walter, Jens; Schmitt-Kopplin, Philippe
The review highlights the role of metabolomics in studying human gut microbial metabolism. Microbial communities in our gut exert a multitude of functions with huge impact on human health and disease. Within the meta-omics discipline, gut microbiome is studied by (meta)genomics, (meta)transcriptomics, (meta)proteomics and metabolomics. The goal of metabolomics research applied to fecal samples is to perform their metabolic profiling, to quantify compounds and classes of interest, to characterize small molecules produced by gut microbes. Nuclear magnetic resonance spectroscopy and mass spectrometry are main technologies that are applied in fecal metabolomics. Metabolomics studies have been increasingly used in gut microbiota related research regarding health and disease with main focus on understanding inflammatory bowel diseases. The elucidated metabolites in this field are summarized in this review. We also addressed the main challenges of metabolomics in current and future gut microbiota research. The first challenge reflects the need of adequate analytical tools and pipelines, including sample handling, selection of appropriate equipment, and statistical evaluation to enable meaningful biological interpretation. The second challenge is related to the choice of the right animal model for studies on gut microbiota. We exemplified this using NMR spectroscopy for the investigation of cross-species comparison of fecal metabolite profiles. Finally, we present the problem of variability of human gut microbiota and metabolome that has important consequences on the concepts of personalized nutrition and medicine. Copyright © 2016 Elsevier GmbH. All rights reserved.
Full Text Available Oral diseases are known to be closely associated with oral biofilm metabolism, while cancer tissue is reported to possess specific metabolism such as the ‘Warburg effect’. Metabolomics might be a useful method for clarifying the whole metabolic systems that operate in oral biofilm and oral cancer, however, technical limitations have hampered such research. Fortunately, metabolomics techniques have developed rapidly in the past decade, which has helped to solve these difficulties. In vivo metabolomic analyses of the oral biofilm have produced various findings. Some of these findings agreed with the in vitro results obtained in conventional metabolic studies using representative oral bacteria, while others differed markedly from them. Metabolomic analyses of oral cancer tissue not only revealed differences between metabolomic profiles of cancer and normal tissue, but have also suggested a specific metabolic system operates in oral cancer tissue. Saliva contains a variety of metabolites, some of which might be associated with oral or systemic disease; therefore, metabolomics analysis of saliva could be useful for identifying disease-specific biomarkers. Metabolomic analyses of the oral biofilm, oral cancer, and saliva could contribute to the development of accurate diagnostic, techniques, safe and effective treatments, and preventive strategies for oral and systemic diseases.
Washio, Jumpei; Takahashi, Nobuhiro
Oral diseases are known to be closely associated with oral biofilm metabolism, while cancer tissue is reported to possess specific metabolism such as the 'Warburg effect'. Metabolomics might be a useful method for clarifying the whole metabolic systems that operate in oral biofilm and oral cancer, however, technical limitations have hampered such research. Fortunately, metabolomics techniques have developed rapidly in the past decade, which has helped to solve these difficulties. In vivo metabolomic analyses of the oral biofilm have produced various findings. Some of these findings agreed with the in vitro results obtained in conventional metabolic studies using representative oral bacteria, while others differed markedly from them. Metabolomic analyses of oral cancer tissue not only revealed differences between metabolomic profiles of cancer and normal tissue, but have also suggested a specific metabolic system operates in oral cancer tissue. Saliva contains a variety of metabolites, some of which might be associated with oral or systemic disease; therefore, metabolomics analysis of saliva could be useful for identifying disease-specific biomarkers. Metabolomic analyses of the oral biofilm, oral cancer, and saliva could contribute to the development of accurate diagnostic, techniques, safe and effective treatments, and preventive strategies for oral and systemic diseases.
Mhlongo, Msizi I.; Piater, Lizelle A.; Madala, Ntakadzeni E.; Labuschagne, Nico; Dubery, Ian A.
Plant roots communicate with microbes in a sophisticated manner through chemical communication within the rhizosphere, thereby leading to biofilm formation of beneficial microbes and, in the case of plant growth-promoting rhizomicrobes/-bacteria (PGPR), resulting in priming of defense, or induced resistance in the plant host. The knowledge of plant–plant and plant–microbe interactions have been greatly extended over recent years; however, the chemical communication leading to priming is far from being well understood. Furthermore, linkage between below- and above-ground plant physiological processes adds to the complexity. In metabolomics studies, the main aim is to profile and annotate all exo- and endo-metabolites in a biological system that drive and participate in physiological processes. Recent advances in this field has enabled researchers to analyze 100s of compounds in one sample over a short time period. Here, from a metabolomics viewpoint, we review the interactions within the rhizosphere and subsequent above-ground ‘signalomics’, and emphasize the contributions that mass spectrometric-based metabolomic approaches can bring to the study of plant-beneficial – and priming events. PMID:29479360
Ilakkuvan, Vinu; Tacelosky, Michael; Ivey, Keith C; Pearson, Jennifer L; Cantrell, Jennifer; Vallone, Donna M; Abrams, David B; Kirchner, Thomas R
Photographs are an effective way to collect detailed and objective information about the environment, particularly for public health surveillance. However, accurately and reliably annotating (ie, extracting information from) photographs remains difficult, a critical bottleneck inhibiting the use of photographs for systematic surveillance. The advent of distributed human computation (ie, crowdsourcing) platforms represents a veritable breakthrough, making it possible for the first time to accurately, quickly, and repeatedly annotate photos at relatively low cost. This paper describes a methods protocol, using photographs from point-of-sale surveillance studies in the field of tobacco control to demonstrate the development and testing of custom-built tools that can greatly enhance the quality of crowdsourced annotation. Enhancing the quality of crowdsourced photo annotation requires a number of approaches and tools. The crowdsourced photo annotation process is greatly simplified by decomposing the overall process into smaller tasks, which improves accuracy and speed and enables adaptive processing, in which irrelevant data is filtered out and more difficult targets receive increased scrutiny. Additionally, zoom tools enable users to see details within photographs and crop tools highlight where within an image a specific object of interest is found, generating a set of photographs that answer specific questions. Beyond such tools, optimizing the number of raters (ie, crowd size) for accuracy and reliability is an important facet of crowdsourced photo annotation. This can be determined in a systematic manner based on the difficulty of the task and the desired level of accuracy, using receiver operating characteristic (ROC) analyses. Usability tests of the zoom and crop tool suggest that these tools significantly improve annotation accuracy. The tests asked raters to extract data from photographs, not for the purposes of assessing the quality of that data, but rather to
Wang, Yingfeng; Kora, Guruprasad; Bowen, Benjamin P; Pan, Chongle
A database searching approach can be used for metabolite identification in metabolomics by matching measured tandem mass spectra (MS/MS) against the predicted fragments of metabolites in a database. Here, we present the open-source MIDAS algorithm (Metabolite Identification via Database Searching). To evaluate a metabolite-spectrum match (MSM), MIDAS first enumerates possible fragments from a metabolite by systematic bond dissociation, then calculates the plausibility of the fragments based on their fragmentation pathways, and finally scores the MSM to assess how well the experimental MS/MS spectrum from collision-induced dissociation (CID) is explained by the metabolite's predicted CID MS/MS spectrum. MIDAS was designed to search high-resolution tandem mass spectra acquired on time-of-flight or Orbitrap mass spectrometer against a metabolite database in an automated and high-throughput manner. The accuracy of metabolite identification by MIDAS was benchmarked using four sets of standard tandem mass spectra from MassBank. On average, for 77% of original spectra and 84% of composite spectra, MIDAS correctly ranked the true compounds as the first MSMs out of all MetaCyc metabolites as decoys. MIDAS correctly identified 46% more original spectra and 59% more composite spectra at the first MSMs than an existing database-searching algorithm, MetFrag. MIDAS was showcased by searching a published real-world measurement of a metabolome from Synechococcus sp. PCC 7002 against the MetaCyc metabolite database. MIDAS identified many metabolites missed in the previous study. MIDAS identifications should be considered only as candidate metabolites, which need to be confirmed using standard compounds. To facilitate manual validation, MIDAS provides annotated spectra for MSMs and labels observed mass spectral peaks with predicted fragments. The database searching and manual validation can be performed online at http://midas.omicsbio.org.
Deborde, Catherine; Jacob, Daniel
Plant primary metabolites are organic compounds that are common to all or most plant species and are essential for plant growth, development, and reproduction. They are intermediates and products of metabolism involved in photosynthesis and other biosynthetic processes. Primary metabolites belong to different compound families, mainly carbohydrates, organic acids, amino acids, nucleotides, fatty acids, steroids, or lipids. Until recently, unlike the Human Metabolome Database ( http://www.hmdb.ca ) dedicated to human metabolism, there was no centralized database or repository dedicated exclusively to the plant kingdom that contained information on metabolites and their concentrations in a detailed experimental context. MeRy-B is the first platform for plant (1)H-NMR metabolomic profiles (MeRy-B, http://bit.ly/meryb ), designed to provide a knowledge base of curated plant profiles and metabolites obtained by NMR, together with the corresponding experimental and analytical metadata. MeRy-B contains lists of plant metabolites, mostly primary metabolites and unknown compounds, with information about experimental conditions, the factors studied, and metabolite concentrations for 19 different plant species (Arabidopsis, broccoli, daphne, grape, maize, barrel clover, melon, Ostreococcus tauri, palm date, palm tree, peach, pine tree, eucalyptus, plantain rice, strawberry, sugar beet, tomato, vanilla), compiled from more than 2,300 annotated NMR profiles for various organs or tissues deposited by 30 different private or public contributors in September 2013. Currently, about half of the data deposited in MeRy-B is publicly available. In this chapter, readers will be shown how to (1) navigate through and retrieve data of publicly available projects on MeRy-B website; (2) visualize lists of experimentally identified metabolites and their concentrations in all plant species present in MeRy-B; (3) get primary metabolite list for a particular plant species in MeRy-B; and for a
Full Text Available Producing production quality information systems from conceptual descriptions is a time consuming process that employs many of the world's programmers. Although most of this programming is fairly routine, the process has not been amenable to simple automation because conceptual models do not provide sufficient parameters to make all the implementation decisions that are required, and numerous special cases arise in practice. Most commercial CASE tools address these problems by essentially implementing a waterfall model in which the development proceeds from analysis through design, layout and coding phases in a partially automated manner, but the analyst/programmer must heavily edit each intermediate stage. This paper demonstrates that by recognising the nature of information systems, it is possible to specify applications completely using a conceptual model that has een annotated with additional parameters that guide automated implementation. More importantly, it will be argued that a manageable number of annotations are sufficient to implement realistic applications, and techniques will be described that enabled the author's commercial CASE tool, the Intelligent Develope to automated implementation without requiring complex theorem proving technology.
Full Text Available Lipopolysaccharides (LPSs, as MAMP molecules, trigger the activation of signal transduction pathways involved in defence. Currently, plant metabolomics is providing new dimensions into understanding the intracellular adaptive responses to external stimuli. The effect of LPS on the metabolomes of Arabidopsis thaliana cells and leaf tissue was investigated over a 24 h period. Cellular metabolites and those secreted into the medium were extracted with methanol and liquid chromatography coupled to mass spectrometry was used for quantitative and qualitative analyses. Multivariate statistical data analyses were used to extract interpretable information from the generated multidimensional LC-MS data. The results show that LPS perception triggered differential changes in the metabolomes of cells and leaves, leading to variation in the biosynthesis of specialised secondary metabolites. Time-dependent changes in metabolite profiles were observed and biomarkers associated with the LPS-induced response were tentatively identified. These include the phytohormones salicylic acid and jasmonic acid, and also the associated methyl esters and sugar conjugates. The induced defensive state resulted in increases in indole-and other glucosinolates, indole derivatives, camalexin as well as cinnamic acid derivatives and other phenylpropanoids. These annotated metabolites indicate dynamic reprogramming of metabolic pathways that are functionally related towards creating an enhanced defensive capacity. The results reveal new insights into the mode of action of LPS as an activator of plant innate immunity, broadens knowledge about the defence metabolite pathways involved in Arabidopsis responses to LPS, and identifies specialised metabolites of functional importance that can be employed to enhance immunity against pathogen infection.
In this article, we will provide a description of metabolomics in comparison with other, better known “omics” disciplines such as genomics and proteomics. In addition, we will review the current rationale for the implementation of metabolomics in cardiology, its basic methodology and the available data from human studies in this discipline. The topics covered will delineate the importance of being able to use the metabolomic information to understand the mechanisms of diseases from the perspective of systems biology, and as a non-invasive approach to the diagnosis, grading and treatment of cardiovascular diseases.
Tan, S Z; Begley, P; Mullard, G; Hollywood, K A; Bishop, P N
Metabolomics is the study of endogenous and exogenous metabolites in biological systems, which aims to provide comparative semi-quantitative information about all metabolites in the system. Metabolomics is an emerging and potentially powerful tool in ophthalmology research. It is therefore important for health professionals and researchers involved in the speciality to understand the basic principles of metabolomics experiments. This article provides an overview of the experimental workflow and examples of its use in ophthalmology research from the study of disease metabolism and pathogenesis to identification of biomarkers. PMID:26987591
Thomas, Funmilola Clara; Mudaliar, Manikhandan; Tassi, Riccardo; McNeilly, Tom N; Burchmore, Richard; Burgess, Karl; Herzyk, Pawel; Zadoks, Ruth N; Eckersall, P David
Intramammary infection leading to bovine mastitis is the leading disease problem affecting dairy cows and has marked effects on the milk produced by infected udder quarters. An experimental model of Streptococcus uberis mastitis has previously been investigated for clinical, immunological and pathophysiological alteration in milk, and has been the subject of peptidomic and quantitative proteomic investigation. The same sample set has now been investigated with a metabolomics approach using liquid chromatography and mass spectrometry. The analysis revealed over 3000 chromatographic peaks, of which 690 were putatively annotated with a metabolite. Hierarchical clustering analysis and principal component analysis demonstrated that metabolite changes due to S. uberis infection were maximal at 81 hours post challenge with metabolites in the milk from the resolution phase at 312 hours post challenge being closest to the pre-challenge samples. Metabolic pathway analysis revealed that the majority of the metabolites mapped to carbohydrate and nucleotide metabolism show a decreasing trend in concentration up to 81 hours post-challenge whereas an increasing trend was found in lipid metabolites and di-, tri- and tetra-peptides up to the same time point. The increase in these peptides coincides with an increase in larger peptides found in the previous peptidomic analysis and is likely to be due to protease degradation of milk proteins. Components of bile acid metabolism, linked to the FXR pathway regulating inflammation, were also increased. Metabolomic analysis of the response in milk during mastitis provides an essential component to the full understanding of the mammary gland's response to infection.
Helms, J Bernd; Kaloyanova, Dora V; Strating, Jeroen R P; van Hellemond, Jaap J; van der Schaar, Hilde M; Tielens, Aloysius G M; van Kuppeveld, Frank J M; Brouwers, Jos F
The hydrophobic molecules of the metabolome - also named the lipidome - constitute a major part of the entire metabolome. Novel technologies show the existence of a staggering number of individual lipid species, the biological functions of which are, with the exception of only a few lipid species, unknown. Much can be learned from pathogens that have evolved to take advantage of the complexity of the lipidome to escape the immune system of the host organism and to allow their survival and replication. Different types of pathogens target different lipids as shown in interaction maps, allowing visualization of differences between different types of pathogens. Bacterial and viral pathogens target predominantly structural and signaling lipids to alter the cellular phenotype of the host cell. Fungal and parasitic pathogens have complex lipidomes themselves and target predominantly the release of polyunsaturated fatty acids from the host cell lipidome, resulting in the generation of eicosanoids by either the host cell or the pathogen. Thus, whereas viruses and bacteria induce predominantly alterations in lipid metabolites at the host cell level, eukaryotic pathogens focus on interference with lipid metabolites affecting systemic inflammatory reactions that are part of the immune system. A better understanding of the interplay between host-pathogen interactions will not only help elucidate the fundamental role of lipid species in cellular physiology, but will also aid in the generation of novel therapeutic drugs. © 2015 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.
Schramm, Katharina; Adamski, Jerzy; Gieger, Christian; Herder, Christian; Carstensen, Maren; Peters, Annette; Rathmann, Wolfgang; Roden, Michael; Strauch, Konstantin; Suhre, Karsten; Kastenmüller, Gabi; Prokisch, Holger; Theis, Fabian J.
Biological systems consist of multiple organizational levels all densely interacting with each other to ensure function and flexibility of the system. Simultaneous analysis of cross-sectional multi-omics data from large population studies is a powerful tool to comprehensively characterize the underlying molecular mechanisms on a physiological scale. In this study, we systematically analyzed the relationship between fasting serum metabolomics and whole blood transcriptomics data from 712 individuals of the German KORA F4 cohort. Correlation-based analysis identified 1,109 significant associations between 522 transcripts and 114 metabolites summarized in an integrated network, the ‘human blood metabolome-transcriptome interface’ (BMTI). Bidirectional causality analysis using Mendelian randomization did not yield any statistically significant causal associations between transcripts and metabolites. A knowledge-based interpretation and integration with a genome-scale human metabolic reconstruction revealed systematic signatures of signaling, transport and metabolic processes, i.e. metabolic reactions mainly belonging to lipid, energy and amino acid metabolism. Moreover, the construction of a network based on functional categories illustrated the cross-talk between the biological layers at a pathway level. Using a transcription factor binding site enrichment analysis, this pathway cross-talk was further confirmed at a regulatory level. Finally, we demonstrated how the constructed networks can be used to gain novel insights into molecular mechanisms associated to intermediate clinical traits. Overall, our results demonstrate the utility of a multi-omics integrative approach to understand the molecular mechanisms underlying both normal physiology and disease. PMID:26086077
Full Text Available Biological systems consist of multiple organizational levels all densely interacting with each other to ensure function and flexibility of the system. Simultaneous analysis of cross-sectional multi-omics data from large population studies is a powerful tool to comprehensively characterize the underlying molecular mechanisms on a physiological scale. In this study, we systematically analyzed the relationship between fasting serum metabolomics and whole blood transcriptomics data from 712 individuals of the German KORA F4 cohort. Correlation-based analysis identified 1,109 significant associations between 522 transcripts and 114 metabolites summarized in an integrated network, the 'human blood metabolome-transcriptome interface' (BMTI. Bidirectional causality analysis using Mendelian randomization did not yield any statistically significant causal associations between transcripts and metabolites. A knowledge-based interpretation and integration with a genome-scale human metabolic reconstruction revealed systematic signatures of signaling, transport and metabolic processes, i.e. metabolic reactions mainly belonging to lipid, energy and amino acid metabolism. Moreover, the construction of a network based on functional categories illustrated the cross-talk between the biological layers at a pathway level. Using a transcription factor binding site enrichment analysis, this pathway cross-talk was further confirmed at a regulatory level. Finally, we demonstrated how the constructed networks can be used to gain novel insights into molecular mechanisms associated to intermediate clinical traits. Overall, our results demonstrate the utility of a multi-omics integrative approach to understand the molecular mechanisms underlying both normal physiology and disease.
Smrithi eSugumaran Menon
Full Text Available Human exposure to ionizing radiation disrupts normal metabolic processes in cells and organs by inducing complex biological responses that interfere with gene and protein expression. Conventional dosimetry, monitoring of prodromal symptoms and peripheral lymphocyte counts are of limited value as organ and tissue specific biomarkers for personnel exposed to radiation, particularly, weeks or months after exposure. Analysis of metabolites generated in known stress-responsive pathways by molecular profiling helps to predict the physiological status of an individual in response to environmental or genetic perturbations. Thus, a multi-metabolite profile obtained from a high resolution mass spectrometry-based metabolomics platform offers potential for identification of robust biomarkers to predict radiation toxicity of organs and tissues resulting from exposures to therapeutic or non-therapeutic ionizing radiation. Here, we review the status of radiation metabolomics and explore applications as a standalone technology, as well as its integration in systems biology, to facilitate a better understanding of the molecular basis of radiation response. Finally, we draw attention to the identification of specific pathways that can be targeted for the development of therapeutics to alleviate or mitigate harmful effects of radiation exposure.
Vertes, Akos [George Washington Univ., Washington, DC (United States)
Small molecules constitute a large part of the world around us, including fossil and some renewable energy sources. Solar energy harvested by plants and bacteria is converted into energy rich small molecules on a massive scale. Some of the worst contaminants of the environment and compounds of interest for national security also fall in the category of small molecules. The development of large scale metabolomic analysis methods lags behind the state of the art established for genomics and proteomics. This is commonly attributed to the diversity of molecular classes included in a metabolome. Unlike nucleic acids and proteins, metabolites do not have standard building blocks, and, as a result, their molecular properties exhibit a wide spectrum. This impedes the development of dedicated separation and spectroscopic methods. Mass spectrometry (MS) is a strong contender in the quest for a quantitative analytical tool with extensive metabolite coverage. Although various MS-based techniques are emerging for metabolomics, many of these approaches include extensive sample preparation that make large scale studies resource intensive and slow. New ionization methods are redefining the range of analytical problems that can be solved using MS. This project developed new approaches for the direct analysis of small molecules in unprocessed samples, as well as pushed the limits of ultratrace analysis in volume limited complex samples. The projects resulted in techniques that enabled metabolomics investigations with enhanced molecular coverage, as well as the study of cellular response to stimuli on a single cell level. Effectively individual cells became reaction vessels, where we followed the response of a complex biological system to external perturbation. We established two new analytical platforms for the direct study of metabolic changes in cells and tissues following external perturbation. For this purpose we developed a novel technique, laser ablation electrospray
Ethylene regulates a myriad physiological and biochemical processes in ripening fruits and is accepted as the ripening hormone for the climacteric fruits. However, its effects on metabolome and resulting fruit quality are not yet fully understood, particularly when some of the ripening-associated bi...
Tan, Shi Z; Mullard, Graham; Hollywood, Katherine A; Dunn, Warwick B; Bishop, Paul N
Time-dependent post-mortem biochemical changes have been demonstrated in donor cornea and vitreous, but there have been no published studies to date that objectively measure post-mortem changes in the retinal metabolome over time. The aim of the study was firstly, to investigate post-mortem, time-dependent changes in the rat retinal metabolome and secondly, to compare the metabolite composition of healthy rat ocular tissues. To study post-mortem changes in the rat retinal metabolome, globes were enucleated and stored at 4 °C and sampled at 0, 2, 4, 8, 24 and 48 h post-mortem. To study the metabolite composition of rat ocular tissues, eyes were dissected immediately after culling to isolate the cornea, lens, vitreous and retina, prior to storing at -80 °C. Tissue extracts were subjected to Gas Chromatograph Mass Spectrometry (GC-MS) and Ultra High Performance Liquid Chromatography Mass Spectrometry (UHPLC-MS). Generally, the metabolic composition of the retina was stable for 8 h post-mortem when eyes were stored at 4 °C, but showed increasing changes thereafter. However, some more rapid changes were observed such as increases in TCA cycle metabolites after 2 h post-mortem, whereas some metabolites such as fatty acids only showed decreases in concentration from 24 h. A total of 42 metabolites were identified across the ocular tissues by GC-MS (MSI level 1) and 2782 metabolites were annotated by UHPLC-MS (MSI level 2) according to MSI reporting standards. Many of the metabolites detected were common to all of the tissues but some metabolites showed partitioning between different ocular structures with 655, 297, 93 and 13 metabolites being uniquely detected in the retina, lens, cornea and vitreous respectively. Only a small percentage (1.6%) of metabolites found in the vitreous were only detected in the retina and not other tissues. In conclusion, mass spectrometry-based techniques have been used for the first time to compare the metabolic composition of
Basu, Siddhartha; Fey, Petra; Jimenez-Morales, David; Dodson, Robert J; Chisholm, Rex L
dictyBase is the model organism database for the social amoeba Dictyostelium discoideum and related species. The primary mission of dictyBase is to provide the biomedical research community with well-integrated high quality data, and tools that enable original research. Data presented at dictyBase is obtained from sequencing centers, groups performing high throughput experiments such as large-scale mutagenesis studies, and RNAseq data, as well as a growing number of manually added functional gene annotations from the published literature, including Gene Ontology, strain, and phenotype annotations. Through the Dicty Stock Center we provide the community with an impressive amount of annotated strains and plasmids. Recently, dictyBase accomplished a major overhaul to adapt an outdated infrastructure to the current technological advances, thus facilitating the implementation of innovative tools and comparative genomics. It also provides new strategies for high quality annotations that enable bench researchers to benefit from the rapidly increasing volume of available data. dictyBase is highly responsive to its users needs, building a successful relationship that capitalizes on the vast efforts of the Dictyostelium research community. dictyBase has become the trusted data resource for Dictyostelium investigators, other investigators or organizations seeking information about Dictyostelium, as well as educators who use this model system. © 2015 Wiley Periodicals, Inc.
Anderson, C.D.; McDougall, G.H.G.
This document is an updated and expanded version of an earlier annotated bibliography by Dr. C. Dennis Anderson and Carman Cullen (A Review and Annotation of Energy Research on Consumers, March 1978). It is the final draft of the major report that will be published in English and French and made publicly available through the Consumer Research and Evaluation Branch of Consumer and Corporate Affairs, Canada. Two agencies granting permission to include some of their energy abstracts are the Rand Corporation and the DOE Technical Information Center. The bibliography consists mainly of empirical studies, including surveys and experiments. It also includes a number of descriptive and econometric studies that utilize secondary data. Many of the studies provide summaries of research is specific areas, and point out directions for future research efforts. 14 tables.
McCauley, Stephen; de Groot, Saskia; Mailund, Thomas
Motivation: Viral genomes tend to code in overlapping reading frames to maximize information content. This may result in atypical codon bias and particular evolutionary constraints. Due to the fast mutation rate of viruses, there is additional strong evidence for varying selection between intra......- and intergenomic regions. The presence of multiple coding regions complicates the concept of Ka/Ks ratio, and thus begs for an alternative approach when investigating selection strengths. Building on the paper by McCauley & Hein (2006), we develop a method for annotating a viral genome coding in overlapping...... may thus achieve an annotation both of coding regions as well as selection strengths, allowing us to investigate different selection patterns and hypotheses. Results: We illustrate our method by applying it to a multiple alignment of four HIV2 sequences, as well as four Hepatitis B sequences. We...
Nawrocki, Eric P
Many different types of functional non-coding RNAs participate in a wide range of important cellular functions but the large majority of these RNAs are not routinely annotated in published genomes. Several programs have been developed for identifying RNAs, including specific tools tailored to a particular RNA family as well as more general ones designed to work for any family. Many of these tools utilize covariance models (CMs), statistical models of the conserved sequence, and structure of an RNA family. In this chapter, as an illustrative example, the Infernal software package and CMs from the Rfam database are used to identify RNAs in the genome of the archaeon Methanobrevibacter ruminantium, uncovering some additional RNAs not present in the genome's initial annotation. Analysis of the results and comparison with family-specific methods demonstrate some important strengths and weaknesses of this general approach.
An annotated summary of 204 articles and publications on burrs, burr prevention and deburring is presented. Thirty-seven deburring processes are listed. Entries cited include English, Russian, French, Japanese and German language articles. Entries are indexed by deburring processes, author, and language. Indexes also indicate which references discuss equipment and tooling, how to use a process, economics, burr properties, and how to design to minimize burr problems. Research studies are identified as are the materials deburred
Tanizawa, Yasuhiro; Fujisawa, Takatomo; Kaminuma, Eli; Nakamura, Yasukazu; Arita, Masanori
Quality assurance and correct taxonomic affiliation of data submitted to public sequence databases have been an everlasting problem. The DDBJ Fast Annotation and Submission Tool (DFAST) is a newly developed genome annotation pipeline with quality and taxonomy assessment tools. To enable annotation of ready-to-submit quality, we also constructed curated reference protein databases tailored for lactic acid bacteria. DFAST was developed so that all the procedures required for DDBJ submission could be done seamlessly online. The online workspace would be especially useful for users not familiar with bioinformatics skills. In addition, we have developed a genome repository, DFAST Archive of Genome Annotation (DAGA), which currently includes 1,421 genomes covering 179 species and 18 subspecies of two genera, Lactobacillus and Pediococcus , obtained from both DDBJ/ENA/GenBank and Sequence Read Archive (SRA). All the genomes deposited in DAGA were annotated consistently and assessed using DFAST. To assess the taxonomic position based on genomic sequence information, we used the average nucleotide identity (ANI), which showed high discriminative power to determine whether two given genomes belong to the same species. We corrected mislabeled or misidentified genomes in the public database and deposited the curated information in DAGA. The repository will improve the accessibility and reusability of genome resources for lactic acid bacteria. By exploiting the data deposited in DAGA, we found intraspecific subgroups in Lactobacillus gasseri and Lactobacillus jensenii , whose variation between subgroups is larger than the well-accepted ANI threshold of 95% to differentiate species. DFAST and DAGA are freely accessible at https://dfast.nig.ac.jp.
Duvick, Jon; Standage, Daniel S; Merchant, Nirav; Brendel, Volker P
Genome-wide annotation of gene structure requires the integration of numerous computational steps. Currently, annotation is arguably best accomplished through collaboration of bioinformatics and domain experts, with broad community involvement. However, such a collaborative approach is not scalable at today's pace of sequence generation. To address this problem, we developed the xGDBvm software, which uses an intuitive graphical user interface to access a number of common genome analysis and gene structure tools, preconfigured in a self-contained virtual machine image. Once their virtual machine instance is deployed through iPlant's Atmosphere cloud services, users access the xGDBvm workflow via a unified Web interface to manage inputs, set program parameters, configure links to high-performance computing (HPC) resources, view and manage output, apply analysis and editing tools, or access contextual help. The xGDBvm workflow will mask the genome, compute spliced alignments from transcript and/or protein inputs (locally or on a remote HPC cluster), predict gene structures and gene structure quality, and display output in a public or private genome browser complete with accessory tools. Problematic gene predictions are flagged and can be reannotated using the integrated yrGATE annotation tool. xGDBvm can also be configured to append or replace existing data or load precomputed data. Multiple genomes can be annotated and displayed, and outputs can be archived for sharing or backup. xGDBvm can be adapted to a variety of use cases including de novo genome annotation, reannotation, comparison of different annotations, and training or teaching. © 2016 American Society of Plant Biologists. All rights reserved.
Full Text Available Abstract Background Two of the main objectives of the genomic and post-genomic era are to structurally and functionally annotate genomes which consists of detecting genes' position and structure, and inferring their function (as well as of other features of genomes. Structural and functional annotation both require the complex chaining of numerous different software, algorithms and methods under the supervision of a biologist. The automation of these pipelines is necessary to manage huge amounts of data released by sequencing projects. Several pipelines already automate some of these complex chaining but still necessitate an important contribution of biologists for supervising and controlling the results at various steps. Results Here we propose an innovative automated platform, FIGENIX, which includes an expert system capable to substitute to human expertise at several key steps. FIGENIX currently automates complex pipelines of structural and functional annotation under the supervision of the expert system (which allows for example to make key decisions, check intermediate results or refine the dataset. The quality of the results produced by FIGENIX is comparable to those obtained by expert biologists with a drastic gain in terms of time costs and avoidance of errors due to the human manipulation of data. Conclusion The core engine and expert system of the FIGENIX platform currently handle complex annotation processes of broad interest for the genomic community. They could be easily adapted to new, or more specialized pipelines, such as for example the annotation of miRNAs, the classification of complex multigenic families, annotation of regulatory elements and other genomic features of interest.
Nicole L Washington
Full Text Available Scientists and clinicians who study genetic alterations and disease have traditionally described phenotypes in natural language. The considerable variation in these free-text descriptions has posed a hindrance to the important task of identifying candidate genes and models for human diseases and indicates the need for a computationally tractable method to mine data resources for mutant phenotypes. In this study, we tested the hypothesis that ontological annotation of disease phenotypes will facilitate the discovery of new genotype-phenotype relationships within and across species. To describe phenotypes using ontologies, we used an Entity-Quality (EQ methodology, wherein the affected entity (E and how it is affected (Q are recorded using terms from a variety of ontologies. Using this EQ method, we annotated the phenotypes of 11 gene-linked human diseases described in Online Mendelian Inheritance in Man (OMIM. These human annotations were loaded into our Ontology-Based Database (OBD along with other ontology-based phenotype descriptions of mutants from various model organism databases. Phenotypes recorded with this EQ method can be computationally compared based on the hierarchy of terms in the ontologies and the frequency of annotation. We utilized four similarity metrics to compare phenotypes and developed an ontology of homologous and analogous anatomical structures to compare phenotypes between species. Using these tools, we demonstrate that we can identify, through the similarity of the recorded phenotypes, other alleles of the same gene, other members of a signaling pathway, and orthologous genes and pathway members across species. We conclude that EQ-based annotation of phenotypes, in conjunction with a cross-species ontology, and a variety of similarity metrics can identify biologically meaningful similarities between genes by comparing phenotypes alone. This annotation and search method provides a novel and efficient means to identify
Full Text Available In systems verification we are often concerned with multiple, inter-dependent properties that a program must satisfy. To prove that a program satisfies a given property, the correctness of intermediate states of the program must be characterized. However, this intermediate reasoning is not always phrased such that it can be easily re-used in the proofs of subsequent properties. We introduce a function annotation logic that extends Hoare logic in two important ways: (1 when proving that a function satisfies a Hoare triple, intermediate reasoning is automatically stored as function annotations, and (2 these function annotations can be exploited in future Hoare logic proofs. This reduces duplication of reasoning between the proofs of different properties, whilst serving as a drop-in replacement for traditional Hoare logic to avoid the costly process of proof refactoring. We explain how this was implemented in Isabelle/HOL and applied to an experimental branch of the seL4 microkernel to significantly reduce the size and complexity of existing proofs.
Jäger, Marten; Wang, Kai; Bauer, Sebastian; Smedley, Damian; Krawitz, Peter; Robinson, Peter N
Transcript-based annotation and pedigree analysis are two basic steps in the computational analysis of whole-exome sequencing experiments in genetic diagnostics and disease-gene discovery projects. Here, we present Jannovar, a stand-alone Java application as well as a Java library designed to be used in larger software frameworks for exome and genome analysis. Jannovar uses an interval tree to identify all transcripts affected by a given variant, and provides Human Genome Variation Society-compliant annotations both for variants affecting coding sequences and splice junctions as well as untranslated regions and noncoding RNA transcripts. Jannovar can also perform family-based pedigree analysis with Variant Call Format (VCF) files with data from members of a family segregating a Mendelian disorder. Using a desktop computer, Jannovar requires a few seconds to annotate a typical VCF file with exome data. Jannovar is freely available under the BSD2 license. Source code as well as the Java application and library file can be downloaded from http://compbio.charite.de (with tutorial) and https://github.com/charite/jannovar. © 2014 WILEY PERIODICALS, INC.
Liu, Hongfang; Li, Xin; Yoon, Victoria; Clarke, Robert
As the most common cancer among women, breast cancer results from the accumulation of mutations in essential genes. Recent advance in high-throughput gene expression microarray technology has inspired researchers to use the technology to assist breast cancer diagnosis, prognosis, and treatment prediction. However, the high dimensionality of microarray experiments and public access of data from many experiments have caused inconsistencies which initiated the development of controlled terminologies and ontologies for annotating microarray experiments, such as the standard microarray Gene Expression Data (MGED) ontology (MO). In this paper, we developed BCM-CO, an ontology tailored specifically for indexing clinical annotations of breast cancer microarray samples from the NCI Thesaurus. Our research showed that the coverage of NCI Thesaurus is very limited with respect to i) terms used by researchers to describe breast cancer histology (covering 22 out of 48 histology terms); ii) breast cancer cell lines (covering one out of 12 cell lines); and iii) classes corresponding to the breast cancer grading and staging. By incorporating a wider range of those terms into BCM-CO, we were able to indexed breast cancer microarray samples from GEO using BCM-CO and MGED ontology and developed a prototype system with web interface that allows the retrieval of microarray data based on the ontology annotations. PMID:18999108
Koek, M.M.; Muilwijk, B.; Werf, M.J. van der; Hankemeier, T.
An analytical method was set up suitable for the analysis of microbial metabolomes, consisting of an oximation and silylation derivatization reaction and subsequent analysis by gas chromatography coupled to mass spectrometry. Microbial matrixes contain many compounds that potentially interfere with
Matsumoto, Mitsuharu; Kibe, Ryoko; Ooga, Takushi; Aiba, Yuji; Kurihara, Shin; Sawaki, Emiko; Koga, Yasuhiro; Benno, Yoshimi
Low–molecular-weight metabolites produced by intestinal microbiota play a direct role in health and disease. In this study, we analyzed the colonic luminal metabolome using capillary electrophoresis mass spectrometry with time-of-flight (CE-TOFMS) —a novel technique for analyzing and differentially displaying metabolic profiles— in order to clarify the metabolite profiles in the intestinal lumen. CE-TOFMS identified 179 metabolites from the colonic luminal metabolome and 48 metabolites were present in significantly higher concentrations and/or incidence in the germ-free (GF) mice than in the Ex-GF mice (p metabolome and a comprehensive understanding of intestinal luminal metabolome is critical for clarifying host-intestinal bacterial interactions. PMID:22724057
the link between high throughput metabolomics data generated on different analytical platforms, discover important metabolites deriving from the digestion processes in the gut, and automate metabolic pathway discovery from mass spectrometry. PLS (partial least squares) based chemometric methods were...
Richard D. Beger
Full Text Available Cancer is a devastating disease that alters the metabolism of a cell and the surrounding milieu. Metabolomics is a growing and powerful technology capable of detecting hundreds to thousands of metabolites in tissues and biofluids. The recent advances in metabolomics technologies have enabled a deeper investigation into the metabolism of cancer and a better understanding of how cancer cells use glycolysis, known as the “Warburg effect,” advantageously to produce the amino acids, nucleotides and lipids necessary for tumor proliferation and vascularization. Currently, metabolomics research is being used to discover diagnostic cancer biomarkers in the clinic, to better understand its complex heterogeneous nature, to discover pathways involved in cancer that could be used for new targets and to monitor metabolic biomarkers during therapeutic intervention. These metabolomics approaches may also provide clues to personalized cancer treatments by providing useful information to the clinician about the cancer patient’s response to medical interventions.
Several fundamental requirements must be met so that NMR-based metabolomics and the related technique of metabonomics can be formally adopted into environmental monitoring and chemical risk assessment. Here we report an intercomparison exercise which has evaluated the effectivene...
Metabolomics datasets, by definition, comprise of measurements of large numbers of metabolites. Both technical (analytical) and biological factors will induce variation within these measurements that is not consistent across all metabolites. Consequently, criteria are required to...
Zheng, Hong; Yde, Christian C; Arnberg, Karina
The plasma and urine metabolome of 192 overweight 12-15-year-old adolescents (BMI of 25.4 ± 2.3 kg/m(2)) were examined in order to elucidate gender, pubertal development measured as Tanner stage, physical activity measured as number of steps taken daily, and intra-/interindividual differences...... and the metabolome could be identified. The present study for the first time provides comprehensive information about associations between the metabolome and gender, pubertal development, and physical activity in overweight adolescents, which is an important subject group to approach in the prevention of obesity...... affecting the metabolome detected by proton NMR spectroscopy. Higher urinary excretion of citrate, creatinine, hippurate, and phenylacetylglutamine and higher plasma level of phosphatidylcholine and unsaturated lipid were found for girls compared with boys. The results suggest that gender differences...
Pedersen, Helle Krogh; Gudmundsdottir, Valborg; Nielsen, Henrik Bjørn
Insulin resistance is a forerunner state of ischaemic cardiovascular disease and type 2 diabetes. Here we show how the human gut microbiome impacts the serum metabolome and associates with insulin resistance in 277 non-diabetic Danish individuals. The serum metabolome of insulin-resistant individ......Insulin resistance is a forerunner state of ischaemic cardiovascular disease and type 2 diabetes. Here we show how the human gut microbiome impacts the serum metabolome and associates with insulin resistance in 277 non-diabetic Danish individuals. The serum metabolome of insulin......-resistant individuals is characterized by increased levels of branched-chain amino acids (BCAAs), which correlate with a gut microbiome that has an enriched biosynthetic potential for BCAAs and is deprived of genes encoding bacterial inward transporters for these amino acids. Prevotella copri and Bacteroides vulgatus...
U.S. Environmental Protection Agency — GC/MS data from the metabolomic profiling of green frog livers after exposure to pesticides and their mixtures. This dataset is associated with the following...
Full Text Available Recent advances in the fields of digital photography, networking and computing, have made it easier than ever for users to store and share photographs. However without sufficient metadata, e.g., in the form of tags, photos are difficult to find and organize. In this paper, we describe a system that recommends tags for image annotation. We postulate that the use of low-level global visual features can improve the quality of the tag recommendation process when compared to a baseline statistical method based on tag co-occurrence. We present results from experiments conducted using photos and metadata sourced from the Flickr photo website that suggest that the use of visual features improves the mean average precision (MAP of the system and increases the system's ability to suggest different tags, therefore justifying the associated increase in complexity.
Scalbert, Augustin; Brennan, Lorraine; Manach, Claudine; Andres-Lacueva, Cristina; Dragsted, Lars O; Draper, John; Rappaport, Stephen M; van der Hooft, Justin J J; Wishart, David S
The food metabolome is defined as the part of the human metabolome directly derived from the digestion and biotransformation of foods and their constituents. With >25,000 compounds known in various foods, the food metabolome is extremely complex, with a composition varying widely according to the diet. By its very nature it represents a considerable and still largely unexploited source of novel dietary biomarkers that could be used to measure dietary exposures with a high level of detail and precision. Most dietary biomarkers currently have been identified on the basis of our knowledge of food compositions by using hypothesis-driven approaches. However, the rapid development of metabolomics resulting from the development of highly sensitive modern analytic instruments, the availability of metabolite databases, and progress in (bio)informatics has made agnostic approaches more attractive as shown by the recent identification of novel biomarkers of intakes for fruit, vegetables, beverages, meats, or complex diets. Moreover, examples also show how the scrutiny of the food metabolome can lead to the discovery of bioactive molecules and dietary factors associated with diseases. However, researchers still face hurdles, which slow progress and need to be resolved to bring this emerging field of research to maturity. These limits were discussed during the First International Workshop on the Food Metabolome held in Glasgow. Key recommendations made during the workshop included more coordination of efforts; development of new databases, software tools, and chemical libraries for the food metabolome; and shared repositories of metabolomic data. Once achieved, major progress can be expected toward a better understanding of the complex interactions between diet and human health. © 2014 American Society for Nutrition.
Metabolomics is the comprehensive assessment of low molecular weight organic metabolites within biological system. The identification and characterization of several chemical species, or metabolic fingerprinting, is an emergent approach in metabolomics field that provides a valuable “snapshot” of metabolic profiles. This approach is finding an increasing number of applications in many areas including cancer research, drug discovery and food science. The combined use of NMR spectroscopy, data ...
Full Text Available Metabolomics, the multi-targeted analysis of endogenous metabolites from biological samples, can be efficiently applied to screen disease biomarkers and investigate pathophysiological processes. Metabolites change rapidly in response to physiological perturbations, making them the closest link to disease phenotypes. This study explored the role of metabolomics in gaining mechanistic insight into disease processes and in searching for novel biomarkers of human diseases
Fan, Teresa W-M.; Lorkiewicz, Pawel; Sellers, Katherine; Moseley, Hunter N.B.; Higashi, Richard M.; Lane, Andrew N.
Advances in analytical methodologies, principally nuclear magnetic resonance spectroscopy (NMR) and mass spectrometry (MS), during the last decade have made large-scale analysis of the human metabolome a reality. This is leading to the reawakening of the importance of metabolism in human diseases, particularly cancer. The metabolome is the functional readout of the genome, functional genome, and proteome; it is also an integral partner in molecular regulations for homeostasis. The interrogation of the metabolome, or metabolomics, is now being applied to numerous diseases, largely by metabolite profiling for biomarker discovery, but also in pharmacology and therapeutics. Recent advances in stable isotope tracer-based metabolomic approaches enable unambiguous tracking of individual atoms through compartmentalized metabolic networks directly in human subjects, which promises to decipher the complexity of the human metabolome at an unprecedented pace. This knowledge will revolutionize our understanding of complex human diseases, clinical diagnostics, as well as individualized therapeutics and drug response. In this review, we focus on the use of stable isotope tracers with metabolomics technologies for understanding metabolic network dynamics in both model systems and in clinical applications. Atom-resolved isotope tracing via the two major analytical platforms, NMR and MS, has the power to determine novel metabolic reprogramming in diseases, discover new drug targets, and facilitates ADME studies. We also illustrate new metabolic tracer-based imaging technologies, which enable direct visualization of metabolic processes in vivo. We further outline current practices and future requirements for biochemoinformatics development, which is an integral part of translating stable isotope-resolved metabolomics into clinical reality. PMID:22212615
Newgard, Christopher B
Metabolomics, or the comprehensive profiling of small molecule metabolites in cells, tissues, or whole organisms, has undergone a rapid technological evolution in the past two decades. These advances have led to the application of metabolomics to defining predictive biomarkers for incident cardiometabolic diseases and, increasingly, as a blueprint for understanding those diseases' pathophysiologic mechanisms. Progress in this area and challenges for the future are reviewed here. Copyright © 2017 Elsevier Inc. All rights reserved.
Angelica Dessì; Vassilios Fanos
Pediatric obesity represents an important health issue. In recent years applications of metabolomics have led to evaluation of responses to the various nutrients (nutrigenomics), in particular to lipids, in diseases such as obesity and diabetes. The experimental data and the studies in pediatrics that evaluated the metabolic condition in infant obesity are presented. It is thus to be hoped that future progress in connection with this new technique, together with a metabolomic study of mother...
Alessandro M. Varani
Full Text Available The Xylella fastidiosa comparative genomic database is a scientific resource with the aim to provide a user-friendly interface for accessing high-quality manually curated genomic annotation and comparative sequence analysis, as well as for identifying and mapping prophage-like elements, a marked feature of Xylella genomes. Here we describe a database and tools for exploring the biology of this important plant pathogen. The hallmarks of this database are the high quality genomic annotation, the functional and comparative genomic analysis and the identification and mapping of prophage-like elements. It is available from web site http://www.xylella.lncc.br.
Ding, Zhaotang; Jia, Sisi; Wang, Yu; Xiao, Jun; Zhang, Yinfei
In order to study the response of tea plants to P stress, we conducted the ionomic and metabolomic analysis by ICP-OES, GC-MS and LC-MS. The results demonstrated that P was antagonistic with S, and was cooperative with Cu, Zn, Mn and Fe under P-deficiency. However, P was antagonistic with Mn, Fe and S, and was cooperative with Cu and Zn under P-excess. Moreover, P-deficiency or excess reduced the syntheses of flavonoids and phosphorylated metabolites. P-deficiency decreased the amount of glutamate and increased the content of glutamine, while P-excess decreased the content of glutamine. Besides, P-deficiency increased three organic acids and decreased three organic acids. P-excess increased the contents of malic acid, oxalic acid, ribonic acid and etc. involved in primary metabolism, but decreased the contents of p-coumaric acid, indoleacrylic acid, related to secondary metabolism. Furthermore, the contents of Mn and Zn were found to be positively related to the amounts of myricetin and quercetin, and the content of Mn to be positively related to the amount of arabinose. The results implied that the P stresses severely disturbed the metabolism of minerals and metabolites in tea plants, which influenced the yield and quality of tea. Copyright © 2017. Published by Elsevier Masson SAS.
Xiao, Chaoni; Wu, Man; Chen, Yongyong; Zhang, Yajun; Zhao, Xinfeng; Zheng, Xiaohui
The distribution of metabolites in the different root parts of Cortex Moutan (the root bark of Paeonia suffruticosa Andrews) is not well understood, therefore, scientific evidence is not available for quality assessment of Cortex Moutan. To reveal metabolomic variations in Cortex Moutan in order to gain deeper insights to enable quality control. Metabolomic variations in the different root parts of Cortex Moutan were characterised using high-performance liquid chromatography combined with mass spectrometry (HPLC-MS) and multivariate data analysis. The discriminating metabolites in different root parts were evaluated by the one-way analysis of variance and a fold change parameter. The metabolite profiles of Cortex Moutan were largely dominated by five primary and 41 secondary metabolites . Higher levels of malic acid, gallic acid and mudanoside-B were mainly observed in the second lateral roots, whereas dihydroxyacetophenone, benzoyloxypaeoniflorin, suffruticoside-A, kaempferol dihexoside, mudanpioside E and mudanpioside J accumulated in the first lateral and axial roots. The highest contents of paeonol, galloyloxypaeoniflorin and procyanidin B were detected in the axial roots. Accordingly, metabolite compositions of Cortex Moutan were found to vary among different root parts. The axial roots have higher quality than the lateral roots in Cortex Moutan due to the accumulation of bioactive secondary metabolites associated with plant physiology. These findings provided important scientific evidence for grading Cortex Moutan on the general market. Copyright © 2014 John Wiley & Sons, Ltd.
Luque de Castro, M D; Delgado-Povedano, M M
Metabolomics, one of the most recently emerged "omics", has taken advantage of ultrasound (US) to improve sample preparation (SP) steps. The metabolomics-US assisted SP step binomial has experienced a dissimilar development that has depended on the area (vegetal or animal) and the SP step. Thus, vegetal metabolomics and US assisted leaching has received the greater attention (encompassing subdisciplines such as metallomics, xenometabolomics and, mainly, lipidomics), but also liquid-liquid extraction and (bio)chemical reactions in metabolomics have taken advantage of US energy. Also clinical and animal samples have benefited from US assisted SP in metabolomics studies but in a lesser extension. The main effects of US have been shortening of the time required for the given step, and/or increase of its efficiency or availability for automation; nevertheless, attention paid to potential degradation caused by US has been scant or nil. Achievements and weak points of the metabolomics-US assisted SP step binomial are discussed and possible solutions to the present shortcomings are exposed. Copyright © 2013 Elsevier B.V. All rights reserved.
Lam, Sin Man; Wang, Yuan; Li, Bowen; Du, Jie; Shui, Guanghou
Metabolomics, which targets at the extensive characterization and quantitation of global metabolites from both endogenous and exogenous sources, has emerged as a novel technological avenue to advance the field of precision medicine principally driven by genomics-oriented approaches. In particular, metabolomics has revealed the cardinal roles that the environment exerts in driving the progression of major diseases threatening public health. Herein, the existent and potential applications of metabolomics in two key areas of precision cardiovascular medicine will be critically discussed: 1) the use of metabolomics in unveiling novel disease biomarkers and pathological pathways; 2) the contribution of metabolomics in cardiovascular drug development. Major issues concerning the statistical handling of big data generated by metabolomics, as well as its interpretation, will be briefly addressed. Finally, the need for integration of various omics branches and adopting a multi-omics approach to precision medicine will be discussed. Copyright © 2017 Institute of Genetics and Developmental Biology, Chinese Academy of Sciences, and Genetics Society of China. Published by Elsevier Ltd. All rights reserved.
Zhang, Peixu; Zhang, Weiguanliu; Lang, Yue; Qu, Yan; Chu, Fengna; Chen, Jiafeng; Cui, Li
Tuberculosis meningitis (TBM) is a prevalent form of extra-pulmonary tuberculosis that causes substantial morbidity and mortality. Diagnosis of TBM is difficult because of the limited sensitivity of existing laboratory techniques. A metabolomics approach can be used to investigate the sets of metabolites of both bacteria and host, and has been used to clarify the mechanisms underlying disease development, and identify metabolic changes, leadings to improved methods for diagnosis, treatment, and prognostication. Mass spectrometry (MS) is a major analysis platform used in metabolomics, and MS-based metabolomics provides wide metabolite coverage, because of its high sensitivity, and is useful for the investigation of Mycobacterium tuberculosis (Mtb) and related diseases. It has been used to investigate TBM diagnosis; however, the processes involved in the MS-based metabolomics approach are complex and flexible, and often consist of several steps, and small changes in the methods used can have a huge impact on the final results. Here, the process of MS-based metabolomics is summarized and its applications in Mtb and Mtb-related diseases discussed. Moreover, the current status of TBM metabolomics is described. Copyright © 2018. Published by Elsevier B.V.
Burnett, N.; Jeffries, J.; Mach, J.; Robson, M.; Pajot, D.; Harrigan, J.; Lebsack, T.; Mullen, D.; Rat, F.; Theys, P.
What is quality? How do you achieve it? How do you keep it once you have got it. The answer for industry at large is the three-step hierarchy of quality control, quality assurance and Total quality Management. An overview is given of the history of quality movement, illustrated with examples from Schlumberger operations, as well as the oil industry's approach to quality. An introduction of the Schlumberger's quality-associated ClientLink program is presented. 15 figs., 4 ills., 16 refs
Full Text Available Abstract Background Magnaporthe oryzae, the causal agent of blast disease of rice, is the most destructive disease of rice worldwide. The genome of this fungal pathogen has been sequenced and an automated annotation has recently been updated to Version 6 http://www.broad.mit.edu/annotation/genome/magnaporthe_grisea/MultiDownloads.html. However, a comprehensive manual curation remains to be performed. Gene Ontology (GO annotation is a valuable means of assigning functional information using standardized vocabulary. We report an overview of the GO annotation for Version 5 of M. oryzae genome assembly. Methods A similarity-based (i.e., computational GO annotation with manual review was conducted, which was then integrated with a literature-based GO annotation with computational assistance. For similarity-based GO annotation a stringent reciprocal best hits method was used to identify similarity between predicted proteins of M. oryzae and GO proteins from multiple organisms with published associations to GO terms. Significant alignment pairs were manually reviewed. Functional assignments were further cross-validated with manually reviewed data, conserved domains, or data determined by wet lab experiments. Additionally, biological appropriateness of the functional assignments was manually checked. Results In total, 6,286 proteins received GO term assignment via the homology-based annotation, including 2,870 hypothetical proteins. Literature-based experimental evidence, such as microarray, MPSS, T-DNA insertion mutation, or gene knockout mutation, resulted in 2,810 proteins being annotated with GO terms. Of these, 1,673 proteins were annotated with new terms developed for Plant-Associated Microbe Gene Ontology (PAMGO. In addition, 67 experiment-determined secreted proteins were annotated with PAMGO terms. Integration of the two data sets resulted in 7,412 proteins (57% being annotated with 1,957 distinct and specific GO terms. Unannotated proteins
Winnike, Jason H; Busby, Marjorie G; Watkins, Paul B; O'Connell, Thomas M
Background: Although the effects of acute dietary interventions on the human metabolome have been studied, the extent to which the metabolome can be normalized by extended dietary standardization has not yet been examined.
Full Text Available To date, variation in nectar chemistry of flowering plants has not been studied in detail. Such variation exerts considerable influence on pollinator-plant interactions, as well as on flower traits that play important roles in the selection of a plant for visitation by specific pollinators. Over the past 60 years the Aquilegia genus has been used as a key model for speciation studies. In this study, we defined the metabolomic profiles of flower samples of two Aquilegia species, A. Canadensis and A. pubescens. We identified a total of 75 metabolites that were classified into six main categories: organic acids, fatty acids, amino acids, esters, sugars, and unknowns. The mean abundances of 25 of these metabolites were significantly different between the two species, providing insights into interspecies variation in floral chemistry. Using the PlantSEED biochemistry database, we found that the majority of these metabolites are involved in biosynthetic pathways. Finally, we explored the annotated genome of A. coerulea, using the PlantSEED pipeline and reconstructed the metabolic network of Aquilegia. This network, which contains the metabolic pathways involved in generating the observed chemical variation, is now publicly available from the DOE Systems Biology Knowledge Base (KBase; http://kbase.us.
Full Text Available Salvia miltiorrhiza (S. miltiorrhiza Bunge is broadly used as herbal medicine for the clinical treatments of cardiovascular and cerebrovascular diseases. Despite its commercial and medicinal values, few systematic studies on the metabolome of S. miltiorrhiza roots have been carried out so far. We systematically described the metabolic profiles of S. miltiorrhiza using high pressure liquid chromatography mass spectrometry (LC/MS in conjunction with multivariate statistical analyses, aimed at monitoring their biological variations of secondary metabolites related to three locations and four S. miltiorrhiza genotypes. A total of 40 bioactive constituents were putatively annotated in S. miltiorrhiza root samples. This study found that both the same S. miltiorrhiza genotype growing at three different locations and four S. miltiorrhiza genotypes growing at the same location had significant metabonomic differences identified by the principal component analysis (PCA approach. By using orthogonal projection to latent structure with discriminant analysis (OPLS-DA, 16 and 14 secondary metabolites can be used as potential location-specific and genotype-specific markers in S. miltiorrhiza, respectively. The specificity of LC/MS profiles offered a powerful tool to discriminate S. miltiorrhiza samples according to genotypes or locations.
Arnaiz, Olivier; Van Dijk, Erwin; Bétermier, Mireille; Lhuillier-Akakpo, Maoussi; de Vanssay, Augustin; Duharcourt, Sandra; Sallet, Erika; Gouzy, Jérôme; Sperling, Linda
The 15 sibling species of the Paramecium aurelia cryptic species complex emerged after a whole genome duplication that occurred tens of millions of years ago. Given extensive knowledge of the genetics and epigenetics of Paramecium acquired over the last century, this species complex offers a uniquely powerful system to investigate the consequences of whole genome duplication in a unicellular eukaryote as well as the genetic and epigenetic mechanisms that drive speciation. High quality Paramecium gene models are important for research using this system. The major aim of the work reported here was to build an improved gene annotation pipeline for the Paramecium lineage. We generated oriented RNA-Seq transcriptome data across the sexual process of autogamy for the model species Paramecium tetraurelia. We determined, for the first time in a ciliate, candidate P. tetraurelia transcription start sites using an adapted Cap-Seq protocol. We developed TrUC, multi-threaded Perl software that in conjunction with TopHat mapping of RNA-Seq data to a reference genome, predicts transcription units for the annotation pipeline. We used EuGene software to combine annotation evidence. The high quality gene structural annotations obtained for P. tetraurelia were used as evidence to improve published annotations for 3 other Paramecium species. The RNA-Seq data were also used for differential gene expression analysis, providing a gene expression atlas that is more sensitive than the previously established microarray resource. We have developed a gene annotation pipeline tailored for the compact genomes and tiny introns of Paramecium species. A novel component of this pipeline, TrUC, predicts transcription units using Cap-Seq and oriented RNA-Seq data. TrUC could prove useful beyond Paramecium, especially in the case of high gene density. Accurate predictions of 3' and 5' UTR will be particularly valuable for studies of gene expression (e.g. nucleosome positioning, identification of cis
Mungall Christopher J
Full Text Available Abstract Background The Gene Ontology project supports categorization of gene products according to their location of action, the molecular functions that they carry out, and the processes that they are involved in. Although the ontologies are intentionally developed to be taxon neutral, and to cover all species, there are inherent taxon specificities in some branches. For example, the process 'lactation' is specific to mammals and the location 'mitochondrion' is specific to eukaryotes. The lack of an explicit formalization of these constraints can lead to errors and inconsistencies in automated and manual annotation. Results We have formalized the taxonomic constraints implicit in some GO classes, and specified these at various levels in the ontology. We have also developed an inference system that can be used to check for violations of these constraints in annotations. Using the constraints in conjunction with the inference system, we have detected and removed errors in annotations and improved the structure of the ontology. Conclusions Detection of inconsistencies in taxon-specificity enables gradual improvement of the ontologies, the annotations, and the formalized constraints. This is progressively improving the quality of our data. The full system is available for download, and new constraints or proposed changes to constraints can be submitted online at https://sourceforge.net/tracker/?atid=605890&group_id=36855.
Bada, Michael; Hunter, Lawrence
A wealth of knowledge valuable to the translational research scientist is contained within the vast biomedical literature, but this knowledge is typically in the form of natural language. Sophisticated natural-language-processing systems are needed to translate text into unambiguous formal representations grounded in high-quality consensus ontologies, and these systems in turn rely on gold-standard corpora of annotated documents for training and testing. To this end, we are constructing the Colorado Richly Annotated Full-Text (CRAFT) Corpus, a collection of 97 full-text biomedical journal articles that are being manually annotated with the entire sets of terms from select vocabularies, predominantly from the Open Biomedical Ontologies (OBO) library. Our efforts in building this corpus has illuminated infelicities of these ontologies with respect to the semantic annotation of biomedical documents, and we propose desiderata whose implementation could substantially improve their utility in this task; these include the integration of overlapping terms across OBOs, the resolution of OBO-specific ambiguities, the integration of the BFO with the OBOs and the use of mid-level ontologies, the inclusion of noncanonical instances, and the expansion of relations and realizable entities. Copyright © 2010 Elsevier Inc. All rights reserved.
Sato, Mayumi; Miyagi, Atsuko; Yoneyama, Shozo; Gisusi, Seiki; Tokuji, Yoshihiko; Kawai-Yamada, Maki
Maitake mushroom (Grifola frondosa [Dicks.] Gray) is generally cultured using the sawdust of broadleaf trees. The maitake strain Gf433 has high production efficiency, with high-quality of fruiting bodies even when 30% of the birch sawdust on the basal substrate is replaced with conifer sawdust. We performed metabolome analysis to investigate the effect of different cultivation components on the metabolism of Gf433 and Mori52 by performing CE-MS on their fruiting bodies in different cultivation conditions to quantify the levels of amino acids, organic acids, and phosphorylated organic acids. We found that amino acid and organic acid content in Gf433 were not affected by the kind of sawdust. However, Gf433 contained more organic acids and less amino acids than Mori52, and Gf433 also contained more chitin compared with Mori52. We believe that these differences in the metabolome contents of the two strains are related to the high production efficiency of Gf433.
Huang, Daisie I; Cronk, Quentin C B
Plann automates the process of annotating a plastome sequence in GenBank format for either downstream processing or for GenBank submission by annotating a new plastome based on a similar, well-annotated plastome. Plann is a Perl script to be executed on the command line. Plann compares a new plastome sequence to the features annotated in a reference plastome and then shifts the intervals of any matching features to the locations in the new plastome. Plann's output can be used in the National Center for Biotechnology Information's tbl2asn to create a Sequin file for GenBank submission. Unlike Web-based annotation packages, Plann is a locally executable script that will accurately annotate a plastome sequence to a locally specified reference plastome. Because it executes from the command line, it is ready to use in other software pipelines and can be easily rerun as a draft plastome is improved.
Swain, Martin T; Tsai, Isheng J; Assefa, Samual A; Newbold, Chris; Berriman, Matthew; Otto, Thomas D
Genome projects now produce draft assemblies within weeks owing to advanced high-throughput sequencing technologies. For milestone projects such as Escherichia coli or Homo sapiens, teams of scientists were employed to manually curate and finish these genomes to a high standard. Nowadays, this is not feasible for most projects, and the quality of genomes is generally of a much lower standard. This protocol describes software (PAGIT) that is used to improve the quality of draft genomes. It offers flexible functionality to close gaps in scaffolds, correct base errors in the consensus sequence and exploit reference genomes (if available) in order to improve scaffolding and generating annotations. The protocol is most accessible for bacterial and small eukaryotic genomes (up to 300 Mb), such as pathogenic bacteria, malaria and parasitic worms. Applying PAGIT to an E. coli assembly takes ∼24 h: it doubles the average contig size and annotates over 4,300 gene models.
Enright Anton J
Full Text Available Abstract Background MicroRNAs (miRNAs are important regulators of gene expression and have been implicated in development, differentiation and pathogenesis. Hundreds of miRNAs have been discovered in mammalian genomes. Approximately 50% of mammalian miRNAs are expressed from introns of protein-coding genes; the primary transcript (pri-miRNA is therefore assumed to be the host transcript. However, very little is known about the structure of pri-miRNAs expressed from intergenic regions. Here we annotate transcript boundaries of miRNAs in human, mouse and rat genomes using various transcription features. The 5' end of the pri-miRNA is predicted from transcription start sites, CpG islands and 5' CAGE tags mapped in the upstream flanking region surrounding the precursor miRNA (pre-miRNA. The 3' end of the pri-miRNA is predicted based on the mapping of polyA signals, and supported by cDNA/EST and ditags data. The predicted pri-miRNAs are also analyzed for promoter and insulator-associated regulatory regions. Results We define sets of conserved and non-conserved human, mouse and rat pre-miRNAs using bidirectional BLAST and synteny analysis. Transcription features in their flanking regions are used to demarcate the 5' and 3' boundaries of the pri-miRNAs. The lengths and boundaries of primary transcripts are highly conserved between orthologous miRNAs. A significant fraction of pri-miRNAs have lengths between 1 and 10 kb, with very few introns. We annotate a total of 59 pri-miRNA structures, which include 82 pre-miRNAs. 36 pri-miRNAs are conserved in all 3 species. In total, 18 of the confidently annotated transcripts express more than one pre-miRNA. The upstream regions of 54% of the predicted pri-miRNAs are found to be associated with promoter and insulator regulatory sequences. Conclusion Little is known about the primary transcripts of intergenic miRNAs. Using comparative data, we are able to identify the boundaries of a significant proportion of
Morusiewicz, Linda; Valett, Jon D.
An annotated bibliography of technical papers, documents, and memorandums produced by or related to the Software Engineering Laboratory is given. More than 100 publications are summarized. These publications cover many areas of software engineering and range from research reports to software documentation. All materials have been grouped into eight general subject areas for easy reference: The Software Engineering Laboratory; The Software Engineering Laboratory: Software Development Documents; Software Tools; Software Models; Software Measurement; Technology Evaluations; Ada Technology; and Data Collection. Subject and author indexes further classify these documents by specific topic and individual author.
Full Text Available Analysis of natural product pattern (metabolites; metabolomics and its formation (pathway; biosynthesis in plants, especially in non-model or crop plants such as medicinal and aromatic plants (MAPs, is a research field with significant potential for breeders, growers and consumers. There is an increasing importance for constant and sustainable quality of MAPs final products. Polyphenols are one of the most important compounds for the antioxidant properties of MAPs and are often, if not identified as active principle, used as lead compounds in quality assessment of herbal drugs and related preparation (herbal tea, alcoholic extracts etc.. Therefore, offering an efficient, robust and reliable fast tool to determine these quality features of MAPs will guarantee the growers, industrial users and the consumers from possible frauds.
Juan A. Galarza
Full Text Available In this paper we report the public availability of transcriptome resources for the aposematic wood tiger moth (Parasemia plantaginis. A comprehensive assembly methods, quality statistics, and annotation are provided. This reference transcriptome may serve as a useful resource for investigating functional gene activity in aposematic Lepidopteran species. All data is freely available at the European Nucleotide Archive (http://www.ebi.ac.uk/ena under study accession number: PRJEB14172.
This bibliography is intended to provide objective basic information on science and the environment, and includes books and journal articles that should be understandable to the nonspecialist. The bibliographic entries are arranged alphabetically by author within broad categories such as acid precipitation, air quality, climate change, ecology and ecosystems, energy, environmental assessment, hazardous wastes, law, natural history, ozone depletion, parks and urban issues, public policy, toxic chemicals, and water resources. Most entries have brief annotations.
Full Text Available A growing body of evidence has shown the intimate relationship between metabolomic profiles and insulin resistance (IR in obese adults, while little is known about childhood obesity. In this review, we searched available papers addressing metabolomic profiles and IR in obese children from inception to February 2016 on MEDLINE, Web of Science, the Cochrane Library, ClinicalTrials.gov, and EMASE. HOMA-IR was applied as surrogate markers of IR and related metabolic disorders at both baseline and follow-up. To minimize selection bias, two investigators independently completed this work. After critical selection, 10 studies (including 2,673 participants were eligible and evaluated by using QUADOMICS for quality assessment. Six of the 10 studies were classified as “high quality.” Then we generated all the metabolites identified in each study and found amino acid metabolism and lipid metabolism were the main affected metabolic pathways in obese children. Among identified metabolites, branched-chain amino acids (BCAAs, aromatic amino acids (AAAs, and acylcarnitines were reported to be associated with IR as biomarkers most frequently. Additionally, BCAAs and tyrosine seemed to be relevant to future metabolic risk in the long-term follow-up cohorts, emphasizing the importance of early diagnosis and prevention strategy. Because of limited scale and design heterogeneity of existing studies, future studies might focus on validating above findings in more large-scale and longitudinal studies with elaborate design.
Full Text Available The recent thriving development of biobanks and associated high-throughput phenotyping studies requires the elaboration of large-scale approaches for monitoring biological sample quality and compliance with standard protocols. We present a metabolomic investigation of human blood samples that delineates pitfalls and guidelines for the collection, storage and handling procedures for serum and plasma. A series of eight pre-processing technical parameters is systematically investigated along variable ranges commonly encountered across clinical studies. While metabolic fingerprints, as assessed by nuclear magnetic resonance, are not significantly affected by altered centrifugation parameters or delays between sample pre-processing (blood centrifugation and storage, our metabolomic investigation highlights that both the delay and storage temperature between blood draw and centrifugation are the primary parameters impacting serum and plasma metabolic profiles. Storing the blood drawn at 4 °C is shown to be a reliable routine to confine variability associated with idle time prior to sample pre-processing. Based on their fine sensitivity to pre-analytical parameters and protocol variations, metabolic fingerprints could be exploited as valuable ways to determine compliance with standard procedures and quality assessment of blood samples within large multi-omic clinical and translational cohort studies.
Ames, L.L.; Rai, D.; Serne, R.J.
The annotated bibliography is divided into sections on chemistry and geochemistry, migration and accumulation, cultural distributions, natural distributions, and bibliographies and annual reviews. (LK)
Full Text Available Bacterial genome annotations are accumulating rapidly in the GenBank database and the use of automated annotation technologies to create these annotations has become the norm. However, these automated methods commonly result in a small, but significant percentage of genome annotation errors. To improve accuracy and reliability, we analyzed the Caulobacter crescentus NA1000 genome utilizing computer programs Artemis and MICheck to manually examine the third codon position GC content, alignment to a third codon position GC frame plot peak, and matches in the GenBank database. We identified 11 new genes, modified the start site of 113 genes, and changed the reading frame of 38 genes that had been incorrectly annotated. Furthermore, our manual method of identifying protein-coding genes allowed us to remove 112 non-coding regions that had been designated as coding regions. The improved NA1000 genome annotation resulted in a reduction in the use of rare codons since noncoding regions with atypical codon usage were removed from the annotation and 49 new coding regions were added to the annotation. Thus, a more accurate codon usage table was generated as well. These results demonstrate that a comparison of the location of peaks third codon position GC content to the location of protein coding regions could be used to verify the annotation of any genome that has a GC content that is greater than 60%.
Alexander, Roger P; Fang, Gang; Rozowsky, Joel; Snyder, Michael; Gerstein, Mark B
Most of the human genome consists of non-protein-coding DNA. Recently, progress has been made in annotating these non-coding regions through the interpretation of functional genomics experiments and comparative sequence analysis. One can conceptualize functional genomics analysis as involving a sequence of steps: turning the output of an experiment into a 'signal' at each base pair of the genome; smoothing this signal and segmenting it into small blocks of initial annotation; and then clustering these small blocks into larger derived annotations and networks. Finally, one can relate functional genomics annotations to conserved units and measures of conservation derived from comparative sequence analysis.
A diverse collection of 32 pepper accessions was analysed for variation in health-related metabolites, such as carotenoids, capsaicinoids, flavonoids and vitamins C and E. For each of the metabolites analysed, there was a lot of variation among the accessions and it was possible to
Wang, Lei; Sun, Xiaoliang; Weiszmann, Jakob; Weckwerth, Wolfram
Grapevine is a fruit crop with worldwide economic importance. The grape berry undergoes complex biochemical changes from fruit set until ripening. This ripening process and production processes define the wine quality. Thus, a thorough understanding of berry ripening is crucial for the prediction of wine quality. For a systemic analysis of grape berry development we applied mass spectrometry based platforms to analyse the metabolome and proteome of Early Campbell at 12 stages covering major d...
Freidin, Maxim B; Wells, Helena R R; Potter, Tilly; Livshits, Gregory; Menni, Cristina; Williams, Frances M K
Fatigue is a sensation of unbearable tiredness that frequently accompanies chronic widespread musculoskeletal pain (CWP) and inflammatory joint disease. Its mechanisms are poorly understood and there is a lack of effective biomarkers for diagnosis and onset prediction. We studied the circulating metabolome in a population sample characterised for CWP to identify biomarkers showing specificity for fatigue. Untargeted metabolomic profiling was conducted on fasting plasma and serum samples of 1106 females with and without CWP from the TwinsUK cohort. Linear mixed-effects models accounting for covariates were used to determine relationships between fatigue and metabolites. Receiver operating curve (ROC)-analysis was used to determine predictive value of metabolites for fatigue. While no association between fatigue and metabolites was identified in twins without CWP (n=711), in participants with CWP (n=395), levels of eicosapentaenoate (EPA) ω-3 fatty acid were significantly reduced in those with fatigue (β=-0.452±0.116; p=1.2×10 -4 ). A significant association between fatigue and two other metabolites also emerged when BMI was excluded from the model: 3-carboxy-4-methyl-5-propyl-2-furanpropanoate (CMPF), and C-glycosyltryptophan (p=1.5×10 -4 and p=3.1×10 -4 , respectively). ROC analysis has identified a combination of 15 circulating metabolites with good predictive potential for fatigue in CWP (AUC=75%; 95% CI 69-80%). The results of this agnostic metabolomics screening show that fatigue is metabolically distinct from CWP, and is associated with a decrease in circulating levels of EPA. Our panel of circulating metabolites provides the starting point for a diagnostic test for fatigue in CWP. Copyright © 2017 The Authors. Published by Elsevier B.V. All rights reserved.
Murgia, Federica; Muroni, Antonella; Puligheddu, Monica; Polizzi, Lorenzo; Barberini, Luigi; Orofino, Gianni; Solla, Paolo; Poddighe, Simone; Del Carratore, Francesco; Griffin, Julian L; Atzori, Luigi; Marrosu, Francesco
Drug resistance is a critical issue in the treatment of epilepsy, contributing to clinical emergencies and increasing both serious social and economic burdens on the health system. The wide variety of potential drug combinations followed by often failed consecutive attempts to match drugs to an individual patient may mean that this treatment stage may last for years with suboptimal benefit to the patient. Given these challenges, it is valuable to explore the availability of new methodologies able to shorten the period of determining a rationale pharmacologic treatment. Metabolomics could provide such a tool to investigate possible markers of drug resistance in subjects with epilepsy. Blood samples were collected from (1) controls (C) ( n = 35), (2) patients with epilepsy "responder" (R) ( n = 18), and (3) patients with epilepsy "non-responder" (NR) ( n = 17) to the drug therapy. The samples were analyzed using nuclear magnetic resonance spectroscopy, followed by multivariate statistical analysis. A different metabolic profile based on metabolomics analysis of the serum was observed between C and patients with epilepsy and also between R and NR patients. It was possible to identify the discriminant metabolites for the three classes under investigation. Serum from patients with epilepsy were characterized by increased levels of 3-OH-butyrate, 2-OH-valerate, 2-OH-butyrate, acetoacetate, acetone, acetate, choline, alanine, glutamate, scyllo-inositol (C lactate, and citrate compared to C (C > R > NR). In conclusion, metabolomics may represent an important tool for discovery of differences between subjects affected by epilepsy responding or resistant to therapies and for the study of its pathophysiology, optimizing the therapeutic resources and the quality of life of patients.
Full Text Available PurposeDrug resistance is a critical issue in the treatment of epilepsy, contributing to clinical emergencies and increasing both serious social and economic burdens on the health system. The wide variety of potential drug combinations followed by often failed consecutive attempts to match drugs to an individual patient may mean that this treatment stage may last for years with suboptimal benefit to the patient. Given these challenges, it is valuable to explore the availability of new methodologies able to shorten the period of determining a rationale pharmacologic treatment. Metabolomics could provide such a tool to investigate possible markers of drug resistance in subjects with epilepsy.MethodsBlood samples were collected from (1 controls (C (n = 35, (2 patients with epilepsy “responder” (R (n = 18, and (3 patients with epilepsy “non-responder” (NR (n = 17 to the drug therapy. The samples were analyzed using nuclear magnetic resonance spectroscopy, followed by multivariate statistical analysis.Key findingsA different metabolic profile based on metabolomics analysis of the serum was observed between C and patients with epilepsy and also between R and NR patients. It was possible to identify the discriminant metabolites for the three classes under investigation. Serum from patients with epilepsy were characterized by increased levels of 3-OH-butyrate, 2-OH-valerate, 2-OH-butyrate, acetoacetate, acetone, acetate, choline, alanine, glutamate, scyllo-inositol (C < R < NR, and decreased concentration of glucose, lactate, and citrate compared to C (C > R > NR.SignificanceIn conclusion, metabolomics may represent an important tool for discovery of differences between subjects affected by epilepsy responding or resistant to therapies and for the study of its pathophysiology, optimizing the therapeutic resources and the quality of life of patients.
Luo, Xian; Li, Liang
In cellular metabolomics, it is desirable to carry out metabolomic profiling using a small number of cells in order to save time and cost. In some applications (e.g., working with circulating tumor cells in blood), only a limited number of cells are available for analysis. In this report, we describe a method based on high-performance chemical isotope labeling (CIL) nanoflow liquid chromatography mass spectrometry (nanoLC-MS) for high-coverage metabolomic analysis of small numbers of cells (i.e., ≤10000 cells). As an example, 12 C-/ 13 C-dansyl labeling of the metabolites in lysates of 100, 1000, and 10000 MCF-7 breast cancer cells was carried out using a new labeling protocol tailored to handle small amounts of metabolites. Chemical-vapor-assisted ionization in a captivespray interface was optimized for improving metabolite ionization and increasing robustness of nanoLC-MS. Compared to microflow LC-MS, the nanoflow system provided much improved metabolite detectability with a significantly reduced sample amount required for analysis. Experimental duplicate analyses of biological triplicates resulted in the detection of 1620 ± 148, 2091 ± 89 and 2402 ± 80 (n = 6) peak pairs or metabolites in the amine/phenol submetabolome from the 12 C-/ 13 C-dansyl labeled lysates of 100, 1000, and 10000 cells, respectively. About 63-69% of these peak pairs could be either identified using dansyl labeled standard library or mass-matched to chemical structures in human metabolome databases. We envisage the routine applications of this method for high-coverage quantitative cellular metabolomics using a starting material of 10000 cells. Even for analyzing 100 or 1000 cells, although the metabolomic coverage is reduced from the maximal coverage, this method can still detect thousands of metabolites, allowing the analysis of a large fraction of the metabolome and focused analysis of the detectable metabolites.
Sifrim, Alejandro; Van Houdt, Jeroen Kj; Tranchevent, Leon-Charles; Nowakowska, Beata; Sakai, Ryo; Pavlopoulos, Georgios A; Devriendt, Koen; Vermeesch, Joris R; Moreau, Yves; Aerts, Jan
The increasing size and complexity of exome/genome sequencing data requires new tools for clinical geneticists to discover disease-causing variants. Bottlenecks in identifying the causative variation include poor cross-sample querying, constantly changing functional annotation and not considering existing knowledge concerning the phenotype. We describe a methodology that facilitates exploration of patient sequencing data towards identification of causal variants under different genetic hypotheses. Annotate-it facilitates handling, analysis and interpretation of high-throughput single nucleotide variant data. We demonstrate our strategy using three case studies. Annotate-it is freely available and test data are accessible to all users at http://www.annotate-it.org.
Full Text Available Xenobiotic exposure, especially high-dose or repeated exposure of xenobiotics, can elicit detrimental effects on biological systems through diverse mechanisms. Changes in metabolic systems, including formation of reactive metabolites and disruption of endogenous metabolism, are not only the common consequences of toxic xenobiotic exposure, but in many cases are the major causes behind development of xenobiotic-induced toxicities (XIT. Therefore, examining the metabolic events associated with XIT generates mechanistic insights into the initiation and progression of XIT, and provides guidance for prevention and treatment. Traditional bioanalytical platforms that target only a few suspected metabolites are capable of validating the expected outcomes of xenobiotic exposure. However, these approaches lack the capacity to define global changes and to identify unexpected events in the metabolic system. Recent developments in high-throughput metabolomics have dramatically expanded the scope and potential of metabolite analysis. Among all analytical techniques adopted for metabolomics, liquid chromatography-mass spectrometry (LC-MS has been most widely used for metabolomic investigations of XIT due to its versatility and sensitivity in metabolite analysis. In this review, technical platform of LC-MS-based metabolomics, including experimental model, sample preparation, instrumentation, and data analysis, are discussed. Applications of LC-MS-based metabolomics in exploratory and hypothesis-driven investigations of XIT are illustrated by case studies of xenobiotic metabolism and endogenous metabolism associated with xenobiotic exposure.
Gowda, G.A. Nagana; Raftery, Daniel
The field of metabolomics continues to witness rapid growth driven by fundamental studies, methods development, and applications in a number of disciplines that include biomedical science, plant and nutrition sciences, drug development, energy and environmental sciences, toxicology, etc. NMR spectroscopy is one of the two most widely used analytical platforms in the metabolomics field, along with mass spectrometry (MS). NMR's excellent reproducibility and quantitative accuracy, its ability to identify structures of unknown metabolites, its capacity to generate metabolite profiles using intact biospecimens with no need for separation, and its capabilities for tracing metabolic pathways using isotope labeled substrates offer unique strengths for metabolomics applications. However, NMR's limited sensitivity and resolution continue to pose a major challenge and have restricted both the number and the quantitative accuracy of metabolites analyzed by NMR. Further, the analysis of highly complex biological samples has increased the demand for new methods with improved detection, better unknown identification, and more accurate quantitation of larger numbers of metabolites. Recent efforts have contributed significant improvements in these areas, and have thereby enhanced the pool of routinely quantifiable metabolites. Additionally, efforts focused on combining NMR and MS promise opportunities to exploit the combined strength of the two analytical platforms for direct comparison of the metabolite data, unknown identification and reliable biomarker discovery that continue to challenge the metabolomics field. This article presents our perspectives on the emerging trends in NMR-based metabolomics and NMR's continuing role in the field with an emphasis on recent and ongoing research from our laboratory. PMID:26476597
A variety of chemicals produced by plants, often referred to as 'phytochemicals', have been used as medicines, food, fuels and industrial raw materials. Recent advances in the study of genomics and metabolomics in plant science have accelerated our understanding of the mechanisms, regulation and evolution of the biosynthesis of specialized plant products. We can now address such questions as how the metabolomic diversity of plants is originated at the levels of genome, and how we should apply this knowledge to drug discovery, industry and agriculture. Our research group has focused on metabolomics-based functional genomics over the last 15 years and we have developed a new research area called 'Phytochemical Genomics'. In this review, the development of a research platform for plant metabolomics is discussed first, to provide a better understanding of the chemical diversity of plants. Then, representative applications of metabolomics to functional genomics in a model plant, Arabidopsis thaliana, are described. The extension of integrated multi-omics analyses to non-model specialized plants, e.g., medicinal plants, is presented, including the identification of novel genes, metabolites and networks for the biosynthesis of flavonoids, alkaloids, sulfur-containing metabolites and terpenoids. Further, functional genomics studies on a variety of medicinal plants is presented. I also discuss future trends in pharmacognosy and related sciences.
Nagana Gowda, G. A.; Raftery, Daniel
The field of metabolomics continues to witness rapid growth driven by fundamental studies, methods development, and applications in a number of disciplines that include biomedical science, plant and nutrition sciences, drug development, energy and environmental sciences, toxicology, etc. NMR spectroscopy is one of the two most widely used analytical platforms in the metabolomics field, along with mass spectrometry (MS). NMR's excellent reproducibility and quantitative accuracy, its ability to identify structures of unknown metabolites, its capacity to generate metabolite profiles using intact bio-specimens with no need for separation, and its capabilities for tracing metabolic pathways using isotope labeled substrates offer unique strengths for metabolomics applications. However, NMR's limited sensitivity and resolution continue to pose a major challenge and have restricted both the number and the quantitative accuracy of metabolites analyzed by NMR. Further, the analysis of highly complex biological samples has increased the demand for new methods with improved detection, better unknown identification, and more accurate quantitation of larger numbers of metabolites. Recent efforts have contributed significant improvements in these areas, and have thereby enhanced the pool of routinely quantifiable metabolites. Additionally, efforts focused on combining NMR and MS promise opportunities to exploit the combined strength of the two analytical platforms for direct comparison of the metabolite data, unknown identification and reliable biomarker discovery that continue to challenge the metabolomics field. This article presents our perspectives on the emerging trends in NMR-based metabolomics and NMR's continuing role in the field with an emphasis on recent and ongoing research from our laboratory.
Full Text Available Web services allow permanent access to music from all over the world. Especially in the case of web services with user-supplied content, e.g., YouTube™, the available metadata is often incomplete or erroneous. On the other hand, a vast amount of high-quality and musically relevant metadata has been annotated in research areas such as Music Information Retrieval (MIR. Although they have great potential, these musical annotations are often inaccessible to users outside the academic world. With our contribution, we want to bridge this gap by enriching publicly available multimedia content with musical annotations available in research corpora, while maintaining easy access to the underlying data. Our web-based tools offer researchers and music lovers novel possibilities to interact with and navigate through the content. In this paper, we consider a research corpus called the Weimar Jazz Database (WJD as an illustrating example scenario. The WJD contains various annotations related to famous jazz solos. First, we establish a link between the WJD annotations and corresponding YouTube videos employing existing retrieval techniques. With these techniques, we were able to identify 988 corresponding YouTube videos for 329 solos out of 456 solos contained in the WJD. We then embed the retrieved videos in a recently developed web-based platform and enrich the videos with solo transcriptions that are part of the WJD. Furthermore, we integrate publicly available data resources from the Semantic Web in order to extend the presented information, for example, with a detailed discography or artists-related information. Our contribution illustrates the potential of modern web-based technologies for the digital humanities, and novel ways for improving access and interaction with digitized multimedia content.
Hasler-Sheetal, Harald; Holmer, Marianne; Weckwerth, Wolfram
Environmental metabolomics has become interesting in marine ecological studies. One example is the revealing of new insights in stress response of Zostera marina. This is essential to understand how, at which level and to what extend aquatic plants adapt, tolerate and react to environmental...... stressors. We exposed Z. marina to water column anoxia and assessed the diurnal metabolomic response by GC-TOF-MS based metabolomics identifying 109 known and 217 unknown metabolites. During day time photosynthetic oxygen production prevents severe effects of anoxia on the metabolome (complete set of small...... the applicability of metabolomics to assess environmental stress responses of Zostera marina....
Gomez-Casati, Diego F; Zanor, Maria I; Busi, María V
In the recent years, there has been an increase in the number of metabolomic approaches used, in parallel with proteomic and functional genomic studies. The wide variety of chemical types of metabolites available has also accelerated the use of different techniques in the investigation of the metabolome. At present, metabolomics is applied to investigate several human diseases, to improve their diagnosis and prevention, and to design better therapeutic strategies. In addition, metabolomic studies are also being carried out in areas such as toxicology and pharmacology, crop breeding, and plant biotechnology. In this review, we emphasize the use and application of metabolomics in human diseases and plant research to improve human health.
Bovo, S; Mazzoni, G; Calò, D G; Galimberti, G; Fanelli, F; Mezzullo, M; Schiavo, G; Scotti, E; Manisi, A; Samoré, A B; Bertolini, F; Trevisi, P; Bosi, P; Dall'Olio, S; Pagotto, U; Fontanesi, L
Metabolomics has opened new possibilities to investigate metabolic differences among animals. In this study, we applied a targeted metabolomic approach to deconstruct the pig sex metabolome as defined by castrated males and entire gilts. Plasma from 545 performance-tested Italian Large White pigs (172 castrated males and 373 females) sampled at about 160 kg live weight were analyzed for 186 metabolites using the Biocrates AbsoluteIDQ p180 Kit. After filtering, 132 metabolites (20 AA, 11 biogenic amines, 1 hexose, 13 acylcarnitines, 11 sphingomyelins, 67 phosphatidylcholines, and 9 lysophosphatidylcholines) were retained for further analyses. The multivariate approach of the sparse partial least squares discriminant analysis was applied, together with a specifically designed statistical pipeline, that included a permutation test and a 10 cross-fold validation procedure that produced stability and effect size statistics for each metabolite. Using this approach, we identified 85 biomarkers (with metabolites from all analyzed chemical families) that contributed to the differences between the 2 groups of pigs ( metabolic shift in castrated males toward energy storage and lipid production. Similar general patterns were observed for most sphingomyelins, phosphatidylcholines, and lysophosphatidylcholines. Metabolomic pathway analysis and pathway enrichment identified several differences between the 2 sexes. This metabolomic overview opened new clues on the biochemical mechanisms underlying sexual dimorphism that, on one hand, might explain differences in terms of economic traits between castrated male pigs and entire gilts and, on the other hand, could strengthen the pig as a model to define metabolic mechanisms related to fat deposition.
Johnson, W.; Kido Soule, M. C.; Longnecker, K.; Kujawinski, E. B.
Microbial consortia function via the exchange and transformation of small organic molecules or metabolites. These metabolites make up a pool of rapidly cycling organic matter in the ocean that is challenging to characterize due to its low concentrations. We seek to determine the distribution of these molecules and the factors that shape their abundance and flux. Through measurements of the abundance of a core set of metabolites, including nucleic acids, amino acids, sugars, vitamins, and signaling molecules, we gain a real-time snapshot of microbial activity. We used a targeted metabolomics technique to profile metabolite abundance in particulate and dissolved organic matter extracts collected from a 14,000 km transect running from 38˚S to 55˚N in the Western Atlantic Ocean. This extensive dataset is the first of its kind in the Atlantic Ocean and allows us to explore connections among metabolites as well as latitudinal trends in metabolite abundance. We found changes in the intracellular abundance of certain metabolites between low and high nutrient regions and a wide distribution of certain dissolved vitamins in the surface ocean. These measurements give us baseline data on the distribution of these metabolites and allow us to extend our understanding of microbial community activity in different regions of the ocean.
Jeffrey S Breunig
Full Text Available Metabolism, the conversion of nutrients into usable energy and biochemical building blocks, is an essential feature of all cells. The genetic factors responsible for inter-individual metabolic variability remain poorly understood. To investigate genetic causes of metabolome variation, we measured the concentrations of 74 metabolites across ~ 100 segregants from a Saccharomyces cerevisiae cross by liquid chromatography-tandem mass spectrometry. We found 52 quantitative trait loci for 34 metabolites. These included linkages due to overt changes in metabolic genes, e.g., linking pyrimidine intermediates to the deletion of ura3. They also included linkages not directly related to metabolic enzymes, such as those for five central carbon metabolites to ira2, a Ras/PKA pathway regulator, and for the metabolites, S-adenosyl-methionine and S-adenosyl-homocysteine to slt2, a MAP kinase involved in cell wall integrity. The variant of ira2 that elevates metabolite levels also increases glucose uptake and ethanol secretion. These results highlight specific examples of genetic variability, including in genes without prior known metabolic regulatory function, that impact yeast metabolism.
Kalkatawi, Manal M.; Alam, Intikhab; Bajic, Vladimir B.
We developed BEACON, a fast tool for an automated and a systematic comparison of different annotations of single genomes. The extended annotation assigns putative functions to many genes with unknown functions. BEACON is available under GNU General Public License version 3.0 and is accessible at: http://www.cbrc.kaust.edu.sa/BEACON/
Colasante, Meg; Douglas, Kathy
Annotation of video provides students with the opportunity to view and engage with audiovisual content in an interactive and participatory way rather than in passive-receptive mode. This article discusses research into the use of video annotation in four vocational programs at RMIT University in Melbourne, which allowed students to interact with…
Full Text Available Large-scale genome projects have generated a rapidly increasing number of DNA sequences. Therefore, development of computational methods to rapidly analyze these sequences is essential for progress in genomic research. Here we present an automatic annotation system for preliminary analysis of DNA sequences. The gene annotation tool (GATO is a Bioinformatics pipeline designed to facilitate routine functional annotation and easy access to annotated genes. It was designed in view of the frequent need of genomic researchers to access data pertaining to a common set of genes. In the GATO system, annotation is generated by querying some of the Web-accessible resources and the information is stored in a local database, which keeps a record of all previous annotation results. GATO may be accessed from everywhere through the internet or may be run locally if a large number of sequences are going to be annotated. It is implemented in PHP and Perl and may be run on any suitable Web server. Usually, installation and application of annotation systems require experience and are time consuming, but GATO is simple and practical, allowing anyone with basic skills in informatics to access it without any special training. GATO can be downloaded at [http://mariwork.iq.usp.br/gato/]. Minimum computer free space required is 2 MB.
This annotated bibliography is divided into three sections. Section I contains annotations of general publications on work time options. Section II presents resources on flexitime and the compressed work week. In Section III are found resources related to these reduced work time options: permanent part-time employment, job sharing, voluntary…
da Silva, Ricardo R; Wang, Mingxun; Nothias, Louis-Félix; van der Hooft, Justin J J; Caraballo-Rodríguez, Andrés Mauricio; Fox, Evan; Balunas, Marcy J; Klassen, Jonathan L; Lopes, Norberto Peporine; Dorrestein, Pieter C
The annotation of small molecules is one of the most challenging and important steps in untargeted mass spectrometry analysis, as most of our biological interpretations rely on structural annotations. Molecular networking has emerged as a structured way to organize and mine data from untargeted tandem mass spectrometry (MS/MS) experiments and has been widely applied to propagate annotations. However, propagation is done through manual inspection of MS/MS spectra connected in the spectral networks and is only possible when a reference library spectrum is available. One of the alternative approaches used to annotate an unknown fragmentation mass spectrum is through the use of in silico predictions. One of the challenges of in silico annotation is the uncertainty around the correct structure among the predicted candidate lists. Here we show how molecular networking can be used to improve the accuracy of in silico predictions through propagation of structural annotations, even when there is no match to a MS/MS spectrum in spectral libraries. This is accomplished through creating a network consensus of re-ranked structural candidates using the molecular network topology and structural similarity to improve in silico annotations. The Network Annotation Propagation (NAP) tool is accessible through the GNPS web-platform https://gnps.ucsd.edu/ProteoSAFe/static/gnps-theoretical.jsp.
Tobes, Raquel; Pareja-Tobes, Pablo; Manrique, Marina; Pareja-Tobes, Eduardo; Kovach, Evdokim; Alekhin, Alexey; Pareja, Eduardo
New massive sequencing technologies are providing many bacterial genome sequences from diverse taxa but a refined annotation of these genomes is crucial for obtaining scientific findings and new knowledge. Thus, bacterial genome annotation has emerged as a key point to investigate in bacteria. Any efficient tool designed specifically to annotate bacterial genomes sequenced with massively parallel technologies has to consider the specific features of bacterial genomes (absence of introns and scarcity of nonprotein-coding sequence) and of next-generation sequencing (NGS) technologies (presence of errors and not perfectly assembled genomes). These features make it convenient to focus on coding regions and, hence, on protein sequences that are the elements directly related with biological functions. In this chapter we describe how to annotate bacterial genomes with BG7, an open-source tool based on a protein-centered gene calling/annotation paradigm. BG7 is specifically designed for the annotation of bacterial genomes sequenced with NGS. This tool is sequence error tolerant maintaining their capabilities for the annotation of highly fragmented genomes or for annotating mixed sequences coming from several genomes (as those obtained through metagenomics samples). BG7 has been designed with scalability as a requirement, with a computing infrastructure completely based on cloud computing (Amazon Web Services).
This study examined the effect of online metacognitive strategies, hypermedia annotations, and motivation on reading comprehension in a Taiwanese hypertext environment. A path analysis model was proposed based on the assumption that if English as a foreign language learners frequently use online metacognitive strategies and hypermedia annotations,…
Wise, Michael J.
Protein Annotators' Assistant (PAA) is a software system which assists protein annotators in assigning functions to newly sequenced proteins. PAA employs a number of information retrieval techniques in a novel setting and is thus related to text categorization, where multiple categories may be suggested, except that in this case none of the…
Ceolin, D.; Nottamkandath, A.; Fokkink, W.J.; Dimitrakos, Th.; Moona, R.; Patel, Dh.; Harrison McKnight, D.
Museums are rapidly digitizing their collections, and face a huge challenge to annotate every digitized artifact in store. Therefore they are opening up their archives for receiving annotations from experts world-wide. This paper presents an architecture for choosing the most eligible set of
Steimle, Jurgen; Brdiczka, Oliver; Muhlhauser, Max
In a study of notetaking in university courses, we found that the large majority of students prefer paper to computer-based media like Tablet PCs for taking notes and making annotations. Based on this finding, we developed CoScribe, a concept and system which supports students in making collaborative handwritten annotations on printed lecture…
Fisseni, B.; Kurji, A.; Löwe, B.
We continue the study of the reproducibility of Propp’s annotations from Bod et al. (2012). We present four experiments in which test subjects were taught Propp’s annotation system; we conclude that Propp’s system needs a significant amount of training, but that with sufficient time investment, it
Perez-Paredes, Pascual; Alcaraz-Calero, Jose M.
Although "annotation" is a widely-researched topic in Corpus Linguistics (CL), its potential role in Data Driven Learning (DDL) has not been addressed in depth by Foreign Language Teaching (FLT) practitioners. Furthermore, most of the research in the use of DDL methods pays little attention to annotation in the design and implementation…
Samejima, Masaki; Hisakane, Daichi; Komoda, Norihisa
Purpose: The purpose of this paper is to annotate an attribute of a problem, a solution or no annotation on learners' opinions automatically for supporting the learners' discussion without a facilitator. The case method aims at discussing problems and solutions in a target case. However, the learners miss discussing some of problems and solutions.…
Ab initio gene prediction and evidence alignment were used to produce the first annotations for the fathead minnow SOAPdenovo genome assembly. Additionally, a genome browser hosted at genome.setac.org provides simplified access to the annotation data in context with fathead minno...
Jorge, Tiago F.; Mata, Ana T.
Metabolomics is a research field used to acquire comprehensive information on the composition of a metabolite pool to provide a functional screen of the cellular state. Studies of the plant metabolome include the analysis of a wide range of chemical species with very diverse physico-chemical properties, and therefore powerful analytical tools are required for the separation, characterization and quantification of this vast compound diversity present in plant matrices. In this review, challenges in the use of mass spectrometry (MS) as a quantitative tool in plant metabolomics experiments are discussed, and important criteria for the development and validation of MS-based analytical methods provided. This article is part of the themed issue ‘Quantitative mass spectrometry’. PMID:27644967
Savorani, Francesco; Rasmussen, Morten Arendt; Mikkelsen, Mette Skau
This paper outlines the advantages and disadvantages of using high throughput NMR metabolomics for nutritional studies with emphasis on the workflow and data analytical methods for generation of new knowledge. The paper describes one-by-one the major research activities in the interdisciplinary...... metabolomics platform and highlights the opportunities that NMR spectra can provide in future nutrition studies. Three areas are emphasized: (1) NMR as an unbiased and non-destructive platform for providing an overview of the metabolome under investigation, (2) NMR for providing versatile information and data...... structures for multivariate pattern recognition methods and (3) NMR for providing a unique fingerprint of the lipoprotein status of the subject. For the first time in history, by combining NMR spectroscopy and chemometrics we are able to perform inductive nutritional research as a complement to the deductive...
Fearnley, Liam G; Inouye, Michael
Metabolomics is becoming feasible for population-scale studies of human disease. In this review, we survey epidemiological studies that leverage metabolomics and multi-omics to gain insight into disease mechanisms. We outline key practical, technological and analytical limitations while also highlighting recent successes in integrating these data. The use of multi-omics to infer reaction rates is discussed as a potential future direction for metabolomics research, as a means of identifying biomarkers as well as inferring causality. Furthermore, we highlight established analysis approaches as well as simulation-based methods currently used in single- and multi-cell levels in systems biology. © The Author 2016. Published by Oxford University Press on behalf of the International Epidemiological Association.