Full Text Available BACKGROUND: In metabolomics researches using mass spectrometry (MS, systematic searching of high-resolution mass data against compound databases is often the first step of metabolite annotation to determine elemental compositions possessing similar theoretical mass numbers. However, incorrect hits derived from errors in mass analyses will be included in the results of elemental composition searches. To assess the quality of peak annotation information, a novel methodology for false discovery rates (FDR evaluation is presented in this study. Based on the FDR analyses, several aspects of an elemental composition search, including setting a threshold, estimating FDR, and the types of elemental composition databases most reliable for searching are discussed. METHODOLOGY/PRINCIPAL FINDINGS: The FDR can be determined from one measured value (i.e., the hit rate for search queries and four parameters determined by Monte Carlo simulation. The results indicate that relatively high FDR values (30-50% were obtained when searching time-of-flight (TOF/MS data using the KNApSAcK and KEGG databases. In addition, searches against large all-in-one databases (e.g., PubChem always produced unacceptable results (FDR >70%. The estimated FDRs suggest that the quality of search results can be improved not only by performing more accurate mass analysis but also by modifying the properties of the compound database. A theoretical analysis indicates that FDR could be improved by using compound database with smaller but higher completeness entries. CONCLUSIONS/SIGNIFICANCE: High accuracy mass analysis, such as Fourier transform (FT-MS, is needed for reliable annotation (FDR <10%. In addition, a small, customized compound database is preferable for high-quality annotation of metabolome data.
Bouhifd, Mounir; Beger, Richard; Flynn, Thomas; Guo, Lining; Harris, Georgina; Hogberg, Helena; Kaddurah-Daouk, Rima; Kamp, Hennicke; Kleensang, Andre; Maertens, Alexandra; Odwin-DaCosta, Shelly; Pamies, David; Robertson, Donald; Smirnova, Lena; Sun, Jinchun; Zhao, Liang; Hartung, Thomas
Metabolomics promises a holistic phenotypic characterization of biological responses to toxicants. This technology is based on advanced chemical analytical tools with reasonable throughput, including mass-spectroscopy and NMR. Quality assurance, however - from experimental design, sample preparation, metabolite identification, to bioinformatics data-mining - is urgently needed to assure both quality of metabolomics data and reproducibility of biological models. In contrast to microarray-based transcriptomics, where consensus on quality assurance and reporting standards has been fostered over the last two decades, quality assurance of metabolomics is only now emerging. Regulatory use in safety sciences, and even proper scientific use of these technologies, demand quality assurance. In an effort to promote this discussion, an expert workshop discussed the quality assurance needs of metabolomics. The goals for this workshop were 1) to consider the challenges associated with metabolomics as an emerging science, with an emphasis on its application in toxicology and 2) to identify the key issues to be addressed in order to establish and implement quality assurance procedures in metabolomics-based toxicology. Consensus has still to be achieved regarding best practices to make sure sound, useful, and relevant information is derived from these new tools.
competitive position in that marketplace. In many cases , this reassessment has led to the adoption of a new way of managing and measuring quality and...wide quality control (CWQC), plan-do-check-act cycle of continuous improvement ( PDCA ), and quality improvement teams and boards. Other Miscellaneous...A. (1981). Banking on high quality . Quality Progress, 14(12), 14-19. Key Terms: Structure and Organization/ Case Histories. Abstract: The authors
Dudzik, Danuta; Barbas-Bernardos, Cecilia; García, Antonia; Barbas, Coral
Untargeted metabolomics, as a global approach, has already proven its great potential and capabilities for the investigation of health and disease, as well as the wide applicability for other research areas. Although great progress has been made on the feasibility of metabolomics experiments, there are still some challenges that should be faced and that includes all sources of fluctuations and bias affecting every step involved in multiplatform untargeted metabolomics studies. The identification and reduction of the main sources of unwanted variation regarding the pre-analytical, analytical and post-analytical phase of metabolomics experiments is essential to ensure high data quality. Nowadays, there is still a lack of information regarding harmonized guidelines for quality assurance as those available for targeted analysis. In this review, sources of variations to be considered and minimized along with methodologies and strategies for monitoring and improvement the quality of the results are discussed. The given information is based on evidences from different groups among our own experiences and recommendations for each stage of the metabolomics workflow. The comprehensive overview with tools presented here might serve other researchers interested in monitoring, controlling and improving the reliability of their findings by implementation of good experimental quality practices in the untargeted metabolomics study. Copyright © 2017 Elsevier B.V. All rights reserved.
is a presentation of a core consistency diagnostic aiding in determining the number of components in a PARAFAC2 model. It is of great importance to validate especially PLS-DA models and if not done properly, the developed models might reveal spurious groupings. Furthermore, data from metabolomics studies contain...... and the results indicate that GC-MS-based metabolomics in combination with PARAFAC2 modelling is applicable for extracting relevant biological information from the plasma samples. Overall, the work in this thesis shows that suitable and properly validated chemometrics models used in metabolomics are very useful...
Kamstrup-Nielsen, Maja Hermann
how to properly handle complex metabolomics data, in order to achieve reliable and valid multivariate models. This has been illustrated by three case studies with examples of forecasting breast cancer and early detection of colorectal cancer based on data from nuclear magnetic resonance (NMR...... based on NMR data with RRV and known risk markers. The sensitivity and specificity values are 0.80 and 0.79, respectively, for a test set validated model. The second case study is based on plasma samples with verified colorectal cancer and three types of control samples analysed by fluorescence...... spectroscopy a potential tool in early detection of colorectal cancer. Finally, plasma samples have been analysed using GC-MS. The method requires extensive sample preparation and therefore the study can only be considered a feasibility study with room for optimization. However, 14 plasma samples were analysed...
Full Text Available The aim of this study was to elucidate the underlying biochemical processes to identify potential key molecules of meat quality traits drip loss, pH of meat 1 h post-mortem (pH1, pH in meat 24 h post-mortem (pH24 and meat color. An untargeted metabolomics approach detected the profiles of 393 annotated and 1,600 unknown metabolites in 97 Duroc × Pietrain pigs. Despite obvious differences regarding the statistical approaches, the four applied methods, namely correlation analysis, principal component analysis, weighted network analysis (WNA and random forest regression (RFR, revealed mainly concordant results. Our findings lead to the conclusion that meat quality traits pH1, pH24 and color are strongly influenced by processes of post-mortem energy metabolism like glycolysis and pentose phosphate pathway, whereas drip loss is significantly associated with metabolites of lipid metabolism. In case of drip loss, RFR was the most suitable method to identify reliable biomarkers and to predict the phenotype based on metabolites. On the other hand, WNA provides the best parameters to investigate the metabolite interactions and to clarify the complex molecular background of meat quality traits. In summary, it was possible to attain findings on the interaction of meat quality traits and their underlying biochemical processes. The detected key metabolites might be better indicators of meat quality especially of drip loss than the measured phenotype itself and potentially might be used as bio indicators.
Full Text Available The identification of translation initiation sites (TISs constitutes an important aspect of sequence-based genome analysis. An erroneous TIS annotation can impair the identification of regulatory elements and N-terminal signal peptides, and also may flaw the determination of descent, for any particular gene. We have formulated a reference-free method to score the TIS annotation quality. The method is based on a comparison of the observed and expected distribution of all TISs in a particular genome given prior gene-calling. We have assessed the TIS annotations for all available NCBI RefSeq microbial genomes and found that approximately 87% is of appropriate quality, whereas 13% needs substantial improvement. We have analyzed a number of factors that could affect TIS annotation quality such as GC-content, taxonomy, the fraction of genes with a Shine-Dalgarno sequence and the year of publication. The analysis showed that only the first factor has a clear effect. We have then formulated a straightforward Principle Component Analysis-based TIS identification strategy to self-organize and score potential TISs. The strategy is independent of reference data and a priori calculations. A representative set of 277 genomes was subjected to the analysis and we found a clear increase in TIS annotation quality for the genomes with a low quality score. The PCA-based annotation was also compared with annotation with the current tool of reference, Prodigal. The comparison for the model genome of Escherichia coli K12 showed that both methods supplement each other and that prediction agreement can be used as an indicator of a correct TIS annotation. Importantly, the data suggest that the addition of a PCA-based strategy to a Prodigal prediction can be used to 'flag' TIS annotations for re-evaluation and in addition can be used to evaluate a given annotation in case a Prodigal annotation is lacking.
Habchi, Baninia; Alves, Sandra; Jouan-Rimbaud Bouveresse, Delphine; Appenzeller, Brice; Paris, Alain; Rutledge, Douglas N; Rathahao-Paris, Estelle
Due to the presence of pollutants in the environment and food, the assessment of human exposure is required. This necessitates high-throughput approaches enabling large-scale analysis and, as a consequence, the use of high-performance analytical instruments to obtain highly informative metabolomic profiles. In this study, direct introduction mass spectrometry (DIMS) was performed using a Fourier transform ion cyclotron resonance (FT-ICR) instrument equipped with a dynamically harmonized cell. Data quality was evaluated based on mass resolving power (RP), mass measurement accuracy, and ion intensity drifts from the repeated injections of quality control sample (QC) along the analytical process. The large DIMS data size entails the use of bioinformatic tools for the automatic selection of common ions found in all QC injections and for robustness assessment and correction of eventual technical drifts. RP values greater than 10 6 and mass measurement accuracy of lower than 1 ppm were obtained using broadband mode resulting in the detection of isotopic fine structure. Hence, a very accurate relative isotopic mass defect (RΔm) value was calculated. This reduces significantly the number of elemental composition (EC) candidates and greatly improves compound annotation. A very satisfactory estimate of repeatability of both peak intensity and mass measurement was demonstrated. Although, a non negligible ion intensity drift was observed for negative ion mode data, a normalization procedure was easily applied to correct this phenomenon. This study illustrates the performance and robustness of the dynamically harmonized FT-ICR cell to perform large-scale high-throughput metabolomic analyses in routine conditions. Graphical abstract Analytical performance of FT-ICR instrument equipped with a dynamically harmonized cell.
Full Text Available Abstract Background Metabolomic studies are targeted at identifying and quantifying all metabolites in a given biological context. Among the tools used for metabolomic research, mass spectrometry is one of the most powerful tools. However, metabolomics by mass spectrometry always reveals a high number of unknown compounds which complicate in depth mechanistic or biochemical understanding. In principle, mass spectrometry can be utilized within strategies of de novo structure elucidation of small molecules, starting with the computation of the elemental composition of an unknown metabolite using accurate masses with errors Results High mass accuracy (95% of false candidates. This orthogonal filter can condense several thousand candidates down to only a small number of molecular formulas. Example calculations for 10, 5, 3, 1 and 0.1 ppm mass accuracy are given. Corresponding software scripts can be downloaded from http://fiehnlab.ucdavis.edu. A comparison of eight chemical databases revealed that PubChem and the Dictionary of Natural Products can be recommended for automatic queries using molecular formulae. Conclusion More than 1.6 million molecular formulae in the range 0–500 Da were generated in an exhaustive manner under strict observation of mathematical and chemical rules. Assuming that ion species are fully resolved (either by chromatography or by high resolution mass spectrometry, we conclude that a mass spectrometer capable of 3 ppm mass accuracy and 2% error for isotopic abundance patterns outperforms mass spectrometers with less than 1 ppm mass accuracy or even hypothetical mass spectrometers with 0.1 ppm mass accuracy that do not include isotope information in the calculation of molecular formulae.
Shahaf, N.; Franceschi, P.; Arapitsas, P.; Rogachev, I.; Vrhovsek, U.; Wehrens, H.R.M.J.
RATIONALE Estimation of mass measurement accuracy is an elementary step in the application of mass spectroscopy (MS) data towards metabolite annotations and has been addressed several times in the past. However, the reproducibility of mass measurements over a diverse set of analytes and in variable
Yin, Peiyuan; Peter, Andreas; Franken, Holger; Zhao, Xinjie; Neukamm, Sabine S; Rosenbaum, Lars; Lucio, Marianna; Zell, Andreas; Häring, Hans-Ulrich; Xu, Guowang; Lehmann, Rainer
Metabolomics is a powerful tool that is increasingly used in clinical research. Although excellent sample quality is essential, it can easily be compromised by undetected preanalytical errors. We set out to identify critical preanalytical steps and biomarkers that reflect preanalytical inaccuracies. We systematically investigated the effects of preanalytical variables (blood collection tubes, hemolysis, temperature and time before further processing, and number of freeze-thaw cycles) on metabolomics studies of clinical blood and plasma samples using a nontargeted LC-MS approach. Serum and heparinate blood collection tubes led to chemical noise in the mass spectra. Distinct, significant changes of 64 features in the EDTA-plasma metabolome were detected when blood was exposed to room temperature for 2, 4, 8, and 24 h. The resulting pattern was characterized by increases in hypoxanthine and sphingosine 1-phosphate (800% and 380%, respectively, at 2 h). In contrast, the plasma metabolome was stable for up to 4 h when EDTA blood samples were immediately placed in iced water. Hemolysis also caused numerous changes in the metabolic profile. Unexpectedly, up to 4 freeze-thaw cycles only slightly changed the EDTA-plasma metabolome, but increased the individual variability. Nontargeted metabolomics investigations led to the following recommendations for the preanalytical phase: test the blood collection tubes, avoid hemolysis, place whole blood immediately in ice water, use EDTA plasma, and preferably use nonrefrozen biobank samples. To exclude outliers due to preanalytical errors, inspect the biomarker signal intensities reflecting systematic as well as accidental and preanalytical inaccuracies before processing the bioinformatics data. © 2013 American Association for Clinical Chemistry.
Lee, Kyung-Min; Jeon, Jun-Yeong; Lee, Byeong-Ju; Lee, Hwanhui; Choi, Hyung-Kyoon
Metabolomics has been used as a powerful tool for the analysis and quality assessment of the natural product (NP)-derived medicines. It is increasingly being used in the quality control and standardization of NP-derived medicines because they are composed of hundreds of natural compounds. The most common techniques that are used in metabolomics consist of NMR, GC-MS, and LC-MS in combination with multivariate statistical analyses including principal components analysis (PCA) and partial least squares-discriminant analysis (PLS-DA). Currently, the quality control of the NP-derived medicines is usually conducted using HPLC and is specified by one or two indicators. To create a superior quality control framework and avoid adulterated drugs, it is necessary to be able to determine and establish standards based on multiple ingredients using metabolic profiling and fingerprinting. Therefore, the application of various analytical tools in the quality control of NP-derived medicines forms the major part of this review. Veregen ® (Medigene AG, Planegg/Martinsried, Germany), which is the first botanical prescription drug approved by US Food and Drug Administration, is reviewed as an example that will hopefully provide future directions and perspectives on metabolomics technologies available for the quality control of NP-derived medicines.
In recent years, omic sciences have been increasingly employed in a multitude of research fields thanks to their high-throughput capabilities and holistic approach. Among the omic sciences, metabolomics and foodomics have recently emerged in the investigation of food and nutrition and their relat......In recent years, omic sciences have been increasingly employed in a multitude of research fields thanks to their high-throughput capabilities and holistic approach. Among the omic sciences, metabolomics and foodomics have recently emerged in the investigation of food and nutrition...... carried out both in Italy and in Denmark, outlines the analytical pipeline of the foodomic approach and highlights the current challenges in the field (Chapter 2.3). The thesis traces the path of modern foodomics and metabolomics from the definition and description of food quality (Chapters 3 to 6......), to the profiling of the metabolome (Chapters 7 to 8.5), and finally the investigation of the impact of food on the human health, the prevention of diseases, and the identification of biomarkers of health status (Chapters 8.6 and 8.7). The impact of factors such as genetic modification or farming method...
Liu, Shao; Liang, Yi-Zeng; Liu, Hai-Tao
Traditional Chinese medicines (TCMs) bring a great challenge in quality control and evaluating the efficacy because of their complexity of chemical composition. Chemometric techniques provide a good opportunity for mining more useful chemical information from TCMs. Then, the application of chemometrics in the field of TCMs is spontaneous and necessary. This review focuses on the recent various important chemometrics tools for chromatographic fingerprinting, including peak alignment information features, baseline correction and applications of chemometrics in metabolomics and modernization of TCMs, including authentication and evaluation of the quality of TCMs, evaluating the efficacy of TCMs and essence of TCM syndrome. In the conclusions, the general trends and some recommendations for improving chromatographic metabolomics data analysis are provided. Copyright © 2016 Elsevier B.V. All rights reserved.
Kurita, Kenji L; Glassey, Emerson; Linington, Roger G
Traditional natural products discovery using a combination of live/dead screening followed by iterative bioassay-guided fractionation affords no information about compound structure or mode of action until late in the discovery process. This leads to high rates of rediscovery and low probabilities of finding compounds with unique biological and/or chemical properties. By integrating image-based phenotypic screening in HeLa cells with high-resolution untargeted metabolomics analysis, we have developed a new platform, termed Compound Activity Mapping, that is capable of directly predicting the identities and modes of action of bioactive constituents for any complex natural product extract library. This new tool can be used to rapidly identify novel bioactive constituents and provide predictions of compound modes of action directly from primary screening data. This approach inverts the natural products discovery process from the existing "grind and find" model to a targeted, hypothesis-driven discovery model where the chemical features and biological function of bioactive metabolites are known early in the screening workflow, and lead compounds can be rationally selected based on biological and/or chemical novelty. We demonstrate the utility of the Compound Activity Mapping platform by combining 10,977 mass spectral features and 58,032 biological measurements from a library of 234 natural products extracts and integrating these two datasets to identify 13 clusters of fractions containing 11 known compound families and four new compounds. Using Compound Activity Mapping we discovered the quinocinnolinomycins, a new family of natural products with a unique carbon skeleton that cause endoplasmic reticulum stress.
Simader, Alexandra Maria; Kluger, Bernhard; Neumann, Nora Katharina Nicole; Bueschl, Christoph; Lemmens, Marc; Lirk, Gerald; Krska, Rudolf; Schuhmacher, Rainer
Metabolomics experiments often comprise large numbers of biological samples resulting in huge amounts of data. This data needs to be inspected for plausibility before data evaluation to detect putative sources of error e.g. retention time or mass accuracy shifts. Especially in liquid chromatography-high resolution mass spectrometry (LC-HRMS) based metabolomics research, proper quality control checks (e.g. for precision, signal drifts or offsets) are crucial prerequisites to achieve reliable and comparable results within and across experimental measurement sequences. Software tools can support this process. The software tool QCScreen was developed to offer a quick and easy data quality check of LC-HRMS derived data. It allows a flexible investigation and comparison of basic quality-related parameters within user-defined target features and the possibility to automatically evaluate multiple sample types within or across different measurement sequences in a short time. It offers a user-friendly interface that allows an easy selection of processing steps and parameter settings. The generated results include a coloured overview plot of data quality across all analysed samples and targets and, in addition, detailed illustrations of the stability and precision of the chromatographic separation, the mass accuracy and the detector sensitivity. The use of QCScreen is demonstrated with experimental data from metabolomics experiments using selected standard compounds in pure solvent. The application of the software identified problematic features, samples and analytical parameters and suggested which data files or compounds required closer manual inspection. QCScreen is an open source software tool which provides a useful basis for assessing the suitability of LC-HRMS data prior to time consuming, detailed data processing and subsequent statistical analysis. It accepts the generic mzXML format and thus can be used with many different LC-HRMS platforms to process both multiple
Díaz, Ramon; Gallart-Ayala, Hector; Sancho, Juan V; Nuñez, Oscar; Zamora, Tatiana; Martins, Claudia P B; Hernández, Félix; Hernández-Cassou, Santiago; Saurina, Javier; Checa, Antonio
This work focuses on the influence of the selected LC-HRMS platform on the final annotated compounds in non-targeted metabolomics. Two platforms that differed in columns, mobile phases, gradients, chromatographs, mass spectrometers (Orbitrap [Platform#1] and Q-TOF [Platform#2]), data processing and marker selection protocols were compared. A total of 42 wines samples from three different protected denomination of origin (PDO) were analyzed. At the feature level, good (O)PLS-DA models were obtained for both platforms (Q(2)[Platform#1]=0.89, 0.83 and 0.72; Q(2)[Platform#2]=0.86, 0.86 and 0.77 for Penedes, Ribera del Duero and Rioja wines respectively) with 100% correctly classified samples in all cases. At the annotated metabolite level, platforms proposed 9 and 8 annotated metabolites respectively which were identified by matching standards or the MS/MS spectra of the compounds. At this stage, there was no coincidence among platforms regarding the suggested metabolites. When screened on the raw data, 6 and 5 of these compounds were detected on the other platform with a similar trend. Some of the detected metabolites showed complimentary information when integrated on biological pathways. Through the use of some examples at the annotated metabolite level, possible explanations of this initial divergence on the results are presented. This work shows the complications that may arise on the comparison of non-targeted metabolomics platforms even when metabolite focused approaches are used in the identification. Copyright © 2016 Elsevier B.V. All rights reserved.
Guo, An Chi; Jewison, Timothy; Wilson, Michael; Liu, Yifeng; Knox, Craig; Djoumbou, Yannick; Lo, Patrick; Mandal, Rupasri; Krishnamurthy, Ram; Wishart, David S.
The Escherichia coli Metabolome Database (ECMDB, http://www.ecmdb.ca) is a comprehensively annotated metabolomic database containing detailed information about the metabolome of E. coli (K-12). Modelled closely on the Human and Yeast Metabolome Databases, the ECMDB contains >2600 metabolites with links to ?1500 different genes and proteins, including enzymes and transporters. The information in the ECMDB has been collected from dozens of textbooks, journal articles and electronic databases. E...
The FDA mandates that digital electrocardiograms (ECGs) from 'thorough' QTc trials be submitted into the ECG Warehouse in Health Level 7 extended markup language format with annotated onset and offset points of waveforms. The FDA did not disclose the exact Warehouse metrics and minimal acceptable quality standards. The author describes the Warehouse scoring algorithms and metrics used by FDA, points out ways to improve FDA review and suggests Warehouse benefits for pharmaceutical sponsors. The Warehouse ranks individual ECGs according to their score for each quality metric and produces histogram distributions with Warehouse-specific thresholds that identify ECGs of questionable quality. Automatic Warehouse algorithms assess the quality of QT annotation and duration of manual QT measurement by the central ECG laboratory.
Full Text Available Abstract Background Macromolecular visualization as well as automated structural and functional annotation tools play an increasingly important role in the post-genomic era, contributing significantly towards the understanding of molecular systems and processes. For example, three dimensional (3D models help in exploring protein active sites and functional hot spots that can be targeted in drug design. Automated annotation and visualization pipelines can also reveal other functionally important attributes of macromolecules. These goals are dependent on the availability of advanced tools that integrate better the existing databases, annotation servers and other resources with state-of-the-art rendering programs. Results We present a new tool for protein structure analysis, with the focus on annotation and visualization of protein complexes, which is an extension of our previously developed POLYVIEW web server. By integrating the web technology with state-of-the-art software for macromolecular visualization, such as the PyMol program, POLYVIEW-3D enables combining versatile structural and functional annotations with a simple web-based interface for creating publication quality structure rendering, as well as animated images for Powerpoint™, web sites and other electronic resources. The service is platform independent and no plug-ins are required. Several examples of how POLYVIEW-3D can be used for structural and functional analysis in the context of protein-protein interactions are presented to illustrate the available annotation options. Conclusion POLYVIEW-3D server features the PyMol image rendering that provides detailed and high quality presentation of macromolecular structures, with an easy to use web-based interface. POLYVIEW-3D also provides a wide array of options for automated structural and functional analysis of proteins and their complexes. Thus, the POLYVIEW-3D server may become an important resource for researches and educators in
Johannesen, Lars; Galeotti, Loriano
An algorithm to determine the quality of electrocardiograms (ECGs) can enable inexperienced nurses and paramedics to record ECGs of sufficient diagnostic quality. Previously, we proposed an algorithm for determining if ECG recordings are of acceptable quality, which was entered in the PhysioNet Challenge 2011. In the present work, we propose an improved two-step algorithm, which first rejects ECGs with macroscopic errors (signal absent, large voltage shifts or saturation) and subsequently quantifies the noise (baseline, powerline or muscular noise) on a continuous scale. The performance of the improved algorithm was evaluated using the PhysioNet Challenge database (1500 ECGs rated by humans for signal quality). We achieved a classification accuracy of 92.3% on the training set and 90.0% on the test set. The improved algorithm is capable of detecting ECGs with macroscopic errors and giving the user a score of the overall quality. This allows the user to assess the degree of noise and decide if it is acceptable depending on the purpose of the recording.
Liang, Yu-Jen; Lin, Yu-Ting; Chen, Chia-Wei; Lin, Chien-Wei; Chao, Kun-Mao; Pan, Wen-Harn; Yang, Hsin-Chou
Metabolomics data provide unprecedented opportunities to decipher metabolic mechanisms by analyzing hundreds to thousands of metabolites. Data quality concerns and complex batch effects in metabolomics must be appropriately addressed through statistical analysis. This study developed an integrated analysis tool for metabolomics studies to streamline the complete analysis flow from initial data preprocessing to downstream association analysis. We developed Statistical Metabolomics Analysis-An R Tool (SMART), which can analyze input files with different formats, visually represent various types of data features, implement peak alignment and annotation, conduct quality control for samples and peaks, explore batch effects, and perform association analysis. A pharmacometabolomics study of antihypertensive medication was conducted and data were analyzed using SMART. Neuromedin N was identified as a metabolite significantly associated with angiotensin-converting-enzyme inhibitors in our metabolome-wide association analysis (p = 1.56 × 10(-4) in an analysis of covariance (ANCOVA) with an adjustment for unknown latent groups and p = 1.02 × 10(-4) in an ANCOVA with an adjustment for hidden substructures). This endogenous neuropeptide is highly related to neurotensin and neuromedin U, which are involved in blood pressure regulation and smooth muscle contraction. The SMART software, a user guide, and example data can be downloaded from http://www.stat.sinica.edu.tw/hsinchou/metabolomics/SMART.htm .
Bernillon, Stéphane; Biais, Benoit; Deborde, Catherine
Melon (Cucumis melo L.) is a global crop in terms of economic importance and nutritional quality. The aim of this study was to explore the variability in metabolite and elemental composition of several commercial varieties of melon in various environmental conditions. Volatile and non......-volatile metabolites as well as mineral elements were profiled in the flesh of mature fruit, employing a range of complementary analytical technologies. More than 1,000 metabolite signatures and 19 mineral elements were determined. Data analyses revealed variations related to factors such as variety, growing season...... tools to characterize the quality of fruits cultivated under commercial conditions. They can also provide knowledge on fruit metabolism and the mechanisms of plant response to environmental modifications, thereby paving the way for metabolomics-guided improvement of cultural practices for better fruit...
Diet, dietary patterns, and other environmental factors such as exposure to toxins are playing an important role in the prevention/development of many diseases, like obesity, type 2 diabetes, and consequently on the health status of individuals. A major challenge nowadays is to identify novel biomarkers to detect as early as possible metabolic dysfunction and to predict evolution of health status in order to refine nutritional advices to specific population groups. Omics technologies such as genomics, transcriptomics, proteomics, and metabolomics coupled with statistical and bioinformatics tools have already shown great potential in this research field even if so far only few biomarkers have been validated. For the past two decades, important analytical techniques have been developed to detect as many metabolites as possible in human biofluids such as urine, blood, and saliva. In the field of food science and nutrition, many studies have been carried out for food authenticity, quality, and safety, as well as for food processing. Furthermore, metabolomic investigations have been carried out to discover new early biomarkers of metabolic dysfunction and predictive biomarkers of developing pathologies (obesity, metabolic syndrome, type-2 diabetes, etc.). Great emphasis is also placed in the development of methodologies to identify and validate biomarkers of nutrients exposure. © 2017 Elsevier Inc. All rights reserved.
Metabolomics provides a holistic approach to investigate the perturbations in human metabolism with respect to a specific exposure. In nutritional metabolomics, the research question is generally related to the effect of a specific food intake on metabolic profiles commonly of plasma or urine...... strategy influences the patterns identified as important for the nutritional question under study. Therefore, in depth understanding of the study design and the specific effects of the analytical technology on the produced data is extremely important to achieve high quality data handling. Besides data...... handling, this thesis also deals with biological interpretation of postprandial metabolism and trans fatty acid (TFA) intake. Two nutritional issues were objects of investigation: 1) metabolic states as a function of time since the last meal and 2) markers related to intakes of cis- and trans-fat. Plasma...
Morris, P. J.; Kelly, M. A.; Lowery, D. B.; Macklin, J. A.; Morris, R. A.; Tremonte, D.; Wang, Z.
The single greatest problem with the federation of scientific data is the assessment of the quality and validity of the aggregated data in the context of particular research problems, that is, its fitness for use. There are three critical data quality issues in networks of distributed natural science collections data, as in all scientific data: identifying and correcting errors, maintaining currency, and assessing fitness for use. To this end, we have designed and implemented a prototype network in the domain of natural science collections. This prototype is built over the open source Map-Reduce platform Hadoop with a network client in the open source collections management system Specify 6. We call this network “Filtered Push” as, at its core, annotations are pushed from the network edges to relevant authoritative repositories, where humans and software filter the annotations before accepting them as changes to the authoritative data. The Filtered Push software is a domain-neutral framework for originating, distributing, and analyzing record-level annotations. Network participants can subscribe to notifications arising from ontology-based analyses of new annotations or of purpose-built queries against the network's global history of annotations. Quality and fitness for use of distributed natural science collections data can be addressed with Filtered Push software by implementing a network that allows data providers and consumers to define potential errors in data, develop metrics for those errors, specify workflows to analyze distributed data to detect potential errors, and to close the quality management cycle by providing a network architecture to pushing assertions about data quality such as corrections back to the curators of the participating data sets. Quality issues in distributed scientific data have several things in common: (1) Statements about data quality should be regarded as hypotheses about inconsistencies between perhaps several records, data
Saito, Kazuki; Matsuda, Fumio
Metabolomics now plays a significant role in fundamental plant biology and applied biotechnology. Plants collectively produce a huge array of chemicals, far more than are produced by most other organisms; hence, metabolomics is of great importance in plant biology. Although substantial improvements have been made in the field of metabolomics, the uniform annotation of metabolite signals in databases and informatics through international standardization efforts remains a challenge, as does the development of new fields such as fluxome analysis and single cell analysis. The principle of transcript and metabolite cooccurrence, particularly transcriptome coexpression network analysis, is a powerful tool for decoding the function of genes in Arabidopsis thaliana. This strategy can now be used for the identification of genes involved in specific pathways in crops and medicinal plants. Metabolomics has gained importance in biotechnology applications, as exemplified by quantitative loci analysis, prediction of food quality, and evaluation of genetically modified crops. Systems biology driven by metabolome data will aid in deciphering the secrets of plant cell systems and their application to biotechnology.
Tedersoo, Leho; Abarenkov, Kessy; Nilsson, R Henrik; Schüssler, Arthur; Grelet, Gwen-Aëlle; Kohout, Petr; Oja, Jane; Bonito, Gregory M; Veldre, Vilmar; Jairus, Teele; Ryberg, Martin; Larsson, Karl-Henrik; Kõljalg, Urmas
Sequence analysis of the ribosomal RNA operon, particularly the internal transcribed spacer (ITS) region, provides a powerful tool for identification of mycorrhizal fungi. The sequence data deposited in the International Nucleotide Sequence Databases (INSD) are, however, unfiltered for quality and are often poorly annotated with metadata. To detect chimeric and low-quality sequences and assign the ectomycorrhizal fungi to phylogenetic lineages, fungal ITS sequences were downloaded from INSD, aligned within family-level groups, and examined through phylogenetic analyses and BLAST searches. By combining the fungal sequence database UNITE and the annotation and search tool PlutoF, we also added metadata from the literature to these accessions. Altogether 35,632 sequences belonged to mycorrhizal fungi or originated from ericoid and orchid mycorrhizal roots. Of these sequences, 677 were considered chimeric and 2,174 of low read quality. Information detailing country of collection, geographical coordinates, interacting taxon and isolation source were supplemented to cover 78.0%, 33.0%, 41.7% and 96.4% of the sequences, respectively. These annotated sequences are publicly available via UNITE (http://unite.ut.ee/) for downstream biogeographic, ecological and taxonomic analyses. In European Nucleotide Archive (ENA; http://www.ebi.ac.uk/ena/), the annotated sequences have a special link-out to UNITE. We intend to expand the data annotation to additional genes and all taxonomic groups and functional guilds of fungi.
Nakato, Ryuichiro; Shirahige, Katsuhiko
Chromatin immunoprecipitation followed by sequencing (ChIP-seq) analysis can detect protein/DNA-binding and histone-modification sites across an entire genome. Recent advances in sequencing technologies and analyses enable us to compare hundreds of samples simultaneously; such large-scale analysis has potential to reveal the high-dimensional interrelationship level for regulatory elements and annotate novel functional genomic regions de novo. Because many experimental considerations are relevant to the choice of a method in a ChIP-seq analysis, the overall design and quality management of the experiment are of critical importance. This review offers guiding principles of computation and sample preparation for ChIP-seq analyses, highlighting the validity and limitations of the state-of-the-art procedures at each step. We also discuss the latest challenges of single-cell analysis that will encourage a new era in this field. © The Author 2016. Published by Oxford University Press.
Zhang, Wenchao; Zhao, Patrick X
Extracted ion chromatogram (EIC) extraction and chromatographic peak detection are two important processing procedures in liquid chromatography/mass spectrometry (LC/MS)-based metabolomics data analysis. Most commonly, the LC/MS technique employs electrospray ionization as the ionization method. The EICs from LC/MS data are often noisy and contain high background signals. Furthermore, the chromatographic peak quality varies with respect to its location in the chromatogram and most peaks have zigzag shapes. Therefore, there is a critical need to develop effective metrics for quality evaluation of EICs and chromatographic peaks in LC/MS based metabolomics data analysis. We investigated a comprehensive set of potential quality evaluation metrics for extracted EICs and detected chromatographic peaks. Specifically, for EIC quality evaluation, we analyzed the mass chromatographic quality index (MCQ index) and propose a novel quality evaluation metric, the EIC-related global zigzag index, which is based on an EIC's first order derivatives. For chromatographic peak quality evaluation, we analyzed and compared six metrics: sharpness, Gaussian similarity, signal-to-noise ratio, peak significance level, triangle peak area similarity ratio and the local peak-related local zigzag index. Although the MCQ index is suited for selecting and aligning analyte components, it cannot fairly evaluate EICs with high background signals or those containing only a single peak. Our proposed EIC related global zigzag index is robust enough to evaluate EIC qualities in both scenarios. Of the six peak quality evaluation metrics, the sharpness, peak significance level, and zigzag index outperform the others due to the zigzag nature of LC/MS chromatographic peaks. Furthermore, using several peak quality metrics in combination is more efficient than individual metrics in peak quality evaluation.
Wei, Bih-Rong; Simpson, R Mark
Standardization of biorepository best practices will enhance the quality of translational biomedical research utilizing patient-derived biobank specimens. Harmonization of pathology quality assurance procedures for biobank accessions has lagged behind other avenues of biospecimen research and biobank development. Comprehension of the cellular content of biorepository specimens is important for discovery of tissue-specific clinically relevant biomarkers for diagnosis and treatment. While rapidly emerging technologies in molecular analyses and data mining create focus on appropriate measures for minimizing pre-analytic artifact-inducing variables, less attention gets paid to annotating the constituent makeup of biospecimens for more effective specimen selection by biobank clients. Both pre-analytic tissue processing and specimen composition influence acquisition of relevant macromolecules for downstream assays. Pathologist review of biorepository submissions, particularly tissues as part of quality assurance procedures, helps to ensure that the intended target cells are present and in sufficient quantity in accessioned specimens. This manual procedure can be tedious and subjective. Incorporating digital pathology into biobank quality assurance procedures, using automated pattern recognition morphometric image analysis to quantify tissue feature areas in digital whole slide images of tissue sections, can minimize variability and subjectivity associated with routine pathologic evaluations in biorepositories. Whole-slide images and pathologist-reviewed morphometric analyses can be provided to researchers to guide specimen selection. Harmonization of pathology quality assurance methods that minimize subjectivity and improve reproducibility among collections would facilitate research-relevant specimen selection by investigators and could facilitate information sharing in an integrated network approach to biobanking. Published by Elsevier Inc.
Wei, Bih-Rong; Simpson, R. Mark
Standardization of biorepository best practices will enhance the quality of translational biomedical research utilizing patient-derived biobank specimens. Harmonization of pathology quality assurance procedures for biobank accessions has lagged behind other avenues of biospecimen research and biobank development. Comprehension of the cellular content of biorepository specimens is important for discovery of tissue-specific clinically relevant biomarkers for diagnosis and treatment. While rapidly emerging technologies in molecular analyses and data mining create focus on appropriate measures for minimizing pre-analytic artifact-inducing variables, less attention gets paid to annotating the constituent make up of biospecimens for more effective specimen selection by biobank clients. Both pre-analytic tissue processing and a specimen's composition influence acquisition of relevant macromolecules for downstream assays. Pathologist review of biorepository submissions, particularly tissues as part of quality assurance procedures, helps to ensure that the intended target cells are present and in sufficient quantity in accessioned specimens. This manual procedure can be tedious and subjective. Incorporating digital pathology into biobank quality assurance procedures, using automated pattern recognition morphometric image analysis to quantify tissue feature areas in digital whole slide images of tissue sections, can minimize variability and subjectivity associated with routine pathologic evaluations in biorepositories. Whole-slide images and pathologist-reviewed morphometric analyses can be provided to researchers to guide specimen selection. Harmonization of pathology quality assurance methods that minimize subjectivity and improve reproducibility among collections would facilitate research-relevant specimen selection by investigators and could facilitate information sharing in an integrated network approach to biobanking. PMID:24362266
Full Text Available Despite the structure and objectivity provided by the Gene Ontology (GO, the annotation of proteins is a complex task that is subject to errors and inconsistencies. Electronically inferred annotations in particular are widely considered unreliable. However, given that manual curation of all GO annotations is unfeasible, it is imperative to improve the quality of electronically inferred annotations. In this work, we analyze the full GO molecular function annotation of UniProtKB proteins, and discuss some of the issues that affect their quality, focusing particularly on the lack of annotation consistency. Based on our analysis, we estimate that 64% of the UniProtKB proteins are incompletely annotated, and that inconsistent annotations affect 83% of the protein functions and at least 23% of the proteins. Additionally, we present and evaluate a data mining algorithm, based on the association rule learning methodology, for identifying implicit relationships between molecular function terms. The goal of this algorithm is to assist GO curators in updating GO and correcting and preventing inconsistent annotations. Our algorithm predicted 501 relationships with an estimated precision of 94%, whereas the basic association rule learning methodology predicted 12,352 relationships with a precision below 9%.
Laiakis, Evagelia C; Trani, Daniela; Moon, Bo-Hyun; Strawn, Steven J; Fornace, Albert J
As space travel is expanding to include private tourism and travel beyond low-Earth orbit, so is the risk of exposure to space radiation. Galactic cosmic rays and solar particle events have the potential to expose space travelers to significant doses of radiation that can lead to increased cancer risk and other adverse health consequences. Metabolomics has the potential to assess an individual's risk by exploring the metabolic perturbations in a biofluid or tissue. In this study, C57BL/6 mice were exposed to 0.5 and 2 Gy of 1 GeV/nucleon of protons and the levels of metabolites were evaluated in urine at 4 h after radiation exposure through liquid chromatography coupled to time-of-flight mass spectrometry. Significant differences were identified in metabolites that map to the tricarboxylic acid (TCA) cycle and fatty acid metabolism, suggesting that energy metabolism is severely impacted after exposure to protons. Additionally, various pathways of amino acid metabolism (tryptophan, tyrosine, arginine and proline and phenylalanine) were affected with potential implications for DNA damage repair and cognitive impairment. Finally, presence of products of purine and pyrimidine metabolism points to direct DNA damage or increased apoptosis. Comparison of these metabolomic data to previously published data from our laboratory with gamma radiation strongly suggests a more pronounced effect on metabolism with protons. This is the first metabolomics study with space radiation in an easily accessible biofluid such as urine that further investigates and exemplifies the biological differences at early time points after exposure to different radiation qualities.
Sarapa, Nenad; Mortara, Justin L; Brown, Barry D; Isola, Lamberto; Badilini, Fabio
The US Food and Drug Administration recommends submission of digital electrocardiograms in the standard HL7 XML format into the electrocardiogram warehouse to support preapproval review of new drug applications. The Food and Drug Administration scrutinizes electrocardiogram quality by viewing the annotated waveforms and scoring electrocardiogram quality by the warehouse algorithms. Part of the Food and Drug Administration warehouse is commercially available to sponsors as the E-Scribe Warehouse. The authors tested the performance of E-Scribe Warehouse algorithms by quantifying electrocardiogram acquisition quality, adherence to QT annotation protocol, and T-wave signal strength in 2 data sets: "reference" (104 digital electrocardiograms from a phase I study with sotalol in 26 healthy subjects with QT annotations by computer-assisted manual adjustment) and "test" (the same electrocardiograms with an intentionally introduced predefined number of quality issues). The E-Scribe Warehouse correctly detected differences between the 2 sets expected from the number and pattern of errors in the "test" set (except for 1 subject with QT misannotated in different leads of serial electrocardiograms) and confirmed the absence of differences where none was expected. E-Scribe Warehouse scores below the threshold value identified individual electrocardiograms with questionable T-wave signal strength. The E-Scribe Warehouse showed satisfactory performance in detecting electrocardiogram quality issues that may impair reliability of QTc assessment in clinical trials in healthy subjects.
Robert A Morris
Full Text Available Electronic annotation of scientific data is very similar to annotation of documents. Both types of annotation amplify the original object, add related knowledge to it, and dispute or support assertions in it. In each case, annotation is a framework for discourse about the original object, and, in each case, an annotation needs to clearly identify its scope and its own terminology. However, electronic annotation of data differs from annotation of documents: the content of the annotations, including expectations and supporting evidence, is more often shared among members of networks. Any consequent actions taken by the holders of the annotated data could be shared as well. But even those current annotation systems that admit data as their subject often make it difficult or impossible to annotate at fine-enough granularity to use the results in this way for data quality control. We address these kinds of issues by offering simple extensions to an existing annotation ontology and describe how the results support an interest-based distribution of annotations. We are using the result to design and deploy a platform that supports annotation services overlaid on networks of distributed data, with particular application to data quality control. Our initial instance supports a set of natural science collection metadata services. An important application is the support for data quality control and provision of missing data. A previous proof of concept demonstrated such use based on data annotations modeled with XML-Schema.
Chagoyen, Mónica; López-Ibáñez, Javier; Pazos, Florencio
Metabolomics aims at characterizing the repertory of small chemical compounds in a biological sample. As it becomes more massive and larger sets of compounds are detected, a functional analysis is required to convert these raw lists of compounds into biological knowledge. The most common way of performing such analysis is "annotation enrichment analysis," also used in transcriptomics and proteomics. This approach extracts the annotations overrepresented in the set of chemical compounds arisen in a given experiment. Here, we describe the protocols for performing such analysis as well as for visualizing a set of compounds in different representations of the metabolic networks, in both cases using free accessible web tools.
Sakurai, Nozomu; Ara, Takeshi; Enomoto, Mitsuo; Motegi, Takeshi; Morishita, Yoshihiko; Kurabayashi, Atsushi; Iijima, Yoko; Ogata, Yoshiyuki; Nakajima, Daisuke; Suzuki, Hideyuki; Shibata, Daisuke
A metabolome--the collection of comprehensive quantitative data on metabolites in an organism--has been increasingly utilized for applications such as data-intensive systems biology, disease diagnostics, biomarker discovery, and assessment of food quality. A considerable number of tools and databases have been developed to date for the analysis of data generated by various combinations of chromatography and mass spectrometry. We report here a web portal named KOMICS (The Kazusa Metabolomics Portal), where the tools and databases that we developed are available for free to academic users. KOMICS includes the tools and databases for preprocessing, mining, visualization, and publication of metabolomics data. Improvements in the annotation of unknown metabolites and dissemination of comprehensive metabolomic data are the primary aims behind the development of this portal. For this purpose, PowerGet and FragmentAlign include a manual curation function for the results of metabolite feature alignments. A metadata-specific wiki-based database, Metabolonote, functions as a hub of web resources related to the submitters' work. This feature is expected to increase citation of the submitters' work, thereby promoting data publication. As an example of the practical use of KOMICS, a workflow for a study on Jatropha curcas is presented. The tools and databases available at KOMICS should contribute to enhanced production, interpretation, and utilization of metabolomic Big Data.
Picone, Gianfranco; Engelsen, Søren Balling; Savorani, Francesco; Testi, Silvia; Badiani, Anna; Capozzi, Francesco
The molecular profiles of perchloric acid solutions extracted from the flesh of Sparus aurata fish specimens, produced according to different aquaculture systems, have been investigated. The 1H-NMR spectra of aqueous extracts are indicative of differences in the metabolite content of fish reared under different conditions that are already distinguishable at their capture, and substantially maintain the same differences in their molecular profiles after sixteen days of storage under ice. The fish metabolic profiles are studied by top-down chemometric analysis. The results of this exploratory investigation show that the fish metabolome accurately reflects the rearing conditions. The level of many metabolites co-vary with the rearing conditions and a few metabolites are quantified including glycogen (stress indicator), histidine, alanine and glycine which all display significant changes dependent on the aquaculture system and on the storage times. PMID:22254093
Overmars, L.; Siezen, R.J.; Francke, C.
The identification of translation initiation sites (TISs) constitutes an important aspect of sequence-based genome analysis. An erroneous TIS annotation can impair the identification of regulatory elements and N-terminal signal peptides, and also may flaw the determination of descent, for any
Kodani, Yoshinori; Miyakawa, Takuya; Komatsu, Tomohiko; Tanokura, Masaru
Analytical methodologies to comprehensively evaluate beef quality are increasingly needed to accelerate improvement in both breeding and post-mortem processing. Consumer palatability towards beef is generally attributed to tenderness, flavor, and/or juiciness. These primary qualities are modified by post-mortem aging and the crude content and fatty acid composition of intramuscular fat. In this study, we report a nuclear magnetic resonance (NMR)-based metabolic profiles of Japanese Black cattle to evaluate the compositional attributes of intramuscular fat and the long-term post-mortem aging. The unsaturation degree of triacylglycerol was estimated by the 1 H NMR spectra and was correlated with the content ratio of unsaturated fatty acids (R 2 = 0.944) and the melting point of intramuscular fat (R 2 = 0.871). NMR-detected profiles of water-soluble metabolites revealed overall metabolic change (R 2 = 0.951) and several metabolites (R 2 > 0.818) linearly correlated with long-term aging duration, which can be used to evaluate the aging rate and aging duration of beef. This approach also provided the pH profile during aging, which is related to the water-holding capacity of beef. Thus, NMR-based metabolomics has the potential to evaluate multiple parameters related to the beef qualities of Japanese Black cattle.
Royère, D; Feuerstein, P; Cadoret, V; Puard, V; Uzbekova, S; Dalbies-Tran, R; Teusan, R; Houlgatte, R; Labas, V; Guérif, F
Preimplantation embryo development is one of the key features with implantation itself to achieve a pregnancy. Assisted Reproductive Technologies both in human and animal have improved our knowledge on these events, although it remains elusive to predict embryo potential to give a baby. Among various ways to define embryo viability, noninvasive approaches get a serious advantage linked to the final transfer of the embryo. Techniques devoted to characterize the embryo secretome using proteomic or metabolomic approaches may be non invasive. Based on a direct identification of products of the embryo metabolism or an assessment of profile(s) related with embryo viability, they have greatly improved their sensitivity to allow their use in clinical embryology, once validated. Oocyte-cumulus dialogue, as a key factor for oocyte competence to meiosis and embryo development, was particularly concerned with both genomic and proteomic assessment of cumulus cells. While it is not possible to designate at the time being which among these approaches will be robust and cost-efficient enough to help routinely the clinical embryologist in assisted reproductive techniques (ART), one can predict that our ability to select the "right" embryo will combine morphological criteria already available with validated biomarkers.
D'Alessandro, Angelo; Marrocco, Cristina; Zolla, Valerio; D'Andrea, Mariasilvia; Zolla, Lello
Longissimus lumborum muscles from high fat-deposing Casertana and lean meat Large White pigs were assayed for meat quality parameters, including early and ultimate post mortem pH, water holding capacity and Minolta L*a*b*values. These parameters were correlated to results from differential proteomic and targeted metabolomic analyses. Higher levels of glycolytic enzymes and lactate accumulation were related to slow pH drop in Casertana pigs, albeit not to rapid pH lowering in LW counterparts. On the other hand, the individuation of pyruvate kinaseM1 and tropomyosin levels in LW were related to water holding capacity and Minolta values at 24h after slaughter. Bioinformatic analyses strengthened the correlation between over-expression of structural proteins in LW and more accentuated growth aptitude in this breed. Conversely, enzymes taking part into branching glycolytic reactions, such as glycerol 3-phosphate and creatine kinase M, were related to accentuated lipogenesis and slower albeit prolonged glycolytic rate in Casertana, respectively. Breed-specific differences at the protein level were not only related to growth performances and fat accumulation tendency in vivo, but they also affected post mortem performances through a direct influence on the forcedly anaerobic behavior of pig muscles after slaughter. Copyright © 2011 Elsevier B.V. All rights reserved.
Anna A Vanyushkina
Full Text Available We present a systematic study of three bacterial species that belong to the class Mollicutes, the smallest and simplest bacteria, Spiroplasma melliferum, Mycoplasma gallisepticum, and Acholeplasma laidlawii. To understand the difference in the basic principles of metabolism regulation and adaptation to environmental conditions in the three species, we analyzed the metabolome of these bacteria. Metabolic pathways were reconstructed using the proteogenomic annotation data provided by our lab. The results of metabolome, proteome and genome profiling suggest a fundamental difference in the adaptation of the three closely related Mollicute species to stress conditions. As the transaldolase is not annotated in Mollicutes, we propose variants of the pentose phosphate pathway catalyzed by annotated enzymes for three species. For metabolite detection we employed high performance liquid chromatography coupled with mass spectrometry. We used liquid chromatography method - hydrophilic interaction chromatography with silica column - as it effectively separates highly polar cellular metabolites prior to their detection by mass spectrometer.
Sundekilde, Ulrik; Larsen, Lotte Bach; Bertram, Hanne Christine S.
and processing capabilities of bovine milk is closely associated to milk composition. Metabolomics is ideal in the study of the low-molecular-weight compounds in milk, and this review focuses on the recent nuclear magnetic resonance (NMR)-based metabolomics trends in milk research, including applications linking...... the milk metabolite profiling with nutritional aspects, and applications which aim to link the milk metabolite profile to various technological qualities of milk. The metabolite profiling studies encompass the identification of novel metabolites, which potentially can be used as biomarkers or as bioactive...... compounds. Furthermore, metabolomics applications elucidating how the differential regulated genes affects milk composition are also reported. This review will highlight the recent advances in NMR-based metabolomics on milk, as well as give a brief summary of when NMR spectroscopy can be useful for gaining...
Full Text Available A metabolome—the collection of comprehensive quantitative data on metabolites in an organism—has been increasingly utilized for applications such as data-intensive systems biology, disease diagnostics, biomarker discovery, and assessment of food quality. A considerable number of tools and databases have been developed to date for the analysis of data generated by various combinations of chromatography and mass spectrometry. We report here a web portal named KOMICS (The Kazusa Metabolomics Portal, where the tools and databases that we developed are available for free to academic users. KOMICS includes the tools and databases for preprocessing, mining, visualization, and publication of metabolomics data. Improvements in the annotation of unknown metabolites and dissemination of comprehensive metabolomic data are the primary aims behind the development of this portal. For this purpose, PowerGet and FragmentAlign include a manual curation function for the results of metabolite feature alignments. A metadata-specific wiki-based database, Metabolonote, functions as a hub of web resources related to the submitters' work. This feature is expected to increase citation of the submitters' work, thereby promoting data publication. As an example of the practical use of KOMICS, a workflow for a study on Jatropha curcas is presented. The tools and databases available at KOMICS should contribute to enhanced production, interpretation, and utilization of metabolomic Big Data.
Chagoyen, Monica; Pazos, Florencio
While many tools exist for performing enrichment analysis of transcriptomic and proteomic data in order to interpret them in biological terms, almost no equivalent tools exist for metabolomic data. We present Metabolite Biological Role (MBRole), a web server for carrying out over-representation analysis of biological and chemical annotations in arbitrary sets of metabolites (small chemical compounds) coming from metabolomic data of any organism or sample. The web server is freely available at http://csbg.cnb.csic.es/mbrole. It was tested in the main web browsers.
The large size and relative complexity of many plant genomes make creation, quality control, and dissemination of high-quality gene structure annotations challenging. In response, we have developed MAKER-P, a fast and easy-to-use genome annotation engine for plants. Here, we report the use of MAKER-...
Jewison, Timothy; Knox, Craig; Neveu, Vanessa; Djoumbou, Yannick; Guo, An Chi; Lee, Jacqueline; Liu, Philip; Mandal, Rupasri; Krishnamurthy, Ram; Sinelnikov, Igor; Wilson, Michael; Wishart, David S.
The Yeast Metabolome Database (YMDB, http://www.ymdb.ca) is a richly annotated ‘metabolomic’ database containing detailed information about the metabolome of Saccharomyces cerevisiae. Modeled closely after the Human Metabolome Database, the YMDB contains >2000 metabolites with links to 995 different genes/proteins, including enzymes and transporters. The information in YMDB has been gathered from hundreds of books, journal articles and electronic databases. In addition to its comprehensive literature-derived data, the YMDB also contains an extensive collection of experimental intracellular and extracellular metabolite concentration data compiled from detailed Mass Spectrometry (MS) and Nuclear Magnetic Resonance (NMR) metabolomic analyses performed in our lab. This is further supplemented with thousands of NMR and MS spectra collected on pure, reference yeast metabolites. Each metabolite entry in the YMDB contains an average of 80 separate data fields including comprehensive compound description, names and synonyms, structural information, physico-chemical data, reference NMR and MS spectra, intracellular/extracellular concentrations, growth conditions and substrates, pathway information, enzyme data, gene/protein sequence data, as well as numerous hyperlinks to images, references and other public databases. Extensive searching, relational querying and data browsing tools are also provided that support text, chemical structure, spectral, molecular weight and gene/protein sequence queries. Because of S. cervesiae's importance as a model organism for biologists and as a biofactory for industry, we believe this kind of database could have considerable appeal not only to metabolomics researchers, but also to yeast biologists, systems biologists, the industrial fermentation industry, as well as the beer, wine and spirit industry. PMID:22064855
Chagoyen, Monica; Pazos, Florencio
The so-called 'omics' approaches used in modern biology aim at massively characterizing the molecular repertories of living systems at different levels. Metabolomics is one of the last additions to the 'omics' family and it deals with the characterization of the set of metabolites in a given biological system. As metabolomic techniques become more massive and allow characterizing larger sets of metabolites, automatic methods for analyzing these sets in order to obtain meaningful biological information are required. Only recently the first tools specifically designed for this task in metabolomics appeared. They are based on approaches previously used in transcriptomics and other 'omics', such as annotation enrichment analysis. These, together with generic tools for metabolic analysis and visualization not specifically designed for metabolomics will for sure be in the toolbox of the researches doing metabolomic experiments in the near future.
Metabolomic analysis of plants broadens understanding of how plants may benefit humans, animals and the environment, provide sustainable food and energy, and improve current agricultural, pharmacological and medicinal practices in order to bring about healthier and longer life. The quality...... and amount of the extractible biological information is largely determined by data acquisition, data processing and analysis methodologies of the plant metabolomics studies. This PhD study focused mainly on the development and implementation of new metabolomics methodologies for improved data acquisition...... and data processing. The study mainly concerned the three most commonly applied analytical techniques in plant metabolomics, GC-MS, LC-MS and NMR. In addition, advanced chemometrics methods e.g. PARAFAC2 and ASCA have been extensively used for development of complex metabolomics data processing...
Tohge, Takayuki; Fernie, Alisdair R
Tomato was one of the first plant species to be evaluated using metabolomics and remains one of the best characterized, with tomato fruit being both an important source of nutrition in the human diet and a valuable model system for the development of fleshy fruits. Additionally, given the broad habitat range of members of the tomato clade and the extensive use of exotic germplasm in tomato genetic research, it represents an excellent genetic model system for understanding both metabolism per se and the importance of various metabolites in conferring stress tolerance. This review summarizes technical approaches used to characterize the tomato metabolome to date and details insights into metabolic pathway structure and regulation that have been obtained via analysis of tissue samples taken under different developmental or environmental circumstance as well as following genetic perturbation. Particular attention is paid to compounds of importance for nutrition or the shelf-life of tomatoes. We propose furthermore how metabolomics information can be coupled to the burgeoning wealth of genome sequence data from the tomato clade to enhance further our understanding of (i) the shifts in metabolic regulation occurring during development and (ii) specialization of metabolism within the tomato clade as a consequence of either adaptive evolution or domestication. © The Author 2015. Published by Oxford University Press on behalf of Japanese Society of Plant Physiologists. All rights reserved. For permissions, please email: firstname.lastname@example.org.
Ranjbar Sistani, Nima; Kaul, Hans-Peter; Desalegn, Getinet; Wienkoop, Stefanie
In field peas, ascochyta blight is one of the most common fungal diseases caused by Didymella pinodes. Despite the high diversity of pea cultivars, only little resistance has been developed until to date, still leading to significant losses in grain yield. Rhizobia as plant growth promoting endosymbionts are the main partners for establishment of symbiosis with pea plants. The key role of Rhizobium as an effective nitrogen source for legumes seed quality and quantity improvement is in line with sustainable agriculture and food security programs. Besides these growth promoting effects, Rhizobium symbiosis has been shown to have a priming impact on the plants immune system that enhances resistance against environmental perturbations. This is the first integrative study that investigates the effect of Rhizobium leguminosarum bv. viceae (Rlv) on phenotypic seed quality, quantity and fungal disease in pot grown pea (Pisum sativum) cultivars with two different resistance levels against D. pinodes through metabolomics and proteomics analyses. In addition, the pathogen effects on seed quantity components and quality are assessed at morphological and molecular level. Rhizobium inoculation decreased disease severity by significant reduction of seed infection level. Rhizobium symbiont enhanced yield through increased seed fresh and dry weights based on better seed filling. Rhizobium inoculation also induced changes in seed proteome and metabolome involved in enhanced P. sativum resistance level against D. pinodes. Besides increased redox and cell wall adjustments light is shed on the role of late embryogenesis abundant proteins and metabolites such as the seed triterpenoid Soyasapogenol. The results of this study open new insights into the significance of symbiotic Rhizobium interactions for crop yield, health and seed quality enhancement and reveal new metabolite candidates involved in pathogen resistance. PMID:29204150
Pérez-Pérez, Martín; Glez-Peña, Daniel; Fdez-Riverola, Florentino; Lourenço, Anália
Document annotation is a key task in the development of Text Mining methods and applications. High quality annotated corpora are invaluable, but their preparation requires a considerable amount of resources and time. Although the existing annotation tools offer good user interaction interfaces to domain experts, project management and quality control abilities are still limited. Therefore, the current work introduces Marky, a new Web-based document annotation tool equipped to manage multi-user and iterative projects, and to evaluate annotation quality throughout the project life cycle. At the core, Marky is a Web application based on the open source CakePHP framework. User interface relies on HTML5 and CSS3 technologies. Rangy library assists in browser-independent implementation of common DOM range and selection tasks, and Ajax and JQuery technologies are used to enhance user-system interaction. Marky grants solid management of inter- and intra-annotator work. Most notably, its annotation tracking system supports systematic and on-demand agreement analysis and annotation amendment. Each annotator may work over documents as usual, but all the annotations made are saved by the tracking system and may be further compared. So, the project administrator is able to evaluate annotation consistency among annotators and across rounds of annotation, while annotators are able to reject or amend subsets of annotations made in previous rounds. As a side effect, the tracking system minimises resource and time consumption. Marky is a novel environment for managing multi-user and iterative document annotation projects. Compared to other tools, Marky offers a similar visually intuitive annotation experience while providing unique means to minimise annotation effort and enforce annotation quality, and therefore corpus consistency. Marky is freely available for non-commercial use at http://sing.ei.uvigo.es/marky. Copyright © 2014 Elsevier Ireland Ltd. All rights reserved.
Full Text Available Progress in improving crop growth is an absolute goal despite the influence multifactorial components have on crop yield and quality. An Avalon × Cadenza doubled-haploid wheat mapping population was used to study the leaf metabolome of field grown wheat at weekly intervals during the time in which the canopy contributes to grain filling, i.e., from anthesis to 5 weeks post-anthesis. Wheat was grown under four different nitrogen supplies reaching from residual soil N to a luxury over-fertilization (0, 100, 200, and 350 kg N ha−1. Four lines from a segregating doubled haploid population derived of a cross of the wheat elite cvs. Avalon and Cadenza were chosen as they showed pairwise differences in either N utilization efficiency (NUtE or senescence timing. 108 annotated metabolites of primary metabolism and ions were determined. The analysis did not provide genotype specific markers because of a remarkable stability of the metabolome between lines. We speculate that the reason for failing to identify genotypic markers might be due to insufficient genetic diversity of the wheat parents and/or the known tendency of plants to keep metabolome homeostasis even under adverse conditions through multiple adaptations and rescue mechanism. The data, however, provided a consistent catalogue of metabolites and their respective responses to environmental and developmental factors and may bode well for future systems biology approaches, and support plant breeding and crop improvement.
Blake, J A; Dolan, M; Drabkin, H; Hill, D P; Li, Ni; Sitnikov, D; Bridges, S; Burgess, S; Buza, T; McCarthy, F; Peddinti, D; Pillai, L; Carbon, S; Dietze, H; Ireland, A; Lewis, S E; Mungall, C J; Gaudet, P; Chrisholm, R L; Fey, P; Kibbe, W A; Basu, S; Siegele, D A; McIntosh, B K; Renfro, D P; Zweifel, A E; Hu, J C; Brown, N H; Tweedie, S; Alam-Faruque, Y; Apweiler, R; Auchinchloss, A; Axelsen, K; Bely, B; Blatter, M -C; Bonilla, C; Bouguerleret, L; Boutet, E; Breuza, L; Bridge, A; Chan, W M; Chavali, G; Coudert, E; Dimmer, E; Estreicher, A; Famiglietti, L; Feuermann, M; Gos, A; Gruaz-Gumowski, N; Hieta, R; Hinz, C; Hulo, C; Huntley, R; James, J; Jungo, F; Keller, G; Laiho, K; Legge, D; Lemercier, P; Lieberherr, D; Magrane, M; Martin, M J; Masson, P; Mutowo-Muellenet, P; O'Donovan, C; Pedruzzi, I; Pichler, K; Poggioli, D; Porras Millán, P; Poux, S; Rivoire, C; Roechert, B; Sawford, T; Schneider, M; Stutz, A; Sundaram, S; Tognolli, M; Xenarios, I; Foulgar, R; Lomax, J; Roncaglia, P; Khodiyar, V K; Lovering, R C; Talmud, P J; Chibucos, M; Giglio, M Gwinn; Chang, H -Y; Hunter, S; McAnulla, C; Mitchell, A; Sangrador, A; Stephan, R; Harris, M A; Oliver, S G; Rutherford, K; Wood, V; Bahler, J; Lock, A; Kersey, P J; McDowall, D M; Staines, D M; Dwinell, M; Shimoyama, M; Laulederkind, S; Hayman, T; Wang, S -J; Petri, V; Lowry, T; D'Eustachio, P; Matthews, L; Balakrishnan, R; Binkley, G; Cherry, J M; Costanzo, M C; Dwight, S S; Engel, S R; Fisk, D G; Hitz, B C; Hong, E L; Karra, K; Miyasato, S R; Nash, R S; Park, J; Skrzypek, M S; Weng, S; Wong, E D; Berardini, T Z; Huala, E; Mi, H; Thomas, P D; Chan, J; Kishore, R; Sternberg, P; Van Auken, K; Howe, D; Westerfield, M
The Gene Ontology (GO) Consortium (GOC, http://www.geneontology.org) is a community-based bioinformatics resource that classifies gene product function through the use of structured, controlled vocabularies. Over the past year, the GOC has implemented several processes to increase the quantity, quality and specificity of GO annotations. First, the number of manual, literature-based annotations has grown at an increasing rate. Second, as a result of a new 'phylogenetic annotation' process, manually reviewed, homology-based annotations are becoming available for a broad range of species. Third, the quality of GO annotations has been improved through a streamlined process for, and automated quality checks of, GO annotations deposited by different annotation groups. Fourth, the consistency and correctness of the ontology itself has increased by using automated reasoning tools. Finally, the GO has been expanded not only to cover new areas of biology through focused interaction with experts, but also to capture greater specificity in all areas of the ontology using tools for adding new combinatorial terms. The GOC works closely with other ontology developers to support integrated use of terminologies. The GOC supports its user community through the use of e-mail lists, social media and web-based resources.
Fujimura, Yoshinori; Kurihara, Kana; Ida, Megumi; Kosaka, Reia; Miura, Daisuke; Wariishi, Hiroyuki; Maeda-Yamamoto, Mari; Nesumi, Atsushi; Saito, Takeshi; Kanda, Tomomasa; Yamada, Koji; Tachibana, Hirofumi
BACKGROUND: Green tea has various health promotion effects. Although there are numerous tea cultivars, little is known about the differences in their nutraceutical properties. Metabolic profiling techniques can provide information on the relationship between the metabolome and factors such as phenotype or quality. Here, we performed metabolomic analyses to explore the relationship between the metabolome and health-promoting attributes (bioactivity) of diverse Japanese green tea cultivars. MET...
Donoghue, Mildred R.
This annotated bibliography of contemporary multicultural books for children is divided into sections on: (1) non-fiction, biography (12 citations); (2) non-fiction, information (18 citations); (3) contemporary realistic fiction (14 citations); (4) folklore (11 citations); (5) historical fiction (11 citations); (6) modern fantasy (10 citations);…
Evolution of potent odorants within the volatile metabolome of high-quality hazelnuts (Corylus avellana L.): evaluation by comprehensive two-dimensional gas chromatography coupled with mass spectrometry.
Rosso, Marta Cialiè; Liberto, Erica; Spigolon, Nicola; Fontana, Mauro; Somenzi, Marco; Bicchi, Carlo; Cordero, Chiara
Within the pattern of volatiles released by food products (volatilome), potent odorants are bio-active compounds that trigger aroma perception by activating a complex array of odor receptors (ORs) in the regio olfactoria. Their informative role is fundamental to select optimal post-harvest and storage conditions and preserve food sensory quality. This study addresses the volatile metabolome from high-quality hazelnuts (Corylus avellana L.) from the Ordu region (Turkey) and Tonda Romana from Italy, and investigates its evolution throughout the production chain (post-harvest, industrial storage, roasting) to find functional correlations between technological strategies and product quality. The volatile metabolome is analyzed by headspace solid-phase microextration combined with comprehensive two-dimensional gas chromatography and mass spectrometry. Dedicated pattern recognition, based on 2D data (targeted fingerprinting), is used to mine analytical outputs, while principal component analysis (PCA), Fisher ratio, hierarchical clustering, and analysis of variance are used to find decision makers among the most informative chemicals. Low-temperature drying (18-20 °C) has a decisive effect on quality; it correlates negatively with bacteria and mold metabolic activity, nut viability, and lipid oxidation products (2-methyl-1-propanol, 3-methyl-1-butanol, 2-ethyl-1-hexanol, 2-octanol, 1-octen-3-ol, hexanal, octanal and (E)-2-heptanal). Protective atmosphere storage (99% N 2 -1% O 2 ) effectively limits lipid oxidation for 9-12 months after nut harvest. The combination of optimal drying and storage preserves the aroma potential; after roasting at different shelf-lives, key odorants responsible for malty and buttery (2- and 3-methylbutanal, 2,3-butanedione and 2,3-pentanedione), earthy (methylpyrazine, 2-ethyl-5-methyl pyrazine and 3-ethyl-2,5-dimethyl pyrazine) and caramel-like and musty notes (2,5-dimethyl-4-hydroxy-3(2H)-furanone - furaneol and acetyl pyrrole) show no
The Metabolomics and Epidemiology (MetEpi) Working Group promotes metabolomics analyses in population-based studies, as well as advancement in the field of metabolomics for broader biomedical and public health research.
Under a cooperative agreement with the U.S. Department of Energy's Office of Science and Technology, Waste Policy Institute (WPI) is conducting a five-year research project to develop a research-based approach for integrating communication products in stakeholder involvement related to innovative technology. As part of the research, WPI developed this annotated bibliography which contains almost 100 citations of articles/books/resources involving topics related to communication and public involvement aspects of deploying innovative cleanup technology. To compile the bibliography, WPI performed on-line literature searches (e.g., Dialog, International Association of Business Communicators Public Relations Society of America, Chemical Manufacturers Association, etc.), consulted past years proceedings of major environmental waste cleanup conferences (e.g., Waste Management), networked with professional colleagues and DOE sites to gather reports or case studies, and received input during the August 1996 Research Design Team meeting held to discuss the project's research methodology. Articles were selected for annotation based upon their perceived usefulness to the broad range of public involvement and communication practitioners
Reidsma, Dennis; op den Akker, Hendrikus J.A.; Artstein, R.; Boleda, G.; Keller, F.; Schulte im Walde, S.
Many interesting phenomena in conversation can only be annotated as a subjective task, requiring interpretative judgements from annotators. This leads to data which is annotated with lower levels of agreement not only due to errors in the annotation, but also due to the differences in how annotators
Hall, R.D.; Brouwer, I.D.; Fitzgerald, M.A.
With the growing interest in the use of metabolomic technologies for a wide range of biological targets, food applications related to nutrition and quality are rapidly emerging. Metabolomics offers us the opportunity to gain deeper insights into, and have better control of, the fundamental
Cox, James E; Thummel, Carl S; Tennessen, Jason M
Metabolomic analysis provides a powerful new tool for studies of Drosophila physiology. This approach allows investigators to detect thousands of chemical compounds in a single sample, representing the combined contributions of gene expression, enzyme activity, and environmental context. Metabolomics has been used for a wide range of studies in Drosophila , often providing new insights into gene function and metabolic state that could not be obtained using any other approach. In this review, we survey the uses of metabolomic analysis since its entry into the field. We also cover the major methods used for metabolomic studies in Drosophila and highlight new directions for future research. Copyright © 2017 by the Genetics Society of America.
Full Text Available Selenium (Se is an essential nutrient for humans, due to its antioxidant properties, whereas, to date, its essentiality to plants still remains to be demonstrated. Nevertheless, if added to the cultivation substrate, plants growth resulted enhanced. However, the concentration of Se in agricultural soils is very variable, ranging from 0.01 mg kg-1 up to 10 mg kg-1 in seleniferous areas. Therefore several studies have been performed aimed at bio-fortifying crops with Se and the approaches exploited were mainly based on the application of Se fertilizers. The aim of the present research was to assess the biofortification potential of Se in hydroponically grown strawberry fruits and its effects on qualitative parameters and nutraceutical compounds. The supplementation with Se did not negatively affect the growth and the yield of strawberries, and induced an accumulation of Se in fruits. Furthermore, the metabolomic analyses highlighted an increase in flavonoid and polyphenol compounds, which contributes to the organoleptic features and antioxidant capacity of fruits; in addition, an increase in the fruits sweetness also was detected in biofortified strawberries. In conclusion, based on our observations, strawberry plants seem a good target for Se biofortification, thus allowing the increase in the human intake of this essential micronutrient.
The right annotation tool does not always exist for processing a particular natural language task. In these scenarios, researchers are required to build new annotation tools to fit the tasks at hand. However, developing new annotation tools is difficult and inefficient. There has not been careful consideration of software complexity in current annotation tools. Due to the problems of complexity, new annotation tools must reimplement common annotation features despite the availability of imple...
Matthew N Benedict
Full Text Available Genome-scale metabolic models provide a powerful means to harness information from genomes to deepen biological insights. With exponentially increasing sequencing capacity, there is an enormous need for automated reconstruction techniques that can provide more accurate models in a short time frame. Current methods for automated metabolic network reconstruction rely on gene and reaction annotations to build draft metabolic networks and algorithms to fill gaps in these networks. However, automated reconstruction is hampered by database inconsistencies, incorrect annotations, and gap filling largely without considering genomic information. Here we develop an approach for applying genomic information to predict alternative functions for genes and estimate their likelihoods from sequence homology. We show that computed likelihood values were significantly higher for annotations found in manually curated metabolic networks than those that were not. We then apply these alternative functional predictions to estimate reaction likelihoods, which are used in a new gap filling approach called likelihood-based gap filling to predict more genomically consistent solutions. To validate the likelihood-based gap filling approach, we applied it to models where essential pathways were removed, finding that likelihood-based gap filling identified more biologically relevant solutions than parsimony-based gap filling approaches. We also demonstrate that models gap filled using likelihood-based gap filling provide greater coverage and genomic consistency with metabolic gene functions compared to parsimony-based approaches. Interestingly, despite these findings, we found that likelihoods did not significantly affect consistency of gap filled models with Biolog and knockout lethality data. This indicates that the phenotype data alone cannot necessarily be used to discriminate between alternative solutions for gap filling and therefore, that the use of other information
strategy influences the patterns identified as important for the nutritional question under study. Therefore, in depth understanding of the study design and the specific effects of the analytical technology on the produced data is extremely important to achieve high quality data handling. Besides data...
In this work we study the task of image annotation, of which the goal is to describe an image using a few tags. Instead of predicting the full list of tags, here we target for providing a short list of tags under a limited number (e.g., 3), to cover as much information as possible of the image. The tags in such a short list should be representative and diverse. It means they are required to be not only corresponding to the contents of the image, but also be different to each other. To this end, we treat the image annotation as a subset selection problem based on the conditional determinantal point process (DPP) model, which formulates the representation and diversity jointly. We further explore the semantic hierarchy and synonyms among the candidate tags, and require that two tags in a semantic hierarchy or in a pair of synonyms should not be selected simultaneously. This requirement is then embedded into the sampling algorithm according to the learned conditional DPP model. Besides, we find that traditional metrics for image annotation (e.g., precision, recall and F1 score) only consider the representation, but ignore the diversity. Thus we propose new metrics to evaluate the quality of the selected subset (i.e., the tag list), based on the semantic hierarchy and synonyms. Human study through Amazon Mechanical Turk verifies that the proposed metrics are more close to the humans judgment than traditional metrics. Experiments on two benchmark datasets show that the proposed method can produce more representative and diverse tags, compared with existing image annotation methods.
Kuhlisch, Constanze; Pohnert, Georg
Chemical ecology elucidates the nature and role of natural products as mediators of organismal interactions. The emerging techniques that can be summarized under the concept of metabolomics provide new opportunities to study such environmentally relevant signaling molecules. Especially comparative tools in metabolomics enable the identification of compounds that are regulated during interaction situations and that might play a role as e.g. pheromones, allelochemicals or in induced and activated defenses. This approach helps overcoming limitations of traditional bioassay-guided structure elucidation approaches. But the power of metabolomics is not limited to the comparison of metabolic profiles of interacting partners. Especially the link to other -omics techniques helps to unravel not only the compounds in question but the entire biosynthetic and genetic re-wiring, required for an ecological response. This review comprehensively highlights successful applications of metabolomics in chemical ecology and discusses existing limitations of these novel techniques. It focuses on recent developments in comparative metabolomics and discusses the use of metabolomics in the systems biology of organismal interactions. It also outlines the potential of large metabolomics initiatives for model organisms in the field of chemical ecology.
Sun, Weiyi; Rumshisky, Anna; Uzuner, Ozlem
Temporal information in clinical narratives plays an important role in patients' diagnosis, treatment and prognosis. In order to represent narrative information accurately, medical natural language processing (MLP) systems need to correctly identify and interpret temporal information. To promote research in this area, the Informatics for Integrating Biology and the Bedside (i2b2) project developed a temporally annotated corpus of clinical narratives. This corpus contains 310 de-identified discharge summaries, with annotations of clinical events, temporal expressions and temporal relations. This paper describes the process followed for the development of this corpus and discusses annotation guideline development, annotation methodology, and corpus quality. Copyright © 2013 Elsevier Inc. All rights reserved.
Moseley, Hunter N B
Error analysis plays a fundamental role in describing the uncertainty in experimental results. It has several fundamental uses in metabolomics including experimental design, quality control of experiments, the selection of appropriate statistical methods, and the determination of uncertainty in results. Furthermore, the importance of error analysis has grown with the increasing number, complexity, and heterogeneity of measurements characteristic of 'omics research. The increase in data complexity is particularly problematic for metabolomics, which has more heterogeneity than other omics technologies due to the much wider range of molecular entities detected and measured. This review introduces the fundamental concepts of error analysis as they apply to a wide range of metabolomics experimental designs and it discusses current methodologies for determining the propagation of uncertainty in appropriate metabolomics data analysis. These methodologies include analytical derivation and approximation techniques, Monte Carlo error analysis, and error analysis in metabolic inverse problems. Current limitations of each methodology with respect to metabolomics data analysis are also discussed.
Burghardt, Manuel; Wolff, Christian
Wir diskutieren zunächst die Problematik der (syntaktischen) Annotation diachroner Korpora und stellen anschließend eine Evaluationsstudie vor, bei der mehr als 50 Annotationswerkzeuge und -frameworks vor dem Hintergrund eines funktionalen und software-ergonomischen Anforderungsprofils nach dem Qualitätsmodell von ISO/IEC 9126-1:2001 (Software engineering – Product quality – Part 1: Quality model) und ISO/IEC 25000:2005 (Software Engineering – Software product Quality Requirements and Evaluat...
Kodani, Yoshinori; Miyakawa, Takuya; Komatsu, Tomohiko; Tanokura, Masaru
Analytical methodologies to comprehensively evaluate beef quality are increasingly needed to accelerate improvement in both breeding and post-mortem processing. Consumer palatability towards beef is generally attributed to tenderness, flavor, and/or juiciness. These primary qualities are modified by post-mortem aging and the crude content and fatty acid composition of intramuscular fat. In this study, we report a nuclear magnetic resonance (NMR)-based metabolic profiles of Japanese Black catt...
Lingutla, Nikhil Tej; Preece, Justin; Todorovic, Sinisa; Cooper, Laurel; Moore, Laura; Jaiswal, Pankaj
Large quantities of digital images are now generated for biological collections, including those developed in projects premised on the high-throughput screening of genome-phenome experiments. These images often carry annotations on taxonomy and observable features, such as anatomical structures and phenotype variations often recorded in response to the environmental factors under which the organisms were sampled. At present, most of these annotations are described in free text, may involve limited use of non-standard vocabularies, and rarely specify precise coordinates of features on the image plane such that a computer vision algorithm could identify, extract and annotate them. Therefore, researchers and curators need a tool that can identify and demarcate features in an image plane and allow their annotation with semantically contextual ontology terms. Such a tool would generate data useful for inter and intra-specific comparison and encourage the integration of curation standards. In the future, quality annotated image segments may provide training data sets for developing machine learning applications for automated image annotation. We developed a novel image segmentation and annotation software application, "Annotation of Image Segments with Ontologies" (AISO). The tool enables researchers and curators to delineate portions of an image into multiple highlighted segments and annotate them with an ontology-based controlled vocabulary. AISO is a freely available Java-based desktop application and runs on multiple platforms. It can be downloaded at http://www.plantontology.org/software/AISO. AISO enables curators and researchers to annotate digital images with ontology terms in a manner which ensures the future computational value of the annotated images. We foresee uses for such data-encoded image annotations in biological data mining, machine learning, predictive annotation, semantic inference, and comparative analyses.
R. R. Furina
Full Text Available The review shows the results of metabolomic studies in pulmonology. The key idea of metabolomics is to detect specific biomarkers in a biological sample for the diagnosis of diseases of the bronchi and lung. Main methods for the separation and identification of volatile organic substances as biomarkers (gas chromatography, mass spectrometry, and nuclear magnetic resonance spectrometry used in metabolomics are given. A solid-phase microextraction method used to pre-prepare a sample is also covered. The results of laboratory tests for biomarkers for lung cancer, acute respiratory distress syndrome, chronic obstructive pulmonary disease, cystic fibrosis, chronic infections, and pulmonary tuberculosis are presented. In addition, emphasis is placed on the possibilities of metabolomics used in experimental medicine, including to the study of asthma. The information is of interest to both theorists and practitioners.
Scalbert, Augustin; Brennan, Lorraine; Manach, Claudine
The food metabolome is defined as the part of the human metabolome directly derived from the digestion and biotransformation of foods and their constituents. With >25,000 compounds known in various foods, the food metabolome is extremely complex, with a composition varying widely according...... to the diet. By its very nature it represents a considerable and still largely unexploited source of novel dietary biomarkers that could be used to measure dietary exposures with a high level of detail and precision. Most dietary biomarkers currently have been identified on the basis of our knowledge of food...... by the recent identification of novel biomarkers of intakes for fruit, vegetables, beverages, meats, or complex diets. Moreover, examples also show how the scrutiny of the food metabolome can lead to the discovery of bioactive molecules and dietary factors associated with diseases. However, researchers still...
Olivon, Florent; Roussi, Fanny; Litaudon, Marc; Touboul, David
New omics sciences generate massive amounts of data, requiring to be sorted, curated, and statistically analyzed by dedicated software. Data-dependent acquisition mode including inclusion and exclusion rules for tandem mass spectrometry is routinely used to perform such analyses. While acquisition parameters are well described for proteomics, no general rule is currently available to generate reliable metabolomic data for molecular networking analysis on the Global Natural Product Social Molecular Networking platform (GNPS). Following on from an exploration of key parameters influencing the quality of molecular networks, universal optimal acquisition conditions for metabolomic studies are suggested in the present paper. The benefit of data pre-clustering before initiating large datasets for GNPS analyses is also demonstrated. Moreover, an efficient workflow dedicated to Agilent Technologies instruments is described, making the dereplication process easier by unambiguously distinguishing isobaric isomers eluted at different retention times, annotating the molecular networks with chemical formulas, and giving access to semi-quantitative data. This specific workflow foreshadows future developments of the GNPS platform.
The COnsortium of METabolomics Studies (COMETS) is an extramural-intramural partnership that promotes collaboration among prospective cohort studies that follow participants for a range of outcomes and perform metabolomic profiling of individuals.
Full Text Available Biomedical annotation is a common and affective artifact for researchers to discuss, show opinion, and share discoveries. It becomes increasing popular in many online research communities, and implies much useful information. Ranking biomedical annotations is a critical problem for data user to efficiently get information. As the annotator’s knowledge about the annotated entity normally determines quality of the annotations, we evaluate the knowledge, that is, semantic relationship between them, in two ways. The first is extracting relational information from credible websites by mining association rules between an annotator and a biomedical entity. The second way is frequent pattern mining from historical annotations, which reveals common features of biomedical entities that an annotator can annotate with high quality. We propose a weighted and concept-extended RDF model to represent an annotator, a biomedical entity, and their background attributes and merge information from the two ways as the context of an annotator. Based on that, we present a method to rank the annotations by evaluating their correctness according to user’s vote and the semantic relevancy between the annotator and the annotated entity. The experimental results show that the approach is applicable and efficient even when data set is large.
Govindaraghavan, Suresh; Hennell, James R; Sucher, Nikolaus J
Fundamental to herbal medicine quality is the use of 'authentic' medicinal herb species. Species, however, 'represent more or less arbitrary and subjective man-made units'. Against this background, we discuss, with illustrative examples, the importance of defining species boundaries by accommodating both the fixed (shared) diagnostic and varying (within-species) traits in medicinal herb populations. We emphasize the role of taxonomy, floristic information and genomic profiling in authenticating medicinal herb species, in addition to the need to include within species phytochemical profile variations while developing herbal extract identification protocols. We outline the application of species-specific genomic and phytochemical markers, chemoprofiling and chemometrics as additional tools to develop qualifying herbal extract references. We list the diagnostic traits available subsequent to each step during the medicinal herb extract manufacturing process and delineate limits to qualification of extract references. Copyright © 2012 Elsevier B.V. All rights reserved.
Mihaleva, V.V.; Vorst, O.F.J.; Maliepaard, C.A.; Verhoeven, H.A.; Vos, de C.H.; Hall, R.D.; Ham, van R.C.H.J.
Compound identification and annotation in (untargeted) metabolomics experiments based on accurate mass require the highest possible accuracy of the mass determination. Experimental LC/TOF-MS platforms equipped with a time-to-digital converter (TDC) give the best mass estimate for those mass signals
Lindstrøm, Bo; Wells, Lisa Marie
a method which makes it possible to associate auxiliary information, called annotations, with tokens without modifying the colour sets of the CP-net. Annotations are pieces of information that are not essential for determining the behaviour of the system being modelled, but are rather added to support...... a certain use of the CP-net. We define the semantics of annotations by describing a translation from a CP-net and the corresponding annotation layers to another CP-net where the annotations are an integrated part of the CP-net....
Barbosa-Breda, João; Himmelreich, Uwe; Ghesquière, Bart; Rocha-Sousa, Amândio; Stalmans, Ingeborg
Glaucoma is one of the leading causes of irreversible blindness worldwide. However, there are no biomarkers that accurately help clinicians perform an early diagnosis or detect patients with a high risk of progression. Metabolomics is the study of all metabolites in an organism, and it has the potential to provide a biomarker. This review summarizes the findings of metabolomics in glaucoma patients and explains why this field is promising for new research. We identified published studies that focused on metabolomics and ophthalmology. After providing an overview of metabolomics in ophthalmology, we focused on human glaucoma studies. Five studies have been conducted in glaucoma patients and all compared patients to healthy controls. Using mass spectrometry, significant differences were found in blood plasma in the metabolic pathways that involve palmitoylcarnitine, sphingolipids, vitamin D-related compounds, and steroid precursors. For nuclear magnetic resonance spectroscopy, a high glutamine-glutamate/creatine ratio was found in the vitreous and lateral geniculate body; no differences were detected in the optic radiations, and a lower N-acetylaspartate/choline ratio was observed in the geniculocalcarine and striate areas. Metabolomics can move glaucoma care towards a personalized approach and provide new knowledge concerning the pathophysiology of glaucoma, which can lead to new therapeutic options. © 2017 S. Karger AG, Basel.
Full Text Available Understanding and harnessing the interactions between nanoparticles and biological molecules is at the forefront of applications of nanotechnology to modern biology. Metabolomics has emerged as a prominent player in systems biology as a complement to genomics, transcriptomics and proteomics. Its focus is the systematic study of metabolite identities and concentration changes in living systems. Despite significant progress over the recent past, important challenges in metabolomics remain, such as the deconvolution of the spectra of complex mixtures with strong overlaps, the sensitive detection of metabolites at low abundance, unambiguous identification of known metabolites, structure determination of unknown metabolites and standardized sample preparation for quantitative comparisons. Recent research has demonstrated that some of these challenges can be substantially alleviated with the help of nanoscience. Nanoparticles in particular have found applications in various areas of bioanalytical chemistry and metabolomics. Their chemical surface properties and increased surface-to-volume ratio endows them with a broad range of binding affinities to biomacromolecules and metabolites. The specific interactions of nanoparticles with metabolites or biomacromolecules help, for example, simplify metabolomics spectra, improve the ionization efficiency for mass spectrometry or reveal relationships between spectral signals that belong to the same molecule. Lessons learned from nanoparticle-assisted metabolomics may also benefit other emerging areas, such as nanotoxicity and nanopharmaceutics.
Rasmussen, Susanne; Parsons, Anthony J; Jones, Christopher S
Forage plant breeding is under increasing pressure to deliver new cultivars with improved yield, quality and persistence to the pastoral industry. New innovations in DNA sequencing technologies mean that quantitative trait loci analysis and marker-assisted selection approaches are becoming faster and cheaper, and are increasingly used in the breeding process with the aim to speed it up and improve its precision. High-throughput phenotyping is currently a major bottle neck and emerging technologies such as metabolomics are being developed to bridge the gap between genotype and phenotype; metabolomics studies on forages are reviewed in this article. Major challenges for pasture production arise from the reduced availability of resources, mainly water, nitrogen and phosphorus, and metabolomics studies on metabolic responses to these abiotic stresses in Lolium perenne and Lotus species will be discussed here. Many forage plants can be associated with symbiotic microorganisms such as legumes with nitrogen fixing rhizobia, grasses and legumes with phosphorus-solubilizing arbuscular mycorrhizal fungi, and cool temperate grasses with fungal anti-herbivorous alkaloid-producing Neotyphodium endophytes and metabolomics studies have shown that these associations can significantly affect the metabolic composition of forage plants. The combination of genetics and metabolomics, also known as genetical metabolomics can be a powerful tool to identify genetic regions related to specific metabolites or metabolic profiles, but this approach has not been widely adopted for forages yet, and we argue here that more studies are needed to improve our chances of success in forage breeding. Metabolomics combined with other '-omics' technologies and genome sequencing can be invaluable tools for large-scale geno- and phenotyping of breeding populations, although the implementation of these approaches in forage breeding programmes still lags behind. The majority of studies using metabolomics
Carroll Adam J
Full Text Available Abstract Background Standardization of analytical approaches and reporting methods via community-wide collaboration can work synergistically with web-tool development to result in rapid community-driven expansion of online data repositories suitable for data mining and meta-analysis. In metabolomics, the inter-laboratory reproducibility of gas-chromatography/mass-spectrometry (GC/MS makes it an obvious target for such development. While a number of web-tools offer access to datasets and/or tools for raw data processing and statistical analysis, none of these systems are currently set up to act as a public repository by easily accepting, processing and presenting publicly submitted GC/MS metabolomics datasets for public re-analysis. Description Here, we present MetabolomeExpress, a new File Transfer Protocol (FTP server and web-tool for the online storage, processing, visualisation and statistical re-analysis of publicly submitted GC/MS metabolomics datasets. Users may search a quality-controlled database of metabolite response statistics from publicly submitted datasets by a number of parameters (eg. metabolite, species, organ/biofluid etc.. Users may also perform meta-analysis comparisons of multiple independent experiments or re-analyse public primary datasets via user-friendly tools for t-test, principal components analysis, hierarchical cluster analysis and correlation analysis. They may interact with chromatograms, mass spectra and peak detection results via an integrated raw data viewer. Researchers who register for a free account may upload (via FTP their own data to the server for online processing via a novel raw data processing pipeline. Conclusions MetabolomeExpress https://www.metabolome-express.org provides a new opportunity for the general metabolomics community to transparently present online the raw and processed GC/MS data underlying their metabolomics publications. Transparent sharing of these data will allow researchers to
There is a growing appreciation that metabolic processes and individual metabolites can shape the function of immune cells and thereby play important roles in the outcome of immune responses. In this respect, the use of MS- and NMR spectroscopy-based platforms to characterize and quantify metabolites in biological samples has recently yielded important novel insights into how our immune system functions and has contributed to the identification of biomarkers for immune-mediated diseases. Here, these recent immunological studies in which metabolomics has been used and made significant contributions to these fields will be discussed. In particular the role of metabolomics to the rapidly advancing field of cellular immunometabolism will be highlighted as well as the future prospects of such metabolomic tools in immunology.
Su, Qiao; Guan, Tianbing; Lv, Haitao
Uropathogenic Escherichia coli (UPEC) growth in women's bladders during urinary tract infection (UTI) incurs substantial chemical exchange, termed the "interactive metabolome", which primarily accounts for the metabolic costs (utilized metabolome) and metabolic donations (excreted metabolome) between UPEC and human urine. Here, we attempted to identify the individualized interactive metabolome between UPEC and human urine. We were able to distinguish UPEC from non-UPEC by employing a combination of metabolomics and genetics. Our results revealed that the interactive metabolome between UPEC and human urine was markedly different from that between non-UPEC and human urine, and that UPEC triggered much stronger perturbations in the interactive metabolome in human urine. Furthermore, siderophore biosynthesis coordinately modulated the individualized interactive metabolome, which we found to be a critical component of UPEC virulence. The individualized virulence-associated interactive metabolome contained 31 different metabolites and 17 central metabolic pathways that were annotated to host these different metabolites, including energetic metabolism, amino acid metabolism, and gut microbe metabolism. Changes in the activities of these pathways mechanistically pinpointed the virulent capability of siderophore biosynthesis. Together, our findings provide novel insights into UPEC virulence, and we propose that siderophores are potential targets for further discovery of drugs to treat UPEC-induced UTI.
Vieth, M; Quirke, P; Lambert, R; von Karsa, L; Risio, M
Multidisciplinary, evidence-based guidelines for quality assurance in colorectal cancer screening and diagnosis have been developed by experts in a project coordinated by the International Agency for Research on Cancer. The full guideline document covers the entire process of population-based screening. It consists of 10 chapters and over 250 recommendations, graded according to the strength of the recommendation and the supporting evidence. The 450-page guidelines and the extensive evidence base have been published by the European Commission. The chapter on quality assurance in pathology was supplemented by an annex describing in greater detail some issues raised in the chapter, particularly details of special interest to pathologists. The content of the annex is presented here to promote international discussion and collaboration by making the issues discussed in the guidelines known to a wider professional and scientific community. © Georg Thieme Verlag KG Stuttgart · New York.
Guo, Haihong; Na, Xu; Li, Jiao
Health question-answering (QA) systems have become a typical application scenario of Artificial Intelligent (AI). An annotated question corpus is prerequisite for training machines to understand health information needs of users. Thus, we aimed to develop an annotated classification corpus of Chinese health questions (Qcorp) and make it openly accessible. We developed a two-layered classification schema and corresponding annotation rules on basis of our previous work. Using the schema, we annotated 5000 questions that were randomly selected from 5 Chinese health websites within 6 broad sections. 8 annotators participated in the annotation task, and the inter-annotator agreement was evaluated to ensure the corpus quality. Furthermore, the distribution and relationship of the annotated tags were measured by descriptive statistics and social network map. The questions were annotated using 7101 tags that covers 29 topic categories in the two-layered schema. In our released corpus, the distribution of questions on the top-layered categories was treatment of 64.22%, diagnosis of 37.14%, epidemiology of 14.96%, healthy lifestyle of 10.38%, and health provider choice of 4.54% respectively. Both the annotated health questions and annotation schema were openly accessible on the Qcorp website. Users can download the annotated Chinese questions in CSV, XML, and HTML format. We developed a Chinese health question corpus including 5000 manually annotated questions. It is openly accessible and would contribute to the intelligent health QA system development.
Gattiker, Alexandre; Michoud, Karine; Rivoire, Catherine; Auchincloss, Andrea H; Coudert, Elisabeth; Lima, Tania; Kersey, Paul; Pagni, Marco; Sigrist, Christian J A; Lachaize, Corinne; Veuthey, Anne Lise; Gasteiger, Elisabeth; Bairoch, Amos
Large-scale sequencing of prokaryotic genomes demands the automation of certain annotation tasks currently manually performed in the production of the SWISS-PROT protein knowledgebase. The HAMAP project, or 'High-quality Automated and Manual Annotation of microbial Proteomes', aims to integrate manual and automatic annotation methods in order to enhance the speed of the curation process while preserving the quality of the database annotation. Automatic annotation is only applied to entries that belong to manually defined orthologous families and to entries with no identifiable similarities (ORFans). Many checks are enforced in order to prevent the propagation of wrong annotation and to spot problematic cases, which are channelled to manual curation. The results of this annotation are integrated in SWISS-PROT, and a website is provided at http://www.expasy.org/sprot/hamap/.
Hansen, Frank Allan
Ubiquitous annotation systems allow users to annotate physical places, objects, and persons with digital information. Especially in the field of location based information systems much work has been done to implement adaptive and context-aware systems, but few efforts have focused on the general...... requirements for linking information to objects in both physical and digital space. This paper surveys annotation techniques from open hypermedia systems, Web based annotation systems, and mobile and augmented reality systems to illustrate different approaches to four central challenges ubiquitous annotation...... systems have to deal with: anchoring, structuring, presentation, and authoring. Through a number of examples each challenge is discussed and HyCon, a context-aware hypermedia framework developed at the University of Aarhus, Denmark, is used to illustrate an integrated approach to ubiquitous annotations...
Gresham Cathy R
Full Text Available Abstract Background Modeling results from chicken microarray studies is challenging for researchers due to little functional annotation associated with these arrays. The Affymetrix GenChip chicken genome array, one of the biggest arrays that serve as a key research tool for the study of chicken functional genomics, is among the few arrays that link gene products to Gene Ontology (GO. However the GO annotation data presented by Affymetrix is incomplete, for example, they do not show references linked to manually annotated functions. In addition, there is no tool that facilitates microarray researchers to directly retrieve functional annotations for their datasets from the annotated arrays. This costs researchers amount of time in searching multiple GO databases for functional information. Results We have improved the breadth of functional annotations of the gene products associated with probesets on the Affymetrix chicken genome array by 45% and the quality of annotation by 14%. We have also identified the most significant diseases and disorders, different types of genes, and known drug targets represented on Affymetrix chicken genome array. To facilitate functional annotation of other arrays and microarray experimental datasets we developed an Array GO Mapper (AGOM tool to help researchers to quickly retrieve corresponding functional information for their dataset. Conclusion Results from this study will directly facilitate annotation of other chicken arrays and microarray experimental datasets. Researchers will be able to quickly model their microarray dataset into more reliable biological functional information by using AGOM tool. The disease, disorders, gene types and drug targets revealed in the study will allow researchers to learn more about how genes function in complex biological systems and may lead to new drug discovery and development of therapies. The GO annotation data generated will be available for public use via AgBase website and
Bada, Michael; Eckert, Miriam; Evans, Donald; Garcia, Kristin; Shipley, Krista; Sitnikov, Dmitry; Baumgartner, William A; Cohen, K Bretonnel; Verspoor, Karin; Blake, Judith A; Hunter, Lawrence E
Manually annotated corpora are critical for the training and evaluation of automated methods to identify concepts in biomedical text. This paper presents the concept annotations of the Colorado Richly Annotated Full-Text (CRAFT) Corpus, a collection of 97 full-length, open-access biomedical journal articles that have been annotated both semantically and syntactically to serve as a research resource for the biomedical natural-language-processing (NLP) community. CRAFT identifies all mentions of nearly all concepts from nine prominent biomedical ontologies and terminologies: the Cell Type Ontology, the Chemical Entities of Biological Interest ontology, the NCBI Taxonomy, the Protein Ontology, the Sequence Ontology, the entries of the Entrez Gene database, and the three subontologies of the Gene Ontology. The first public release includes the annotations for 67 of the 97 articles, reserving two sets of 15 articles for future text-mining competitions (after which these too will be released). Concept annotations were created based on a single set of guidelines, which has enabled us to achieve consistently high interannotator agreement. As the initial 67-article release contains more than 560,000 tokens (and the full set more than 790,000 tokens), our corpus is among the largest gold-standard annotated biomedical corpora. Unlike most others, the journal articles that comprise the corpus are drawn from diverse biomedical disciplines and are marked up in their entirety. Additionally, with a concept-annotation count of nearly 100,000 in the 67-article subset (and more than 140,000 in the full collection), the scale of conceptual markup is also among the largest of comparable corpora. The concept annotations of the CRAFT Corpus have the potential to significantly advance biomedical text mining by providing a high-quality gold standard for NLP systems. The corpus, annotation guidelines, and other associated resources are freely available at http://bionlp-corpora.sourceforge.net/CRAFT/index.shtml.
Background Manually annotated corpora are critical for the training and evaluation of automated methods to identify concepts in biomedical text. Results This paper presents the concept annotations of the Colorado Richly Annotated Full-Text (CRAFT) Corpus, a collection of 97 full-length, open-access biomedical journal articles that have been annotated both semantically and syntactically to serve as a research resource for the biomedical natural-language-processing (NLP) community. CRAFT identifies all mentions of nearly all concepts from nine prominent biomedical ontologies and terminologies: the Cell Type Ontology, the Chemical Entities of Biological Interest ontology, the NCBI Taxonomy, the Protein Ontology, the Sequence Ontology, the entries of the Entrez Gene database, and the three subontologies of the Gene Ontology. The first public release includes the annotations for 67 of the 97 articles, reserving two sets of 15 articles for future text-mining competitions (after which these too will be released). Concept annotations were created based on a single set of guidelines, which has enabled us to achieve consistently high interannotator agreement. Conclusions As the initial 67-article release contains more than 560,000 tokens (and the full set more than 790,000 tokens), our corpus is among the largest gold-standard annotated biomedical corpora. Unlike most others, the journal articles that comprise the corpus are drawn from diverse biomedical disciplines and are marked up in their entirety. Additionally, with a concept-annotation count of nearly 100,000 in the 67-article subset (and more than 140,000 in the full collection), the scale of conceptual markup is also among the largest of comparable corpora. The concept annotations of the CRAFT Corpus have the potential to significantly advance biomedical text mining by providing a high-quality gold standard for NLP systems. The corpus, annotation guidelines, and other associated resources are freely available at http
Sartor, Maureen A; Ade, Alex; Wright, Zach; States, David; Omenn, Gilbert S; Athey, Brian; Karnovsky, Alla
Progress in high-throughput genomic technologies has led to the development of a variety of resources that link genes to functional information contained in the biomedical literature. However, tools attempting to link small molecules to normal and diseased physiology and published data relevant to biologists and clinical investigators, are still lacking. With metabolomics rapidly emerging as a new omics field, the task of annotating small molecule metabolites becomes highly relevant. Our tool Metab2MeSH uses a statistical approach to reliably and automatically annotate compounds with concepts defined in Medical Subject Headings, and the National Library of Medicine's controlled vocabulary for biomedical concepts. These annotations provide links from compounds to biomedical literature and complement existing resources such as PubChem and the Human Metabolome Database.
Kráfová, Katarina; Jampílek, Josef; Ostrovský, Ivan
Pharmaceutical and food industries are increasingly focused on the great potential of plant secondary metabolites or natural substances which can be used as therapeutics or model compounds for development of new drugs. The paper is devoted to the use of metabolomics, metabolic profiling and metabolic "fingerprint" for the identification of individual active phyto-substances in plant extracts, in profiling of unique groups of plant secondary metabolites that can be used to improve the classification of several species of medicinal plants as well as for a better characterization and quality control of medicinal extracts, tinctures and phytotherapeutic products prepared from these plants. Combined analytical methods and multivariate statistical analysis are used for metabolite identification. Using this approach, medicinal plants are evaluated not only on the basis of a limited number of pharmacologically important metabolites but also based on the fingerprints of minor metabolites and bioactive molecules.
Song, Yuelin; Song, Qingqing; Liu, Yao; Li, Jun; Wan, Jian-Bo; Wang, Yitao; Jiang, Yong; Tu, Pengfei
Universal acquisition of reliable information regarding the qualitative and quantitative properties of complicated matrices is the premise for the success of metabolomics study. Liquid chromatography-mass spectrometry (LC-MS) is now serving as a workhorse for metabolomics; however, LC-MS-based non-targeted metabolomics is suffering from some shortcomings, even some cutting-edge techniques have been introduced. Aiming to tackle, to some extent, the drawbacks of the conventional approaches, such as redundant information, detector saturation, low sensitivity, and inconstant signal number among different runs, herein, a novel and flexible work-flow consisting of three progressive steps was proposed to profile in depth the quantitative metabolome of plants. The roots of Peucedanum praeruptorum Dunn (Peucedani Radix, PR) that are rich in various coumarin isomers, were employed as a case study to verify the applicability. First, offline two dimensional LC-MS was utilized for in-depth detection of metabolites in a pooled PR extract namely universal metabolome standard (UMS). Second, mass fragmentation rules, notably concerning angular-type pyranocoumarins that are the primary chemical homologues in PR, and available databases were integrated for signal assignment and structural annotation. Third, optimum collision energy (OCE) as well as ion transition for multiple monitoring reaction measurement was online optimized with a reference compound-free strategy for each annotated component and large-scale relative quantification of all annotated components was accomplished by plotting calibration curves via serially diluting UMS. It is worthwhile to highlight that the potential of OCE for isomer discrimination was described and the linearity ranges of those primary ingredients were extended by suppressing their responses. The integrated workflow is expected to be qualified as a promising pipeline to clarify the quantitative metabolome of plants because it could not only
Karimi, Sarvnaz; Metke-Jimenez, Alejandro; Kemp, Madonna; Wang, Chen
CSIRO Adverse Drug Event Corpus (Cadec) is a new rich annotated corpus of medical forum posts on patient-reported Adverse Drug Events (ADEs). The corpus is sourced from posts on social media, and contains text that is largely written in colloquial language and often deviates from formal English grammar and punctuation rules. Annotations contain mentions of concepts such as drugs, adverse effects, symptoms, and diseases linked to their corresponding concepts in controlled vocabularies, i.e., SNOMED Clinical Terms and MedDRA. The quality of the annotations is ensured by annotation guidelines, multi-stage annotations, measuring inter-annotator agreement, and final review of the annotations by a clinical terminologist. This corpus is useful for studies in the area of information extraction, or more generally text mining, from social media to detect possible adverse drug reactions from direct patient reports. The corpus is publicly available at https://data.csiro.au.(1). Copyright © 2015 Elsevier Inc. All rights reserved.
Bean, Heather D; Hill, Jane E; Dimandja, Jean-Marie D
The potential of high-resolution analytical technologies like GC×GC/TOF MS in untargeted metabolomics and biomarker discovery has been limited by the development of fully automated software that can efficiently align and extract information from multiple chromatographic data sets. In this work we report the first investigation on a peak-by-peak basis of the chromatographic factors that impact GC×GC data alignment. A representative set of 16 compounds of different chromatographic characteristics were followed through the alignment of 63 GC×GC chromatograms. We found that varying the mass spectral match parameter had a significant influence on the alignment for poorly-resolved peaks, especially those at the extremes of the detector linear range, and no influence on well-chromatographed peaks. Therefore, optimized chromatography is required for proper GC×GC data alignment. Based on these observations, a workflow is presented for the conservative selection of biomarker candidates from untargeted metabolomics analyses. Copyright © 2015 Elsevier B.V. All rights reserved.
Howe, Kevin; Davis, Paul; Paulini, Michael; Tuli, Mary Ann; Williams, Gary; Yook, Karen; Durbin, Richard; Kersey, Paul; Sternberg, Paul W
WormBase (www.wormbase.org) has been serving the scientific community for over 11 years as the central repository for genomic and genetic information for the soil nematode Caenorhabditis elegans. The resource has evolved from its beginnings as a database housing the genomic sequence and genetic and physical maps of a single species, and now represents the breadth and diversity of nematode research, currently serving genome sequence and annotation for around 20 nematodes. In this article, we focus on WormBase's role of genome sequence annotation, describing how we annotate and integrate data from a growing collection of nematode species and strains. We also review our approaches to sequence curation, and discuss the impact on annotation quality of large functional genomics projects such as modENCODE.
Effects of boiling duration in processing of White Paeony Root on its overall quality evaluated by ultra-high performance liquid chromatography quadrupole/time-of-flight mass spectrometry based metabolomics analysis and high performance liquid chromatography quantification.
Ming, Kong; Xu, Jun; Liu, Huan-Huan; Xu, Jin-Di; Li, Xiu-Yang; Lu, Min; Wang, Chun-Ru; Chen, Hu-Biao; Li, Song-Lin
Boiling processing is commonly used in post-harvest handling of White Paeony Root (WPR), in order to whiten the herbal materials and preserve the bright color, since such WPR is empirically considered to possess a higher quality. The present study was designed to investigate whether and how the boiling processing affects overall quality of WPR. First, an ultra-high performance liquid chromatography quadrupole/time-of-flight mass spectrometry-based metabolomics approach coupled with multivariate statistical analysis was developed to compare the holistic quality of boiled and un-boiled WPR samples. Second, ten major components in WPR samples boiled for different durations were quantitatively determined using high performance liquid chromatography to further explore the effects of boiling time on the holistic quality of WPR, meanwhile the appearance of the processed herbal materials was observed. The results suggested that the boiling processing conspicuously affected the holistic quality of WPR by simultaneously and inconsistently altering the chemical compositions and that short-time boiling processing between 2 and 10 min could both make the WPR bright-colored and improve the contents of major bioactive components, which were not achieved either without boiling or with prolonged boiling. In conclusion, short-term boiling (2-10 min) is recommended for post-harvest handling of WPR. Copyright © 2017 China Pharmaceutical University. Published by Elsevier B.V. All rights reserved.
Werf, M.J.v.d.; Overkamp, K.M.; Muilwijk, B.; Coulier, L.; Hankemeier, T.
Achieving metabolome data with satisfactory coverage is a formidable challenge in metabolomics because metabolites are a chemically highly diverse group of compounds. Here we present a strategy for the development of an advanced analytical platform that allows the comprehensive analysis of microbial
Kale, Namrata S; Haug, Kenneth; Conesa, Pablo; Jayseelan, Kalaivani; Moreno, Pablo; Rocca-Serra, Philippe; Nainala, Venkata Chandrasekhar; Spicer, Rachel A; Williams, Mark; Li, Xuefei; Salek, Reza M; Griffin, Julian L; Steinbeck, Christoph
MetaboLights is the first general purpose, open-access database repository for cross-platform and cross-species metabolomics research at the European Bioinformatics Institute (EMBL-EBI). Based upon the open-source ISA framework, MetaboLights provides Metabolomics Standard Initiative (MSI) compliant metadata and raw experimental data associated with metabolomics experiments. Users can upload their study datasets into the MetaboLights Repository. These studies are then automatically assigned a stable and unique identifier (e.g., MTBLS1) that can be used for publication reference. The MetaboLights Reference Layer associates metabolites with metabolomics studies in the archive and is extensively annotated with data fields such as structural and chemical information, NMR and MS spectra, target species, metabolic pathways, and reactions. The database is manually curated with no specific release schedules. MetaboLights is also recommended by journals for metabolomics data deposition. This unit provides a guide to using MetaboLights, downloading experimental data, and depositing metabolomics datasets using user-friendly submission tools. Copyright © 2016 John Wiley & Sons, Inc.
Full Text Available As genomes of many plant species have been sequenced, demand for functional genomics has dramatically accelerated the improvement of other omics including metabolomics. Despite a large amount of metabolites still remaining to be identified, metabolomics has contributed significantly not only to the understanding of plant physiology and biology from the view of small chemical molecules that reflect the end point of biological activities, but also in past decades to the attempts to improve plant behavior under both normal and stressed conditions. Hereby, we summarize the current knowledge on the genetic and biochemical mechanisms underlying plant growth, development, and stress responses, focusing further on the contributions of metabolomics to practical applications in crop quality improvement and food safety assessment, as well as plant metabolic engineering. We also highlight the current challenges and future perspectives in this inspiring area, with the aim to stimulate further studies leading to better crop improvement of yield and quality.
Hong, Jun; Yang, Litao; Zhang, Dabing; Shi, Jianxin
As genomes of many plant species have been sequenced, demand for functional genomics has dramatically accelerated the improvement of other omics including metabolomics. Despite a large amount of metabolites still remaining to be identified, metabolomics has contributed significantly not only to the understanding of plant physiology and biology from the view of small chemical molecules that reflect the end point of biological activities, but also in past decades to the attempts to improve plant behavior under both normal and stressed conditions. Hereby, we summarize the current knowledge on the genetic and biochemical mechanisms underlying plant growth, development, and stress responses, focusing further on the contributions of metabolomics to practical applications in crop quality improvement and food safety assessment, as well as plant metabolic engineering. We also highlight the current challenges and future perspectives in this inspiring area, with the aim to stimulate further studies leading to better crop improvement of yield and quality. PMID:27258266
Hong, Jun; Yang, Litao; Zhang, Dabing; Shi, Jianxin
As genomes of many plant species have been sequenced, demand for functional genomics has dramatically accelerated the improvement of other omics including metabolomics. Despite a large amount of metabolites still remaining to be identified, metabolomics has contributed significantly not only to the understanding of plant physiology and biology from the view of small chemical molecules that reflect the end point of biological activities, but also in past decades to the attempts to improve plant behavior under both normal and stressed conditions. Hereby, we summarize the current knowledge on the genetic and biochemical mechanisms underlying plant growth, development, and stress responses, focusing further on the contributions of metabolomics to practical applications in crop quality improvement and food safety assessment, as well as plant metabolic engineering. We also highlight the current challenges and future perspectives in this inspiring area, with the aim to stimulate further studies leading to better crop improvement of yield and quality.
Shu, Shengqiang; Rokhsar, Dan; Goodstein, David; Hayes, David; Mitros, Therese
Plant genomes vary in size and are highly complex with a high amount of repeats, genome duplication and tandem duplication. Gene encodes a wealth of information useful in studying organism and it is critical to have high quality and stable gene annotation. Thanks to advancement of sequencing technology, many plant species genomes have been sequenced and transcriptomes are also sequenced. To use these vastly large amounts of sequence data to make gene annotation or re-annotation in a timely fashion, an automatic pipeline is needed. JGI plant genomics gene annotation pipeline, called integrated gene call (IGC), is our effort toward this aim with aid of a RNA-seq transcriptome assembly pipeline. It utilizes several gene predictors based on homolog peptides and transcript ORFs. See Methods for detail. Here we present genome annotation of JGI flagship green plants produced by this pipeline plus Arabidopsis and rice except for chlamy which is done by a third party. The genome annotations of these species and others are used in our gene family build pipeline and accessible via JGI Phytozome portal whose URL and front page snapshot are shown below.
Peyretaillade, Eric; Parisot, Nicolas; Polonais, Valérie; Terrat, Sébastien; Denonfoux, Jérémie; Dugat-Bony, Eric; Wawrzyniak, Ivan; Biderre-Petit, Corinne; Mahul, Antoine; Rimour, Sébastien; Gonçalves, Olivier; Bornes, Stéphanie; Delbac, Frédéric; Chebance, Brigitte; Duprat, Simone; Samson, Gaëlle; Katinka, Michael; Weissenbach, Jean; Wincker, Patrick; Peyret, Pierre
High-quality annotation of microsporidian genomes is essential for understanding the biological processes that govern the development of these parasites. Here we present an improved structural annotation method using transcriptional DNA signals. We apply this method to re-annotate four previously annotated genomes, which allow us to detect annotation errors and identify a significant number of unpredicted genes. We then annotate the newly sequenced genome of Anncaliia algerae. A comparative genomic analysis of A. algerae permits the identification of not only microsporidian core genes, but also potentially highly expressed genes encoding membrane-associated proteins, which represent good candidates involved in the spore architecture, the invasion process and the microsporidian-host relationships. Furthermore, we find that the ten-fold variation in microsporidian genome sizes is not due to gene number, size or complexity, but instead stems from the presence of transposable elements. Such elements, along with kinase regulatory pathways and specific transporters, appear to be key factors in microsporidian adaptive processes.
Strategy for comparative untargeted metabolomics reveals honey markers of different floral and geographic origins using ultrahigh-performance liquid chromatography-hybrid quadrupole-orbitrap mass spectrometry.
Li, Yi; Jin, Yue; Yang, Shupeng; Zhang, Wenwen; Zhang, Jinzhen; Zhao, Wen; Chen, Lanzhen; Wen, Yaqin; Zhang, Yongxin; Lu, Kaizhi; Zhang, Yaping; Zhou, Jinhui; Yang, Shuming
Honey discrimination based on floral and geographic origins is limited by the ability to determine reliable markers because developing hypothetical substances in advance considerably limits the throughput of metabolomics studies. Here, we present a novel approach to screen and elucidate honey markers based on comparative untargeted metabolomics using ultrahigh-performance liquid chromatography-hybrid quadrupole-orbitrap mass spectrometry (UHPLC-Q-Orbitrap). To reduce metabolite information losses during sample preparation, the honey samples were dissolved in water and centrifuged to remove insoluble particles prior to UHPLC-Q-Orbitrap analysis in positive and negative electrospray ionization modes. The data were pretreated using background subtraction, chromatographic peak extraction, normalization, transformation and scaling to remove interferences from unwanted biases and variance in the experimental data. The pretreated data were further processed using principal component analysis (PCA) and a three-stage approach (t-test, volcano plot and variable importance in projection (VIP) plot) to ensure marker authenticity. A correlation between the molecular and fragment ions with a mass accuracy of less than 1.0ppm was used to annotate and elucidate the marker structures, and the marker responses in real samples were used to confirm the effectiveness of the honey discrimination. Moreover, we evaluated the data quality using blank and quality control (QC) samples based on PCA clustering, retention times, normalized levels and peak areas. This strategy will help guide standardized, comparative untargeted metabolomics studies of honey and other agro-products from different floral and geographic origins. Copyright © 2017 Elsevier B.V. All rights reserved.
A case is made for the importance of high quality semantic and coreference annotation. The challenges of providing such annotation are described. Asperger's Syndrome is introduced, and the connections are drawn between the needs of text annotation and the abilities of persons with Asperger's Syndrome to meet those needs. Finally, a pilot program is recommended wherein semantic annotation is performed by people with Asperger's Syndrome. The primary points embodied in this paper are as follows: (1) Document annotation is essential to the Natural Language Processing (NLP) projects at Lawrence Livermore National Laboratory (LLNL); (2) LLNL does not currently have a system in place to meet its need for text annotation; (3) Text annotation is challenging for a variety of reasons, many related to its very rote nature; (4) Persons with Asperger's Syndrome are particularly skilled at rote verbal tasks, and behavioral experts agree that they would excel at text annotation; and (6) A pilot study is recommend in which two to three people with Asperger's Syndrome annotate documents and then the quality and throughput of their work is evaluated relative to that of their neuro-typical peers.
Full Text Available Metabolomic-based approaches are increasingly applied to analyse genetically modified organisms (GMOs making it possible to obtain broader and deeper information on the composition of GMOs compared to that obtained from traditional analytical approaches. The combination in metabolomics of advanced analytical methods and bioinformatics tools provides wide chemical compositional data that contributes to corroborate (or not the substantial equivalence and occurrence of unintended changes resulting from genetic transformation. This review provides insight into recent progress in metabolomics studies on transgenic crops focusing mainly in papers published in the last decade.
Simó, Carolina; Ibáñez, Clara; Valdés, Alberto; Cifuentes, Alejandro; García-Cañas, Virginia
Metabolomic-based approaches are increasingly applied to analyse genetically modified organisms (GMOs) making it possible to obtain broader and deeper information on the composition of GMOs compared to that obtained from traditional analytical approaches. The combination in metabolomics of advanced analytical methods and bioinformatics tools provides wide chemical compositional data that contributes to corroborate (or not) the substantial equivalence and occurrence of unintended changes resulting from genetic transformation. This review provides insight into recent progress in metabolomics studies on transgenic crops focusing mainly in papers published in the last decade.
Simó, Carolina; Ibáñez, Clara; Valdés, Alberto; Cifuentes, Alejandro; García-Cañas, Virginia
Metabolomic-based approaches are increasingly applied to analyse genetically modified organisms (GMOs) making it possible to obtain broader and deeper information on the composition of GMOs compared to that obtained from traditional analytical approaches. The combination in metabolomics of advanced analytical methods and bioinformatics tools provides wide chemical compositional data that contributes to corroborate (or not) the substantial equivalence and occurrence of unintended changes resulting from genetic transformation. This review provides insight into recent progress in metabolomics studies on transgenic crops focusing mainly in papers published in the last decade. PMID:25334064
Floros, Dimitrios J; Jensen, Paul R; Dorrestein, Pieter C; Koyama, Nobuhiro
Natural products from culture collections have enormous impact in advancing discovery programs for metabolites of biotechnological importance. These discovery efforts rely on the metabolomic characterization of strain collections. Many emerging approaches compare metabolomic profiles of such collections, but few enable the analysis and prioritization of thousands of samples from diverse organisms while delivering chemistry specific read outs. In this work we utilize untargeted LC-MS/MS based metabolomics together with molecular networking to. This approach annotated 76 molecular families (a spectral match rate of 28 %), including clinically and biotechnologically important molecules such as valinomycin, actinomycin D, and desferrioxamine E. Targeting a molecular family produced primarily by one microorganism led to the isolation and structure elucidation of two new molecules designated maridric acids A and B. Molecular networking guided exploration of large culture collections allows for rapid dereplication of know molecules and can highlight producers of uniques metabolites. These methods, together with large culture collections and growing databases, allow for data driven strain prioritization with a focus on novel chemistries.
Volatile organic compounds (VOCs) in strawberry (Fragaria spp.) represent a large portion of the fruit secondary metabolome, and contribute significantly to aroma, flavor, disease resistance, pest resistance and overall fruit quality. Understanding the basis for volatile compound biosynthesis and it...
Denihan, Niamh M.; Boylan, Geraldine B.; Murray, Deirdre M.
Metabolomics, the latest “omic” technology, is defined as the comprehensive study of all low molecular weight biochemicals, “metabolites” present in an organism. As a systems biology approach, metabolomics has huge potential to progress our understanding of perinatal asphyxia and neonatal hypoxic-ischaemic encephalopathy, by uniquely detecting rapid biochemical pathway alterations in response to the hypoxic environment. The study of metabolomic biomarkers in the immediate neonatal period is not a trivial task and requires a number of specific considerations, unique to this disease and population. Recruiting a clearly defined cohort requires standardised multicentre recruitment with broad inclusion criteria and the participation of a range of multidisciplinary staff. Minimally invasive biospecimen collection is a priority for biomarker discovery. Umbilical cord blood presents an ideal medium as large volumes can be easily extracted and stored and the sample is not confounded by postnatal disease progression. Pristine biobanking and phenotyping are essential to ensure the validity of metabolomic findings. This paper provides an overview of the current state of the art in the field of metabolomics in perinatal asphyxia and neonatal hypoxic-ischaemic encephalopathy. We detail the considerations required to ensure high quality sampling and analysis, to support scientific progression in this important field. PMID:25802843
Iron (Fe) deficiency is an important agricultural concern leading to lower yields and crop quality. A better understanding of the condition, at the metabolome level, could contribute to the design of strategies to ameliorate Fe deficiency problems. Fe-sufficient and Fe-deficient soybean leaf extract...
Torkamani, Ali; Scott-Van Zeeland, Ashley A; Topol, Eric J; Schork, Nicholas J
Advances in DNA sequencing technologies have made it possible to rapidly, accurately and affordably sequence entire individual human genomes. As impressive as this ability seems, however, it will not likely amount to much if one cannot extract meaningful information from individual sequence data. Annotating variations within individual genomes and providing information about their biological or phenotypic impact will thus be crucially important in moving individual sequencing projects forward, especially in the context of the clinical use of sequence information. In this paper we consider the various ways in which one might annotate individual sequence variations and point out limitations in the available methods for doing so. It is arguable that, in the foreseeable future, DNA sequencing of individual genomes will become routine for clinical, research, forensic, and personal purposes. We therefore also consider directions and areas for further research in annotating genomic variants. Copyright © 2011 Elsevier Inc. All rights reserved.
Torkamani, Ali; Scott-Van Zeeland, Ashley A.; Topol, Eric J.; Schork, Nicholas J.
Advances in DNA sequencing technologies have made it possible to rapidly, accurately and affordably sequence entire individual human genomes. As impressive as this ability seems, however, it will not likely to amount to much if one cannot extract meaningful information from individual sequence data. Annotating variations within individual genomes and providing information about their biological or phenotypic impact will thus be crucially important in moving individual sequencing projects forward, especially in the context of the clinical use of sequence information. In this paper we consider the various ways in which one might annotate individual sequence variations and point out limitations in the available methods for doing so. It is arguable that, in the foreseeable future, DNA sequencing of individual genomes will become routine for clinical, research, forensic, and personal purposes. We therefore also consider directions and areas for further research in annotating genomic variants. PMID:21839162
juice from ancient Danish apple cultivars. Both studies revealed variety-related peculiarities that would have been difficult to detect by means of traditional analysis. The second part of the project includes four metabolomics studies performed on samples of biological origin. In particular, the first......Metabolomics is the scientific discipline that identifies and quantifies endogenous and exogenous metabolites in different biological samples. Metabolites are crucial components of a biological system and they are highly informative about its functional state, due to their closeness to the organism...... focused on the analysis of various samples covering a wide range of fields, namely, food and nutraceutical sciences, cell metabolomics and medicine using a metabolomics approach. Indeed, the first part of the thesis describes two exploratory studies performed on Algerian extra virgin olive oil and apple...
U.S. Department of Health & Human Services — The Metabolomics Program's Data Repository and Coordinating Center (DRCC), housed at the San Diego Supercomputer Center (SDSC), University of California, San Diego,...
Roberts, Randy S. [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States); Pope, Paul A. [Los Alamos National Lab. (LANL), Los Alamos, NM (United States); Jiang, Ming [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States); Trucano, Timothy G. [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States); Aragon, Cecilia R. [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States); Ni, Kevin [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States); Wei, Thomas [Argonne National Lab. (ANL), Argonne, IL (United States); Chilton, Lawrence K. [Pacific Northwest National Lab. (PNNL), Richland, WA (United States); Bakel, Alan [Argonne National Lab. (ANL), Argonne, IL (United States)
The following annotated bibliography was developed as part of the geospatial algorithm verification and validation (GSV) project for the Simulation, Algorithms and Modeling program of NA-22. Verification and Validation of geospatial image analysis algorithms covers a wide range of technologies. Papers in the bibliography are thus organized into the following five topic areas: Image processing and analysis, usability and validation of geospatial image analysis algorithms, image distance measures, scene modeling and image rendering, and transportation simulation models. Many other papers were studied during the course of the investigation including. The annotations for these articles can be found in the paper "On the verification and validation of geospatial image analysis algorithms".
Al Arab, Marwa; Höner Zu Siederdissen, Christian; Tout, Kifah; Sahyoun, Abdullah H; Stadler, Peter F; Bernt, Matthias
Mitochondrial genome sequences are available in large number and new sequences become published nowadays with increasing pace. Fast, automatic, consistent, and high quality annotations are a prerequisite for downstream analyses. Therefore, we present an automated pipeline for fast de novo annotation of mitochondrial protein-coding genes. The annotation is based on enhanced phylogeny-aware hidden Markov models (HMMs). The pipeline builds taxon-specific enhanced multiple sequence alignments (MSA) of already annotated sequences and corresponding HMMs using an approximation of the phylogeny. The MSAs are enhanced by fixing unannotated frameshifts, purging of wrong sequences, and removal of non-conserved columns from both ends. A comparison with reference annotations highlights the high quality of the results. The frameshift correction method predicts a large number of frameshifts, many of which are unknown. A detailed analysis of the frameshifts in nad3 of the Archosauria-Testudines group has been conducted. Copyright © 2016 Elsevier Inc. All rights reserved.
Guitton, Yann; Tremblay-Franco, Marie; Le Corguillé, Gildas; Martin, Jean-François; Pétéra, Mélanie; Roger-Mele, Pierrick; Delabrière, Alexis; Goulitquer, Sophie; Monsoor, Misharl; Duperier, Christophe; Canlet, Cécile; Servien, Rémi; Tardivel, Patrick; Caron, Christophe; Giacomoni, Franck; Thévenot, Etienne A
Metabolomics is a key approach in modern functional genomics and systems biology. Due to the complexity of metabolomics data, the variety of experimental designs, and the multiplicity of bioinformatics tools, providing experimenters with a simple and efficient resource to conduct comprehensive and rigorous analysis of their data is of utmost importance. In 2014, we launched the Workflow4Metabolomics (W4M; http://workflow4metabolomics.org) online infrastructure for metabolomics built on the Galaxy environment, which offers user-friendly features to build and run data analysis workflows including preprocessing, statistical analysis, and annotation steps. Here we present the new W4M 3.0 release, which contains twice as many tools as the first version, and provides two features which are, to our knowledge, unique among online resources. First, data from the four major metabolomics technologies (i.e., LC-MS, FIA-MS, GC-MS, and NMR) can be analyzed on a single platform. By using three studies in human physiology, alga evolution, and animal toxicology, we demonstrate how the 40 available tools can be easily combined to address biological issues. Second, the full analysis (including the workflow, the parameter values, the input data and output results) can be referenced with a permanent digital object identifier (DOI). Publication of data analyses is of major importance for robust and reproducible science. Furthermore, the publicly shared workflows are of high-value for e-learning and training. The Workflow4Metabolomics 3.0 e-infrastructure thus not only offers a unique online environment for analysis of data from the main metabolomics technologies, but it is also the first reference repository for metabolomics workflows. Copyright © 2017 Elsevier Ltd. All rights reserved.
Reidsma, Dennis; Heylen, Dirk K.J.; Ordelman, Roeland J.F.
We present the results of two trials testing procedures for the annotation of emotion and mental state of the AMI corpus. The first procedure is an adaptation of the FeelTrace method, focusing on a continuous labelling of emotion dimensions. The second method is centered around more discrete
Martinez Alonso, Hector
Regular polysemy has received a lot of attention from the theory of lexical semantics and from computational linguistics. However, there is no consensus on how to represent the sense of underspecified examples at the token level, namely when annotating or disambiguating senses of metonymic words...
Stanley, N.E.; Thurow, T.L.; Russell, B.F.; Sullivan, J.F.
This annotated bibliography covers the following topics: algae, wetland ecosystems; institutional aspects; macrophytes - general, production rates, and mineral absorption; trace metal absorption; wetland soils; water quality; and other aspects of marsh ecosystems. (MHR)
Roessner, U.; Rolin, D.; Rijswijk, van M.E.C.; Hall, R.D.; Hankemeier, T.
In 2012 the Metabolomics Society established a more formal system for national and regional metabolomics initiatives, interest groups, societies and networks to become an International Affiliate of the Society. A number of groups (http://metabolomicssociety.org/international-affilia
Macel, M.; Dam, van N.M.; Keurentjes, J.J.B.
Metabolomics is a fast developing field of comprehensive untargeted chemical analyses. It has many applications and can in principle be used on any organism without prior knowledge of the metabolome or genome. The amount of functional information that is acquired with metabolomics largely depends on
Song, Dezhao; Chute, Christopher G; Tao, Cui
To facilitate clinical research, clinical data needs to be stored in a machine processable and understandable way. Manual annotating clinical data is time consuming. Automatic approaches (e.g., Natural Language Processing systems) have been adopted to convert such data into structured formats; however, the quality of such automatically extracted data may not always be satisfying. In this paper, we propose Semantator, a semi-automatic tool for document annotation with Semantic Web ontologies. With a loaded free text document and an ontology, Semantator supports the creation/deletion of ontology instances for any document fragment, linking/disconnecting instances with the properties in the ontology, and also enables automatic annotation by connecting to the NCBO annotator and cTAKES. By representing annotations in Semantic Web standards, Semantator supports reasoning based upon the underlying semantics of the owl:disjointWith and owl:equivalentClass predicates. We present discussions based on user experiences of using Semantator.
Roberts, Randy S. [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States); Pope, Paul A. [Los Alamos National Lab. (LANL), Los Alamos, NM (United States); Jiang, Ming [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States); Trucano, Timothy G. [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States); Aragon, Cecilia R. [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States); Ni, Kevin [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States); Wei, Thomas [Argonne National Lab. (ANL), Argonne, IL (United States); Chilton, Lawrence K. [Pacific Northwest National Lab. (PNNL), Richland, WA (United States); Bakel, Alan [Argonne National Lab. (ANL), Argonne, IL (United States)
The following annotated bibliography was developed as part of the Geospatial Algorithm Veri cation and Validation (GSV) project for the Simulation, Algorithms and Modeling program of NA-22. Veri cation and Validation of geospatial image analysis algorithms covers a wide range of technologies. Papers in the bibliography are thus organized into the following ve topic areas: Image processing and analysis, usability and validation of geospatial image analysis algorithms, image distance measures, scene modeling and image rendering, and transportation simulation models.
Full Text Available Abstract Background The genome sequencing projects have shown our limited knowledge regarding gene function, e.g. S. cerevisiae has 5–6,000 genes of which nearly 1,000 have an uncertain function. Their gross influence on the behaviour of the cell can be observed using large-scale metabolomic studies. The metabolomic data produced need to be structured and annotated in a machine-usable form to facilitate the exploration of the hidden links between the genes and their functions. Description MeMo is a formal model for representing metabolomic data and the associated metadata. Two predominant platforms (SQL and XML are used to encode the model. MeMo has been implemented as a relational database using a hybrid approach combining the advantages of the two technologies. It represents a practical solution for handling the sheer volume and complexity of the metabolomic data effectively and efficiently. The MeMo model and the associated software are available at http://dbkgroup.org/memo/. Conclusion The maturity of relational database technology is used to support efficient data processing. The scalability and self-descriptiveness of XML are used to simplify the relational schema and facilitate the extensibility of the model necessitated by the creation of new experimental techniques. Special consideration is given to data integration issues as part of the systems biology agenda. MeMo has been physically integrated and cross-linked to related metabolomic and genomic databases. Semantic integration with other relevant databases has been supported through ontological annotation. Compatibility with other data formats is supported by automatic conversion.
Kwon, Hyuk Nam; Phan, Hong-Duc; Xu, Wen Jun; Ko, Yoon-Joo; Park, Sunghyouk
Herbal medicines have been used for a long time all around the world. Since the quality of herbal preparations depends on the source of herbal materials, there has been a strong need to develop methods to correctly identify the origin of materials. To develop a smartphone metabolomics platform as a simpler and low-cost alternative for the identification of herbal material source. Schisandra sinensis extracts from Korea and China were prepared. The visible spectra of all samples were measured by a smartphone spectrometer platform. This platform included all the necessary measures built-in for the metabolomics research: data acquisition, processing, chemometric analysis and visualisation of the results. The result of the smartphone metabolomics platform was compared to that of NMR-based metabolomics, suggesting the feasibility of smartphone platform in metabolomics research. The smartphone metabolomics platform gave similar results to the NMR method, showing good separation between Korean and Chinese materials and correct predictability for all test samples. With its accuracy and advantages of affordability, user-friendliness, and portability, the smartphone metabolomics platform could be applied to the authentication of other medicinal plants. Copyright © 2016 John Wiley & Sons, Ltd. Copyright © 2016 John Wiley & Sons, Ltd.
Overbeek Ross A
Full Text Available Abstract Background The number of prokaryotic genome sequences becoming available is growing steadily and is growing faster than our ability to accurately annotate them. Description We describe a fully automated service for annotating bacterial and archaeal genomes. The service identifies protein-encoding, rRNA and tRNA genes, assigns functions to the genes, predicts which subsystems are represented in the genome, uses this information to reconstruct the metabolic network and makes the output easily downloadable for the user. In addition, the annotated genome can be browsed in an environment that supports comparative analysis with the annotated genomes maintained in the SEED environment. The service normally makes the annotated genome available within 12–24 hours of submission, but ultimately the quality of such a service will be judged in terms of accuracy, consistency, and completeness of the produced annotations. We summarize our attempts to address these issues and discuss plans for incrementally enhancing the service. Conclusion By providing accurate, rapid annotation freely to the community we have created an important community resource. The service has now been utilized by over 120 external users annotating over 350 distinct genomes.
Marsch, Amanda F; Espiritu, Baltazar; Groth, John; Hutchens, Kelli A
With today's technology, paraffin-embedded, hematoxylin & eosin-stained pathology slides can be scanned to generate high quality virtual slides. Using proprietary software, digital images can also be annotated with arrows, circles and boxes to highlight certain diagnostic features. Previous studies assessing digital microscopy as a teaching tool did not involve the annotation of digital images. The objective of this study was to compare the effectiveness of annotated digital pathology slides versus non-annotated digital pathology slides as a teaching tool during dermatology and pathology residencies. A study group composed of 31 dermatology and pathology residents was asked to complete an online pre-quiz consisting of 20 multiple choice style questions, each associated with a static digital pathology image. After completion, participants were given access to an online tutorial composed of digitally annotated pathology slides and subsequently asked to complete a post-quiz. A control group of 12 residents completed a non-annotated version of the tutorial. Nearly all participants in the study group improved their quiz score, with an average improvement of 17%, versus only 3% (P = 0.005) in the control group. These results support the notion that annotated digital pathology slides are superior to non-annotated slides for the purpose of resident education. © 2014 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.
Kovalchuk, Anna; Nersisyan, Lilit; Mandal, Rupasri; Wishart, David; Mancini, Maria; Sidransky, David; Kolb, Bryan; Kovalchuk, Olga
Cancer survivors experience numerous treatment side effects that negatively affect their quality of life. Cognitive side effects are especially insidious, as they affect memory, cognition, and learning. Neurocognitive deficits occur prior to cancer treatment, arising even before cancer diagnosis, and we refer to them as “tumor brain.” Metabolomics is a new area of research that focuses on metabolome profiles and provides important mechanistic insights into various human diseases, including cancer, neurodegenerative diseases, and aging. Many neurological diseases and conditions affect metabolic processes in the brain. However, the tumor brain metabolome has never been analyzed. In our study we used direct flow injection/mass spectrometry (DI-MS) analysis to establish the effects of the growth of lung cancer, pancreatic cancer, and sarcoma on the brain metabolome of TumorGraft™ mice. We found that the growth of malignant non-CNS tumors impacted metabolic processes in the brain, affecting protein biosynthesis, and amino acid and sphingolipid metabolism. The observed metabolic changes were similar to those reported for neurodegenerative diseases and brain aging, and may have potential mechanistic value for future analysis of the tumor brain phenomenon. PMID:29515623
Hoyle, David C.; Brass, Andrew
We present a statistical mechanical theory of the process of annotating an object with terms selected from an ontology. The term selection process is formulated as an ideal lattice gas model, but in a highly structured inhomogeneous field. The model enables us to explain patterns recently observed in real-world annotation data sets, in terms of the underlying graph structure of the ontology. By relating the external field strengths to the information content of each node in the ontology graph, the statistical mechanical model also allows us to propose a number of practical metrics for assessing the quality of both the ontology, and the annotations that arise from its use. Using the statistical mechanical formalism we also study an ensemble of ontologies of differing size and complexity; an analysis not readily performed using real data alone. Focusing on regular tree ontology graphs we uncover a rich set of scaling laws describing the growth in the optimal ontology size as the number of objects being annotated increases. In doing so we provide a further possible measure for assessment of ontologies.
McDowell, Jillian Marie; Johnson, Gillian Margaret; Hetherington, Barbara Helen
Quality technique documentation is integral to the practice of manual therapy, ensuring uniform application and reproducibility of treatment. Manual therapy techniques are described by annotations utilizing a range of acronyms, abbreviations and universal terminology based on biomechanical and anatomical concepts. The various combinations of therapist and patient generated forces utilized in a variety of weight-bearing positions, which are synonymous with Mulligan Concept, challenge practitioners existing annotational skills. An annotation framework with recording rules adapted to the Mulligan Concept is proposed in which the abbreviations incorporate established manual therapy tenets and are detailed in the following sequence of; starting position, side, joint/s, method of application, glide/s, Mulligan technique, movement (or function), whether an assistant is used, overpressure (and by whom) and numbers of repetitions or time and sets. Therapist or patient application of overpressure and utilization of treatment belts or manual techniques must be recorded to capture the complete description. The adoption of the Mulligan Concept annotation framework in this way for documentation purposes will provide uniformity and clarity of information transfer for the future purposes of teaching, clinical practice and audit for its practitioners. Copyright © 2014 Elsevier Ltd. All rights reserved.
Uziel, M.S.; Hannon, E.H.
This bibliography of 655 annotated references on impingement of aquatic organisms at intake structures of thermal-power-plant cooling systems was compiled from the published and unpublished literature. The bibliography includes references from 1928 to 1978 on impingement monitoring programs; impingement impact assessment; applicable law; location and design of intake structures, screens, louvers, and other barriers; fish behavior and swim speed as related to impingement susceptibility; and the effects of light, sound, bubbles, currents, and temperature on fish behavior. References are arranged alphabetically by author or corporate author. Indexes are provided for author, keywords, subject category, geographic location, taxon, and title
Martinez Alonso, Hector; Johannsen, Anders Trærup; Lopez de Lacalle, Oier
High agreement is a common objective when annotating data for word senses. However, a number of factors make perfect agreement impossible, e.g. the limitations of the sense inventories, the difficulty of the examples or the interpretation preferences of the annotations. Estimating potential...... agreement is thus a relevant task to supplement the evaluation of sense annotations. In this article we propose two methods to predict agreement on word-annotation instances. We experiment with a continuous representation and a three-way discretization of observed agreement. In spite of the difficulty...
Mondul, Alison M; Weinstein, Stephanie J; Albanes, Demetrius
How micronutrients might influence risk of developing adenocarcinoma of the prostate has been the focus of a large body of research (especially regarding vitamins E, A, and D). Metabolomic profiling has the potential to discover molecular species relevant to prostate cancer etiology, early detection, and prevention, and may help elucidate the biologic mechanisms through which vitamins influence prostate cancer risk. Prostate cancer risk data related to vitamins E, A, and D and metabolomic profiling from clinical, cohort, and nested case-control studies, along with randomized controlled trials, are examined and summarized, along with recent metabolomic data of the vitamin phenotypes. Higher vitamin E serologic status is associated with lower prostate cancer risk, and vitamin E genetic variant data support this. By contrast, controlled vitamin E supplementation trials have had mixed results based on differing designs and dosages. Beta-carotene supplementation (in smokers) and higher circulating retinol and 25-hydroxy-vitamin D concentrations appear related to elevated prostate cancer risk. Our prospective metabolomic profiling of fasting serum collected 1-20 years prior to clinical diagnoses found reduced lipid and energy/TCA cycle metabolites, including inositol-1-phosphate, lysolipids, alpha-ketoglutarate, and citrate, significantly associated with lower risk of aggressive disease. Several active leads exist regarding the role of micronutrients and metabolites in prostate cancer carcinogenesis and risk. How vitamins D and A may adversely impact risk, and whether low-dose vitamin E supplementation remains a viable preventive approach, require further study.
Salek, Reza M.; Haug, Kenneth; Conesa, Pablo; Hastings, Janna; Williams, Mark; Mahendraker, Tejasvi; Maguire, Eamonn; González-Beltrán, Alejandra N.; Rocca-Serra, Philippe; Sansone, Susanna-Assunta; Steinbeck, Christoph
MetaboLights is the first general-purpose open-access curated repository for metabolomic studies, their raw experimental data and associated metadata, maintained by one of the major open-access data providers in molecular biology. Increases in the number of depositions, number of samples per study and the file size of data submitted to MetaboLights present a challenge for the objective of ensuring high-quality and standardized data in the context of diverse metabolomic workflows and data representations. Here, we describe the MetaboLights curation pipeline, its challenges and its practical application in quality control of complex data depositions. Database URL: http://www.ebi.ac.uk/metabolights PMID:23630246
Hamed Hassanzadeh; MohammadReza Keyvanpour
The Semantic Web is an extension of the current web in which information is given well-defined meaning. The perspective of Semantic Web is to promote the quality and intelligence of the current web by changing its contents into machine understandable form. Therefore, semantic level information is one of the cornerstones of the Semantic Web. The process of adding semantic metadata to web resources is called Semantic Annotation. There are many obstacles against the Semantic Annotation, such as ...
Engelhardt, Barbara E; Jordan, Michael I; Repo, Susanna T; Brenner, Steven E
It is now easier to discover thousands of protein sequences in a new microbial genome than it is to biochemically characterize the specific activity of a single protein of unknown function. The molecular functions of protein sequences have typically been predicted using homology-based computational methods, which rely on the principle that homologous proteins share a similar function. However, some protein families include groups of proteins with different molecular functions. A phylogenetic approach for predicting molecular function (sometimes called 'phylogenomics') is an effective means to predict protein molecular function. These methods incorporate functional evidence from all members of a family that have functional characterizations using the evolutionary history of the protein family to make robust predictions for the uncharacterized proteins. However, they are often difficult to apply on a genome-wide scale because of the time-consuming step of reconstructing the phylogenies of each protein to be annotated. Our automated approach for function annotation using phylogeny, the SIFTER (Statistical Inference of Function Through Evolutionary Relationships) methodology, uses a statistical graphical model to compute the probabilities of molecular functions for unannotated proteins. Our benchmark tests showed that SIFTER provides accurate functional predictions on various protein families, outperforming other available methods.
Childs Kevin L
Full Text Available Abstract Background A goal of the Bovine Genome Database (BGD; http://BovineGenome.org has been to support the Bovine Genome Sequencing and Analysis Consortium (BGSAC in the annotation and analysis of the bovine genome. We were faced with several challenges, including the need to maintain consistent quality despite diversity in annotation expertise in the research community, the need to maintain consistent data formats, and the need to minimize the potential duplication of annotation effort. With new sequencing technologies allowing many more eukaryotic genomes to be sequenced, the demand for collaborative annotation is likely to increase. Here we present our approach, challenges and solutions facilitating a large distributed annotation project. Results and Discussion BGD has provided annotation tools that supported 147 members of the BGSAC in contributing 3,871 gene models over a fifteen-week period, and these annotations have been integrated into the bovine Official Gene Set. Our approach has been to provide an annotation system, which includes a BLAST site, multiple genome browsers, an annotation portal, and the Apollo Annotation Editor configured to connect directly to our Chado database. In addition to implementing and integrating components of the annotation system, we have performed computational analyses to create gene evidence tracks and a consensus gene set, which can be viewed on individual gene pages at BGD. Conclusions We have provided annotation tools that alleviate challenges associated with distributed annotation. Our system provides a consistent set of data to all annotators and eliminates the need for annotators to format data. Involving the bovine research community in genome annotation has allowed us to leverage expertise in various areas of bovine biology to provide biological insight into the genome sequence.
Full Text Available BACKGROUND: Green tea has various health promotion effects. Although there are numerous tea cultivars, little is known about the differences in their nutraceutical properties. Metabolic profiling techniques can provide information on the relationship between the metabolome and factors such as phenotype or quality. Here, we performed metabolomic analyses to explore the relationship between the metabolome and health-promoting attributes (bioactivity of diverse Japanese green tea cultivars. METHODOLOGY/PRINCIPAL FINDINGS: We investigated the ability of leaf extracts from 43 Japanese green tea cultivars to inhibit thrombin-induced phosphorylation of myosin regulatory light chain (MRLC in human umbilical vein endothelial cells (HUVECs. This thrombin-induced phosphorylation is a potential hallmark of vascular endothelial dysfunction. Among the tested cultivars, Cha Chuukanbohon Nou-6 (Nou-6 and Sunrouge (SR strongly inhibited MRLC phosphorylation. To evaluate the bioactivity of green tea cultivars using a metabolomics approach, the metabolite profiles of all tea extracts were determined by high-performance liquid chromatography-mass spectrometry (LC-MS. Multivariate statistical analyses, principal component analysis (PCA and orthogonal partial least-squares-discriminant analysis (OPLS-DA, revealed differences among green tea cultivars with respect to their ability to inhibit MRLC phosphorylation. In the SR cultivar, polyphenols were associated with its unique metabolic profile and its bioactivity. In addition, using partial least-squares (PLS regression analysis, we succeeded in constructing a reliable bioactivity-prediction model to predict the inhibitory effect of tea cultivars based on their metabolome. This model was based on certain identified metabolites that were associated with bioactivity. When added to an extract from the non-bioactive cultivar Yabukita, several metabolites enriched in SR were able to transform the extract into a bioactive
Ida, Megumi; Kosaka, Reia; Miura, Daisuke; Wariishi, Hiroyuki; Maeda-Yamamoto, Mari; Nesumi, Atsushi; Saito, Takeshi; Kanda, Tomomasa; Yamada, Koji; Tachibana, Hirofumi
Background Green tea has various health promotion effects. Although there are numerous tea cultivars, little is known about the differences in their nutraceutical properties. Metabolic profiling techniques can provide information on the relationship between the metabolome and factors such as phenotype or quality. Here, we performed metabolomic analyses to explore the relationship between the metabolome and health-promoting attributes (bioactivity) of diverse Japanese green tea cultivars. Methodology/Principal Findings We investigated the ability of leaf extracts from 43 Japanese green tea cultivars to inhibit thrombin-induced phosphorylation of myosin regulatory light chain (MRLC) in human umbilical vein endothelial cells (HUVECs). This thrombin-induced phosphorylation is a potential hallmark of vascular endothelial dysfunction. Among the tested cultivars, Cha Chuukanbohon Nou-6 (Nou-6) and Sunrouge (SR) strongly inhibited MRLC phosphorylation. To evaluate the bioactivity of green tea cultivars using a metabolomics approach, the metabolite profiles of all tea extracts were determined by high-performance liquid chromatography-mass spectrometry (LC-MS). Multivariate statistical analyses, principal component analysis (PCA) and orthogonal partial least-squares-discriminant analysis (OPLS-DA), revealed differences among green tea cultivars with respect to their ability to inhibit MRLC phosphorylation. In the SR cultivar, polyphenols were associated with its unique metabolic profile and its bioactivity. In addition, using partial least-squares (PLS) regression analysis, we succeeded in constructing a reliable bioactivity-prediction model to predict the inhibitory effect of tea cultivars based on their metabolome. This model was based on certain identified metabolites that were associated with bioactivity. When added to an extract from the non-bioactive cultivar Yabukita, several metabolites enriched in SR were able to transform the extract into a bioactive extract
Mayorga Gross, Ana Lucía; Quirós Guerrero, Luis Manuel; Fourny, G.; Vaillant Barka, Fabrice
Fermentation is a critical step in the processing of high quality cocoa; however, the biochemistry behind is still not well understood at a molecular level. In this research, using a non-targeted approach, the main metabolomic changes that occur throughout the fermentation process were explored. Genetically undefined cocoa varieties from Trinidad and Tobago (n = 3), Costa Rica (n = 1) and one clone IMC-67 (n = 3) were subjected to spontaneous fermentation using farm-based and pilot plant cont...
Grossman, Arthur R
Shotgun sequencing of the nuclear genome of Chlamydomonas reinhardtii (Chlamydomonas throughout) was performed at an approximate 10X coverage by JGI. Roughly half of the genome is now contained on 26 scaffolds, all of which are at least 1.6 Mb, and the coverage of the genome is ~95%. There are now over 200,000 cDNA sequence reads that we have generated as part of the Chlamydomonas genome project (Grossman, 2003; Shrager et al., 2003; Grossman et al. 2007; Merchant et al., 2007); other sequences have also been generated by the Kasuza sequence group (Asamizu et al., 1999; Asamizu et al., 2000) or individual laboratories that have focused on specific genes. Shrager et al. (2003) placed the reads into distinct contigs (an assemblage of reads with overlapping nucleotide sequences), and contigs that group together as part of the same genes have been designated ACEs (assembly of contigs generated from EST information). All of the reads have also been mapped to the Chlamydomonas nuclear genome and the cDNAs and their corresponding genomic sequences have been reassembled, and the resulting assemblage is called an ACEG (an Assembly of contiguous EST sequences supported by genomic sequence) (Jain et al., 2007). Most of the unique genes or ACEGs are also represented by gene models that have been generated by the Joint Genome Institute (JGI, Walnut Creek, CA). These gene models have been placed onto the DNA scaffolds and are presented as a track on the Chlamydomonas genome browser associated with the genome portal (http://genome.jgi-psf.org/Chlre3/Chlre3.home.html). Ultimately, the meeting grant awarded by DOE has helped enormously in the development of an annotation pipeline (a set of guidelines used in the annotation of genes) and resulted in high quality annotation of over 4,000 genes; the annotators were from both Europe and the USA. Some of the people who led the annotation initiative were Arthur Grossman, Olivier Vallon, and Sabeeha Merchant (with many individual
Boot, P.; Haentjens Dekker, R.
Annotation in digital scholarly editions (of historical documents, literary works, letters, etc.) has long been recognized as an important desideratum, but has also proven to be an elusive ideal. In so far as annotation functionality is available, it is usually developed for a single edition and
Boot, P.; Boot, P.; Stronks, E.
From the introduction: Annotation is an important item on the wish list for digital scholarly tools. It is one of John Unsworth’s primitives of scholarship (Unsworth 2000). Especially in linguistics,a number of tools have been developed that facilitate the creation of annotations to source material
Full Text Available A novel framework for automated elucidation of metabolite structures in liquid chromatography-mass spectrometer (LC-MS metabolome data was constructed by integrating databases. High-resolution tandem mass spectra data automatically acquired from each metabolite signal were used for database searches. Three distinct databases, KNApSAcK, ReSpect, and the PRIMe standard compound database, were employed for the structural elucidation. The outputs were retrieved using the CAS metabolite identifier for identification and putative annotation. A simple metabolite ontology system was also introduced to attain putative characterization of the metabolite signals. The automated method was applied for the metabolome data sets obtained from the rosette leaves of 20 Arabidopsis accessions. Phenotypic variations in novel Arabidopsis metabolites among these accessions could be investigated using this method.
Tebani, Abdellah; Afonso, Carlos; Bekri, Soumeya
This work reports the second part of a review intending to give the state of the art of major metabolic phenotyping strategies. It particularly deals with inherent advantages and limits regarding data analysis issues and biological information retrieval tools along with translational challenges. This Part starts with introducing the main data preprocessing strategies of the different metabolomics data. Then, it describes the main data analysis techniques including univariate and multivariate aspects. It also addresses the challenges related to metabolite annotation and characterization. Finally, functional analysis including pathway and network strategies are discussed. The last section of this review is devoted to practical considerations and current challenges and pathways to bring metabolomics into clinical environments.
Fanos, Vassilios; Atzori, Luigi; Makarenko, Karina; Melis, Gian Benedetto; Ferrazzi, Enrico
Metabolomics in maternal-fetal medicine is still an “embryonic” science. However, there is already an increasing interest in metabolome of normal and complicated pregnancies, and neonatal outcomes. Tissues used for metabolomics interrogations of pregnant women, fetuses and newborns are amniotic fluid, blood, plasma, cord blood, placenta, urine, and vaginal secretions. All published papers highlight the strong correlation between biomarkers found in these tissues and fetal malformations, prete...
Full Text Available Endogenous mechanisms for successful resolution of an acute inflammatory response and the local return to homeostasis are of interest because excessive inflammation underlies many human diseases. In this review, we provide an update and overview of functional metabolomics that identified a new bioactive metabolome of docosahexaenoic acid (DHA. Systematic studies revealed that DHA was converted to DHEA-derived novel bioactive products as well as aspirin-triggered (AT forms of protectins. The new oxygenated DHEA derived products blocked PMN chemotaxis, reduced P-selectin expression and platelet-leukocyte adhesion, and showed organ protection in ischemia/reperfusion injury. These products activated cannabinoid receptor (CB2 receptor and not CB1 receptors. The AT-PD1 reduced neutrophil (PMN recruitment in murine peritonitis. With human cells, AT-PD1 decreased transendothelial PMN migration as well as enhanced efferocytosis of apoptotic human PMN by macrophages. The recent findings reviewed here indicate that DHEA oxidative metabolism and aspirin-triggered conversion of DHA produce potent novel molecules with anti-inflammatory and organ-protective properties, opening the DHA metabolome functional roles.
Ramirez, Tzutzuy; Daneshian, Mardas; Kamp, Hennicke; Bois, Frederic Y.; Clench, Malcolm R.; Coen, Muireann; Donley, Beth; Fischer, Steven M.; Ekman, Drew R.; Fabian, Eric; Guillou, Claude; Heuer, Joachim; Hogberg, Helena T.; Jungnickel, Harald; Keun, Hector C.; Krennrich, Gerhard; Krupp, Eckart; Luch, Andreas; Noor, Fozia; Peter, Erik; Riefke, Bjoern; Seymour, Mark; Skinner, Nigel; Smirnova, Lena; Verheij, Elwin; Wagner, Silvia; Hartung, Thomas; van Ravenzwaay, Bennard; Leist, Marcel
Summary Metabolomics, the comprehensive analysis of metabolites in a biological system, provides detailed information about the biochemical/physiological status of a biological system, and about the changes caused by chemicals. Metabolomics analysis is used in many fields, ranging from the analysis of the physiological status of genetically modified organisms in safety science to the evaluation of human health conditions. In toxicology, metabolomics is the -omics discipline that is most closely related to classical knowledge of disturbed biochemical pathways. It allows rapid identification of the potential targets of a hazardous compound. It can give information on target organs and often can help to improve our understanding regarding the mode-of-action of a given compound. Such insights aid the discovery of biomarkers that either indicate pathophysiological conditions or help the monitoring of the efficacy of drug therapies. The first toxicological applications of metabolomics were for mechanistic research, but different ways to use the technology in a regulatory context are being explored. Ideally, further progress in that direction will position the metabolomics approach to address the challenges of toxicology of the 21st century. To address these issues, scientists from academia, industry, and regulatory bodies came together in a workshop to discuss the current status of applied metabolomics and its potential in the safety assessment of compounds. We report here on the conclusions of three working groups addressing questions regarding 1) metabolomics for in vitro studies 2) the appropriate use of metabolomics in systems toxicology, and 3) use of metabolomics in a regulatory context. PMID:23665807
Mariusz A Bromke
Full Text Available Diatoms are very efficient in their use of available nutrients. Changes in nutrient availability influence the metabolism and the composition of the cell constituents. Since diatoms are valuable candidates to search for oil producing algae, measurements of diatom-produced compounds can be very useful for biotechnology. In order to explore the diversity of lipophilic compounds produced by diatoms, we describe the results from an analysis of 13 diatom strains. With the help of a lipidomics platform, which combines an UPLC separation with a high resolution/high mass accuracy mass spectrometer, we were able to measure and annotate 142 lipid species. Out of these, 32 were present in all 13 cultures. The annotated lipid features belong to six classes of glycerolipids. The data obtained from the measurements were used to create lipidomic profiles. The metabolomic overview of analysed cultures is amended by the measurement of 96 polar compounds. To further increase the lipid diversity and gain insight into metabolomic adaptation to nitrogen limitation, diatoms were cultured in media with high and low concentrations of nitrate. The growth in nitrogen-deplete or nitrogen-replete conditions affects metabolite accumulation but has no major influence on the species-specific metabolomic profile. Thus, the genetic component is stronger in determining metabolic patterns than nitrogen levels. Therefore, lipid profiling is powerful enough to be used as a molecular fingerprint for diatom cultures. Furthermore, an increase of triacylglycerol (TAG accumulation was observed in low nitrogen samples, although this trend was not consistent across all 13 diatom strains. Overall, our results expand the current understanding of metabolomics diversity in diatoms and confirm their potential value for producing lipids for either bioenergy or as feed stock.
Bromke, Mariusz A.; Sabir, Jamal S.; Alfassi, Fahad A.; Hajarah, Nahid H.; Kabli, Saleh A.; Al-Malki, Abdulrahman L.; Ashworth, Matt P.; Méret, Michaël; Jansen, Robert K.; Willmitzer, Lothar
Diatoms are very efficient in their use of available nutrients. Changes in nutrient availability influence the metabolism and the composition of the cell constituents. Since diatoms are valuable candidates to search for oil producing algae, measurements of diatom-produced compounds can be very useful for biotechnology. In order to explore the diversity of lipophilic compounds produced by diatoms, we describe the results from an analysis of 13 diatom strains. With the help of a lipidomics platform, which combines an UPLC separation with a high resolution/high mass accuracy mass spectrometer, we were able to measure and annotate 142 lipid species. Out of these, 32 were present in all 13 cultures. The annotated lipid features belong to six classes of glycerolipids. The data obtained from the measurements were used to create lipidomic profiles. The metabolomic overview of analysed cultures is amended by the measurement of 96 polar compounds. To further increase the lipid diversity and gain insight into metabolomic adaptation to nitrogen limitation, diatoms were cultured in media with high and low concentrations of nitrate. The growth in nitrogen-deplete or nitrogen-replete conditions affects metabolite accumulation but has no major influence on the species-specific metabolomic profile. Thus, the genetic component is stronger in determining metabolic patterns than nitrogen levels. Therefore, lipid profiling is powerful enough to be used as a molecular fingerprint for diatom cultures. Furthermore, an increase of triacylglycerol (TAG) accumulation was observed in low nitrogen samples, although this trend was not consistent across all 13 diatom strains. Overall, our results expand the current understanding of metabolomics diversity in diatoms and confirm their potential value for producing lipids for either bioenergy or as feed stock. PMID:26440112
Lv, Mengying; Huang, Wanqiu; Chen, Zhipeng; Jiang, Hulin; Chen, Jiaqing; Tian, Yuan; Zhang, Zunjian; Xu, Fengguo
Nanomaterials are commonly defined as engineered structures with at least one dimension of 100 nm or less. Investigations of their potential toxicological impact on biological systems and the environment have yet to catch up with the rapid development of nanotechnology and extensive production of nanoparticles. High-throughput methods are necessary to assess the potential toxicity of nanoparticles. The omics techniques are well suited to evaluate toxicity in both in vitro and in vivo systems. Besides genomic, transcriptomic and proteomic profiling, metabolomics holds great promises for globally evaluating and understanding the molecular mechanism of nanoparticle-organism interaction. This manuscript presents a general overview of metabolomics techniques, summarizes its early application in nanotoxicology and finally discusses opportunities and challenges faced in nanotoxicology.
Rasmiena, Aliki A; Ng, Theodore W; Meikle, Peter J
Ischaemic heart disease accounts for nearly half of the global cardiovascular disease burden. Aetiologies relating to heart disease are complex, but dyslipidaemia, oxidative stress and inflammation are cardinal features. Despite preventative measures and advancements in treatment regimens with lipid-lowering agents, the high prevalence of heart disease and the residual risk of recurrent events continue to be a significant burden to the health sector and to the affected individuals and their families. The development of improved risk models for the early detection and prevention of cardiovascular events in addition to new therapeutic strategies to address this residual risk are required if we are to continue to make inroads into this most prevalent of diseases. Metabolomics and lipidomics are modern disciplines that characterize the metabolite and lipid complement respectively, of a given system. Their application to ischaemic heart disease has demonstrated utilities in population profiling, identification of multivariate biomarkers and in monitoring of therapeutic response, as well as in basic mechanistic studies. Although advances in magnetic resonance and mass spectrometry technologies have given rise to the fields of metabolomics and lipidomics, the plethora of data generated presents challenges requiring specific statistical and bioinformatics applications, together with appropriate study designs. Nonetheless, the predictive and re-classification capacity of individuals with various degrees of risk by the plasma lipidome has recently been demonstrated. In the present review, we summarize evidence derived exclusively by metabolomic and lipidomic studies in the context of ischaemic heart disease. We consider the potential role of plasma lipid profiling in assessing heart disease risk and therapeutic responses, and explore the potential mechanisms. Finally, we highlight where metabolomic studies together with complementary -omic disciplines may make further
Interstitial cystitis (IC), also known as painful bladder syndrome or bladder pain syndrome, is a chronic lower urinary tract syndrome characterized by pelvic pain, urinary urgency, and increased urinary frequency in the absence of bacterial infection or identifiable clinicopathology. IC can lead to long-term adverse effects on the patient's quality of life. Therefore, early diagnosis and better understanding of the mechanisms underlying IC are needed. Metabolomic studies of biofluids have become a powerful method for assessing disease mechanisms and biomarker discovery, which potentially address these important clinical needs. However, limited intensive metabolic profiles have been elucidated in IC. The article is a short review on metabolomic analyses that provide a unique fingerprint of IC with a focus on its use in determining a potential diagnostic biomarker associated with symptoms, a response predictor of therapy, and a prognostic marker. PMID:25279237
Guignon, Valentin; Droc, Gaëtan; Alaux, Michael; Baurens, Franc-Christophe; Garsmeur, Olivier; Poiron, Claire; Carver, Tim; Rouard, Mathieu; Bocs, Stéphanie
We developed a controller that is compliant with the Chado database schema, GBrowse and genome annotation-editing tools such as Artemis and Apollo. It enables the management of public and private data, monitors manual annotation (with controlled vocabularies, structural and functional annotation controls) and stores versions of annotation for all modified features. The Chado controller uses PostgreSQL and Perl. The Chado Controller package is available for download at http://www.gnpannot.org/content/chado-controller and runs on any Unix-like operating system, and documentation is available at http://www.gnpannot.org/content/chado-controller-doc The system can be tested using the GNPAnnot Sandbox at http://www.gnpannot.org/content/gnpannot-sandbox-form email@example.com; firstname.lastname@example.org Supplementary data are available at Bioinformatics online.
Full Text Available Transposable elements (TEs are mobile, repetitive sequences that make up significant fractions of metazoan genomes. Despite their near ubiquity and importance in genome and chromosome biology, most efforts to annotate TEs in genome sequences rely on the results of a single computational program, RepeatMasker. In contrast, recent advances in gene annotation indicate that high-quality gene models can be produced from combining multiple independent sources of computational evidence. To elevate the quality of TE annotations to a level comparable to that of gene models, we have developed a combined evidence-model TE annotation pipeline, analogous to systems used for gene annotation, by integrating results from multiple homology-based and de novo TE identification methods. As proof of principle, we have annotated "TE models" in Drosophila melanogaster Release 4 genomic sequences using the combined computational evidence derived from RepeatMasker, BLASTER, TBLASTX, all-by-all BLASTN, RECON, TE-HMM and the previous Release 3.1 annotation. Our system is designed for use with the Apollo genome annotation tool, allowing automatic results to be curated manually to produce reliable annotations. The euchromatic TE fraction of D. melanogaster is now estimated at 5.3% (cf. 3.86% in Release 3.1, and we found a substantially higher number of TEs (n = 6,013 than previously identified (n = 1,572. Most of the new TEs derive from small fragments of a few hundred nucleotides long and highly abundant families not previously annotated (e.g., INE-1. We also estimated that 518 TE copies (8.6% are inserted into at least one other TE, forming a nest of elements. The pipeline allows rapid and thorough annotation of even the most complex TE models, including highly deleted and/or nested elements such as those often found in heterochromatic sequences. Our pipeline can be easily adapted to other genome sequences, such as those of the D. melanogaster heterochromatin or other
Li, Yanyun; Chen, Minjian; Liu, Cuiping; Xia, Yankai; Xu, Bo; Hu, Yanhui; Chen, Ting; Shen, Meiping; Tang, Wei
Papillary thyroid carcinoma (PTC) is the most common thyroid cancer. Nuclear magnetic resonance (NMR)‑based metabolomic technique is the gold standard in metabolite structural elucidation, and can provide different coverage of information compared with other metabolomic techniques. Here, we firstly conducted NMR based metabolomics study regarding detailed metabolic changes especially metabolic pathway changes related to PTC pathogenesis. 1H NMR-based metabolomic technique was adopted in conju-nction with multivariate analysis to analyze matched tumor and normal thyroid tissues obtained from 16 patients. The results were further annotated with Kyoto Encyclopedia of Genes and Genomes (KEGG), and Human Metabolome Database, and then were analyzed using modules of pathway analysis and enrichment analysis of MetaboAnalyst 3.0. Based on the analytical techniques, we established the models of principal component analysis (PCA), partial least squares-discriminant analysis (PLS-DA), and orthogonal partial least-squares discriminant analysis (OPLS‑DA) which could discriminate PTC from normal thyroid tissue, and found 15 robust differentiated metabolites from two OPLS-DA models. We identified 8 KEGG pathways and 3 pathways of small molecular pathway database which were significantly related to PTC by using pathway analysis and enrichment analysis, respectively, through which we identified metabolisms related to PTC including branched chain amino acid metabolism (leucine and valine), other amino acid metabolism (glycine and taurine), glycolysis (lactate), tricarboxylic acid cycle (citrate), choline metabolism (choline, ethanolamine and glycerolphosphocholine) and lipid metabolism (very-low‑density lipoprotein and low-density lipoprotein). In conclusion, the PTC was characterized with increased glycolysis and inhibited tricarboxylic acid cycle, increased oncogenic amino acids as well as abnormal choline and lipid metabolism. The findings in this study provide new
Haas, B J; Salzberg, S L; Zhu, W; Pertea, M; Allen, J E; Orvis, J; White, O; Buell, C R; Wortman, J R
EVidenceModeler (EVM) is presented as an automated eukaryotic gene structure annotation tool that reports eukaryotic gene structures as a weighted consensus of all available evidence. EVM, when combined with the Program to Assemble Spliced Alignments (PASA), yields a comprehensive, configurable annotation system that predicts protein-coding genes and alternatively spliced isoforms. Our experiments on both rice and human genome sequences demonstrate that EVM produces automated gene structure annotation approaching the quality of manual curation.
Full Text Available Background: Generating good training datasets is essential for machine learning-based nuclei detection methods. However, creating exhaustive nuclei contour annotations, to derive optimal training data from, is often infeasible. Methods: We compared different approaches for training nuclei detection methods solely based on nucleus center markers. Such markers contain less accurate information, especially with regard to nuclear boundaries, but can be produced much easier and in greater quantities. The approaches use different automated sample extraction methods to derive image positions and class labels from nucleus center markers. In addition, the approaches use different automated sample selection methods to improve the detection quality of the classification algorithm and reduce the run time of the training process. We evaluated the approaches based on a previously published generic nuclei detection algorithm and a set of Ki-67-stained breast cancer images. Results: A Voronoi tessellation-based sample extraction method produced the best performing training sets. However, subsampling of the extracted training samples was crucial. Even simple class balancing improved the detection quality considerably. The incorporation of active learning led to a further increase in detection quality. Conclusions: With appropriate sample extraction and selection methods, nuclei detection algorithms trained on the basis of simple center marker annotations can produce comparable quality to algorithms trained on conventionally created training sets.
van Rijswijk, Merlijn; Beirnaert, Charlie; Caron, Christophe; Cascante, Marta; Dominguez, Victoria; Dunn, Warwick B; Ebbels, Timothy M D; Giacomoni, Franck; Gonzalez-Beltran, Alejandra; Hankemeier, Thomas; Haug, Kenneth; Izquierdo-Garcia, Jose L; Jimenez, Rafael C; Jourdan, Fabien; Kale, Namrata; Klapa, Maria I; Kohlbacher, Oliver; Koort, Kairi; Kultima, Kim; Le Corguillé, Gildas; Moreno, Pablo; Moschonas, Nicholas K; Neumann, Steffen; O'Donovan, Claire; Reczko, Martin; Rocca-Serra, Philippe; Rosato, Antonio; Salek, Reza M; Sansone, Susanna-Assunta; Satagopam, Venkata; Schober, Daniel; Shimmo, Ruth; Spicer, Rachel A; Spjuth, Ola; Thévenot, Etienne A; Viant, Mark R; Weber, Ralf J M; Willighagen, Egon L; Zanetti, Gianluigi; Steinbeck, Christoph
Metabolomics, the youngest of the major omics technologies, is supported by an active community of researchers and infrastructure developers across Europe. To coordinate and focus efforts around infrastructure building for metabolomics within Europe, a workshop on the "Future of metabolomics in ELIXIR" was organised at Frankfurt Airport in Germany. This one-day strategic workshop involved representatives of ELIXIR Nodes, members of the PhenoMeNal consortium developing an e-infrastructure that supports workflow-based metabolomics analysis pipelines, and experts from the international metabolomics community. The workshop established metabolite identification as the critical area, where a maximal impact of computational metabolomics and data management on other fields could be achieved. In particular, the existing four ELIXIR Use Cases, where the metabolomics community - both industry and academia - would benefit most, and which could be exhaustively mapped onto the current five ELIXIR Platforms were discussed. This opinion article is a call for support for a new ELIXIR metabolomics Use Case, which aligns with and complements the existing and planned ELIXIR Platforms and Use Cases.
Andrade, M A; Brown, N P; Leroy, C; Hoersch, S; de Daruvar, A; Reich, C; Franchini, A; Tamames, J; Valencia, A; Ouzounis, C; Sander, C
Large-scale genome projects generate a rapidly increasing number of sequences, most of them biochemically uncharacterized. Research in bioinformatics contributes to the development of methods for the computational characterization of these sequences. However, the installation and application of these methods require experience and are time consuming. We present here an automatic system for preliminary functional annotation of protein sequences that has been applied to the analysis of sets of sequences from complete genomes, both to refine overall performance and to make new discoveries comparable to those made by human experts. The GeneQuiz system includes a Web-based browser that allows examination of the evidence leading to an automatic annotation and offers additional information, views of the results, and links to biological databases that complement the automatic analysis. System structure and operating principles concerning the use of multiple sequence databases, underlying sequence analysis tools, lexical analyses of database annotations and decision criteria for functional assignments are detailed. The system makes automatic quality assessments of results based on prior experience with the underlying sequence analysis tools; overall error rates in functional assignment are estimated at 2.5-5% for cases annotated with highest reliability ('clear' cases). Sources of over-interpretation of results are discussed with proposals for improvement. A conservative definition for reporting 'new findings' that takes account of database maturity is presented along with examples of possible kinds of discoveries (new function, family and superfamily) made by the system. System performance in relation to sequence database coverage, database dynamics and database search methods is analysed, demonstrating the inherent advantages of an integrated automatic approach using multiple databases and search methods applied in an objective and repeatable manner. The GeneQuiz system
van der Greef, J.; Smilde, A. K.
Metabolomics is a growing area in the field of systems biology. Metabolomics has already a long history and also the connection of metabolomics with chemometrics goes back some time. This review discusses the symbiosis of metabolomics and chemometrics with emphasis on the medical domain, puts the
Metabolomics is an “omic” science that is now emerging with the purpose of elaborating a comprehensive analysis of the metabolome, which is the complete set of metabolites (i.e., small molecules intermediates) in an organism, tissue, cell, or biofluid. In the past decade, metabolomics has already proved to be useful for the characterization of several pathological conditions and offers promises as a clinical tool. A metabolomics investigation of coeliac disease (CD) revealed that a metabolic fingerprint for CD can be defined, which accounts for three different but complementary components: malabsorption, energy metabolism, and alterations in gut microflora and/or intestinal permeability. In this review, we will discuss the major advancements in metabolomics of CD, in particular with respect to the role of gut microbiome and energy metabolism. PMID:24665364
Adkins, Daniel E.; McClay, Joseph L.; Vunck, Sarah A.; Batman, Angela M.; Vann, Robert E.; Clark, Shaunna L.; Souza, Renan P.; Crowley, James J.; Sullivan, Patrick F.; van den Oord, Edwin J.C.G.; Beardsley, Patrick M.
Behavioral sensitization has been widely studied in animal models and is theorized to reflect neural modifications associated with human psychostimulant addiction. While the mesolimbic dopaminergic pathway is known to play a role, the neurochemical mechanisms underlying behavioral sensitization remain incompletely understood. In the present study, we conducted the first metabolomics analysis to globally characterize neurochemical differences associated with behavioral sensitization. Methamphetamine-induced sensitization measures were generated by statistically modeling longitudinal activity data for eight inbred strains of mice. Subsequent to behavioral testing, nontargeted liquid and gas chromatography-mass spectrometry profiling was performed on 48 brain samples, yielding 301 metabolite levels per sample after quality control. Association testing between metabolite levels and three primary dimensions of behavioral sensitization (total distance, stereotypy and margin time) showed four robust, significant associations at a stringent metabolome-wide significance threshold (false discovery rate < 0.05). Results implicated homocarnosine, a dipeptide of GABA and histidine, in total distance sensitization, GABA metabolite 4-guanidinobutanoate and pantothenate in stereotypy sensitization, and myo-inositol in margin time sensitization. Secondary analyses indicated that these associations were independent of concurrent methamphetamine levels and, with the exception of the myo-inositol association, suggest a mechanism whereby strain-based genetic variation produces specific baseline neurochemical differences that substantially influence the magnitude of MA-induced sensitization. These findings demonstrate the utility of mouse metabolomics for identifying novel biomarkers, and developing more comprehensive neurochemical models, of psychostimulant sensitization. PMID:24034544
Full Text Available Previous studies have shown that calcium stressed Saccharomyces cerevisiae, challenged with immunosuppressant drugs FK506 and Cyclosporin A, responds with comprehensive gene expression changes and attenuation of the generalized calcium stress response. Here, we describe a global metabolomics workflow for investigating the utility of tracking corresponding phenotypic changes. This was achieved by efficiently analyzing relative abundance differences between intracellular metabolite pools from wild-type and calcium stressed cultures, with and without prior immunosuppressant drugs exposure. We used pathway database content from WikiPathways and YeastCyc to facilitate the projection of our metabolomics profiling results onto biological pathways. A key challenge was to increase the coverage of the detected metabolites. This was achieved by applying both reverse phase (RP and aqueous normal phase (ANP chromatographic separations, as well as electrospray ionization (ESI and atmospheric pressure chemical ionization (APCI sources for detection in both ion polarities. Unsupervised principle component analysis (PCA and ANOVA results revealed differentiation between wild-type controls, calcium stressed and immunosuppressant/calcium challenged cells. Untargeted data mining resulted in 247 differentially expressed, annotated metabolites, across at least one pair of conditions. A separate, targeted data mining strategy identified 187 differential, annotated metabolites. All annotated metabolites were subsequently mapped onto curated pathways from YeastCyc and WikiPathways for interactive pathway analysis and visualization. Dozens of pathways showed differential responses to stress conditions based on one or more matches to the list of annotated metabolites or to metabolites that had been identified further by MS/MS. The purine salvage, pantothenate and sulfur amino acid pathways were flagged as being enriched, which is consistent with previously published
Holt, Carson; Yandell, Mark
Second-generation sequencing technologies are precipitating major shifts with regards to what kinds of genomes are being sequenced and how they are annotated. While the first generation of genome projects focused on well-studied model organisms, many of today's projects involve exotic organisms whose genomes are largely terra incognita. This complicates their annotation, because unlike first-generation projects, there are no pre-existing 'gold-standard' gene-models with which to train gene-finders. Improvements in genome assembly and the wide availability of mRNA-seq data are also creating opportunities to update and re-annotate previously published genome annotations. Today's genome projects are thus in need of new genome annotation tools that can meet the challenges and opportunities presented by second-generation sequencing technologies. We present MAKER2, a genome annotation and data management tool designed for second-generation genome projects. MAKER2 is a multi-threaded, parallelized application that can process second-generation datasets of virtually any size. We show that MAKER2 can produce accurate annotations for novel genomes where training-data are limited, of low quality or even non-existent. MAKER2 also provides an easy means to use mRNA-seq data to improve annotation quality; and it can use these data to update legacy annotations, significantly improving their quality. We also show that MAKER2 can evaluate the quality of genome annotations, and identify and prioritize problematic annotations for manual review. MAKER2 is the first annotation engine specifically designed for second-generation genome projects. MAKER2 scales to datasets of any size, requires little in the way of training data, and can use mRNA-seq data to improve annotation quality. It can also update and manage legacy genome annotation datasets.
Mao, Qi; Tsang, Ivor Wai-Hung; Gao, Shenghua
Automatic image annotation, which is usually formulated as a multi-label classification problem, is one of the major tools used to enhance the semantic understanding of web images. Many multimedia applications (e.g., tag-based image retrieval) can greatly benefit from image annotation. However, the insufficient performance of image annotation methods prevents these applications from being practical. On the other hand, specific measures are usually designed to evaluate how well one annotation method performs for a specific objective or application, but most image annotation methods do not consider optimization of these measures, so that they are inevitably trapped into suboptimal performance of these objective-specific measures. To address this issue, we first summarize a variety of objective-guided performance measures under a unified representation. Our analysis reveals that macro-averaging measures are very sensitive to infrequent keywords, and hamming measure is easily affected by skewed distributions. We then propose a unified multi-label learning framework, which directly optimizes a variety of objective-specific measures of multi-label learning tasks. Specifically, we first present a multilayer hierarchical structure of learning hypotheses for multi-label problems based on which a variety of loss functions with respect to objective-guided measures are defined. And then, we formulate these loss functions as relaxed surrogate functions and optimize them by structural SVMs. According to the analysis of various measures and the high time complexity of optimizing micro-averaging measures, in this paper, we focus on example-based measures that are tailor-made for image annotation tasks but are seldom explored in the literature. Experiments show consistency with the formal analysis on two widely used multi-label datasets, and demonstrate the superior performance of our proposed method over state-of-the-art baseline methods in terms of example-based measures on four
Full Text Available Metabolomics is a promising avenue for biomarker discovery. Although the quality of metabolomic analyses, especially global metabolomics (G-Met using mass spectrometry (MS, largely depends on the instrumentation, potential bottlenecks still exist at several basic levels in the metabolomics workflow. Therefore, we established a precise protocol initially for the G-Met analyses of human blood plasma to overcome some these difficulties. In our protocol, samples are deproteinized in a 96-well plate using an automated liquid-handling system, and conducted either using a UHPLC-QTOF/MS system equipped with a reverse phase column or a LC-FTMS system equipped with a normal phase column. A normalization protocol of G-Met data was also developed to compensate for intra- and inter-batch differences, and the variations were significantly reduced along with our normalization, especially for the UHPLC-QTOF/MS data with a C18 reverse-phase column for positive ions. Secondly, we examined the changes in metabolomic profiles caused by the storage of EDTA-blood specimens to identify quality markers for the evaluation of the specimens' pre-analytical conditions. Forty quality markers, including lysophospholipids, dipeptides, fatty acids, succinic acid, amino acids, glucose, and uric acid were identified by G-Met for the evaluation of plasma sample quality and established the equation of calculating the quality score. We applied our quality markers to a small-scale study to evaluate the quality of clinical samples. The G-Met protocols and quality markers established here should prove useful for the discovery and development of biomarkers for a wider range of diseases.
Seyler, L. M.; Rempfert, K. R.; Kraus, E. A.; Spear, J. R.; Templeton, A. S.; Schrenk, M. O.
Environmental metabolomics is an emerging approach used to study ecosystem properties. Through bioinformatic comparisons to metagenomic data sets, metabolomics can be used to study microbial adaptations and responses to varying environmental conditions. Since the techniques are highly parallel to organic geochemistry approaches, metabolomics can also provide insight into biogeochemical processes. These analyses are a reflection of metabolic potential and intersection with other organisms and environmental components. Here, we used an untargeted metabolomics approach to characterize dissolved organic carbon and aqueous metabolites from groundwater obtained from an actively serpentinizing habitat. Serpentinites are known to support microbial communities that feed off of the products of serpentinization (such as methane and H2 gas), while adapted to harsh environmental conditions such as high pH and low DIC availability. However, the biochemistry of microbial populations that inhabit these environments are understudied and are complicated by overlapping biotic and abiotic processes. The aim of this study was to identify potential sources of carbon in an environment that is depleted of soluble inorganic carbon, and to characterize the flow of metabolites and describe overlapping biogenic and abiogenic processes impacting carbon cycling in serpentinizing rocks. We applied untargeted metabolomics techniques to groundwater taken from a series of wells drilled into the Semail Ophiolite in Oman.. Samples were analyzed via quadrupole time-of-flight liquid chromatography tandem mass spectrometry (QToF-LC/MS/MS). Metabolomes and metagenomic data were imported into Progenesis QI software for statistical analysis and correlation, and metabolic networks constructed using the Genome-Linked Application for Metabolic Maps (GLAMM), a web interface tool. Further multivariate statistical analyses and quality control was performed using EZinfo. Pools of dissolved organic carbon could
Hiller, Karsten; Metallo, Christian; Stephanopoulos, Gregory
Metabolomics and metabolic flux analysis (MFA) are powerful tools in the arsenal of methodologies of systems biology. Currently, metabolomics techniques are applied routinely for biomarker determination. However, standard metabolomics techniques only provide static information about absolute or relative metabolite amounts. The application of stable-isotope tracers has opened up a new dimension to metabolomics by providing dynamic information of intracellular fluxes and, by extension, enzyme activities. In the first part of the manuscript we review experimental and computational technologies applicable for metabolomics analyses. In the second part we present current technologies based on the use of stable isotopes and their applications to the analysis of cellular metabolism. Beginning with the determination of mass isotopomer distributions (MIDs), we review technologies for metabolic flux analysis (MFA) and conclude with the presentation of a new methodology for the non-targeted analysis of stable-isotope labeled metabolomics data.
Abstract Background Data from metabolomic studies are typically complex and high-dimensional. Principal component analysis (PCA) is currently the most widely used statistical technique for analyzing metabolomic data. However, PCA is limited by the fact that it is not based on a statistical model. Results Here, probabilistic principal component analysis (PPCA) which addresses some of the limitations of PCA, is reviewed and extended. A novel extension of PPCA, called probabilistic principal component and covariates analysis (PPCCA), is introduced which provides a flexible approach to jointly model metabolomic data and additional covariate information. The use of a mixture of PPCA models for discovering the number of inherent groups in metabolomic data is demonstrated. The jackknife technique is employed to construct confidence intervals for estimated model parameters throughout. The optimal number of principal components is determined through the use of the Bayesian Information Criterion model selection tool, which is modified to address the high dimensionality of the data. Conclusions The methods presented are illustrated through an application to metabolomic data sets. Jointly modeling metabolomic data and covariates was successfully achieved and has the potential to provide deeper insight to the underlying data structure. Examination of confidence intervals for the model parameters, such as loadings, allows for principled and clear interpretation of the underlying data structure. A software package called MetabolAnalyze, freely available through the R statistical software, has been developed to facilitate implementation of the presented methods in the metabolomics field.
A mechanism for attaching graphic and overlay annotation to multiple bits/pixel imagery while providing levels of performance approaching that of native mode graphics systems is presented. This mechanism isolates programming complexity from the application programmer through software encapsulation under the X Window System. It ensures display accuracy throughout operations on the imagery and annotation including zooms, pans, and modifications of the annotation. Trade-offs that affect speed of display, consumption of memory, and system functionality are explored. The use of resource files to tune the display system is discussed. The mechanism makes use of an abstraction consisting of four parts; a graphics overlay, a dithered overlay, an image overly, and a physical display window. Data structures are maintained that retain the distinction between the four parts so that they can be modified independently, providing system flexibility. A unique technique for associating user color preferences with annotation is introduced. An interface that allows interactive modification of the mapping between image value and color is discussed. A procedure that provides for the colorization of imagery on 8-bit display systems using pixel dithering is explained. Finally, the application of annotation mechanisms to various applications is discussed.
James P Balhoff
Full Text Available Phenotypic differences among species have long been systematically itemized and described by biologists in the process of investigating phylogenetic relationships and trait evolution. Traditionally, these descriptions have been expressed in natural language within the context of individual journal publications or monographs. As such, this rich store of phenotype data has been largely unavailable for statistical and computational comparisons across studies or integration with other biological knowledge.Here we describe Phenex, a platform-independent desktop application designed to facilitate efficient and consistent annotation of phenotypic similarities and differences using Entity-Quality syntax, drawing on terms from community ontologies for anatomical entities, phenotypic qualities, and taxonomic names. Phenex can be configured to load only those ontologies pertinent to a taxonomic group of interest. The graphical user interface was optimized for evolutionary biologists accustomed to working with lists of taxa, characters, character states, and character-by-taxon matrices.Annotation of phenotypic data using ontologies and globally unique taxonomic identifiers will allow biologists to integrate phenotypic data from different organisms and studies, leveraging decades of work in systematics and comparative morphology.
Kumar, Arun; Mosa, Kareem A; Ji, Liyao; Kage, Udaykumar; Dhokane, Dhananjay; Karre, Shailesh; Madalageri, Deepa; Pathania, Neemisha
Today, the dramatic changes in types of food consumed have led to an increased burden of chronic diseases. Therefore, the emphasis of food research is not only to ensure quality food that can supply adequate nutrients to prevent nutrition related diseases, but also to ensure overall physical and mental-health. This has led to the concept of functional foods and nutraceuticals (FFNs), which can be ideally produced and delivered through plants. Metabolomics can help in getting the most relevant functional information, and thus has been considered the greatest -OMICS technology to date. However, metabolomics has not been exploited to the best potential in plant sciences. The technology can be leveraged to identify the health promoting compounds and metabolites that can be used for the development of FFNs. This article reviews (i) plant-based FFNs-related metabolites and their health benefits; (ii) use of different analytic platforms for targeted and non-targeted metabolite profiling along with experimental considerations; (iii) exploitation of metabolomics to develop FFNs in plants using various biotechnological tools; and (iv) potential use of metabolomics in plant breeding. We have also provided some insights into integration of metabolomics with latest genome editing tools for metabolic pathway regulation in plants.
Brooke N. Dulka
Full Text Available Acute social defeat represents a naturalistic form of conditioned fear and is an excellent model in which to investigate the biological basis of stress resilience. While there is growing interest in identifying biomarkers of stress resilience, until recently, it has not been feasible to associate levels of large numbers of neurochemicals and metabolites to stress-related phenotypes. The objective of the present study was to use an untargeted metabolomics approach to identify known and unknown neurochemicals in select brain regions that distinguish susceptible and resistant individuals in two rodent models of acute social defeat. In the first experiment, male mice were first phenotyped as resistant or susceptible. Then, mice were subjected to acute social defeat, and tissues were immediately collected from the ventromedial prefrontal cortex (vmPFC, basolateral/central amygdala (BLA/CeA, nucleus accumbens (NAc, and dorsal hippocampus (dHPC. Ultra-high performance liquid chromatography coupled with high resolution mass spectrometry (UPLC-HRMS was used for the detection of water-soluble neurochemicals. In the second experiment, male Syrian hamsters were paired in daily agonistic encounters for 2 weeks, during which they formed stable dominant-subordinate relationships. Then, 24 h after the last dominance encounter, animals were exposed to acute social defeat stress. Immediately after social defeat, tissue was collected from the vmPFC, BLA/CeA, NAc, and dHPC for analysis using UPLC-HRMS. Although no single biomarker characterized stress-related phenotypes in both species, commonalities were found. For instance, in both model systems, animals resistant to social defeat stress also show increased concentration of molecules to protect against oxidative stress in the NAc and vmPFC. Additionally, in both mice and hamsters, unidentified spectral features were preliminarily annotated as potential targets for future experiments. Overall, these findings
Full Text Available Guard cells represent a unique single cell-type system for the study of cellular responses to abiotic and biotic perturbations that affect stomatal movement. Decades of effort through both classical physiological and functional genomics approaches have generated an enormous amount of information on the roles of individual metabolites in stomatal guard cell function and physiology. Recent application of metabolomics methods has produced a substantial amount of new information on metabolome control of stomatal movement. In conjunction with other ‘omics’ approaches, the knowledge-base is growing to reach a systems-level description of this single cell-type. Here we summarize current knowledge of the guard cell metabolome and highlight critical metabolites that bear significant impact on future engineering and breeding efforts to generate plants/crops that are resistant to environmental challenges and produce high yield and quality products for food and energy security.
Gille, Christoph; Fähling, Michael; Weyand, Birgit; Wieland, Thomas; Gille, Andreas
Alignment-Annotator is a novel web service designed to generate interactive views of annotated nucleotide and amino acid sequence alignments (i) de novo and (ii) embedded in other software. All computations are performed at server side. Interactivity is implemented in HTML5, a language native to web browsers. The alignment is initially displayed using default settings and can be modified with the graphical user interfaces. For example, individual sequences can be reordered or deleted using drag and drop, amino acid color code schemes can be applied and annotations can be added. Annotations can be made manually or imported (BioDAS servers, the UniProt, the Catalytic Site Atlas and the PDB). Some edits take immediate effect while others require server interaction and may take a few seconds to execute. The final alignment document can be downloaded as a zip-archive containing the HTML files. Because of the use of HTML the resulting interactive alignment can be viewed on any platform including Windows, Mac OS X, Linux, Android and iOS in any standard web browser. Importantly, no plugins nor Java are required and therefore Alignment-Anotator represents the first interactive browser-based alignment visualization. http://www.bioinformatics.org/strap/aa/ and http://strap.charite.de/aa/. © The Author(s) 2014. Published by Oxford University Press on behalf of Nucleic Acids Research.
Full Text Available Metabolomics in maternal-fetal medicine is still an “embryonic” science. However, there is already an increasing interest in metabolome of normal and complicated pregnancies, and neonatal outcomes. Tissues used for metabolomics interrogations of pregnant women, fetuses and newborns are amniotic fluid, blood, plasma, cord blood, placenta, urine, and vaginal secretions. All published papers highlight the strong correlation between biomarkers found in these tissues and fetal malformations, preterm delivery, premature rupture of membranes, gestational diabetes mellitus, preeclampsia, neonatal asphyxia, and hypoxic-ischemic encephalopathy. The aim of this review is to summarize and comment on original data available in relevant published works in order to emphasize the clinical potential of metabolomics in obstetrics in the immediate future.
Young, Jasmine Y.; Feng, Zukang; Dimitropoulos, Dimitris; Sala, Raul; Westbrook, John; Zhuravleva, Marina; Shao, Chenghua; Quesada, Martha; Peisach, Ezra; Berman, Helen M.
Over the past decade, the number of polymers and their complexes with small molecules in the Protein Data Bank archive (PDB) has continued to increase significantly. To support scientific advancements and ensure the best quality and completeness of the data files over the next 10 years and beyond, the Worldwide PDB partnership that manages the PDB archive is developing a new deposition and annotation system. This system focuses on efficient data capture across all supported experimental methods. The new deposition and annotation system is composed of four major modules that together support all of the processing requirements for a PDB entry. In this article, we describe one such module called the Chemical Component Annotation Tool. This tool uses information from both the Chemical Component Dictionary and Biologically Interesting molecule Reference Dictionary to aid in annotation. Benchmark studies have shown that the Chemical Component Annotation Tool provides significant improvements in processing efficiency and data quality. Database URL: http://wwpdb.org PMID:24291661
Chakrabarti, Chayan; Jones, Thomas B; Luger, George F; Xu, Jiawei F; Turner, Matthew D; Laird, Angela R; Turner, Jessica A
Ontologies encode relationships within a domain in robust data structures that can be used to annotate data objects, including scientific papers, in ways that ease tasks such as search and meta-analysis. However, the annotation process requires significant time and effort when performed by humans. Text mining algorithms can facilitate this process, but they render an analysis mainly based upon keyword, synonym and semantic matching. They do not leverage information embedded in an ontology's structure. We present a probabilistic framework that facilitates the automatic annotation of literature by indirectly modeling the restrictions among the different classes in the ontology. Our research focuses on annotating human functional neuroimaging literature within the Cognitive Paradigm Ontology (CogPO). We use an approach that combines the stochastic simplicity of naïve Bayes with the formal transparency of decision trees. Our data structure is easily modifiable to reflect changing domain knowledge. We compare our results across naïve Bayes, Bayesian Decision Trees, and Constrained Decision Tree classifiers that keep a human expert in the loop, in terms of the quality measure of the F1-mirco score. Unlike traditional text mining algorithms, our framework can model the knowledge encoded by the dependencies in an ontology, albeit indirectly. We successfully exploit the fact that CogPO has explicitly stated restrictions, and implicit dependencies in the form of patterns in the expert curated annotations.
Markowitz, Victor M.; Mavromatis, Konstantinos; Ivanova, Natalia N.; Chen, I-Min A.; Chu, Ken; Kyrpides, Nikos C.
A rapidly increasing number of microbial genomes are sequenced by organizations worldwide and are eventually included into various public genome data resources. The quality of the annotations depends largely on the original dataset providers, with erroneous or incomplete annotations often carried over into the public resources and difficult to correct. We have developed an Expert Review (ER) version of the Integrated Microbial Genomes (IMG) system, with the goal of supporting systematic and efficient revision of microbial genome annotations. IMG ER provides tools for the review and curation of annotations of both new and publicly available microbial genomes within IMG's rich integrated genome framework. New genome datasets are included into IMG ER prior to their public release either with their native annotations or with annotations generated by IMG ER's annotation pipeline. IMG ER tools allow addressing annotation problems detected with IMG's comparative analysis tools, such as genes missed by gene prediction pipelines or genes without an associated function. Over the past year, IMG ER was used for improving the annotations of about 150 microbial genomes.
Poli, Rosario, Comp.
An annotated bibliography lists 74 articles and reports on instructional materials centers (IMC) which appeared from 1967-70. The articles deal with such topics as the purposes of an IMC, guidelines for setting up an IMC, and the relationship of an IMC to technology. Most articles deal with use of an IMC on an elementary or secondary level, but…
F.-M. Nack (Frank); W. Putz
textabstractThis paper considers the automated and semi-automated annotation of audiovisual media in a new type of production framework, A4SM (Authoring System for Syntactic, Semantic and Semiotic Modelling). We present the architecture of the framework and outline the underlying XML-Schema based
T. Tsikrika (Theodora); C. Diou; A.P. de Vries (Arjen); A. Delopoulos
htmlabstractAutomatic image annotation using supervised learning is performed by concept classifiers trained on labelled example images. This work proposes the use of clickthrough data collected from search logs as a source for the automatic generation of concept training data, thus avoiding the
Full Text Available Nicotinamide phosphoribosyltransferase (NAMPT plays an important role in cellular bioenergetics. It is responsible for converting nicotinamide to nicotinamide adenine dinucleotide, an essential molecule in cellular metabolism. NAMPT has been extensively studied over the past decade due to its role as a key regulator of nicotinamide adenine dinucleotide-consuming enzymes. NAMPT is also known as a potential target for therapeutic intervention due to its involvement in disease. In the current study, we used a global mass spectrometry-based metabolomic approach to investigate the effects of FK866, a small molecule inhibitor of NAMPT currently in clinical trials, on metabolic perturbations in human cancer cells. We treated A2780 (ovarian cancer and HCT-116 (colorectal cancer cell lines with FK866 in the presence and absence of nicotinic acid. Significant changes were observed in the amino acids metabolism and the purine and pyrimidine metabolism. We also observed metabolic alterations in glycolysis, the citric acid cycle (TCA, and the pentose phosphate pathway. To expand the range of the detected polar metabolites and improve data confidence, we applied a global metabolomics profiling platform by using both non-targeted and targeted hydrophilic (HILIC-LC-MS and GC-MS analysis. We used Ingenuity Knowledge Base to facilitate the projection of metabolomics data onto metabolic pathways. Several metabolic pathways showed differential responses to FK866 based on several matches to the list of annotated metabolites. This study suggests that global metabolomics can be a useful tool in pharmacological studies of the mechanism of action of drugs at a cellular level.
Tolstikov, Vladimir; Nikolayev, Alexander; Dong, Sucai; Zhao, Genshi; Kuo, Ming-Shang
Nicotinamide phosphoribosyltransferase (NAMPT) plays an important role in cellular bioenergetics. It is responsible for converting nicotinamide to nicotinamide adenine dinucleotide, an essential molecule in cellular metabolism. NAMPT has been extensively studied over the past decade due to its role as a key regulator of nicotinamide adenine dinucleotide-consuming enzymes. NAMPT is also known as a potential target for therapeutic intervention due to its involvement in disease. In the current study, we used a global mass spectrometry-based metabolomic approach to investigate the effects of FK866, a small molecule inhibitor of NAMPT currently in clinical trials, on metabolic perturbations in human cancer cells. We treated A2780 (ovarian cancer) and HCT-116 (colorectal cancer) cell lines with FK866 in the presence and absence of nicotinic acid. Significant changes were observed in the amino acids metabolism and the purine and pyrimidine metabolism. We also observed metabolic alterations in glycolysis, the citric acid cycle (TCA), and the pentose phosphate pathway. To expand the range of the detected polar metabolites and improve data confidence, we applied a global metabolomics profiling platform by using both non-targeted and targeted hydrophilic (HILIC)-LC-MS and GC-MS analysis. We used Ingenuity Knowledge Base to facilitate the projection of metabolomics data onto metabolic pathways. Several metabolic pathways showed differential responses to FK866 based on several matches to the list of annotated metabolites. This study suggests that global metabolomics can be a useful tool in pharmacological studies of the mechanism of action of drugs at a cellular level.
Konyushkova, Ksenia; Uijlings, Jasper; Lampert, Christoph; Ferrari, Vittorio
We introduce Intelligent Annotation Dialogs for bounding box annotation. We train an agent to automatically choose a sequence of actions for a human annotator to produce a bounding box in a minimal amount of time. Specifically, we consider two actions: box verification , where the annotator verifies a box generated by an object detector, and manual box drawing. We explore two kinds of agents, one based on predicting the probability that a box will be positively verified, and the other bas...
Wang, X.J.; Zhang, L.; Li, X.; Ma, W.Y.
Although it has been studied for years by the computer vision and machine learning communities, image annotation is still far from practical. In this paper, we propose a novel attempt at model-free image annotation, which is a data-driven approach that annotates images by mining their search
Background: Metabolomics is a promising tool of cardiovascular biomarker discovery. We systematically reviewed the literature on comprehensive metabolomic profiling in association with incident cardiovascular disease (CVD). Methods and Results: We searched MEDLINE and EMBASE from inception to Janua...
Hall, R.D.; Beale, M.; Fiehn, O.; Hardy, N.; Summer, L.; Bino, R.
After the establishment of technologies for high-throughput DNA sequencing (genomics), gene expression analysis (transcriptomics), and protein analysis (proteomics), the remaining functional genomics challenge is that of metabolomics. Metabolomics is the term coined for essentially comprehensive,
Rigoutsos, Isidore; Huynh, Tien; Floratos, Aris; Parida, Laxmi; Platt, Daniel
Computational methods seeking to automatically determine the properties (functional, structural, physicochemical, etc.) of a protein directly from the sequence have long been the focus of numerous research groups. With the advent of advanced sequencing methods and systems, the number of amino acid sequences that are being deposited in the public databases has been increasing steadily. This has in turn generated a renewed demand for automated approaches that can annotate individual sequences and complete genomes quickly, exhaustively and objectively. In this paper, we present one such approach that is centered around and exploits the Bio-Dictionary, a collection of amino acid patterns that completely covers the natural sequence space and can capture functional and structural signals that have been reused during evolution, within and across protein families. Our annotation approach also makes use of a weighted, position-specific scoring scheme that is unaffected by the over-representation of well-conserved proteins and protein fragments in the databases used. For a given query sequence, the method permits one to determine, in a single pass, the following: local and global similarities between the query and any protein already present in a public database; the likeness of the query to all available archaeal/ bacterial/eukaryotic/viral sequences in the database as a function of amino acid position within the query; the character of secondary structure of the query as a function of amino acid position within the query; the cytoplasmic, transmembrane or extracellular behavior of the query; the nature and position of binding domains, active sites, post-translationally modified sites, signal peptides, etc. In terms of performance, the proposed method is exhaustive, objective and allows for the rapid annotation of individual sequences and full genomes. Annotation examples are presented and discussed in Results, including individual queries and complete genomes that were
Yuliya V Karpievitch
Full Text Available Liquid chromatography mass spectrometry has become one of the analytical platforms of choice for metabolomics studies. However, LC-MS metabolomics data can suffer from the effects of various systematic biases. These include batch effects, day-to-day variations in instrument performance, signal intensity loss due to time-dependent effects of the LC column performance, accumulation of contaminants in the MS ion source and MS sensitivity among others. In this study we aimed to test a singular value decomposition-based method, called EigenMS, for normalization of metabolomics data. We analyzed a clinical human dataset where LC-MS serum metabolomics data and physiological measurements were collected from thirty nine healthy subjects and forty with type 2 diabetes and applied EigenMS to detect and correct for any systematic bias. EigenMS works in several stages. First, EigenMS preserves the treatment group differences in the metabolomics data by estimating treatment effects with an ANOVA model (multiple fixed effects can be estimated. Singular value decomposition of the residuals matrix is then used to determine bias trends in the data. The number of bias trends is then estimated via a permutation test and the effects of the bias trends are eliminated. EigenMS removed bias of unknown complexity from the LC-MS metabolomics data, allowing for increased sensitivity in differential analysis. Moreover, normalized samples better correlated with both other normalized samples and corresponding physiological data, such as blood glucose level, glycated haemoglobin, exercise central augmentation pressure normalized to heart rate of 75, and total cholesterol. We were able to report 2578 discriminatory metabolite peaks in the normalized data (p<0.05 as compared to only 1840 metabolite signals in the raw data. Our results support the use of singular value decomposition-based normalization for metabolomics data.
David, O.; Lloyd, W.; Carlson, J.; Leavesley, G. H.; Geter, F.
The popular programming languages Java and C# provide annotations, a form of meta-data construct. Software frameworks for web integration, web services, database access, and unit testing now take advantage of annotations to reduce the complexity of APIs and the quantity of integration code between the application and framework infrastructure. Adopting annotation features in frameworks has been observed to lead to cleaner and leaner application code. The USDA Object Modeling System (OMS) version 3.0 fully embraces the annotation approach and additionally defines a meta-data standard for components and models. In version 3.0 framework/model integration previously accomplished using API calls is now achieved using descriptive annotations. This enables the framework to provide additional functionality non-invasively such as implicit multithreading, and auto-documenting capabilities while achieving a significant reduction in the size of the model source code. Using a non-invasive methodology leads to models and modeling components with only minimal dependencies on the modeling framework. Since models and modeling components are not directly bound to framework by the use of specific APIs and/or data types they can more easily be reused both within the framework as well as outside of it. To study the effectiveness of an annotation based framework approach with other modeling frameworks, a framework-invasiveness study was conducted to evaluate the effects of framework design on model code quality. A monthly water balance model was implemented across several modeling frameworks and several software metrics were collected. The metrics selected were measures of non-invasive design methods for modeling frameworks from a software engineering perspective. It appears that the use of annotations positively impacts several software quality measures. In a next step, the PRMS model was implemented in OMS 3.0 and is currently being implemented for water supply forecasting in the
McFee, Brian; Nieto, Oriol; Farbood, Morwaread M; Bello, Juan Pablo
Music exhibits structure at multiple scales, ranging from motifs to large-scale functional components. When inferring the structure of a piece, different listeners may attend to different temporal scales, which can result in disagreements when they describe the same piece. In the field of music informatics research (MIR), it is common to use corpora annotated with structural boundaries at different levels. By quantifying disagreements between multiple annotators, previous research has yielded several insights relevant to the study of music cognition. First, annotators tend to agree when structural boundaries are ambiguous. Second, this ambiguity seems to depend on musical features, time scale, and genre. Furthermore, it is possible to tune current annotation evaluation metrics to better align with these perceptual differences. However, previous work has not directly analyzed the effects of hierarchical structure because the existing methods for comparing structural annotations are designed for "flat" descriptions, and do not readily generalize to hierarchical annotations. In this paper, we extend and generalize previous work on the evaluation of hierarchical descriptions of musical structure. We derive an evaluation metric which can compare hierarchical annotations holistically across multiple levels. sing this metric, we investigate inter-annotator agreement on the multilevel annotations of two different music corpora, investigate the influence of acoustic properties on hierarchical annotations, and evaluate existing hierarchical segmentation algorithms against the distribution of inter-annotator agreement.
Lee, Jang-Eun; Lee, Bum-Jin; Chung, Jin-Oh; Kim, Hak-Nam; Kim, Eun-Hee; Jung, Sungheuk; Lee, Hyosang; Lee, Sang-Jun; Hong, Young-Shick
Numerous factors such as geographical origin, cultivar, climate, cultural practices, and manufacturing processes influence the chemical compositions of tea, in the same way as growing conditions and grape variety affect wine quality. However, the relationships between these factors and tea chemical compositions are not well understood. In this study, a new approach for non-targeted or global analysis, i.e., metabolomics, which is highly reproducible and statistically effective in analysing a diverse range of compounds, was used to better understand the metabolome of Camellia sinensis and determine the influence of environmental factors, including geography, climate, and cultural practices, on tea-making. We found a strong correlation between environmental factors and the metabolome of green, white, and oolong teas from China, Japan, and South Korea. In particular, multivariate statistical analysis revealed strong inter-country and inter-city relationships in the levels of theanine and catechin derivatives found in green and white teas. This information might be useful for assessing tea quality or producing distinct tea products across different locations, and highlights simultaneous identification of diverse tea metabolites through an NMR-based metabolomics approach. Copyright © 2014 Elsevier Ltd. All rights reserved.
Coene, Karlien L M; Kluijtmans, Leo A J; van der Heeft, Ed; Engelke, Udo F H; de Boer, Siebolt; Hoegen, Brechtje; Kwast, Hanneke J T; van de Vorst, Maartje; Huigen, Marleen C D G; Keularts, Irene M L W; Schreuder, Michiel F; van Karnebeek, Clara D M; Wortmann, Saskia B; de Vries, Maaike C; Janssen, Mirian C H; Gilissen, Christian; Engel, Jasper; Wevers, Ron A
The implementation of whole-exome sequencing in clinical diagnostics has generated a need for functional evaluation of genetic variants. In the field of inborn errors of metabolism (IEM), a diverse spectrum of targeted biochemical assays is employed to analyze a limited amount of metabolites. We now present a single-platform, high-resolution liquid chromatography quadrupole time of flight (LC-QTOF) method that can be applied for holistic metabolic profiling in plasma of individual IEM-suspected patients. This method, which we termed "next-generation metabolic screening" (NGMS), can detect >10,000 features in each sample. In the NGMS workflow, features identified in patient and control samples are aligned using the "various forms of chromatography mass spectrometry (XCMS)" software package. Subsequently, all features are annotated using the Human Metabolome Database, and statistical testing is performed to identify significantly perturbed metabolite concentrations in a patient sample compared with controls. We propose three main modalities to analyze complex, untargeted metabolomics data. First, a targeted evaluation can be done based on identified genetic variants of uncertain significance in metabolic pathways. Second, we developed a panel of IEM-related metabolites to filter untargeted metabolomics data. Based on this IEM-panel approach, we provided the correct diagnosis for 42 of 46 IEMs. As a last modality, metabolomics data can be analyzed in an untargeted setting, which we term "open the metabolome" analysis. This approach identifies potential novel biomarkers in known IEMs and leads to identification of biomarkers for as yet unknown IEMs. We are convinced that NGMS is the way forward in laboratory diagnostics of IEMs.
Victoria Dominguez Del Angel
Full Text Available As a part of the ELIXIR-EXCELERATE efforts in capacity building, we present here 10 steps to facilitate researchers getting started in genome assembly and genome annotation. The guidelines given are broadly applicable, intended to be stable over time, and cover all aspects from start to finish of a general assembly and annotation project. Intrinsic properties of genomes are discussed, as is the importance of using high quality DNA. Different sequencing technologies and generally applicable workflows for genome assembly are also detailed. We cover structural and functional annotation and encourage readers to also annotate transposable elements, something that is often omitted from annotation workflows. The importance of data management is stressed, and we give advice on where to submit data and how to make your results Findable, Accessible, Interoperable, and Reusable (FAIR.
Zhang, Aihua; Sun, Hui; Wang, Zhigang; Sun, Wenjun; Wang, Ping; Wang, Xijun
Metabolomics represent a global understanding of metabolite complement of integrated living systems and dynamic responses to the changes of both endogenous and exogenous factors and has many potential applications and advantages for the research of complex systems. As a systemic approach, metabolomics adopts a "top-down" strategy to reflect the function of organisms from the end products of the metabolic network and to understand metabolic changes of a complete system caused by interventions in a holistic context. This property agrees with the holistic thinking of Traditional Chinese Medicine (TCM), a complex medical science, suggesting that metabolomics has the potential to impact our understanding of the theory behind the evidence-based Chinese medicine. Consequently, the development of robust metabolomic platforms will greatly facilitate, for example, the understanding of the action mechanisms of TCM formulae and the analysis of Chinese herbal (CHM) and mineral medicine, acupuncture, and Chinese medicine syndromes. This review summarizes some of the applications of metabolomics in special TCM issues with an emphasis on metabolic biomarker discovery. © Georg Thieme Verlag KG Stuttgart · New York.
Courant, Frédérique; Antignac, Jean-Philippe; Dervilly-Pinel, Gaud; Le Bizec, Bruno
The emerging field of metabolomics, aiming to characterize small molecule metabolites present in biological systems, promises immense potential for different areas such as medicine, environmental sciences, agronomy, etc. The purpose of this article is to guide the reader through the history of the field, then through the main steps of the metabolomics workflow, from study design to structure elucidation, and help the reader to understand the key phases of a metabolomics investigation and the rationale underlying the protocols and techniques used. This article is not intended to give standard operating procedures as several papers related to this topic were already provided, but is designed as a tutorial aiming to help beginners understand the concept and challenges of MS-based metabolomics. A real case example is taken from the literature to illustrate the application of the metabolomics approach in the field of doping analysis. Challenges and limitations of the approach are then discussed along with future directions in research to cope with these limitations. This tutorial is part of the International Proteomics Tutorial Programme (IPTP18). © 2014 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Berardini, Tanya Z; Li, Donghui; Muller, Robert; Chetty, Raymond; Ploetz, Larry; Singh, Shanker; Wensel, April; Huala, Eva
As the scientific literature grows, leading to an increasing volume of published experimental data, so does the need to access and analyze this data using computational tools. The most commonly used method to convert published experimental data on gene function into controlled vocabulary annotations relies on a professional curator, employed by a model organism database or a more general resource such as UniProt, to read published articles and compose annotation statements based on the articles' contents. A more cost-effective and scalable approach capable of capturing gene function data across the whole range of biological research organisms in computable form is urgently needed. We have analyzed a set of ontology annotations generated through collaborations between the Arabidopsis Information Resource and several plant science journals. Analysis of the submissions entered using the online submission tool shows that most community annotations were well supported and the ontology terms chosen were at an appropriate level of specificity. Of the 503 individual annotations that were submitted, 97% were approved and community submissions captured 72% of all possible annotations. This new method for capturing experimental results in a computable form provides a cost-effective way to greatly increase the available body of annotations without sacrificing annotation quality. Database URL: www.arabidopsis.org.
Stubbs, Amber; Uzuner, Özlem
The 2014 i2b2/UTHealth natural language processing shared task featured a track focused on identifying risk factors for heart disease (specifically, Cardiac Artery Disease) in clinical narratives. For this track, we used a "light" annotation paradigm to annotate a set of 1304 longitudinal medical records describing 296 patients for risk factors and the times they were present. We designed the annotation task for this track with the goal of balancing annotation load and time with quality, so as to generate a gold standard corpus that can benefit a clinically-relevant task. We applied light annotation procedures and determined the gold standard using majority voting. On average, the agreement of annotators with the gold standard was above 0.95, indicating high reliability. The resulting document-level annotations generated for each record in each longitudinal EMR in this corpus provide information that can support studies of progression of heart disease risk factors in the included patients over time. These annotations were used in the Risk Factor track of the 2014 i2b2/UTHealth shared task. Participating systems achieved a mean micro-averaged F1 measure of 0.815 and a maximum F1 measure of 0.928 for identifying these risk factors in patient records. Copyright © 2015 Elsevier Inc. All rights reserved.
Zeng, Min; Hao, Wenlong; Zou, Yongdong; Shi, Mengliang; Jiang, Yongguang; Xiao, Peng; Lei, Anping; Hu, Zhangli; Zhang, Weiwen; Zhao, Liqing; Wang, Jiangxin
Microalgae have been recognized as a good food source of natural biologically active ingredients. Among them, the green microalga Euglena is a very promising food and nutritional supplements, providing high value-added poly-unsaturated fatty acids, paramylon and proteins. Different culture conditions could affect the chemical composition and food quality of microalgal cells. However, little information is available for distinguishing the different cellular changes especially the active ingredients including poly-saturated fatty acids and other metabolites under different culture conditions, such as light and dark. In this study, together with fatty acid profiling, we applied a gas chromatography-mass spectrometry (GC-MS)-based metabolomics to differentiate hetrotrophic and mixotrophic culture conditions. This study suggests metabolomics can shed light on understanding metabolomic changes under different culture conditions and provides a theoretical basis for industrial applications of microalgae, as food with better high-quality active ingredients.
Full Text Available Dairy products are an important component in the Western diet and represent a valuable source of nutrients for humans. However, a reliable dairy intake assessment in nutrition research is crucial to correctly elucidate the link between dairy intake and human health. Metabolomics is considered a potential tool for assessment of dietary intake instead of traditional methods, such as food frequency questionnaires, food records, and 24-h recalls. Metabolomics has been successfully applied to discriminate between consumption of different dairy products under different experimental conditions. Moreover, potential metabolites related to dairy intake were identified, although these metabolites need to be further validated in other intervention studies before they can be used as valid biomarkers of dairy consumption. Therefore, this review provides an overview of metabolomics for assessment of dairy intake in order to better clarify the role of dairy products in human nutrition and health.
Full Text Available Although multiple gene and protein expression have been extensively profiled in human pulmonary arterial hypertension (PAH, the mechanism for the development and progression of pulmonary hypertension remains elusive. Analysis of the global metabolomic heterogeneity within the pulmonary vascular system leads to a better understanding of disease progression. Using a combination of high-throughput liquid-and-gas-chromatography-based mass spectrometry, we showed unbiased metabolomic profiles of disrupted glycolysis, increased TCA cycle, and fatty acid metabolites with altered oxidation pathways in the human PAH lung. The results suggest that PAH has specific metabolic pathways contributing to increased ATP synthesis for the vascular remodeling process in severe pulmonary hypertension. These identified metabolites may serve as potential biomarkers for the diagnosis of PAH. By profiling metabolomic alterations of the PAH lung, we reveal new pathogenic mechanisms of PAH, opening an avenue of exploration for therapeutics that target metabolic pathway alterations in the progression of PAH.
Full Text Available Inflammatory Bowel Disease (IBD is a multifactorial disorder that conceptually occurs as a result of altered immune responses to commensal and/or pathogenic gut microbes in individuals most susceptible to the disease. During Crohn’s Disease (CD or Ulcerative Colitis (UC, two components of the human IBD, distinct stages define the disease onset, severity, progression and remission. Epigenetic, environmental (microbiome, metabolome and nutritional factors are important in IBD pathogenesis. While the dysbiotic microbiota has been proposed to play a role in disease pathogenesis, the data on IBD and diet are still less convincing. Nonetheless, studies are ongoing to examine the effect of pre/probiotics and/or FODMAP reduced diets on both the gut microbiome and its metabolome in an effort to define the healthy diet in patients with IBD. Knowledge of a unique metabolomic fingerprint in IBD could be useful for diagnosis, treatment and detection of disease pathogenesis.
Hanna, Mina H; Brophy, Patrick D
Metabolomics, the latest of the “omics” sciences, refers to the systematic study of metabolites and their changes in biological samples due to physiological stimuli and/or genetic modification. Because metabolites represent the downstream expression of genome, transcriptome and proteome, they can closely reflect the phenotype of an organism at a specific time. As an emerging field in analytical biochemistry; metabolomics has the potential to play a major role for monitoring real-time kidney function and detecting adverse renal events. Additionally, small molecule metabolites can provide mechanistic insights for novel biomarkers of kidney diseases, given the limitations of the current traditional markers. The clinical utility of metabolomics in the field of pediatric nephrology includes biomarker discovery, defining as yet unrecognized biologic therapeutic targets, linking of metabolites to relevant standard indices and clinical outcomes, and providing a window of opportunity to investigate the intricacies of environment/genetic interplay in specific disease states. PMID:25027575
Jovanović, Jelena; Bagheri, Ebrahim
The abundance and unstructured nature of biomedical texts, be it clinical or research content, impose significant challenges for the effective and efficient use of information and knowledge stored in such texts. Annotation of biomedical documents with machine intelligible semantics facilitates advanced, semantics-based text management, curation, indexing, and search. This paper focuses on annotation of biomedical entity mentions with concepts from relevant biomedical knowledge bases such as UMLS. As a result, the meaning of those mentions is unambiguously and explicitly defined, and thus made readily available for automated processing. This process is widely known as semantic annotation, and the tools that perform it are known as semantic annotators.Over the last dozen years, the biomedical research community has invested significant efforts in the development of biomedical semantic annotation technology. Aiming to establish grounds for further developments in this area, we review a selected set of state of the art biomedical semantic annotators, focusing particularly on general purpose annotators, that is, semantic annotation tools that can be customized to work with texts from any area of biomedicine. We also examine potential directions for further improvements of today's annotators which could make them even more capable of meeting the needs of real-world applications. To motivate and encourage further developments in this area, along the suggested and/or related directions, we review existing and potential practical applications and benefits of semantic annotators.
Pinoli, Pietro; Chicco, Davide; Masseroli, Marco
Gene function annotations, which are associations between a gene and a term of a controlled vocabulary describing gene functional features, are of paramount importance in modern biology. Datasets of these annotations, such as the ones provided by the Gene Ontology Consortium, are used to design novel biological experiments and interpret their results. Despite their importance, these sources of information have some known issues. They are incomplete, since biological knowledge is far from being definitive and it rapidly evolves, and some erroneous annotations may be present. Since the curation process of novel annotations is a costly procedure, both in economical and time terms, computational tools that can reliably predict likely annotations, and thus quicken the discovery of new gene annotations, are very useful. We used a set of computational algorithms and weighting schemes to infer novel gene annotations from a set of known ones. We used the latent semantic analysis approach, implementing two popular algorithms (Latent Semantic Indexing and Probabilistic Latent Semantic Analysis) and propose a novel method, the Semantic IMproved Latent Semantic Analysis, which adds a clustering step on the set of considered genes. Furthermore, we propose the improvement of these algorithms by weighting the annotations in the input set. We tested our methods and their weighted variants on the Gene Ontology annotation sets of three model organism genes (Bos taurus, Danio rerio and Drosophila melanogaster ). The methods showed their ability in predicting novel gene annotations and the weighting procedures demonstrated to lead to a valuable improvement, although the obtained results vary according to the dimension of the input annotation set and the considered algorithm. Out of the three considered methods, the Semantic IMproved Latent Semantic Analysis is the one that provides better results. In particular, when coupled with a proper weighting policy, it is able to predict a
Liu, Jinding; Xiao, Huamei; Huang, Shuiqing; Li, Fei
Insects are one of the largest classes of animals on Earth and constitute more than half of all living species. The i5k initiative has begun sequencing of more than 5,000 insect genomes, which should greatly help in exploring insect resource and pest control. Insect genome annotation remains challenging because many insects have high levels of heterozygosity. To improve the quality of insect genome annotation, we developed a pipeline, named Optimized Maker-Based Insect Genome Annotation (OMIGA), to predict protein-coding genes from insect genomes. We first mapped RNA-Seq reads to genomic scaffolds to determine transcribed regions using Bowtie, and the putative transcripts were assembled using Cufflink. We then selected highly reliable transcripts with intact coding sequences to train de novo gene prediction software, including Augustus. The re-trained software was used to predict genes from insect genomes. Exonerate was used to refine gene structure and to determine near exact exon/intron boundary in the genome. Finally, we used the software Maker to integrate data from RNA-Seq, de novo gene prediction, and protein alignment to produce an official gene set. The OMIGA pipeline was used to annotate the draft genome of an important insect pest, Chilo suppressalis, yielding 12,548 genes. Different strategies were compared, which demonstrated that OMIGA had the best performance. In summary, we present a comprehensive pipeline for identifying genes in insect genomes that can be widely used to improve the annotation quality in insects. OMIGA is provided at http://ento.njau.edu.cn/omiga.html .
Full Text Available The comprehensive experimental analysis of a metabolic constitution plays a central role in approaches of organismal systems biology.Quantifying the impact of a changing environment on the homeostasis of cellular metabolism has been the focus of numerous studies applying various metabolomics techniques. It has been proven that approaches which integrate different analytical techniques, e.g. LC-MS, GC-MS, CE-MS and H-NMR, can provide a comprehensive picture of a certain metabolic homeostasis. Identification of metabolic compounds and quantification of metabolite levels represent the groundwork for the analysis of regulatory strategies in cellular metabolism. This significantly promotes our current understanding of the molecular organization and regulation of cells, tissues and whole organisms.Nevertheless, it is demanding to elicit the pertinent information which is contained in metabolomics data sets.Based on the central dogma of molecular biology, metabolite levels and their fluctuations are the result of a directed flux of information from gene activation over transcription to translation and posttranslational modification.Hence, metabolomics data represent the summed output of a metabolic system comprising various levels of molecular organization.As a consequence, the inverse assignment of metabolomics data to underlying regulatory processes should yield information which-if deciphered correctly-provides comprehensive insight into a metabolic system.Yet, the deduction of regulatory principles is complex not only due to the high number of metabolic compounds, but also because of a high level of cellular compartmentalization and differentiation.Motivated by the question how metabolomics approaches can provide a representative view on regulatory biochemical processes, this article intends to present and discuss current metabolomics applications, strategies of data analysis and their limitations with respect to the interpretability in context of
Tulipani, Sara; Mora-Cubillos, Ximena; Jáuregui, Olga; Llorach, Rafael; García-Fuentes, Eduardo; Tinahones, Francisco J; Andres-Lacueva, Cristina
Although LC-MS untargeted metabolomics continues to expand into exiting research domains, methodological issues have not been solved yet by the definition of unbiased, standardized and globally accepted analytical protocols. In the present study, the response of the plasma metabolome coverage to specific methodological choices of the sample preparation (two SPE technologies, three sample-to-solvent dilution ratios) and the LC-ESI-MS data acquisition steps of the metabolomics workflow (four RP columns, four elution solvent combinations, two solvent quality grades, postcolumn modification of the mobile phase) was investigated in a pragmatic and decision tree-like performance evaluation strategy. Quality control samples, reference plasma and human plasma from a real nutrimetabolomic study were used for intermethod comparisons. Uni- and multivariate data analysis approaches were independently applied. The highest method performance was obtained by combining the plasma hybrid extraction with the highest solvent proportion during sample preparation, the use of a RP column compatible with 100% aqueous polar phase (Atlantis T3), and the ESI enhancement by using UHPLC-MS purity grade methanol as both organic phase and postcolumn modifier. Results led to the following considerations: submit plasma samples to hybrid extraction for removal of interfering components to minimize the major sample-dependent matrix effects; avoid solvent evaporation following sample extraction if loss in detection and peak shape distortion of early eluting metabolites are not noticed; opt for a RP column for superior retention of highly polar species when analysis fractionation is not feasible; use ultrahigh quality grade solvents and "vintage" analytical tricks such as postcolumn organic enrichment of the mobile phase to enhance ESI efficiency. The final proposed protocol offers an example of how novel and old-fashioned analytical solutions may fruitfully cohabit in untargeted metabolomics
Xia, Jianguo; Sinelnikov, Igor V; Han, Beomsoo; Wishart, David S
MetaboAnalyst (www.metaboanalyst.ca) is a web server designed to permit comprehensive metabolomic data analysis, visualization and interpretation. It supports a wide range of complex statistical calculations and high quality graphical rendering functions that require significant computational resources. First introduced in 2009, MetaboAnalyst has experienced more than a 50X growth in user traffic (>50 000 jobs processed each month). In order to keep up with the rapidly increasing computational demands and a growing number of requests to support translational and systems biology applications, we performed a substantial rewrite and major feature upgrade of the server. The result is MetaboAnalyst 3.0. By completely re-implementing the MetaboAnalyst suite using the latest web framework technologies, we have been able substantially improve its performance, capacity and user interactivity. Three new modules have also been added including: (i) a module for biomarker analysis based on the calculation of receiver operating characteristic curves; (ii) a module for sample size estimation and power analysis for improved planning of metabolomics studies and (iii) a module to support integrative pathway analysis for both genes and metabolites. In addition, popular features found in existing modules have been significantly enhanced by upgrading the graphical output, expanding the compound libraries and by adding support for more diverse organisms. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.
Pannkuk, Evan L; Fornace, Albert J; Laiakis, Evagelia C
Exposure of the general population to ionizing radiation has increased in the past decades, primarily due to long distance travel and medical procedures. On the other hand, accidental exposures, nuclear accidents, and elevated threats of terrorism with the potential detonation of a radiological dispersal device or improvised nuclear device in a major city, all have led to increased needs for rapid biodosimetry and assessment of exposure to different radiation qualities and scenarios. Metabolomics, the qualitative and quantitative assessment of small molecules in a given biological specimen, has emerged as a promising technology to allow for rapid determination of an individual's exposure level and metabolic phenotype. Advancements in mass spectrometry techniques have led to untargeted (discovery phase, global assessment) and targeted (quantitative phase) methods not only to identify biomarkers of radiation exposure, but also to assess general perturbations of metabolism with potential long-term consequences, such as cancer, cardiovascular, and pulmonary disease. Metabolomics of radiation exposure has provided a highly informative snapshot of metabolic dysregulation. Biomarkers in easily accessible biofluids and biospecimens (urine, blood, saliva, sebum, fecal material) from mouse, rat, and minipig models, to non-human primates and humans have provided the basis for determination of a radiation signature to assess the need for medical intervention. Here we provide a comprehensive description of the current status of radiation metabolomic studies for the purpose of rapid high-throughput radiation biodosimetry in easily accessible biofluids and discuss future directions of radiation metabolomics research.
Martínez Alonso, Héctor; Pedersen, Bolette Sandford; Bel, Núria
We present the result of an annotation task on regular polysemy for a series of seman- tic classes or dot types in English, Dan- ish and Spanish. This article describes the annotation process, the results in terms of inter-encoder agreement, and the sense distributions obtained with two methods......: majority voting with a theory-compliant backoff strategy, and MACE, an unsuper- vised system to choose the most likely sense from all the annotations....
López-Fernández, H; Reboiro-Jato, M; Glez-Peña, D; Aparicio, F; Gachet, D; Buenaga, M; Fdez-Riverola, F
Automatic term annotation from biomedical documents and external information linking are becoming a necessary prerequisite in modern computer-aided medical learning systems. In this context, this paper presents BioAnnote, a flexible and extensible open-source platform for automatically annotating biomedical resources. Apart from other valuable features, the software platform includes (i) a rich client enabling users to annotate multiple documents in a user friendly environment, (ii) an extensible and embeddable annotation meta-server allowing for the annotation of documents with local or remote vocabularies and (iii) a simple client/server protocol which facilitates the use of our meta-server from any other third-party application. In addition, BioAnnote implements a powerful scripting engine able to perform advanced batch annotations. Copyright © 2013 Elsevier Ireland Ltd. All rights reserved.
Pfaff, Claas-Thido; Eichenberg, David; Liebergesell, Mario; König-Ries, Birgitta; Wirth, Christian
Ecology has become a data intensive science over the last decades which often relies on the reuse of data in cross-experimental analyses. However, finding data which qualifies for the reuse in a specific context can be challenging. It requires good quality metadata and annotations as well as efficient search strategies. To date, full text search (often on the metadata only) is the most widely used search strategy although it is known to be inaccurate. Faceted navigation is providing a filter mechanism which is based on fine granular metadata, categorizing search objects along numeric and categorical parameters relevant for their discovery. Selecting from these parameters during a full text search creates a system of filters which allows to refine and improve the results towards more relevance. We developed a framework for the efficient annotation and faceted navigation in ecology. It consists of an XML schema for storing the annotation of search objects and is accompanied by a vocabulary focused on ecology to support the annotation process. The framework consolidates ideas which originate from widely accepted metadata standards, textbooks, scientific literature, and vocabularies as well as from expert knowledge contributed by researchers from ecology and adjacent disciplines.
Eichenberg, David; Liebergesell, Mario; König-Ries, Birgitta; Wirth, Christian
Ecology has become a data intensive science over the last decades which often relies on the reuse of data in cross-experimental analyses. However, finding data which qualifies for the reuse in a specific context can be challenging. It requires good quality metadata and annotations as well as efficient search strategies. To date, full text search (often on the metadata only) is the most widely used search strategy although it is known to be inaccurate. Faceted navigation is providing a filter mechanism which is based on fine granular metadata, categorizing search objects along numeric and categorical parameters relevant for their discovery. Selecting from these parameters during a full text search creates a system of filters which allows to refine and improve the results towards more relevance. We developed a framework for the efficient annotation and faceted navigation in ecology. It consists of an XML schema for storing the annotation of search objects and is accompanied by a vocabulary focused on ecology to support the annotation process. The framework consolidates ideas which originate from widely accepted metadata standards, textbooks, scientific literature, and vocabularies as well as from expert knowledge contributed by researchers from ecology and adjacent disciplines. PMID:29023519
Full Text Available Ecology has become a data intensive science over the last decades which often relies on the reuse of data in cross-experimental analyses. However, finding data which qualifies for the reuse in a specific context can be challenging. It requires good quality metadata and annotations as well as efficient search strategies. To date, full text search (often on the metadata only is the most widely used search strategy although it is known to be inaccurate. Faceted navigation is providing a filter mechanism which is based on fine granular metadata, categorizing search objects along numeric and categorical parameters relevant for their discovery. Selecting from these parameters during a full text search creates a system of filters which allows to refine and improve the results towards more relevance. We developed a framework for the efficient annotation and faceted navigation in ecology. It consists of an XML schema for storing the annotation of search objects and is accompanied by a vocabulary focused on ecology to support the annotation process. The framework consolidates ideas which originate from widely accepted metadata standards, textbooks, scientific literature, and vocabularies as well as from expert knowledge contributed by researchers from ecology and adjacent disciplines.
Chatzimitakos, Theodoros G; Stalikas, Constantine D
Metal nanoparticles (NPs) have proven to be more toxic than bulk analogues of the same chemical composition due to their unique physical properties. The NPs, lately, have drawn the attention of researchers because of their antibacterial and biocidal properties. In an effort to shed light on the mechanism through which the bacteria elimination is achieved and the metabolic changes they undergo, an untargeted metabolomic fingerprint study was carried out on Gram-positive (Staphylococcus aureus) and Gram-negative (Escherichia coli) bacteria species. The (1)H NMR spectroscopy, in conjunction with high resolution mass-spectrometry (HRMS) and an unsophisticated data processing workflow were implemented. The combined NMR/HRMS data, supported by an open-access metabolomic database, proved to be efficacious in the process of assigning a putative annotation to a wide range of metabolite signals and is a useful tool to appraise the metabolome alterations, as a consequence of bacterial response to NPs. Interestingly, not all the NPs diminished the intracellular metabolites; bacteria treated with iron NPs produced metabolites not present in the nonexposed bacteria sample, implying the activation of previously inactive metabolic pathways. In contrast, copper and iron-copper NPs reduced the annotated metabolites, alluding to the conclusion that the metabolic pathways (mainly alanine, aspartate, and glutamate metabolism, beta-alanine metabolism, glutathione metabolism, and arginine and proline metabolism) were hindered by the interactions of NPs with the intracellular metabolites.
Full Text Available Tracking occluded objects at different depths has become as extremely important component of study for any video sequence having wide applications in object tracking, scene recognition, coding, editing the videos and mosaicking. The paper studies the ability of annotation to track the occluded object based on pyramids with variation in depth further establishing a threshold at which the ability of the system to track the occluded object fails. Image annotation is applied on 3 similar video sequences varying in depth. In the experiment, one bike occludes the other at a depth of 60cm, 80cm and 100cm respectively. Another experiment is performed on tracking humans with similar depth to authenticate the results. The paper also computes the frame by frame error incurred by the system, supported by detailed simulations. This system can be effectively used to analyze the error in motion tracking and further correcting the error leading to flawless tracking. This can be of great interest to computer scientists while designing surveillance systems etc.
Müller, Constanze; Dietz, Inga; Tziotis, Dimitrios; Moritz, Franco; Rupp, Jan; Schmitt-Kopplin, Philippe
Infections with Chlamydia pneumoniae cause several respiratory diseases, such as community-acquired pneumonia, bronchitis or sinusitis. Here, we present an integrated non-targeted metabolomics analysis applying ultra-high-resolution mass spectrometry and ultra-performance liquid chromatography mass spectrometry to determine metabolite alterations in C. pneumoniae-infected HEp-2 cells. Most important permutations are elaborated using uni- and multivariate statistical analysis, logD retention time regression and mass defect-based network analysis. Classes of metabolites showing high variations upon infection are lipids, carbohydrates and amino acids. Moreover, we observed several non-annotated compounds as predominantly abundant after infection, which are promising biomarker candidates for drug-target and diagnostic research.
Barkal, Layla J.; Theberge, Ashleigh B.; Guo, Chun-Jun; Spraker, Joe; Rappert, Lucas; Berthier, Jean; Brakke, Kenneth A.; Wang, Clay C. C.; Beebe, David J.; Keller, Nancy P.; Berthier, Erwin
The microbial secondary metabolome encompasses great synthetic diversity, empowering microbes to tune their chemical responses to changing microenvironments. Traditional metabolomics methods are ill-equipped to probe a wide variety of environments or environmental dynamics. Here we introduce a class of microscale culture platforms to analyse chemical diversity of fungal and bacterial secondary metabolomes. By leveraging stable biphasic interfaces to integrate microculture with small molecule isolation via liquid–liquid extraction, we enable metabolomics-scale analysis using mass spectrometry. This platform facilitates exploration of culture microenvironments (including rare media typically inaccessible using established methods), unusual organic solvents for metabolite isolation and microbial mutants. Utilizing Aspergillus, a fungal genus known for its rich secondary metabolism, we characterize the effects of culture geometry and growth matrix on secondary metabolism, highlighting the potential use of microscale systems to unlock unknown or cryptic secondary metabolites for natural products discovery. Finally, we demonstrate the potential for this class of microfluidic systems to study interkingdom communication between fungi and bacteria. PMID:26842393
Pinto, Rui Climaco
Chemometrics has been a fundamental discipline for the development of metabolomics, while symbiotically growing with it. From design of experiments, through data processing, to data analysis, chemometrics tools are used to design, process, visualize, explore and analyse metabolomics data.In this chapter, the most commonly used chemometrics methods for data analysis and interpretation of metabolomics experiments will be presented, with focus on multivariate analysis. These are projection-based linear methods, like principal component analysis (PCA) and orthogonal projection to latent structures (OPLS), which facilitate interpretation of the causes behind the observed sample trends, correlation with outcomes or group discrimination analysis. Validation procedures for multivariate methods will be presented and discussed.Univariate analysis is briefly discussed in the context of correlation-based linear regression methods to find associations to outcomes or in analysis of variance-based and logistic regression methods for class discrimination. These methods rely on frequentist statistics, with the determination of p-values and corresponding multiple correction procedures.Several strategies of design-analysis of metabolomics experiments will be discussed, in order to guide the reader through different setups, adopted to better address some experimental issues and to better test the scientific hypotheses.
Vis, D.J.; Westerhuis, J.A.; Jacobs, D.M.; van Duynhoven, J.P.M.; Wopereis, S.; van Ommen, B.; Hendriks, M.M.W.B.; Smilde, A.K.
Challenge tests are used to assess the resilience of human beings to perturbations by analyzing responses to detect functional abnormalities. Well known examples are allergy tests and glucose tolerance tests. Increasingly, metabolomics analysis of blood or serum samples is used to analyze the
Hendriks, M.M.W.B.; Eeuwijk, van F.A.; Jellema, R.H.; Westerhuis, J.A.; Reijmers, T.H.; Hoefsloot, H.C.J.; Smilde, A.K.
Metabolomics studies aim at a better understanding of biochemical processes by studying relations between metabolites and between metabolites and other types of information (e.g., sensory and phenotypic features). The objectives of these studies are diverse, but the types of data generated and the
Finnegan, Tarryn; Steenkamp, Paul A; Piater, Lizelle A; Dubery, Ian A
Lipopolysaccharides (LPSs), as MAMP molecules, trigger the activation of signal transduction pathways involved in defence. Currently, plant metabolomics is providing new dimensions into understanding the intracellular adaptive responses to external stimuli. The effect of LPS on the metabolomes of Arabidopsis thaliana cells and leaf tissue was investigated over a 24 h period. Cellular metabolites and those secreted into the medium were extracted with methanol and liquid chromatography coupled to mass spectrometry was used for quantitative and qualitative analyses. Multivariate statistical data analyses were used to extract interpretable information from the generated multidimensional LC-MS data. The results show that LPS perception triggered differential changes in the metabolomes of cells and leaves, leading to variation in the biosynthesis of specialised secondary metabolites. Time-dependent changes in metabolite profiles were observed and biomarkers associated with the LPS-induced response were tentatively identified. These include the phytohormones salicylic acid and jasmonic acid, and also the associated methyl esters and sugar conjugates. The induced defensive state resulted in increases in indole-and other glucosinolates, indole derivatives, camalexin as well as cinnamic acid derivatives and other phenylpropanoids. These annotated metabolites indicate dynamic reprogramming of metabolic pathways that are functionally related towards creating an enhanced defensive capacity. The results reveal new insights into the mode of action of LPS as an activator of plant innate immunity, broadens knowledge about the defence metabolite pathways involved in Arabidopsis responses to LPS, and identifies specialised metabolites of functional importance that can be employed to enhance immunity against pathogen infection.
Alonso, Cristina; Fernández-Ramos, David; Varela-Rey, Marta; Martínez-Arranz, Ibon; Navasa, Nicolás; Van Liempd, Sebastiaan M; Lavín Trueba, José L; Mayo, Rebeca; Ilisso, Concetta P; de Juan, Virginia G; Iruarrizaga-Lejarreta, Marta; delaCruz-Villar, Laura; Mincholé, Itziar; Robinson, Aaron; Crespo, Javier; Martín-Duce, Antonio; Romero-Gómez, Manuel; Sann, Holger; Platon, Julian; Van Eyk, Jennifer; Aspichueta, Patricia; Noureddin, Mazen; Falcón-Pérez, Juan M; Anguita, Juan; Aransay, Ana M; Martínez-Chantar, María Luz; Lu, Shelly C; Mato, José M
Nonalcoholic fatty liver disease (NAFLD) is a consequence of defects in diverse metabolic pathways that involve hepatic accumulation of triglycerides. Features of these aberrations might determine whether NAFLD progresses to nonalcoholic steatohepatitis (NASH). We investigated whether the diverse defects observed in patients with NAFLD are caused by different NAFLD subtypes with specific serum metabolomic profiles, and whether these can distinguish patients with NASH from patients with simple steatosis. We collected liver and serum from methionine adenosyltransferase 1a knockout (MAT1A-KO) mice, which have chronically low levels of hepatic S-adenosylmethionine (SAMe) and spontaneously develop steatohepatitis, as well as C57Bl/6 mice (controls); the metabolomes of all samples were determined. We also analyzed serum metabolomes of 535 patients with biopsy-proven NAFLD (353 with simple steatosis and 182 with NASH) and compared them with serum metabolomes of mice. MAT1A-KO mice were also given SAMe (30 mg/kg/day for 8 weeks); liver samples were collected and analyzed histologically for steatohepatitis. Livers of MAT1A-KO mice were characterized by high levels of triglycerides, diglycerides, fatty acids, ceramides, and oxidized fatty acids, as well as low levels of SAMe and downstream metabolites. There was a correlation between liver and serum metabolomes. We identified a serum metabolomic signature associated with MAT1A-KO mice that also was present in 49% of the patients; based on this signature, we identified 2 NAFLD subtypes. We identified specific panels of markers that could distinguish patients with NASH from patients with simple steatosis for each subtype of NAFLD. Administration of SAMe reduced features of steatohepatitis in MAT1A-KO mice. In an analysis of serum metabolomes of patients with NAFLD and MAT1A-KO mice with steatohepatitis, we identified 2 major subtypes of NAFLD and markers that differentiate steatosis from NASH in each subtype. These might be
Msizi Innocent Mhlongo
Full Text Available Metabolomics has developed into a valuable tool for advancing our understanding of plant metabolism. Plant innate immune defenses can be activated and enhanced so that, subsequent to being pre-sensitized, plants are able to launch a stronger and faster defense response upon exposure to pathogenic microorganisms, a phenomenon known as priming. Here, three contrasting chemical activators, namely acibenzolar-S-methyl, azelaic acid and riboflavin, were used to induce a primed state in Nicotiana tabacum cells. Identified biomarkers were then compared to responses induced by three phytohormones - abscisic acid, methyljasmonate and salicylic acid. Altered metabolomes were studied using a metabolite fingerprinting approach based on liquid chromatography and mass spectrometry. Multivariate data models indicated that these inducers cause time-dependent metabolic perturbations in the cultured cells and revealed biomarkers of which the levels are affected by these agents. A total of 34 metabolites were annotated from the mass spectral data and online databases. Venn diagrams were used to identify common biomarkers as well as those unique to a specific agent. Results implicate 20 cinnamic acid derivatives conjugated to (i quinic acid (chlorogenic acids, (ii tyramine, (iii polyamines or (iv glucose as discriminatory biomarkers of priming in tobacco cells. Functional roles for most of these metabolites in plant defense responses could thus be proposed. Metabolites induced by the activators belong to the early phenylpropanoid pathway, which indicates that different stimuli can activate similar pathways but with different metabolite fingerprints. Possible linkages to phytohormone-dependent pathways at a metabolomic level were indicated in the case of cells treated with salicylic acid and methyljasmonate. The results contribute to a better understanding of the priming phenomenon and advance our knowledge of cinnamic acid derivatives as versatile defense
Wohlgemuth, Gert; Haldiya, Pradeep Kumar; Willighagen, Egon; Kind, Tobias; Fiehn, Oliver
Summary: Metabolomic publications and databases use different database identifiers or even trivial names which disable queries across databases or between studies. The best way to annotate metabolites is by chemical structures, encoded by the International Chemical Identifier code (InChI) or InChIKey. We have implemented a web-based Chemical Translation Service that performs batch conversions of the most common compound identifiers, including CAS, CHEBI, compound formulas, Human Metabolome Database HMDB, InChI, InChIKey, IUPAC name, KEGG, LipidMaps, PubChem CID+SID, SMILES and chemical synonym names. Batch conversion downloads of 1410 CIDs are performed in 2.5 min. Structures are automatically displayed. Implementation: The software was implemented in Groovy and JAVA, the web frontend was implemented in GRAILS and the database used was PostgreSQL. Availability: The source code and an online web interface are freely available. Chemical Translation Service (CTS): http://cts.fiehnlab.ucdavis.edu Contact: email@example.com PMID:20829444
Wohlgemuth, Gert; Haldiya, Pradeep Kumar; Willighagen, Egon; Kind, Tobias; Fiehn, Oliver
Metabolomic publications and databases use different database identifiers or even trivial names which disable queries across databases or between studies. The best way to annotate metabolites is by chemical structures, encoded by the International Chemical Identifier code (InChI) or InChIKey. We have implemented a web-based Chemical Translation Service that performs batch conversions of the most common compound identifiers, including CAS, CHEBI, compound formulas, Human Metabolome Database HMDB, InChI, InChIKey, IUPAC name, KEGG, LipidMaps, PubChem CID+SID, SMILES and chemical synonym names. Batch conversion downloads of 1410 CIDs are performed in 2.5 min. Structures are automatically displayed. The software was implemented in Groovy and JAVA, the web frontend was implemented in GRAILS and the database used was PostgreSQL. The source code and an online web interface are freely available. Chemical Translation Service (CTS): http://cts.fiehnlab.ucdavis.edu firstname.lastname@example.org
Full Text Available Even with the widespread use of liquid chromatography mass spectrometry (LC/MS based metabolomics, there are still a number of challenges facing this promising technique. Many, diverse experimental workflows exist; yet there is a lack of infrastructure and systems for tracking and sharing of information. Here, we describe the Metabolite Atlas framework and interface that provides highly-efficient, web-based access to raw mass spectrometry data in concert with assertions about chemicals detected to help address some of these challenges. This integration, by design, enables experimentalists to explore their raw data, specify and refine features annotations such that they can be leveraged for future experiments. Fast queries of the data through the web using SciDB, a parallelized database for high performance computing, make this process operate quickly. By using scripting containers, such as IPython or Jupyter, to analyze the data, scientists can utilize a wide variety of freely available graphing, statistics, and information management resources. In addition, the interfaces facilitate integration with systems biology tools to ultimately link metabolomics data with biological models.
Koek, M.M.; Jellema, R.H.; Greef, J. van der; Tas, A.C.; Hankemeier, T.
Metabolomics involves the unbiased quantitative and qualitative analysis of the complete set of metabolites present in cells, body fluids and tissues (the metabolome). By analyzing differences between metabolomes using biostatistics (multivariate data analysis; pattern recognition), metabolites
Schmid, Ralf; Blaxter, Mark L
The expressed sequence tag (EST) methodology is an attractive option for the generation of sequence data for species for which no completely sequenced genome is available. The annotation and comparative analysis of such datasets poses a formidable challenge for research groups that do not have the bioinformatics infrastructure of major genome sequencing centres. Therefore, there is a need for user-friendly tools to facilitate the annotation of non-model species EST datasets with well-defined ontologies that enable meaningful cross-species comparisons. To address this, we have developed annot8r, a platform for the rapid annotation of EST datasets with GO-terms, EC-numbers and KEGG-pathways. annot8r automatically downloads all files relevant for the annotation process and generates a reference database that stores UniProt entries, their associated Gene Ontology (GO), Enzyme Commission (EC) and Kyoto Encyclopaedia of Genes and Genomes (KEGG) annotation and additional relevant data. For each of GO, EC and KEGG, annot8r extracts a specific sequence subset from the UniProt dataset based on the information stored in the reference database. These three subsets are then formatted for BLAST searches. The user provides the protein or nucleotide sequences to be annotated and annot8r runs BLAST searches against these three subsets. The BLAST results are parsed and the corresponding annotations retrieved from the reference database. The annotations are saved both as flat files and also in a relational postgreSQL results database to facilitate more advanced searches within the results. annot8r is integrated with the PartiGene suite of EST analysis tools. annot8r is a tool that assigns GO, EC and KEGG annotations for data sets resulting from EST sequencing projects both rapidly and efficiently. The benefits of an underlying relational database, flexibility and the ease of use of the program make it ideally suited for non-model species EST-sequencing projects.
Rabinowitz, Joshua D [Princeton Univ., NJ (United States); Aristilde, Ludmilla [Cornell Univ., Ithaca, NY (United States); Amador-Noguez, Daniel [Univ. of Wisconsin, Madison, WI (United States)
Members of the genus Clostridium collectively have the ideal set of the metabolic capabilities for fermentative biofuel production: cellulose degradation, hydrogen production, and solvent excretion. No single organism, however, can effectively convert cellulose into biofuels. Here we developed, using metabolomics and isotope tracers, basic science knowledge of Clostridial metabolism of utility for future efforts to engineer such an organism. In glucose fermentation carried out by the biofuel producer Clostridium acetobutylicum, we observed a remarkably ordered series of metabolite concentration changes as the fermentation progressed from acidogenesis to solventogenesis. In general, high-energy compounds decreased while low-energy species increased during solventogenesis. These changes in metabolite concentrations were accompanied by large changes in intracellular metabolic fluxes, with pyruvate directed towards acetyl-CoA and solvents instead of oxaloacetate and amino acids. Thus, the solventogenic transition involves global remodeling of metabolism to redirect resources from biomass production into solvent production. In contrast to C. acetobutylicum, which is an avid fermenter, C. cellulolyticum metabolizes glucose only slowly. We find that glycolytic intermediate concentrations are radically different from fast fermenting organisms. Associated thermodynamic and isotope tracer analysis revealed that the full glycolytic pathway in C. cellulolyticum is reversible. This arises from changes in cofactor utilization for phosphofructokinase and an alternative pathway from phosphoenolpyruvate to pyruvate. The net effect is to increase the high-energy phosphate bond yield of glycolysis by 150% (from 2 to 5) at the expense of lower net flux. Thus, C. cellulolyticum prioritizes glycolytic energy efficiency over speed. Degradation of cellulose results in other sugars in addition to glucose. Simultaneous feeding of stable isotope-labeled glucose and unlabeled pentose sugars
Kleinstreuer, N.C.; Smith, A.M.; West, P.R.; Conard, K.R.; Fontaine, B.R.; Weir-Hauptman, A.M.; Palmer, J.A.; Knudsen, T.B.; Dix, D.J.; Donley, E.L.R.; Cezar, G.G.
Metabolomics analysis was performed on the supernatant of human embryonic stem (hES) cell cultures exposed to a blinded subset of 11 chemicals selected from the chemical library of EPA's ToxCast™ chemical screening and prioritization research project. Metabolites from hES cultures were evaluated for known and novel signatures that may be indicative of developmental toxicity. Significant fold changes in endogenous metabolites were detected for 83 putatively annotated mass features in response to the subset of ToxCast chemicals. The annotations were mapped to specific human metabolic pathways. This revealed strong effects on pathways for nicotinate and nicotinamide metabolism, pantothenate and CoA biosynthesis, glutathione metabolism, and arginine and proline metabolism pathways. Predictivity for adverse outcomes in mammalian prenatal developmental toxicity studies used ToxRefDB and other sources of information, including Stemina Biomarker Discovery's predictive DevTox® model trained on 23 pharmaceutical agents of known developmental toxicity and differing potency. The model initially predicted developmental toxicity from the blinded ToxCast compounds in concordance with animal data with 73% accuracy. Retraining the model with data from the unblinded test compounds at one concentration level increased the predictive accuracy for the remaining concentrations to 83%. These preliminary results on a 11-chemical subset of the ToxCast chemical library indicate that metabolomics analysis of the hES secretome provides information valuable for predictive modeling and mechanistic understanding of mammalian developmental toxicity. -- Highlights: ► We tested 11 environmental compounds in a hESC metabolomics platform. ► Significant changes in secreted small molecule metabolites were observed. ► Perturbed mass features map to pathways critical for normal development and pregnancy. ► Arginine, proline, nicotinate, nicotinamide and glutathione pathways were affected.
Kleinstreuer, N.C., E-mail: email@example.com [NCCT, US EPA, RTP, NC 27711 (United States); Smith, A.M.; West, P.R.; Conard, K.R.; Fontaine, B.R. [Stemina Biomarker Discovery, Inc., Madison, WI 53719 (United States); Weir-Hauptman, A.M. [Covance, Inc., Madison, WI 53704 (United States); Palmer, J.A. [Stemina Biomarker Discovery, Inc., Madison, WI 53719 (United States); Knudsen, T.B.; Dix, D.J. [NCCT, US EPA, RTP, NC 27711 (United States); Donley, E.L.R. [Stemina Biomarker Discovery, Inc., Madison, WI 53719 (United States); Cezar, G.G. [Stemina Biomarker Discovery, Inc., Madison, WI 53719 (United States); University of Wisconsin-Madison, Madison, WI 53706 (United States)
Metabolomics analysis was performed on the supernatant of human embryonic stem (hES) cell cultures exposed to a blinded subset of 11 chemicals selected from the chemical library of EPA's ToxCast Trade-Mark-Sign chemical screening and prioritization research project. Metabolites from hES cultures were evaluated for known and novel signatures that may be indicative of developmental toxicity. Significant fold changes in endogenous metabolites were detected for 83 putatively annotated mass features in response to the subset of ToxCast chemicals. The annotations were mapped to specific human metabolic pathways. This revealed strong effects on pathways for nicotinate and nicotinamide metabolism, pantothenate and CoA biosynthesis, glutathione metabolism, and arginine and proline metabolism pathways. Predictivity for adverse outcomes in mammalian prenatal developmental toxicity studies used ToxRefDB and other sources of information, including Stemina Biomarker Discovery's predictive DevTox Registered-Sign model trained on 23 pharmaceutical agents of known developmental toxicity and differing potency. The model initially predicted developmental toxicity from the blinded ToxCast compounds in concordance with animal data with 73% accuracy. Retraining the model with data from the unblinded test compounds at one concentration level increased the predictive accuracy for the remaining concentrations to 83%. These preliminary results on a 11-chemical subset of the ToxCast chemical library indicate that metabolomics analysis of the hES secretome provides information valuable for predictive modeling and mechanistic understanding of mammalian developmental toxicity. -- Highlights: Black-Right-Pointing-Pointer We tested 11 environmental compounds in a hESC metabolomics platform. Black-Right-Pointing-Pointer Significant changes in secreted small molecule metabolites were observed. Black-Right-Pointing-Pointer Perturbed mass features map to pathways critical for normal
Balhoff, James P; Dahdul, Wasila M; Dececchi, T Alexander; Lapp, Hilmar; Mabee, Paula M; Vision, Todd J
Phenex (http://phenex.phenoscape.org/) is a desktop application for semantically annotating the phenotypic character matrix datasets common in evolutionary biology. Since its initial publication, we have added new features that address several major bottlenecks in the efficiency of the phenotype curation process: allowing curators during the data curation phase to provisionally request terms that are not yet available from a relevant ontology; supporting quality control against annotation guidelines to reduce later manual review and revision; and enabling the sharing of files for collaboration among curators. We decoupled data annotation from ontology development by creating an Ontology Request Broker (ORB) within Phenex. Curators can use the ORB to request a provisional term for use in data annotation; the provisional term can be automatically replaced with a permanent identifier once the term is added to an ontology. We added a set of annotation consistency checks to prevent common curation errors, reducing the need for later correction. We facilitated collaborative editing by improving the reliability of Phenex when used with online folder sharing services, via file change monitoring and continual autosave. With the addition of these new features, and in particular the Ontology Request Broker, Phenex users have been able to focus more effectively on data annotation. Phenoscape curators using Phenex have reported a smoother annotation workflow, with much reduced interruptions from ontology maintenance and file management issues.
This report describes a program that uses annotations in the teacher's editions of existing reading programs to indicate the characteristics of black English that may interfere with the reading process of black children. The first part of the report provides a rationale for the annotation approach, explaining that the discrepancy between written…
This video shows how to annotate the ground truth tracks in the thermal videos. The ground truth tracks are produced to be able to compare them to tracks obtained from a Computer Vision tracking approach. The program used for annotation is T-Analyst, which is developed by Aliaksei Laureshyn, Ph...
Leopold, H.; Meilicke, C.; Fellmann, M.; Pittke, F.; Stuckenschmidt, H.; Mendling, J.
Many techniques for the advanced analysis of process models build on the annotation of process models with elements from predefined vocabularies such as taxonomies. However, the manual annotation of process models is cumbersome and sometimes even hardly manageable taking the size of taxonomies into
Brister, James Rodney; Bao, Yiming; Kuiken, Carla; Lefkowitz, Elliot J; Le Mercier, Philippe; Leplae, Raphael; Madupu, Ramana; Scheuermann, Richard H; Schobel, Seth; Seto, Donald; Shrivastava, Susmita; Sterk, Peter; Zeng, Qiandong; Klimke, William; Tatusova, Tatiana
Improvements in DNA sequencing technologies portend a new era in virology and could possibly lead to a giant leap in our understanding of viral evolution and ecology. Yet, as viral genome sequences begin to fill the world's biological databases, it is critically important to recognize that the scientific promise of this era is dependent on consistent and comprehensive genome annotation. With this in mind, the NCBI Genome Annotation Workshop recently hosted a study group tasked with developing sequence, function, and metadata annotation standards for viral genomes. This report describes the issues involved in viral genome annotation and reviews policy recommendations presented at the NCBI Annotation Workshop.
Full Text Available Improvements in DNA sequencing technologies portend a new era in virology and could possibly lead to a giant leap in our understanding of viral evolution and ecology. Yet, as viral genome sequences begin to fill the world’s biological databases, it is critically important to recognize that the scientific promise of this era is dependent on consistent and comprehensive genome annotation. With this in mind, the NCBI Genome Annotation Workshop recently hosted a study group tasked with developing sequence, function, and metadata annotation standards for viral genomes. This report describes the issues involved in viral genome annotation and reviews policy recommendations presented at the NCBI Annotation Workshop.
. An increase in the number and size of GO groups without any noticeable decrease of the link density within the groups indicated that this expansion significantly broadens the public GO annotation without diluting its quality. We revealed that functional GO annotation correlates mostly with clustering in a physical interaction protein network, while its overlap with indirect regulatory network communities is two to three times smaller. Conclusion Protein functional annotations extracted by the NLP technology expand and enrich the existing GO annotation system. The GO functional modularity correlates mostly with the clustering in the physical interaction network, suggesting that the essential role of structural organization maintained by these interactions. Reciprocally, clustering of proteins in physical interaction networks can serve as an evidence for their functional similarity.
Mardanbeigi, Diako; Qvarfordt, Pernilla
To facilitate distributed communication in mobile settings, we developed GazeNote for creating and sharing gaze annotations in head mounted displays (HMDs). With gaze annotations it possible to point out objects of interest within an image and add a verbal description. To create an annota- tion......, the user simply captures an image using the HMD’s camera, looks at an object of interest in the image, and speaks out the information to be associated with the object. The gaze location is recorded and visualized with a marker. The voice is transcribed using speech recognition. Gaze annotations can...... be shared. Our study showed that users found that gaze annotations add precision and expressive- ness compared to annotations of the image as a whole...
Ting, R.N.; Subramanyam, K.
Ion implantation is a technique for introducing controlled amounts of dopants into target substrates, and has been successfully used for the manufacture of silicon semiconductor devices. Ion implantation is superior to other methods of doping such as thermal diffusion and epitaxy, in view of its advantages such as high degree of control, flexibility, and amenability to automation. This annotated bibliography of 416 references consists of journal articles, books, and conference papers in English and foreign languages published during 1973-74, on all aspects of ion implantation including range distribution and concentration profile, channeling, radiation damage and annealing, compound semiconductors, structural and electrical characterization, applications, equipment and ion sources. Earlier bibliographies on ion implantation, and national and international conferences in which papers on ion implantation were presented have also been listed separately
van der Pluijm, B.
What do colleagues do with your assigned textbook? What they say or think about the material? Want students to be more engaged in their learning experience? If so, online materials that complement standard lecture format provide new opportunity through managed, online group annotation that leverages the ubiquity of internet access, while personalizing learning. The concept is illustrated with the new online textbook "Processes in Structural Geology and Tectonics", by Ben van der Pluijm and Stephen Marshak, which offers a platform for sharing of experiences, supplementary materials and approaches, including readings, mathematical applications, exercises, challenge questions, quizzes, alternative explanations, and more. The annotation framework used is Hypothes.is, which offers a free, open platform markup environment for annotation of websites and PDF postings. The annotations can be public, grouped or individualized, as desired, including export access and download of annotations. A teacher group, hosted by a moderator/owner, limits access to members of a user group of teachers, so that its members can use, copy or transcribe annotations for their own lesson material. Likewise, an instructor can host a student group that encourages sharing of observations, questions and answers among students and instructor. Also, the instructor can create one or more closed groups that offers study help and hints to students. Options galore, all of which aim to engage students and to promote greater responsibility for their learning experience. Beyond new capacity, the ability to analyze student annotation supports individual learners and their needs. For example, student notes can be analyzed for key phrases and concepts, and identify misunderstandings, omissions and problems. Also, example annotations can be shared to enhance notetaking skills and to help with studying. Lastly, online annotation allows active application to lecture posted slides, supporting real-time notetaking
We describe an automatic face tracker plugin for the ANVIL annotation tool. The face tracker produces data for velocity and for acceleration in two dimensions. We compare the annotations generated by the face tracking algorithm with independently made manual annotations for head movements....... The annotations are a useful supplement to manual annotations and may help human annotators to quickly and reliably determine onset of head movements and to suggest which kind of head movement is taking place....
Rácz, Anita; Andrić, Filip; Bajusz, Dávid; Héberger, Károly
Contemporary metabolomic fingerprinting is based on multiple spectrometric and chromatographic signals, used either alone or combined with structural and chemical information of metabolic markers at the qualitative and semiquantitative level. However, signal shifting, convolution, and matrix effects may compromise metabolomic patterns. Recent increase in the use of qualitative metabolomic data, described by the presence (1) or absence (0) of particular metabolites, demonstrates great potential in the field of metabolomic profiling and fingerprint analysis. The aim of this study is a comprehensive evaluation of binary similarity measures for the elucidation of patterns among samples of different botanical origin and various metabolomic profiles. Nine qualitative metabolomic data sets covering a wide range of natural products and metabolomic profiles were applied to assess 44 binary similarity measures for the fingerprinting of plant extracts and natural products. The measures were analyzed by the novel sum of ranking differences method (SRD), searching for the most promising candidates. Baroni-Urbani-Buser (BUB) and Hawkins-Dotson (HD) similarity coefficients were selected as the best measures by SRD and analysis of variance (ANOVA), while Dice (Di1), Yule, Russel-Rao, and Consonni-Todeschini 3 ranked the worst. ANOVA revealed that concordantly and intermediately symmetric similarity coefficients are better candidates for metabolomic fingerprinting than the asymmetric and correlation based ones. The fingerprint analysis based on the BUB and HD coefficients and qualitative metabolomic data performed equally well as the quantitative metabolomic profile analysis. Fingerprint analysis based on the qualitative metabolomic profiles and binary similarity measures proved to be a reliable way in finding the same/similar patterns in metabolomic data as that extracted from quantitative data.
Full Text Available Metabolomics is a powerful technology with broad applications in life science that, like other -omics approaches, requires high-quality samples to achieve reliable results and ensure reproducibility. Therefore, along with quality assurance, methods to assess sample quality regarding pre-analytical confounders are urgently needed. In this study, we analyzed the response of the human serum metabolome to pre-analytical variations comprising prolonged blood incubation and extended serum storage at room temperature by using gas chromatography-mass spectrometry (GC-MS and liquid chromatography-tandem mass spectrometry (LC-MS/MS -based metabolomics. We found that the prolonged incubation of blood results in a statistically significant 20% increase and 4% decrease of 225 tested serum metabolites. Extended serum storage affected 21% of the analyzed metabolites (14% increased, 7% decreased. Amino acids and nucleobases showed the highest percentage of changed metabolites in both confounding conditions, whereas lipids were remarkably stable. Interestingly, the amounts of taurine and O-phosphoethanolamine, which have both been discussed as biomarkers for various diseases, were 1.8- and 2.9-fold increased after 6 h of blood incubation. Since we found that both are more stable in ethylenediaminetetraacetic acid (EDTA blood, EDTA plasma should be the preferred metabolomics matrix.
Tebani, Abdellah; Afonso, Carlos; Bekri, Soumeya
Metabolites are small molecules produced by enzymatic reactions in a given organism. Metabolomics or metabolic phenotyping is a well-established omics aimed at comprehensively assessing metabolites in biological systems. These comprehensive analyses use analytical platforms, mainly nuclear magnetic resonance spectroscopy and mass spectrometry, along with associated separation methods to gather qualitative and quantitative data. Metabolomics holistically evaluates biological systems in an unbiased, data-driven approach that may ultimately support generation of hypotheses. The approach inherently allows the molecular characterization of a biological sample with regard to both internal (genetics) and environmental (exosome, microbiome) influences. Metabolomics workflows are based on whether the investigator knows a priori what kind of metabolites to assess. Thus, a targeted metabolomics approach is defined as a quantitative analysis (absolute concentrations are determined) or a semiquantitative analysis (relative intensities are determined) of a set of metabolites that are possibly linked to common chemical classes or a selected metabolic pathway. An untargeted metabolomics approach is a semiquantitative analysis of the largest possible number of metabolites contained in a biological sample. This is part I of a review intending to give an overview of the state of the art of major metabolic phenotyping technologies. Furthermore, their inherent analytical advantages and limits regarding experimental design, sample handling, standardization and workflow challenges are discussed.
Kilicoglu, Halil; Ben Abacha, Asma; Mrabet, Yassine; Shooshan, Sonya E; Rodriguez, Laritza; Masterton, Kate; Demner-Fushman, Dina
Consumers increasingly use online resources for their health information needs. While current search engines can address these needs to some extent, they generally do not take into account that most health information needs are complex and can only fully be expressed in natural language. Consumer health question answering (QA) systems aim to fill this gap. A major challenge in developing consumer health QA systems is extracting relevant semantic content from the natural language questions (question understanding). To develop effective question understanding tools, question corpora semantically annotated for relevant question elements are needed. In this paper, we present a two-part consumer health question corpus annotated with several semantic categories: named entities, question triggers/types, question frames, and question topic. The first part (CHQA-email) consists of relatively long email requests received by the U.S. National Library of Medicine (NLM) customer service, while the second part (CHQA-web) consists of shorter questions posed to MedlinePlus search engine as queries. Each question has been annotated by two annotators. The annotation methodology is largely the same between the two parts of the corpus; however, we also explain and justify the differences between them. Additionally, we provide information about corpus characteristics, inter-annotator agreement, and our attempts to measure annotation confidence in the absence of adjudication of annotations. The resulting corpus consists of 2614 questions (CHQA-email: 1740, CHQA-web: 874). Problems are the most frequent named entities, while treatment and general information questions are the most common question types. Inter-annotator agreement was generally modest: question types and topics yielded highest agreement, while the agreement for more complex frame annotations was lower. Agreement in CHQA-web was consistently higher than that in CHQA-email. Pairwise inter-annotator agreement proved most
Michaelis, J.; Zednik, S.; West, P.; Fox, P. A.; McGuinness, D. L.
eScience based systems generate provenance of their data products, related to such things as: data processing, data collection conditions, expert evaluation, and data product quality. Recent advances in web-based technology offer users the possibility of making annotations to both data products and steps in accompanying provenance traces, thereby expanding the utility of such provenance for others. These contributing users may have varying backgrounds, ranging from system experts to outside domain experts to citizen scientists. Furthermore, such users may wish to make varying types of annotations - ranging from documenting the purpose of a provenance step to raising concerns about the quality of data dependencies. Semantic Web technologies allow for such kinds of rich annotations to be made to provenance through the use of ontology vocabularies for (i) organizing provenance, and (ii) organizing user/annotation classifications. Furthermore, through Linked Data practices, Semantic linkages may be made from provenance steps to external data of interest. A desire for Semantically-annotated provenance has been motivated by data management issues in the Mauna Loa Solar Observatory’s (MLSO) Advanced Coronal Observing System (ACOS). In ACOS, photomoeter-based readings are taken of solar activity and subsequently processed into final data products consumable by end users. At intermediate stages of ACOS processing, factors such as evaluations by human experts and weather conditions are logged, which could impact data product quality. If such factors are linked via user-submitted annotations to provenance, it could be significantly beneficial for other users. Likewise, the background of a user could impact the credibility of their annotations. For example, an annotation made by a citizen scientist describing the purpose of a provenance step may not be as reliable as a similar annotation made by an ACOS project member. For this work, we have developed a software package that
Pedersen, Helle Krogh; Gudmundsdottir, Valborg; Nielsen, Henrik Bjørn
Insulin resistance is a forerunner state of ischaemic cardiovascular disease and type 2 diabetes. Here we show how the human gut microbiome impacts the serum metabolome and associates with insulin resistance in 277 non-diabetic Danish individuals. The serum metabolome of insulin-resistant individ......Insulin resistance is a forerunner state of ischaemic cardiovascular disease and type 2 diabetes. Here we show how the human gut microbiome impacts the serum metabolome and associates with insulin resistance in 277 non-diabetic Danish individuals. The serum metabolome of insulin...
Sanderson, Robert [Los Alamos National Laboratory; Van De Sompel, Herbert [Los Alamos National Laboratory
As Digital Libraries (DL) become more aligned with the web architecture, their functional components need to be fundamentally rethought in terms of URIs and HTTP. Annotation, a core scholarly activity enabled by many DL solutions, exhibits a clearly unacceptable characteristic when existing models are applied to the web: due to the representations of web resources changing over time, an annotation made about a web resource today may no longer be relevant to the representation that is served from that same resource tomorrow. We assume the existence of archived versions of resources, and combine the temporal features of the emerging Open Annotation data model with the capability offered by the Memento framework that allows seamless navigation from the URI of a resource to archived versions of that resource, and arrive at a solution that provides guarantees regarding the persistence of web annotations over time. More specifically, we provide theoretical solutions and proof-of-concept experimental evaluations for two problems: reconstructing an existing annotation so that the correct archived version is displayed for all resources involved in the annotation, and retrieving all annotations that involve a given archived version of a web resource.
Good, Benjamin M; Nanis, Max; Wu, Chunlei; Su, Andrew I
Identifying concepts and relationships in biomedical text enables knowledge to be applied in computational analyses. Many biological natural language processing (BioNLP) projects attempt to address this challenge, but the state of the art still leaves much room for improvement. Progress in BioNLP research depends on large, annotated corpora for evaluating information extraction systems and training machine learning models. Traditionally, such corpora are created by small numbers of expert annotators often working over extended periods of time. Recent studies have shown that workers on microtask crowdsourcing platforms such as Amazon's Mechanical Turk (AMT) can, in aggregate, generate high-quality annotations of biomedical text. Here, we investigated the use of the AMT in capturing disease mentions in PubMed abstracts. We used the NCBI Disease corpus as a gold standard for refining and benchmarking our crowdsourcing protocol. After several iterations, we arrived at a protocol that reproduced the annotations of the 593 documents in the 'training set' of this gold standard with an overall F measure of 0.872 (precision 0.862, recall 0.883). The output can also be tuned to optimize for precision (max = 0.984 when recall = 0.269) or recall (max = 0.980 when precision = 0.436). Each document was completed by 15 workers, and their annotations were merged based on a simple voting method. In total 145 workers combined to complete all 593 documents in the span of 9 days at a cost of $.066 per abstract per worker. The quality of the annotations, as judged with the F measure, increases with the number of workers assigned to each task; however minimal performance gains were observed beyond 8 workers per task. These results add further evidence that microtask crowdsourcing can be a valuable tool for generating well-annotated corpora in BioNLP. Data produced for this analysis are available at http://figshare.com/articles/Disease_Mention_Annotation_with_Mechanical_Turk/1126402.
Wright, T.; Tsao, H.J.
The success or failure of any sample survey of a finite population is largely dependent upon the condition and adequacy of the list or frame from which the probability sample is selected. Much of the published survey sampling related work has focused on the measurement of sampling errors and, more recently, on nonsampling errors to a lesser extent. Recent studies on data quality for various types of data collection systems have revealed that the extent of the nonsampling errors far exceeds that of the sampling errors in many cases. While much of this nonsampling error, which is difficult to measure, can be attributed to poor frames, relatively little effort or theoretical work has focused on this contribution to total error. The objective of this paper is to present an annotated bibliography on frames with the hope that it will bring together, for experimenters, a number of suggestions for action when sampling from imperfect frames and that more attention will be given to this area of survey methods research
Bro, Rasmus; Nielsen, Hans Jørgen; Savorani, Francesco
We have recently shown that fluorescence spectroscopy of plasma samples has promising abilities regarding early detection of colorectal cancer. In the present paper, these results were further developed by combining fluorescence with the biomarkers, CEA and TIMP-1 and traditional metabolomic...... measurements in the form of (1)H NMR spectroscopy. The results indicate that using an extensive profile established by combining such measurements together with the biomarkers is better than using single markers....
Fromreide, Hege; Hovy, Dirk; Søgaard, Anders
We present two new NER datasets for Twitter; a manually annotated set of 1,467 tweets (kappa=0.942) and a set of 2,975 expert-corrected, crowdsourced NER annotated tweets from the dataset described in Finin et al. (2010). In our experiments with these datasets, we observe two important points: (a......) language drift on Twitter is significant, and while off-the-shelf systems have been reported to perform well on in-sample data, they often perform poorly on new samples of tweets, (b) state-of-the-art performance across various datasets can beobtained from crowdsourced annotations, making it more feasible...
Kortesniemi, Maaria; Vuorinen, Anssi L; Sinkkonen, Jari; Yang, Baoru; Rajala, Ari; Kallio, Heikki
The oilseeds of the commercially important oilseed rape (Brassica napus) and turnip rape (Brassica rapa) were investigated with (1)H NMR metabolomics. The compositions of ripened (cultivated in field trials) and developing seeds (cultivated in controlled conditions) were compared in multivariate models using principal component analysis (PCA), partial least squares discriminant analysis (PLS-DA), and orthogonal partial least squares discriminant analysis (OPLS-DA). Differences in the major lipids and the minor metabolites between the two species were found. A higher content of polyunsaturated fatty acids and sucrose were observed in turnip rape, while the overall oil content and sinapine levels were higher in oilseed rape. The genotype traits were negligible compared to the effect of the growing site and concomitant conditions on the oilseed metabolome. This study demonstrates the applicability of NMR-based analysis in determining the species, geographical origin, developmental stage, and quality of oilseed Brassicas. Copyright © 2014 Elsevier Ltd. All rights reserved.
Zivkovic, Angela M; German, J Bruce
The current rise in diet-related diseases continues to be one of the most significant health problems facing both the developed and the developing world. The use of metabolomics - the accurate and comprehensive measurement of a significant fraction of important metabolites in accessible biological fluids - for the assessment of nutritional status is a promising way forward. The basic toolset, targets and knowledge are all being developed in the emerging field of metabolomics, yet important knowledge and technology gaps will need to be addressed in order to bring such assessment to practice. Dysregulation within the principal metabolic organs (e.g. intestine, adipose, skeletal muscle and liver) are at the center of a diet-disease paradigm that includes metabolic syndrome, type 2 diabetes and obesity. The assessment of both essential nutrient status and the more comprehensive systemic metabolic response to dietary, lifestyle and environmental influences (e.g. metabolic phenotype) are necessary for the evaluation of status in individuals that can identify the multiple targets of intervention needed to address metabolic disease. The first proofs of principle building the knowledge to bring actionable metabolic diagnostics to practice through metabolomics are now appearing.
Haug, Kenneth; Salek, Reza M; Steinbeck, Christoph
Chemical Biology employs chemical synthesis, analytical chemistry and other tools to study biological systems. Recent advances in both molecular biology such as next generation sequencing (NGS) have led to unprecedented insights towards the evolution of organisms' biochemical repertoires. Because of the specific data sharing culture in Genomics, genomes from all kingdoms of life become readily available for further analysis by other researchers. While the genome expresses the potential of an organism to adapt to external influences, the Metabolome presents a molecular phenotype that allows us to asses the external influences under which an organism exists and develops in a dynamic way. Steady advancements in instrumentation towards high-throughput and highresolution methods have led to a revival of analytical chemistry methods for the measurement and analysis of the metabolome of organisms. This steady growth of metabolomics as a field is leading to a similar accumulation of big data across laboratories worldwide as can be observed in all of the other omics areas. This calls for the development of methods and technologies for handling and dealing with such large datasets, for efficiently distributing them and for enabling re-analysis. Here we describe the recently emerging ecosystem of global open-access databases and data exchange efforts between them, as well as the foundations and obstacles that enable or prevent the data sharing and reanalysis of this data. Copyright © 2017 The Authors. Published by Elsevier Ltd.. All rights reserved.
Kronk, Gary W
Meteor showers are among the most spectacular celestial events that may be observed by the naked eye, and have been the object of fascination throughout human history. In “Meteor Showers: An Annotated Catalog,” the interested observer can access detailed research on over 100 annual and periodic meteor streams in order to capitalize on these majestic spectacles. Each meteor shower entry includes details of their discovery, important observations and orbits, and gives a full picture of duration, location in the sky, and expected hourly rates. Armed with a fuller understanding, the amateur observer can better view and appreciate the shower of their choice. The original book, published in 1988, has been updated with over 25 years of research in this new and improved edition. Almost every meteor shower study is expanded, with some original minor showers being dropped while new ones are added. The book also includes breakthroughs in the study of meteor showers, such as accurate predictions of outbursts as well ...
Full Text Available Recent advances in metabolomics technologies have resulted in high-quality (time-resolved metabolic profiles with an increasing coverage of metabolic pathways. These data profiles represent read-outs from often non-linear dynamics of metabolic networks. Yet, metabolic profiles have largely been explored with regression-based approaches that only capture linear relationships, rendering it difficult to determine the extent to which the data reflect the underlying reaction rates and their couplings. Here we propose an approach termed Stoichiometric Correlation Analysis (SCA based on correlation between positive linear combinations of log-transformed metabolic profiles. The log-transformation is due to the evidence that metabolic networks can be modeled by mass action law and kinetics derived from it. Unlike the existing approaches which establish a relation between pairs of metabolites, SCA facilitates the discovery of higher-order dependence between more than two metabolites. By using a paradigmatic model of the tricarboxylic acid cycle we show that the higher-order dependence reflects the coupling of concentration of reactant complexes, capturing the subtle difference between the employed enzyme kinetics. Using time-resolved metabolic profiles from Arabidopsis thaliana and Escherichia coli, we show that SCA can be used to quantify the difference in coupling of reactant complexes, and hence, reaction rates, underlying the stringent response in these model organisms. By using SCA with data from natural variation of wild and domesticated wheat and tomato accession, we demonstrate that the domestication is accompanied by loss of such couplings, in these species. Therefore, application of SCA to metabolomics data from natural variation in wild and domesticated populations provides a mechanistic way to understanding domestication and its relation to metabolic networks.
Full Text Available Abstract Background In response to the rapid growth of available genome sequences, efforts have been made to develop automatic inference methods to functionally characterize them. Pipelines that infer functional annotation are now routinely used to produce new annotations at a genome scale and for a broad variety of species. These pipelines differ widely in their inference algorithms, confidence thresholds and data sources for reasoning. This heterogeneity makes a comparison of the relative merits of each approach extremely complex. The evaluation of the quality of the resultant annotations is also challenging given there is often no existing gold-standard against which to evaluate precision and recall. Results In this paper, we present a pragmatic approach to the study of functional annotations. An ensemble of 12 metrics, describing various aspects of functional annotations, is defined and implemented in a unified framework, which facilitates their systematic analysis and inter-comparison. The use of this framework is demonstrated on three illustrative examples: analysing the outputs of state-of-the-art inference pipelines, comparing electronic versus manual annotation methods, and monitoring the evolution of publicly available functional annotations. The framework is part of the AIGO library (http://code.google.com/p/aigo for the Analysis and the Inter-comparison of the products of Gene Ontology (GO annotation pipelines. The AIGO library also provides functionalities to easily load, analyse, manipulate and compare functional annotations and also to plot and export the results of the analysis in various formats. Conclusions This work is a step toward developing a unified framework for the systematic study of GO functional annotations. This framework has been designed so that new metrics on GO functional annotations can be added in a very straightforward way.
This annotated bibliography of sociolinguistics is divided into the following sections: speech events, ethnography of speaking and anthropological approaches to analysis of conversation; discourse analysis (including analysis of conversation and narrative), ethnomethodology and nonverbal communication; sociolinguistics; pragmatics (including…
Cannataro, Mario; Hiram Guzzi, Pietro; Veltri, Pierangelo
Biological databases have been developed with a special focus on the efficient retrieval of single records or the efficient computation of specialized bioinformatics algorithms against the overall database, such as in sequence alignment. The continuos production of biological knowledge spread on several biological databases and ontologies, such as Gene Ontology, and the availability of efficient techniques to handle such knowledge, such as annotation and semantic similarity measures, enable the development on novel bioinformatics applications that explicitly use and integrate such knowledge. After introducing the annotation process and the main semantic similarity measures, this paper shows how annotations and semantic similarity can be exploited to improve the extraction and analysis of biologically relevant data from protein interaction databases. As case studies, the paper presents two novel software tools, OntoPIN and CytoSeVis, both based on the use of Gene Ontology annotations, for the advanced querying of protein interaction databases and for the enhanced visualization of protein interaction networks.
Yuan, Pingpeng; Wang, Guoyin; Zhang, Qin; Jin, Hai
Due to ambiguity, search engines for scientific literatures may not return right search results. One efficient solution to the problems is to automatically annotate literatures and attach the semantic information to them. Generally, semantic annotation requires identifying entities before attaching semantic information to them. However, due to abbreviation and other reasons, it is very difficult to identify entities correctly. The paper presents a Semantic Annotation System for Literature (SASL), which utilizes Wikipedia as knowledge base to annotate literatures. SASL mainly attaches semantic to terminology, academic institutions, conferences, and journals etc. Many of them are usually abbreviations, which induces ambiguity. Here, SASL uses regular expressions to extract the mapping between full name of entities and their abbreviation. Since full names of several entities may map to a single abbreviation, SASL introduces Hidden Markov Model to implement name disambiguation. Finally, the paper presents the experimental results, which confirm SASL a good performance.
Styler, William F.; Bethard, Steven; Finan, Sean; Palmer, Martha; Pradhan, Sameer; de Groen, Piet C; Erickson, Brad; Miller, Timothy; Lin, Chen; Savova, Guergana; Pustejovsky, James
This article discusses the requirements of a formal specification for the annotation of temporal information in clinical narratives. We discuss the implementation and extension of ISO-TimeML for annotating a corpus of clinical notes, known as the THYME corpus. To reflect the information task and the heavily inference-based reasoning demands in the domain, a new annotation guideline has been developed, “the THYME Guidelines to ISO-TimeML (THYME-TimeML)”. To clarify what relations merit annotation, we distinguish between linguistically-derived and inferentially-derived temporal orderings in the text. We also apply a top performing TempEval 2013 system against this new resource to measure the difficulty of adapting systems to the clinical domain. The corpus is available to the community and has been proposed for use in a SemEval 2015 task. PMID:29082229
Pararas-Carayannis, G.; Dong, B.; Farmer, R.
This compilation contains annotated citations to nearly 3000 tsunami-related publications from 1962 to 1976 in English and several other languages. The foreign-language citations have English titles and abstracts
Kalkatawi, Manal M.
Genome annotation is an important topic since it provides information for the foundation of downstream genomic and biological research. It is considered as a way of summarizing part of existing knowledge about the genomic characteristics of an organism. Annotating different regions of a genome sequence is known as structural annotation, while identifying functions of these regions is considered as a functional annotation. In silico approaches can facilitate both tasks that otherwise would be difficult and timeconsuming. This study contributes to genome annotation by introducing several novel bioinformatics methods, some based on machine learning (ML) approaches. First, we present Dragon PolyA Spotter (DPS), a method for accurate identification of the polyadenylation signals (PAS) within human genomic DNA sequences. For this, we derived a novel feature-set able to characterize properties of the genomic region surrounding the PAS, enabling development of high accuracy optimized ML predictive models. DPS considerably outperformed the state-of-the-art results. The second contribution concerns developing generic models for structural annotation, i.e., the recognition of different genomic signals and regions (GSR) within eukaryotic DNA. We developed DeepGSR, a systematic framework that facilitates generating ML models to predict GSR with high accuracy. To the best of our knowledge, no available generic and automated method exists for such task that could facilitate the studies of newly sequenced organisms. The prediction module of DeepGSR uses deep learning algorithms to derive highly abstract features that depend mainly on proper data representation and hyperparameters calibration. DeepGSR, which was evaluated on recognition of PAS and translation initiation sites (TIS) in different organisms, yields a simpler and more precise representation of the problem under study, compared to some other hand-tailored models, while producing high accuracy prediction results. Finally
Theodore R Sana
Full Text Available Malaria is a global infectious disease that threatens the lives of millions of people. Transcriptomics, proteomics and functional genomics studies, as well as sequencing of the Plasmodium falciparum and Homo sapiens genomes, have shed new light on this host-parasite relationship. Recent advances in accurate mass measurement mass spectrometry, sophisticated data analysis software, and availability of biological pathway databases, have converged to facilitate our global, untargeted biochemical profiling study of in vitro P. falciparum-infected (IRBC and uninfected (NRBC erythrocytes. In order to expand the number of detectable metabolites, several key analytical steps in our workflows were optimized. Untargeted and targeted data mining resulted in detection of over one thousand features or chemical entities. Untargeted features were annotated via matching to the METLIN metabolite database. For targeted data mining, we queried the data using a compound database derived from a metabolic reconstruction of the P. falciparum genome. In total, over one hundred and fifty differential annotated metabolites were observed. To corroborate the representation of known biochemical pathways from our data, an inferential pathway analysis strategy was used to map annotated metabolites onto the BioCyc pathway collection. This hypothesis-generating approach resulted in over-representation of many metabolites onto several IRBC pathways, most prominently glycolysis. In addition, components of the "branched" TCA cycle, partial urea cycle, and nucleotide, amino acid, chorismate, sphingolipid and fatty acid metabolism were found to be altered in IRBCs. Interestingly, we detected and confirmed elevated levels for cyclic ADP ribose and phosphoribosyl AMP in IRBCs, a novel observation. These metabolites may play a role in regulating the release of intracellular Ca(2+ during P. falciparum infection. Our results support a strategy of global metabolite profiling by untargeted
Zellweger, Polle Trescott; Bouvin, Niels Olof; Jehøj, Henning
Fluid Documents use animated typographical changes to provide a novel and appealing user experience for hypertext browsing and for viewing document annotations in context. This paper describes an effort to broaden the utility of Fluid Documents by using the open hypermedia Arakne Environment...... to layer fluid annotations and links on top of abitrary HTML pages on the World Wide Web. Changes to both Fluid Documents and Arakne are required....
Wang, Qinghua; Arighi, Cecilia N; King, Benjamin L; Polson, Shawn W; Vincent, James; Chen, Chuming; Huang, Hongzhan; Kingham, Brewster F; Page, Shallee T; Rendino, Marc Farnum; Thomas, William Kelley; Udwary, Daniel W; Wu, Cathy H
Recent advances in high-throughput DNA sequencing technologies have equipped biologists with a powerful new set of tools for advancing research goals. The resulting flood of sequence data has made it critically important to train the next generation of scientists to handle the inherent bioinformatic challenges. The North East Bioinformatics Collaborative (NEBC) is undertaking the genome sequencing and annotation of the little skate (Leucoraja erinacea) to promote advancement of bioinformatics infrastructure in our region, with an emphasis on practical education to create a critical mass of informatically savvy life scientists. In support of the Little Skate Genome Project, the NEBC members have developed several annotation workshops and jamborees to provide training in genome sequencing, annotation and analysis. Acting as a nexus for both curation activities and dissemination of project data, a project web portal, SkateBase (http://skatebase.org) has been developed. As a case study to illustrate effective coupling of community annotation with workforce development, we report the results of the Mitochondrial Genome Annotation Jamborees organized to annotate the first completely assembled element of the Little Skate Genome Project, as a culminating experience for participants from our three prior annotation workshops. We are applying the physical/virtual infrastructure and lessons learned from these activities to enhance and streamline the genome annotation workflow, as we look toward our continuing efforts for larger-scale functional and structural community annotation of the L. erinacea genome.
Wang, Qinghua; Arighi, Cecilia N.; King, Benjamin L.; Polson, Shawn W.; Vincent, James; Chen, Chuming; Huang, Hongzhan; Kingham, Brewster F.; Page, Shallee T.; Farnum Rendino, Marc; Thomas, William Kelley; Udwary, Daniel W.; Wu, Cathy H.
Recent advances in high-throughput DNA sequencing technologies have equipped biologists with a powerful new set of tools for advancing research goals. The resulting flood of sequence data has made it critically important to train the next generation of scientists to handle the inherent bioinformatic challenges. The North East Bioinformatics Collaborative (NEBC) is undertaking the genome sequencing and annotation of the little skate (Leucoraja erinacea) to promote advancement of bioinformatics infrastructure in our region, with an emphasis on practical education to create a critical mass of informatically savvy life scientists. In support of the Little Skate Genome Project, the NEBC members have developed several annotation workshops and jamborees to provide training in genome sequencing, annotation and analysis. Acting as a nexus for both curation activities and dissemination of project data, a project web portal, SkateBase (http://skatebase.org) has been developed. As a case study to illustrate effective coupling of community annotation with workforce development, we report the results of the Mitochondrial Genome Annotation Jamborees organized to annotate the first completely assembled element of the Little Skate Genome Project, as a culminating experience for participants from our three prior annotation workshops. We are applying the physical/virtual infrastructure and lessons learned from these activities to enhance and streamline the genome annotation workflow, as we look toward our continuing efforts for larger-scale functional and structural community annotation of the L. erinacea genome. PMID:22434832
Full Text Available Electronic health records and scientific articles possess differing linguistic characteristics that may impact the performance of natural language processing tools developed for one or the other. In this paper, we investigate the performance of four extant concept recognition tools: the clinical Text Analysis and Knowledge Extraction System (cTAKES, the National Center for Biomedical Ontology (NCBO Annotator, the Biomedical Concept Annotation System (BeCAS and MetaMap. Each of the four concept recognition systems is applied to four different corpora: the i2b2 corpus of clinical documents, a PubMed corpus of Medline abstracts, a clinical trails corpus and the ShARe/CLEF corpus. In addition, we assess the individual system performances with respect to one gold standard annotation set, available for the ShARe/CLEF corpus. Furthermore, we built a silver standard annotation set from the individual systems' output and assess the quality as well as the contribution of individual systems to the quality of the silver standard. Our results demonstrate that mainly the NCBO annotator and cTAKES contribute to the silver standard corpora (F1-measures in the range of 21% to 74% and their quality (best F1-measure of 33%, independent from the type of text investigated. While BeCAS and MetaMap can contribute to the precision of silver standard annotations (precision of up to 42%, the F1-measure drops when combined with NCBO Annotator and cTAKES due to a low recall. In conclusion, the performances of individual systems need to be improved independently from the text types, and the leveraging strategies to best take advantage of individual systems' annotations need to be revised. The textual content of the PubMed corpus, accession numbers for the clinical trials corpus, and assigned annotations of the four concept recognition systems as well as the generated silver standard annotation sets are available from http://purl.org/phenotype/resources. The textual content
Winsor, Geoffrey L; Griffiths, Emma J; Lo, Raymond; Dhillon, Bhavjinder K; Shay, Julie A; Brinkman, Fiona S L
The Pseudomonas Genome Database (http://www.pseudomonas.com) is well known for the application of community-based annotation approaches for producing a high-quality Pseudomonas aeruginosa PAO1 genome annotation, and facilitating whole-genome comparative analyses with other Pseudomonas strains. To aid analysis of potentially thousands of complete and draft genome assemblies, this database and analysis platform was upgraded to integrate curated genome annotations and isolate metadata with enhanced tools for larger scale comparative analysis and visualization. Manually curated gene annotations are supplemented with improved computational analyses that help identify putative drug targets and vaccine candidates or assist with evolutionary studies by identifying orthologs, pathogen-associated genes and genomic islands. The database schema has been updated to integrate isolate metadata that will facilitate more powerful analysis of genomes across datasets in the future. We continue to place an emphasis on providing high-quality updates to gene annotations through regular review of the scientific literature and using community-based approaches including a major new Pseudomonas community initiative for the assignment of high-quality gene ontology terms to genes. As we further expand from thousands of genomes, we plan to provide enhancements that will aid data visualization and analysis arising from whole-genome comparative studies including more pan-genome and population-based approaches. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.
Full Text Available SE41_AM1 PowerGet annotation In annotation process, KEGG, KNApSAcK and LipidMAPS ar..., predicted molecular formulas are used for the annotation. MS/MS patterns was used to suggest functional gr...-MS Fragment Viewer (http://webs2.kazusa.or.jp/msmsfragmentviewer/) are used for annotation and identification of the compounds. ...
Kell, Douglas B; Oliver, Stephen G
The term 'metabolome' was introduced to the scientific literature in September 1998. To mark its 18-year-old 'coming of age', two of the co-authors of that paper review the genesis of metabolomics, whence it has come and where it may be going.
Miller, Marion G
Metabolomic approaches have the potential to make an exceptional contribution to understanding how chemicals and other environmental stressors can affect both human and environmental health. However, the application of metabolomics to environmental exposures, although getting underway, has not yet been extensively explored. This review will use a SWOT analysis model to discuss some of the strengths, weaknesses, opportunities, and threats that are apparent to an investigator venturing into this relatively new field. SWOT has been used extensively in business settings to uncover new outlooks and identify problems that would impede progress. The field of environmental metabolomics provides great opportunities for discovery, and this is recognized by a high level of interest in potential applications. However, understanding the biological consequence of environmental exposures can be confounded by inter- and intra-individual differences. Metabolomic profiles can yield a plethora of data, the interpretation of which is complex and still being evaluated and researched. The development of the field will depend on the availability of technologies for data handling and that permit ready access metabolomic databases. Understanding the relevance of metabolomic endpoints to organism health vs adaptation vs variation is an important step in understanding what constitutes a substantive environmental threat. Metabolomic applications in reproductive research are discussed. Overall, the development of a comprehensive mechanistic-based interpretation of metabolomic changes offers the possibility of providing information that will significantly contribute to the protection of human health and the environment.
Dragsted, L. O.; Kristensen, M.; Ravn-Haren, Gitte
Metabolomics is a promising tool for searching out new biomarkers and the development of hypotheses in nutrition research. This chapter will describe the design of human dietary intervention studies where samples are collected for metabolomics analyses as well as the analytical issues and data...
Marques, Ana Patrícia; Serralheiro, Maria Luisa; Ferreira, António E. N.; Freire, Ana Ponces; Cordeiro, Carlos; Silva, Marta Sousa
Metabolomics is a key discipline in systems biology, together with genomics, transcriptomics, and proteomics. In this omics cascade, the metabolome represents the biochemical products that arise from cellular processes and is often regarded as the final response of a biological system to environmental or genetic changes. The overall screening…
cancer or a history of transurethral resection of the prostate (TURP) for benign prostatic hypertrophy are excluded. Somewhat surprisingly...AD_________________ Award Number: W81XWH-11-1-0451 TITLE: Metabolomic Profiling of Prostate Cancer...29 September 2012 4. TITLE AND SUBTITLE 5a. CONTRACT NUMBER Metabolomic Profiling of Prostate Cancer Progression During Active Surveillance 5b
Koek, Maud Marijtje
Metabolomics involves the unbiased quantitative and qualitative analysis of the complete set of metabolites present in cells, body fluids and tissues. Gas chromatography coupled to mass spectrometry (GC-MS) is very suitable for metabolomics analysis, as it combines high separation power with
Osborne, John D; Flatow, Jared; Holko, Michelle; Lin, Simon M; Kibbe, Warren A; Zhu, Lihua (Julie); Danila, Maria I; Feng, Gang; Chisholm, Rex L
Background The human genome has been extensively annotated with Gene Ontology for biological functions, but minimally computationally annotated for diseases. Results We used the Unified Medical Language System (UMLS) MetaMap Transfer tool (MMTx) to discover gene-disease relationships from the GeneRIF database. We utilized a comprehensive subset of UMLS, which is disease-focused and structured as a directed acyclic graph (the Disease Ontology), to filter and interpret results from MMTx. The results were validated against the Homayouni gene collection using recall and precision measurements. We compared our results with the widely used Online Mendelian Inheritance in Man (OMIM) annotations. Conclusion The validation data set suggests a 91% recall rate and 97% precision rate of disease annotation using GeneRIF, in contrast with a 22% recall and 98% precision using OMIM. Our thesaurus-based approach allows for comparisons to be made between disease containing databases and allows for increased accuracy in disease identification through synonym matching. The much higher recall rate of our approach demonstrates that annotating human genome with Disease Ontology and GeneRIF for diseases dramatically increases the coverage of the disease annotation of human genome. PMID:19594883
Goede, Patricia A; Lauman, Jason R; Cochella, Christopher; Katzman, Gregory L; Morton, David A; Albertine, Kurt H
Use of digital medical images has become common over the last several years, coincident with the release of inexpensive, mega-pixel quality digital cameras and the transition to digital radiology operation by hospitals. One problem that clinicians, medical educators, and basic scientists encounter when handling images is the difficulty of using business and graphic arts commercial-off-the-shelf (COTS) software in multicontext authoring and interactive teaching environments. The authors investigated and developed software-supported methodologies to help clinicians, medical educators, and basic scientists become more efficient and effective in their digital imaging environments. The software that the authors developed provides the ability to annotate images based on a multispecialty methodology for annotation and visual knowledge representation. This annotation methodology is designed by consensus, with contributions from the authors and physicians, medical educators, and basic scientists in the Departments of Radiology, Neurobiology and Anatomy, Dermatology, and Ophthalmology at the University of Utah. The annotation methodology functions as a foundation for creating, using, reusing, and extending dynamic annotations in a context-appropriate, interactive digital environment. The annotation methodology supports the authoring process as well as output and presentation mechanisms. The annotation methodology is the foundation for a Windows implementation that allows annotated elements to be represented as structured eXtensible Markup Language and stored separate from the image(s).
Goede, Patricia A.; Lauman, Jason R.; Cochella, Christopher; Katzman, Gregory L.; Morton, David A.; Albertine, Kurt H.
Use of digital medical images has become common over the last several years, coincident with the release of inexpensive, mega-pixel quality digital cameras and the transition to digital radiology operation by hospitals. One problem that clinicians, medical educators, and basic scientists encounter when handling images is the difficulty of using business and graphic arts commercial-off-the-shelf (COTS) software in multicontext authoring and interactive teaching environments. The authors investigated and developed software-supported methodologies to help clinicians, medical educators, and basic scientists become more efficient and effective in their digital imaging environments. The software that the authors developed provides the ability to annotate images based on a multispecialty methodology for annotation and visual knowledge representation. This annotation methodology is designed by consensus, with contributions from the authors and physicians, medical educators, and basic scientists in the Departments of Radiology, Neurobiology and Anatomy, Dermatology, and Ophthalmology at the University of Utah. The annotation methodology functions as a foundation for creating, using, reusing, and extending dynamic annotations in a context-appropriate, interactive digital environment. The annotation methodology supports the authoring process as well as output and presentation mechanisms. The annotation methodology is the foundation for a Windows implementation that allows annotated elements to be represented as structured eXtensible Markup Language and stored separate from the image(s). PMID:14527971
Spicer, Rachel A; Salek, Reza; Steinbeck, Christoph
The Metabolomics Standards Initiative (MSI) guidelines were first published in 2007. These guidelines provided reporting standards for all stages of metabolomics analysis: experimental design, biological context, chemical analysis and data processing. Since 2012, a series of public metabolomics databases and repositories, which accept the deposition of metabolomic datasets, have arisen. In this study, the compliance of 399 public data sets, from four major metabolomics data repositories, to the biological context MSI reporting standards was evaluated. None of the reporting standards were complied with in every publicly available study, although adherence rates varied greatly, from 0 to 97%. The plant minimum reporting standards were the most complied with and the microbial and in vitro were the least. Our results indicate the need for reassessment and revision of the existing MSI reporting standards.
Full Text Available Abstract Background The expressed sequence tag (EST methodology is an attractive option for the generation of sequence data for species for which no completely sequenced genome is available. The annotation and comparative analysis of such datasets poses a formidable challenge for research groups that do not have the bioinformatics infrastructure of major genome sequencing centres. Therefore, there is a need for user-friendly tools to facilitate the annotation of non-model species EST datasets with well-defined ontologies that enable meaningful cross-species comparisons. To address this, we have developed annot8r, a platform for the rapid annotation of EST datasets with GO-terms, EC-numbers and KEGG-pathways. Results annot8r automatically downloads all files relevant for the annotation process and generates a reference database that stores UniProt entries, their associated Gene Ontology (GO, Enzyme Commission (EC and Kyoto Encyclopaedia of Genes and Genomes (KEGG annotation and additional relevant data. For each of GO, EC and KEGG, annot8r extracts a specific sequence subset from the UniProt dataset based on the information stored in the reference database. These three subsets are then formatted for BLAST searches. The user provides the protein or nucleotide sequences to be annotated and annot8r runs BLAST searches against these three subsets. The BLAST results are parsed and the corresponding annotations retrieved from the reference database. The annotations are saved both as flat files and also in a relational postgreSQL results database to facilitate more advanced searches within the results. annot8r is integrated with the PartiGene suite of EST analysis tools. Conclusion annot8r is a tool that assigns GO, EC and KEGG annotations for data sets resulting from EST sequencing projects both rapidly and efficiently. The benefits of an underlying relational database, flexibility and the ease of use of the program make it ideally suited for non
Vallenet, D; Engelen, S; Mornico, D; Cruveiller, S; Fleury, L; Lajus, A; Rouy, Z; Roche, D; Salvignol, G; Scarpelli, C; Médigue, C
The initial outcome of genome sequencing is the creation of long text strings written in a four letter alphabet. The role of in silico sequence analysis is to assist biologists in the act of associating biological knowledge with these sequences, allowing investigators to make inferences and predictions that can be tested experimentally. A wide variety of software is available to the scientific community, and can be used to identify genomic objects, before predicting their biological functions. However, only a limited number of biologically interesting features can be revealed from an isolated sequence. Comparative genomics tools, on the other hand, by bringing together the information contained in numerous genomes simultaneously, allow annotators to make inferences based on the idea that evolution and natural selection are central to the definition of all biological processes. We have developed the MicroScope platform in order to offer a web-based framework for the systematic and efficient revision of microbial genome annotation and comparative analysis (http://www.genoscope.cns.fr/agc/microscope). Starting with the description of the flow chart of the annotation processes implemented in the MicroScope pipeline, and the development of traditional and novel microbial annotation and comparative analysis tools, this article emphasizes the essential role of expert annotation as a complement of automatic annotation. Several examples illustrate the use of implemented tools for the review and curation of annotations of both new and publicly available microbial genomes within MicroScope's rich integrated genome framework. The platform is used as a viewer in order to browse updated annotation information of available microbial genomes (more than 440 organisms to date), and in the context of new annotation projects (117 bacterial genomes). The human expertise gathered in the MicroScope database (about 280,000 independent annotations) contributes to improve the quality of
Ibáñez, Clara; Simó, Carolina; García-Cañas, Virginia; Cifuentes, Alejandro; Castro-Puyana, María
Graphical abstract: -- Highlights: •Foodomics allows studying food and nutrition through the application of advanced omics approaches. •CE-MS plays a crucial role as analytical platform to carry out omics studies. •CE-MS applications for food metabolomics, proteomics and peptidomics are presented. -- Abstract: In the current post-genomic era, Foodomics has been defined as a discipline that studies food and nutrition through the application of advanced omics approaches. Foodomics involves the use of genomics, transcriptomics, epigenetics, proteomics, peptidomics, and/or metabolomics to investigate food quality, safety, traceability and bioactivity. In this context, capillary electrophoresis-mass spectrometry (CE-MS) has been applied mainly in food proteomics, peptidomics and metabolomics. The aim of this review work is to present an overview of the most recent developments and applications of CE-MS as analytical platform for Foodomics, covering the relevant works published from 2008 to 2012. The review provides also information about the integration of several omics approaches in the new Foodomics field
Ibáñez, Clara; Simó, Carolina; García-Cañas, Virginia; Cifuentes, Alejandro, E-mail: firstname.lastname@example.org; Castro-Puyana, María
Graphical abstract: -- Highlights: •Foodomics allows studying food and nutrition through the application of advanced omics approaches. •CE-MS plays a crucial role as analytical platform to carry out omics studies. •CE-MS applications for food metabolomics, proteomics and peptidomics are presented. -- Abstract: In the current post-genomic era, Foodomics has been defined as a discipline that studies food and nutrition through the application of advanced omics approaches. Foodomics involves the use of genomics, transcriptomics, epigenetics, proteomics, peptidomics, and/or metabolomics to investigate food quality, safety, traceability and bioactivity. In this context, capillary electrophoresis-mass spectrometry (CE-MS) has been applied mainly in food proteomics, peptidomics and metabolomics. The aim of this review work is to present an overview of the most recent developments and applications of CE-MS as analytical platform for Foodomics, covering the relevant works published from 2008 to 2012. The review provides also information about the integration of several omics approaches in the new Foodomics field.
Amrita K Cheema
Full Text Available Tissue consequences of radiation exposure are dependent on radiation quality and high linear energy transfer (high-LET radiation, such as heavy ions in space is known to deposit higher energy in tissues and cause greater damage than low-LET γ radiation. While radiation exposure has been linked to intestinal pathologies, there are very few studies on long-term effects of radiation, fewer involved a therapeutically relevant γ radiation dose, and none explored persistent tissue metabolomic alterations after heavy ion space radiation exposure. Using a metabolomics approach, we report long-term metabolomic markers of radiation injury and perturbation of signaling pathways linked to metabolic alterations in mice after heavy ion or γ radiation exposure. Intestinal tissues (C57BL/6J, female, 6 to 8 wks were analyzed using ultra performance liquid chromatography coupled with electrospray quadrupole time-of-flight mass spectrometry (UPLC-QToF-MS two months after 2 Gy γ radiation and results were compared to an equitoxic ⁵⁶Fe (1.6 Gy radiation dose. The biological relevance of the metabolites was determined using Ingenuity Pathway Analysis, immunoblots, and immunohistochemistry. Metabolic profile analysis showed radiation-type-dependent spatial separation of the groups. Decreased adenine and guanosine and increased inosine and uridine suggested perturbed nucleotide metabolism. While both the radiation types affected amino acid metabolism, the ⁵⁶Fe radiation preferentially altered dipeptide metabolism. Furthermore, ⁵⁶Fe radiation caused upregulation of 'prostanoid biosynthesis' and 'eicosanoid signaling', which are interlinked events related to cellular inflammation and have implications for nutrient absorption and inflammatory bowel disease during space missions and after radiotherapy. In conclusion, our data showed for the first time that metabolomics can not only be used to distinguish between heavy ion and γ radiation exposures, but
Gille, Christoph; Hübner, Katrin; Hoppe, Andreas; Holzhütter, Hermann-Georg
Semantic annotations of the biochemical entities constituting a biological reaction network are indispensable to create biologically meaningful networks. They further heighten efficient exchange, reuse and merging of existing models which concern present-day systems biology research more often. Two types of tools for the reconstruction of biological networks currently exist: (i) several sophisticated programs support graphical network editing and visualization. (ii) Data management systems permit reconstruction and curation of huge networks in a team of scientists including data integration, annotation and cross-referencing. We seeked ways to combine the advantages of both approaches. Metannogen, which was previously developed for network reconstruction, has been considerably improved. From now on, Metannogen provides sbml import and annotation of networks created elsewhere. This permits users of other network reconstruction platforms or modeling software to annotate their networks using Metannogen's advanced information management. We implemented word-autocompletion, multipattern highlighting, spell check, brace-expansion and publication management, and improved annotation, cross-referencing and team work requirements. Unspecific enzymes and transporters acting on a spectrum of different substrates are efficiently handled. The network can be exported in sbml format where the annotations are embedded in line with the miriam standard. For more comfort, Metannogen may be tightly coupled with the network editor such that Metannogen becomes an additional view for the focused reaction in the network editor. Finally, Metannogen provides local single user, shared password protected multiuser or public access to the annotation data. Metannogen is available free of charge at: http://www.bioinformatics.org/strap/metannogen/ or http://3d-alignment.eu/metannogen/. email@example.com Supplementary data are available at Bioinformatics online.
Full Text Available This work elaborates the semi-semantic part of speech annotation guidelines for the URDU.KON-TB treebank: an annotated corpus. A hierarchical annotation scheme was designed to label the part of speech and then applied on the corpus. This raw corpus was collected from the Urdu Wikipedia and the Jang newspaper and then annotated with the proposed semi-semantic part of speech labels. The corpus contains text of local & international news, social stories, sports, culture, finance, religion, traveling, etc. This exercise finally contributed a part of speech annotation to the URDU.KON-TB treebank. Twenty-two main part of speech categories are divided into subcategories, which conclude the morphological, and semantical information encoded in it. This article reports the annotation guidelines in major; however, it also briefs the development of the URDU.KON-TB treebank, which includes the raw corpus collection, designing & employment of annotation scheme and finally, its statistical evaluation and results. The guidelines presented as follows, will be useful for linguistic community to annotate the sentences not only for the national language Urdu but for the other indigenous languages like Punjab, Sindhi, Pashto, etc., as well.
Full Text Available The MixtureTree Annotator, written in JAVA, allows the user to automatically color any phylogenetic tree in Newick format generated from any phylogeny reconstruction program and output the Nexus file. By providing the ability to automatically color the tree by sequence name, the MixtureTree Annotator provides a unique advantage over any other programs which perform a similar function. In addition, the MixtureTree Annotator is the only package that can efficiently annotate the output produced by MixtureTree with mutation information and coalescent time information. In order to visualize the resulting output file, a modified version of FigTree is used. Certain popular methods, which lack good built-in visualization tools, for example, MEGA, Mesquite, PHY-FI, TreeView, treeGraph and Geneious, may give results with human errors due to either manually adding colors to each node or with other limitations, for example only using color based on a number, such as branch length, or by taxonomy. In addition to allowing the user to automatically color any given Newick tree by sequence name, the MixtureTree Annotator is the only method that allows the user to automatically annotate the resulting tree created by the MixtureTree program. The MixtureTree Annotator is fast and easy-to-use, while still allowing the user full control over the coloring and annotating process.
Palama, Tony Lionel; Fock, Isabelle; Choi, Young Hae; Verpoorte, Robert; Kodja, Hippolyte
The metabolomic analysis of Vanilla planifolia leaves collected at different developmental stages was carried out using (1)H-nuclear magnetic resonance (NMR) spectroscopy and multivariate data analysis in order to evaluate their variation. Ontogenic changes of the metabolome were considered since leaves of different ages were collected at two different times of the day and in two different seasons. Principal component analysis (PCA) and partial least square modeling discriminate analysis (PLS-DA) of (1)H NMR data provided a clear separation according to leaf age, time of the day and season of collection. Young leaves were found to have higher levels of glucose, bis[4-(beta-D-glucopyranosyloxy)-benzyl]-2-isopropyltartrate (glucoside A) and bis[4-(beta-D-glucopyranosyloxy)-benzyl]-2-(2-butyl)-tartrate (glucoside B), whereas older leaves had more sucrose, acetic acid, homocitric acid and malic acid. Results obtained from PLS-DA analysis showed that leaves collected in March 2008 had higher levels of glucosides A and B as compared to those collected in August 2007. However, the relative standard deviation (RSD) exhibited by the individual values of glucosides A and B showed that those compounds vary more according to their developmental stage (50%) than to the time of day or the season in which they were collected (19%). Although morphological variations of the V. planifolia accessions were observed, no clear separation of the accessions was determined from the analysis of the NMR spectra. The results obtained in this study, show that this method based on the use of (1)H NMR spectroscopy in combination with multivariate analysis has a great potential for further applications in the study of vanilla leaf metabolome. Copyright 2009 Elsevier Ltd. All rights reserved.
Collet, Tinh-Hai; Sonoyama, Takuhiro; Henning, Elana; Keogh, Julia M; Ingram, Brian; Kelway, Sarah; Guo, Lining; Farooqi, I Sadaf
The experimental paradigm of acute caloric restriction (CR) followed by refeeding (RF) can be used to study the homeostatic mechanisms that regulate energy homeostasis, which are relevant to understanding the adaptive response to weight loss. Metabolomics, the measurement of hundreds of small molecule metabolites, their precursors, derivatives, and degradation products, has emerged as a useful tool for the study of physiology and disease and was used here to study the metabolic response to acute CR. We used four ultra high-performance liquid chromatography-tandem mass spectrometry methods to characterize changes in carbohydrates, lipids, amino acids, and steroids in eight normal weight men at baseline, after 48 hours of CR (10% of energy requirements) and after 48 hours of ad libitum RF in a tightly controlled environment. We identified a distinct metabolomic signature associated with acute CR characterized by the expected switch from carbohydrate to fat utilization with increased lipolysis and β-fatty acid oxidation. We found an increase in ω-fatty acid oxidation and levels of endocannabinoids, which are known to promote food intake. These changes were reversed with RF. Several plasmalogen phosphatidylethanolamines (endogenous antioxidants) significantly decreased with CR (all P ≤ 0.0007). Additionally, acute CR was associated with an increase in the branched chain amino acids (all P ≤ 1.4 × 10-7) and dehydroepiandrosterone sulfate (P = 0.0006). We identified a distinct metabolomic signature associated with acute CR. Further studies are needed to characterize the mechanisms that mediate these changes and their potential contribution to the adaptive response to dietary restriction. Copyright © 2017 Endocrine Society
Irshad, H; Montaser-Kouhsari, L; Waltz, G; Bucur, O; Nowak, J A; Dong, F; Knoblauch, N W; Beck, A H
The development of tools in computational pathology to assist physicians and biomedical scientists in the diagnosis of disease requires access to high-quality annotated images for algorithm learning and evaluation. Generating high-quality expert-derived annotations is time-consuming and expensive. We explore the use of crowdsourcing for rapidly obtaining annotations for two core tasks in com- putational pathology: nucleus detection and nucleus segmentation. We designed and implemented crowdsourcing experiments using the CrowdFlower platform, which provides access to a large set of labor channel partners that accesses and manages millions of contributors worldwide. We obtained annotations from four types of annotators and compared concordance across these groups. We obtained: crowdsourced annotations for nucleus detection and segmentation on a total of 810 images; annotations using automated methods on 810 images; annotations from research fellows for detection and segmentation on 477 and 455 images, respectively; and expert pathologist-derived annotations for detection and segmentation on 80 and 63 images, respectively. For the crowdsourced annotations, we evaluated performance across a range of contributor skill levels (1, 2, or 3). The crowdsourced annotations (4,860 images in total) were completed in only a fraction of the time and cost required for obtaining annotations using traditional methods. For the nucleus detection task, the research fellow-derived annotations showed the strongest concordance with the expert pathologist- derived annotations (F-M =93.68%), followed by the crowd-sourced contributor levels 1,2, and 3 and the automated method, which showed relatively similar performance (F-M = 87.84%, 88.49%, 87.26%, and 86.99%, respectively). For the nucleus segmentation task, the crowdsourced contributor level 3-derived annotations, research fellow-derived annotations, and automated method showed the strongest concordance with the expert pathologist
Gobeill, Julien; Gaudinat, Arnaud; Pasche, Emilie; Vishnyakova, Dina; Gaudet, Pascale; Bairoch, Amos; Ruch, Patrick
Biomedical professionals have access to a huge amount of literature, but when they use a search engine, they often have to deal with too many documents to efficiently find the appropriate information in a reasonable time. In this perspective, question-answering (QA) engines are designed to display answers, which were automatically extracted from the retrieved documents. Standard QA engines in literature process a user question, then retrieve relevant documents and finally extract some possible answers out of these documents using various named-entity recognition processes. In our study, we try to answer complex genomics questions, which can be adequately answered only using Gene Ontology (GO) concepts. Such complex answers cannot be found using state-of-the-art dictionary- and redundancy-based QA engines. We compare the effectiveness of two dictionary-based classifiers for extracting correct GO answers from a large set of 100 retrieved abstracts per question. In the same way, we also investigate the power of GOCat, a GO supervised classifier. GOCat exploits the GOA database to propose GO concepts that were annotated by curators for similar abstracts. This approach is called deep QA, as it adds an original classification step, and exploits curated biological data to infer answers, which are not explicitly mentioned in the retrieved documents. We show that for complex answers such as protein functional descriptions, the redundancy phenomenon has a limited effect. Similarly usual dictionary-based approaches are relatively ineffective. In contrast, we demonstrate how existing curated data, beyond information extraction, can be exploited by a supervised classifier, such as GOCat, to massively improve both the quantity and the quality of the answers with a +100% improvement for both recall and precision. Database URL: http://eagl.unige.ch/DeepQA4PA/. © The Author(s) 2015. Published by Oxford University Press.
Kholghi, Mahnoosh; Sitbon, Laurianne; Zuccon, Guido; Nguyen, Anthony
To investigate: (1) the annotation time savings by various active learning query strategies compared to supervised learning and a random sampling baseline, and (2) the benefits of active learning-assisted pre-annotations in accelerating the manual annotation process compared to de novo annotation. There are 73 and 120 discharge summary reports provided by Beth Israel institute in the train and test sets of the concept extraction task in the i2b2/VA 2010 challenge, respectively. The 73 reports were used in user study experiments for manual annotation. First, all sequences within the 73 reports were manually annotated from scratch. Next, active learning models were built to generate pre-annotations for the sequences selected by a query strategy. The annotation/reviewing time per sequence was recorded. The 120 test reports were used to measure the effectiveness of the active learning models. When annotating from scratch, active learning reduced the annotation time up to 35% and 28% compared to a fully supervised approach and a random sampling baseline, respectively. Reviewing active learning-assisted pre-annotations resulted in 20% further reduction of the annotation time when compared to de novo annotation. The number of concepts that require manual annotation is a good indicator of the annotation time for various active learning approaches as demonstrated by high correlation between time rate and concept annotation rate. Active learning has a key role in reducing the time required to manually annotate domain concepts from clinical free text, either when annotating from scratch or reviewing active learning-assisted pre-annotations. Copyright © 2017 Elsevier B.V. All rights reserved.
Zhao, Yanni; Hao, Zhiqiang; Zhao, Chunxia; Zhao, Jieyu; Zhang, Junjie; Li, Yanli; Li, Lili; Huang, Xin; Lin, Xiaohui; Zeng, Zhongda; Lu, Xin; Xu, Guowang
Metabolomics is increasingly applied to discover and validate metabolite biomarkers and illuminate biological variations. Combination of multiple analytical batches in large-scale and long-term metabolomics is commonly utilized to generate robust metabolomics data, but gross and systematic errors are often observed. The appropriate calibration methods are required before statistical analyses. Here, we develop a novel correction strategy for large-scale and long-term metabolomics study, which could integrate metabolomics data from multiple batches and different instruments by calibrating gross and systematic errors. The gross error calibration method applied various statistical and fitting models of the feature ratios between two adjacent quality control (QC) samples to screen and calibrate outlier variables. Virtual QC of each sample was produced by a linear fitting model of the feature intensities between two neighboring QCs to obtain a correction factor and remove the systematic bias. The suggested method was applied to handle metabolic profiling data of 1197 plant samples in nine batches analyzed by two gas chromatography-mass spectrometry instruments. The method was evaluated by the relative standard deviations of all the detected peaks, the average Pearson correlation coefficients, and Euclidean distance of QCs and non-QC replicates. The results showed the established approach outperforms the commonly used internal standard correction and total intensity signal correction methods, it could be used to integrate the metabolomics data from multiple analytical batches and instruments, and it allows the frequency of QC to one injection of every 20 real samples. The suggested method makes a large amount of metabolomics analysis practicable.
Smirnov, Kirill S; Maier, Tanja V; Walker, Alesia; Heinzmann, Silke S; Forcisi, Sara; Martinez, Inés; Walter, Jens; Schmitt-Kopplin, Philippe
The review highlights the role of metabolomics in studying human gut microbial metabolism. Microbial communities in our gut exert a multitude of functions with huge impact on human health and disease. Within the meta-omics discipline, gut microbiome is studied by (meta)genomics, (meta)transcriptomics, (meta)proteomics and metabolomics. The goal of metabolomics research applied to fecal samples is to perform their metabolic profiling, to quantify compounds and classes of interest, to characterize small molecules produced by gut microbes. Nuclear magnetic resonance spectroscopy and mass spectrometry are main technologies that are applied in fecal metabolomics. Metabolomics studies have been increasingly used in gut microbiota related research regarding health and disease with main focus on understanding inflammatory bowel diseases. The elucidated metabolites in this field are summarized in this review. We also addressed the main challenges of metabolomics in current and future gut microbiota research. The first challenge reflects the need of adequate analytical tools and pipelines, including sample handling, selection of appropriate equipment, and statistical evaluation to enable meaningful biological interpretation. The second challenge is related to the choice of the right animal model for studies on gut microbiota. We exemplified this using NMR spectroscopy for the investigation of cross-species comparison of fecal metabolite profiles. Finally, we present the problem of variability of human gut microbiota and metabolome that has important consequences on the concepts of personalized nutrition and medicine. Copyright © 2016 Elsevier GmbH. All rights reserved.
Washio, Jumpei; Takahashi, Nobuhiro
Oral diseases are known to be closely associated with oral biofilm metabolism, while cancer tissue is reported to possess specific metabolism such as the 'Warburg effect'. Metabolomics might be a useful method for clarifying the whole metabolic systems that operate in oral biofilm and oral cancer, however, technical limitations have hampered such research. Fortunately, metabolomics techniques have developed rapidly in the past decade, which has helped to solve these difficulties. In vivo metabolomic analyses of the oral biofilm have produced various findings. Some of these findings agreed with the in vitro results obtained in conventional metabolic studies using representative oral bacteria, while others differed markedly from them. Metabolomic analyses of oral cancer tissue not only revealed differences between metabolomic profiles of cancer and normal tissue, but have also suggested a specific metabolic system operates in oral cancer tissue. Saliva contains a variety of metabolites, some of which might be associated with oral or systemic disease; therefore, metabolomics analysis of saliva could be useful for identifying disease-specific biomarkers. Metabolomic analyses of the oral biofilm, oral cancer, and saliva could contribute to the development of accurate diagnostic, techniques, safe and effective treatments, and preventive strategies for oral and systemic diseases.
Roberts, Kirk; Demner-Fushman, Dina
This paper discusses the creation of a semantically annotated corpus of questions about patient data in electronic health records (EHRs). The goal is to provide the training data necessary for semantic parsers to automatically convert EHR questions into a structured query. A layered annotation strategy is used which mirrors a typical natural language processing (NLP) pipeline. First, questions are syntactically analyzed to identify multi-part questions. Second, medical concepts are recognized and normalized to a clinical ontology. Finally, logical forms are created using a lambda calculus representation. We use a corpus of 446 questions asking for patient-specific information. From these, 468 specific questions are found containing 259 unique medical concepts and requiring 53 unique predicates to represent the logical forms. We further present detailed characteristics of the corpus, including inter-annotator agreement results, and describe the challenges automatic NLP systems will face on this task.
McCauley, Stephen; de Groot, Saskia; Mailund, Thomas
- and intergenomic regions. The presence of multiple coding regions complicates the concept of Ka/Ks ratio, and thus begs for an alternative approach when investigating selection strengths. Building on the paper by McCauley & Hein (2006), we develop a method for annotating a viral genome coding in overlapping...... may thus achieve an annotation both of coding regions as well as selection strengths, allowing us to investigate different selection patterns and hypotheses. Results: We illustrate our method by applying it to a multiple alignment of four HIV2 sequences, as well as four Hepatitis B sequences. We...... obtain an annotation of the coding regions, as well as a posterior probability for each site of the strength of selection acting on it. From this we may deduce the average posterior selection acting on the different genes. Whilst we are encouraged to see in HIV2, that the known to be conserved genes gag...
Deborde, Catherine; Jacob, Daniel
Plant primary metabolites are organic compounds that are common to all or most plant species and are essential for plant growth, development, and reproduction. They are intermediates and products of metabolism involved in photosynthesis and other biosynthetic processes. Primary metabolites belong to different compound families, mainly carbohydrates, organic acids, amino acids, nucleotides, fatty acids, steroids, or lipids. Until recently, unlike the Human Metabolome Database ( http://www.hmdb.ca ) dedicated to human metabolism, there was no centralized database or repository dedicated exclusively to the plant kingdom that contained information on metabolites and their concentrations in a detailed experimental context. MeRy-B is the first platform for plant (1)H-NMR metabolomic profiles (MeRy-B, http://bit.ly/meryb ), designed to provide a knowledge base of curated plant profiles and metabolites obtained by NMR, together with the corresponding experimental and analytical metadata. MeRy-B contains lists of plant metabolites, mostly primary metabolites and unknown compounds, with information about experimental conditions, the factors studied, and metabolite concentrations for 19 different plant species (Arabidopsis, broccoli, daphne, grape, maize, barrel clover, melon, Ostreococcus tauri, palm date, palm tree, peach, pine tree, eucalyptus, plantain rice, strawberry, sugar beet, tomato, vanilla), compiled from more than 2,300 annotated NMR profiles for various organs or tissues deposited by 30 different private or public contributors in September 2013. Currently, about half of the data deposited in MeRy-B is publicly available. In this chapter, readers will be shown how to (1) navigate through and retrieve data of publicly available projects on MeRy-B website; (2) visualize lists of experimentally identified metabolites and their concentrations in all plant species present in MeRy-B; (3) get primary metabolite list for a particular plant species in MeRy-B; and for a
Li, Dapeng; Heiling, Sven; Baldwin, Ian T; Gaquerel, Emmanuel
Secondary metabolite diversity is considered an important fitness determinant for plants' biotic and abiotic interactions in nature. This diversity can be examined in two dimensions. The first one considers metabolite diversity across plant species. A second way of looking at this diversity is by considering the tissue-specific localization of pathways underlying secondary metabolism within a plant. Although these cross-tissue metabolite variations are increasingly regarded as important readouts of tissue-level gene function and regulatory processes, they have rarely been comprehensively explored by nontargeted metabolomics. As such, important questions have remained superficially addressed. For instance, which tissues exhibit prevalent signatures of metabolic specialization? Reciprocally, which metabolites contribute most to this tissue specialization in contrast to those metabolites exhibiting housekeeping characteristics? Here, we explore tissue-level metabolic specialization in Nicotiana attenuata, an ecological model with rich secondary metabolism, by combining tissue-wide nontargeted mass spectral data acquisition, information theory analysis, and tandem MS (MS/MS) molecular networks. This analysis was conducted for two different methanolic extracts of 14 tissues and deconvoluted 895 nonredundant MS/MS spectra. Using information theory analysis, anthers were found to harbor the most specialized metabolome, and most unique metabolites of anthers and other tissues were annotated through MS/MS molecular networks. Tissue-metabolite association maps were used to predict tissue-specific gene functions. Predictions for the function of two UDP-glycosyltransferases in flavonoid metabolism were confirmed by virus-induced gene silencing. The present workflow allows biologists to amortize the vast amount of data produced by modern MS instrumentation in their quest to understand gene function.
Smrithi eSugumaran Menon
Full Text Available Human exposure to ionizing radiation disrupts normal metabolic processes in cells and organs by inducing complex biological responses that interfere with gene and protein expression. Conventional dosimetry, monitoring of prodromal symptoms and peripheral lymphocyte counts are of limited value as organ and tissue specific biomarkers for personnel exposed to radiation, particularly, weeks or months after exposure. Analysis of metabolites generated in known stress-responsive pathways by molecular profiling helps to predict the physiological status of an individual in response to environmental or genetic perturbations. Thus, a multi-metabolite profile obtained from a high resolution mass spectrometry-based metabolomics platform offers potential for identification of robust biomarkers to predict radiation toxicity of organs and tissues resulting from exposures to therapeutic or non-therapeutic ionizing radiation. Here, we review the status of radiation metabolomics and explore applications as a standalone technology, as well as its integration in systems biology, to facilitate a better understanding of the molecular basis of radiation response. Finally, we draw attention to the identification of specific pathways that can be targeted for the development of therapeutics to alleviate or mitigate harmful effects of radiation exposure.
Schaub, Jochen; Schiesling, Carola; Reuss, Matthias; Dauner, Michael
Metabolome analysis, the analysis of large sets of intracellular metabolites, has become an important systems analysis method in biotechnological and pharmaceutical research. In metabolic engineering, the integration of metabolome data with fluxome and proteome data into large-scale mathematical models promises to foster rational strategies for strain and cell line improvement. However, the development of reproducible sampling procedures for quantitative analysis of intracellular metabolite concentrations represents a major challenge, accomplishing (i) fast transfer of sample, (ii) efficient quenching of metabolism, (iii) quantitative metabolite extraction, and (iv) optimum sample conditioning for subsequent quantitative analysis. In addressing these requirements, we propose an integrated sampling procedure. Simultaneous quenching and quantitative extraction of intracellular metabolites were realized by short-time exposure of cells to temperatures unit operations into a one unit operation, (ii) the avoidance of any alteration of the sample due to chemical reagents in quenching and extraction, and (iii) automation. A sampling frequency of 5 s(-)(1) and an overall individual sample processing time faster than 30 s allow observing responses of intracellular metabolite concentrations to extracellular stimuli on a subsecond time scale. Recovery and reliability of the unit operations were analyzed. Impact of sample conditioning on subsequent IC-MS analysis of metabolites was examined as well. The integrated sampling procedure was validated through consistent results from steady-state metabolite analysis of Escherichia coli cultivated in a chemostat at D = 0.1 h(-)(1).
Full Text Available Burkitt lymphoma (BL is a rare and highly aggressive type of non-Hodgkin lymphoma. The mortality rate of BL patients is very high due to the rapid growth rate and frequent systemic spread of the disease. A better understanding of the pathogenesis, more sensitive diagnostic tools and effective treatment methods for BL are essential. Metabolomics, an important aspect of systems biology, allows the comprehensive analysis of global, dynamic and endogenous biological metabolites based on their nuclear magnetic resonance (NMR and mass spectrometry (MS. It has already been used to investigate the pathogenesis and discover new biomarkers for disease diagnosis and prognosis. In this study, we analyzed differences of serum metabolites in BL mice and normal mice by NMR-based metabolomics. We found that metabolites associated with energy metabolism, amino acid metabolism, fatty acid metabolism and choline phospholipid metabolism were altered in BL mice. The diagnostic potential of the metabolite differences was investigated in this study. Glutamate, glycerol and choline had a high diagnostic accuracy; in contrast, isoleucine, leucine, pyruvate, lysine, α-ketoglutarate, betaine, glycine, creatine, serine, lactate, tyrosine, phenylalanine, histidine and formate enabled the accurate differentiation of BL mice from normal mice. The discovery of abnormal metabolism and relevant differential metabolites may provide useful clues for developing novel, noninvasive approaches for the diagnosis and prognosis of BL based on these potential biomarkers.
Vossen, Piek; Ilievski, Filip; Postma, Marten; Roxane, Segers
In this paper, we present a new method to obtain large volumes of high-quality text corpora with event data for studying identity and reference relations. We report on the current methods to create event reference data by annotating texts and deriving the event data a posteriori. Our method starts
In this article, we will provide a description of metabolomics in comparison with other, better known “omics” disciplines such as genomics and proteomics. In addition, we will review the current rationale for the implementation of metabolomics in cardiology, its basic methodology and the available data from human studies in this discipline. The topics covered will delineate the importance of being able to use the metabolomic information to understand the mechanisms of diseases from the perspective of systems biology, and as a non-invasive approach to the diagnosis, grading and treatment of cardiovascular diseases.
Vertes, Akos [George Washington Univ., Washington, DC (United States)
Small molecules constitute a large part of the world around us, including fossil and some renewable energy sources. Solar energy harvested by plants and bacteria is converted into energy rich small molecules on a massive scale. Some of the worst contaminants of the environment and compounds of interest for national security also fall in the category of small molecules. The development of large scale metabolomic analysis methods lags behind the state of the art established for genomics and proteomics. This is commonly attributed to the diversity of molecular classes included in a metabolome. Unlike nucleic acids and proteins, metabolites do not have standard building blocks, and, as a result, their molecular properties exhibit a wide spectrum. This impedes the development of dedicated separation and spectroscopic methods. Mass spectrometry (MS) is a strong contender in the quest for a quantitative analytical tool with extensive metabolite coverage. Although various MS-based techniques are emerging for metabolomics, many of these approaches include extensive sample preparation that make large scale studies resource intensive and slow. New ionization methods are redefining the range of analytical problems that can be solved using MS. This project developed new approaches for the direct analysis of small molecules in unprocessed samples, as well as pushed the limits of ultratrace analysis in volume limited complex samples. The projects resulted in techniques that enabled metabolomics investigations with enhanced molecular coverage, as well as the study of cellular response to stimuli on a single cell level. Effectively individual cells became reaction vessels, where we followed the response of a complex biological system to external perturbation. We established two new analytical platforms for the direct study of metabolic changes in cells and tissues following external perturbation. For this purpose we developed a novel technique, laser ablation electrospray
Kobayashi, Daisuke; Sakamoto, Ryota; Nomura, Yoshihiko
This paper describes a learning assistant system using motion capture data and annotation to teach "Naginata-jutsu" (a skill to practice Japanese halberd) performance. There are some video annotation tools such as YouTube. However these video based tools have only single angle of view. Our approach that uses motion-captured data allows us to view any angle. A lecturer can write annotations related to parts of body. We have made a comparison of effectiveness between the annotation tool of YouTube and the proposed system. The experimental result showed that our system triggered more annotations than the annotation tool of YouTube.
Stegmann, Mikkel Bille
Lawrence, Michael; Huber, Wolfgang; Pagès, Hervé; Aboyoun, Patrick; Carlson, Marc; Gentleman, Robert; Morgan, Martin T; Carey, Vincent J
We describe Bioconductor infrastructure for representing and computing on annotated genomic ranges and integrating genomic data with the statistical computing features of R and its extensions. At the core of the infrastructure are three packages: IRanges, GenomicRanges, and GenomicFeatures. These packages provide scalable data structures for representing annotated ranges on the genome, with special support for transcript structures, read alignments and coverage vectors. Computational facilities include efficient algorithms for overlap and nearest neighbor detection, coverage calculation and other range operations. This infrastructure directly supports more than 80 other Bioconductor packages, including those for sequence analysis, differential expression analysis and visualization.
Full Text Available We describe Bioconductor infrastructure for representing and computing on annotated genomic ranges and integrating genomic data with the statistical computing features of R and its extensions. At the core of the infrastructure are three packages: IRanges, GenomicRanges, and GenomicFeatures. These packages provide scalable data structures for representing annotated ranges on the genome, with special support for transcript structures, read alignments and coverage vectors. Computational facilities include efficient algorithms for overlap and nearest neighbor detection, coverage calculation and other range operations. This infrastructure directly supports more than 80 other Bioconductor packages, including those for sequence analysis, differential expression analysis and visualization.
Full Text Available Lipopolysaccharides (LPSs, as MAMP molecules, trigger the activation of signal transduction pathways involved in defence. Currently, plant metabolomics is providing new dimensions into understanding the intracellular adaptive responses to external stimuli. The effect of LPS on the metabolomes of Arabidopsis thaliana cells and leaf tissue was investigated over a 24 h period. Cellular metabolites and those secreted into the medium were extracted with methanol and liquid chromatography coupled to mass spectrometry was used for quantitative and qualitative analyses. Multivariate statistical data analyses were used to extract interpretable information from the generated multidimensional LC-MS data. The results show that LPS perception triggered differential changes in the metabolomes of cells and leaves, leading to variation in the biosynthesis of specialised secondary metabolites. Time-dependent changes in metabolite profiles were observed and biomarkers associated with the LPS-induced response were tentatively identified. These include the phytohormones salicylic acid and jasmonic acid, and also the associated methyl esters and sugar conjugates. The induced defensive state resulted in increases in indole-and other glucosinolates, indole derivatives, camalexin as well as cinnamic acid derivatives and other phenylpropanoids. These annotated metabolites indicate dynamic reprogramming of metabolic pathways that are functionally related towards creating an enhanced defensive capacity. The results reveal new insights into the mode of action of LPS as an activator of plant innate immunity, broadens knowledge about the defence metabolite pathways involved in Arabidopsis responses to LPS, and identifies specialised metabolites of functional importance that can be employed to enhance immunity against pathogen infection.
Full Text Available It is a challenging problem to efficiently interpret the large volumes of remotely sensed image data being collected in the current age of remote sensing “big data”. Although human visual interpretation can yield accurate annotation of remote sensing images, it demands considerable expert knowledge and is always time-consuming, which strongly hinders its efficiency. Alternatively, intelligent approaches (e.g., supervised classification and unsupervised clustering can speed up the annotation process through the application of advanced image analysis and data mining technologies. However, high-quality expert-annotated samples are still a prerequisite for intelligent approaches to achieve accurate results. Thus, how to efficiently annotate remote sensing images with little expert knowledge is an important and inevitable problem. To address this issue, this paper introduces a novel active clustering method for the annotation of high-resolution remote sensing images. More precisely, given a set of remote sensing images, we first build a graph based on these images and then gradually optimize the structure of the graph using a cut-collect process, which relies on a graph-based spectral clustering algorithm and pairwise constraints that are incrementally added via active learning. The pairwise constraints are simply similarity/dissimilarity relationships between the most uncertain pairwise nodes on the graph, which can be easily determined by non-expert human oracles. Furthermore, we also propose a strategy to adaptively update the number of classes in the clustering algorithm. In contrast with existing methods, our approach can achieve high accuracy in the task of remote sensing image annotation with relatively little expert knowledge, thereby greatly lightening the workload burden and reducing the requirements regarding expert knowledge. Experiments on several datasets of remote sensing images show that our algorithm achieves state
Ansong, Charles; Tolic, Nikola; Purvine, Samuel O.; Porwollik, Steffen; Jones, Marcus B.; Yoon, Hyunjin; Payne, Samuel H.; Martin, Jessica L.; Burnet, Meagan C.; Monroe, Matthew E.; Venepally, Pratap; Smith, Richard D.; Peterson, Scott; Heffron, Fred; Mcclelland, Michael; Adkins, Joshua N.
Complete and accurate genome annotation is crucial for comprehensive and systematic studies of biological systems. For example systems biology-oriented genome scale modeling efforts greatly benefit from accurate annotation of protein-coding genes to develop proper functioning models. However, determining protein-coding genes for most new genomes is almost completely performed by inference, using computational predictions with significant documented error rates (> 15%). Furthermore, gene prediction programs provide no information on biologically important post-translational processing events critical for protein function. With the ability to directly measure peptides arising from expressed proteins, mass spectrometry-based proteomics approaches can be used to augment and verify coding regions of a genomic sequence and importantly detect post-translational processing events. In this study we utilized “shotgun” proteomics to guide accurate primary genome annotation of the bacterial pathogen Salmonella Typhimurium 14028 to facilitate a systems-level understanding of Salmonella biology. The data provides protein-level experimental confirmation for 44% of predicted protein-coding genes, suggests revisions to 48 genes assigned incorrect translational start sites, and uncovers 13 non-annotated genes missed by gene prediction programs. We also present a comprehensive analysis of post-translational processing events in Salmonella, revealing a wide range of complex chemical modifications (70 distinct modifications) and confirming more than 130 signal peptide and N-terminal methionine cleavage events in Salmonella. This study highlights several ways in which proteomics data applied during the primary stages of annotation can improve the quality of genome annotations, especially with regards to the annotation of mature protein products.
Thomas, Funmilola Clara; Mudaliar, Manikhandan; Tassi, Riccardo; McNeilly, Tom N; Burchmore, Richard; Burgess, Karl; Herzyk, Pawel; Zadoks, Ruth N; Eckersall, P David
Intramammary infection leading to bovine mastitis is the leading disease problem affecting dairy cows and has marked effects on the milk produced by infected udder quarters. An experimental model of Streptococcus uberis mastitis has previously been investigated for clinical, immunological and pathophysiological alteration in milk, and has been the subject of peptidomic and quantitative proteomic investigation. The same sample set has now been investigated with a metabolomics approach using liquid chromatography and mass spectrometry. The analysis revealed over 3000 chromatographic peaks, of which 690 were putatively annotated with a metabolite. Hierarchical clustering analysis and principal component analysis demonstrated that metabolite changes due to S. uberis infection were maximal at 81 hours post challenge with metabolites in the milk from the resolution phase at 312 hours post challenge being closest to the pre-challenge samples. Metabolic pathway analysis revealed that the majority of the metabolites mapped to carbohydrate and nucleotide metabolism show a decreasing trend in concentration up to 81 hours post-challenge whereas an increasing trend was found in lipid metabolites and di-, tri- and tetra-peptides up to the same time point. The increase in these peptides coincides with an increase in larger peptides found in the previous peptidomic analysis and is likely to be due to protease degradation of milk proteins. Components of bile acid metabolism, linked to the FXR pathway regulating inflammation, were also increased. Metabolomic analysis of the response in milk during mastitis provides an essential component to the full understanding of the mammary gland's response to infection.
Chassagne, François; Haddad, Mohamed; Amiel, Aurélien; Phakeovilay, Chiobouaphong; Manithip, Chanthanom; Bourdy, Geneviève; Deharo, Eric; Marti, Guillaume
Liver cancer is a major health burden in Southeast Asia, and most patients turn towards the use of medicinal plants to alleviate their symptoms. The aim of this work was to apply to Southeast Asian plants traditionally used to treat liver disorders, a successive ranking strategy based on a comprehensive review of the literature and metabolomic data in order to relate ethnopharmacological relevance to chemical entities of interest. We analyzed 45 publications resulting in a list of 378 plant species, and our point system based on the frequency of citation in the literature allowed the selection of 10 top ranked species for further collection and extraction. Extracts of these plants were tested for their in vitro anti-proliferative activities on HepG2 cells. Ethanolic extracts of Andrographis paniculata, Oroxylum indicum, Orthosiphon aristatus and Willughbeia edulis showed the highest anti-proliferative effects (IC 50 = 195.9, 64.1, 71.3 and 66.7 μg/ml, respectively). A metabolomic ranking model was performed to annotate compounds responsible for the anti-proliferative properties of A. paniculata (andrographolactone and dehydroandrographolide), O. indicum (baicalein, chrysin, oroxylin A and scutellarein), O. aristatus (5-desmethylsinensetin) and W. edulis (parabaroside C and procyanidin). Overall, our dereplicative approach combined with a bibliographic scoring system allowed us to rapidly decipher the molecular basis of traditionally used medicinal plants. Copyright © 2018 Elsevier B.V. All rights reserved.
Chetwynd, Andrew J; Abdul-Sada, Alaa; Holt, Stephen G; Hill, Elizabeth M
Metabolomics analyses of urine have the potential to provide new information on the detection and progression of many disease processes. However, urine samples can vary significantly in total solute concentration and this presents a challenge to achieve high quality metabolomic datasets and the detection of biomarkers of disease or environmental exposures. This study investigated the efficacy of pre- and post-analysis normalisation methods to analyse metabolomic datasets obtained from neat and diluted urine samples from five individuals. Urine samples were extracted by solid phase extraction (SPE) prior to metabolomic analyses using a sensitive nanoflow/nanospray LC-MS technique and the data analysed by principal component analyses (PCA). Post-analysis normalisation of the datasets to either creatinine or osmolality concentration, or to mass spectrum total signal (MSTS), revealed that sample discrimination was driven by the dilution factor of urine rather than the individual providing the sample. Normalisation of urine samples to equal osmolality concentration prior to LC-MS analysis resulted in clustering of the PCA scores plot according to sample source and significant improvements in the number of peaks common to samples of all three dilutions from each individual. In addition, the ability to identify discriminating markers, using orthogonal partial least squared-discriminant analysis (OPLS-DA), was greatly improved when pre-analysis normalisation to osmolality was compared with post-analysis normalisation to osmolality and non-normalised datasets. Further improvements for peak area repeatability were observed in some samples when the pre-analysis normalisation to osmolality was combined with a post-analysis mass spectrum total useful signal (MSTUS) or MSTS normalisation. Future adoption of such normalisation methods may reduce the variability in metabolomics analyses due to differing urine concentrations and improve the discovery of discriminating metabolites
Full Text Available Oceanic dissolved organic matter (DOM is an assemblage of reduced carbon compounds, which results from biotic and abiotic processes. The biotic processes consist in either release or uptake of specific molecules by marine organisms. Heterotrophic bacteria have been mostly considered to influence the DOM composition by preferential uptake of certain compounds. However, they also secrete a variety of molecules depending on physiological state, environmental and growth conditions, but so far the full set of compounds secreted by these bacteria has never been investigated. In this study, we analyzed the exo-metabolome, metabolites secreted into the environment, of the heterotrophic marine bacterium Pseudovibrio sp. FO-BEG1 via ultra-high resolution mass spectrometry, comparing phosphate limited with phosphate surplus growth conditions. Bacteria belonging to the Pseudovibrio genus have been isolated worldwide, mainly from marine invertebrates and were described as metabolically versatile Alphaproteobacteria. We show that the exo-metabolome is unexpectedly large and diverse, consisting of hundreds of compounds that differ by their molecular formulae. It is characterized by a dynamic recycling of molecules, and it is drastically affected by the physiological state of the strain. Moreover, we show that phosphate limitation greatly influences both the amount and the composition of the secreted molecules. By assigning the detected masses to general chemical categories, we observed that under phosphate surplus conditions the secreted molecules were mainly peptides and highly unsaturated compounds. In contrast, under phosphate limitation the composition of the exo-metabolome changed during bacterial growth, showing an increase in highly unsaturated, phenolic, and polyphenolic compounds. Finally, we annotated the detected masses using multiple metabolite databases. These analyses suggested the presence of several masses analogue to masses of known bioactive
Castillo Luis F.
Full Text Available Gene annotation is a process that encompasses multiple approaches on the analysis of nucleic acids or protein sequences in order to assign structural and functional characteristics to gene models. When thousands of gene models are being described in an organism genome, construction and visualization of gene networks impose novel challenges in the understanding of complex expression patterns and the generation of new knowledge in genomics research. In order to take advantage of accumulated text data after conventional gene sequence analysis, this work applied semantics in combination with visualization tools to build transcriptome networks from a set of coffee gene annotations. A set of selected coffee transcriptome sequences, chosen by the quality of the sequence comparison reported by Basic Local Alignment Search Tool (BLAST and Interproscan, were filtered out by coverage, identity, length of the query, and e-values. Meanwhile, term descriptors for molecular biology and biochemistry were obtained along the Wordnet dictionary in order to construct a Resource Description Framework (RDF using Ruby scripts and Methontology to find associations between concepts. Relationships between sequence annotations and semantic concepts were graphically represented through a total of 6845 oriented vectors, which were reduced to 745 non-redundant associations. A large gene network connecting transcripts by way of relational concepts was created where detailed connections remain to be validated for biological significance based on current biochemical and genetics frameworks. Besides reusing text information in the generation of gene connections and for data mining purposes, this tool development opens the possibility to visualize complex and abundant transcriptome data, and triggers the formulation of new hypotheses in metabolic pathways analysis.
Baldock, Richard A; Armit, Chris
"The Atlas of Mouse Development" by Kaufman is a classic paper atlas that is the de facto standard for the definition of mouse embryo anatomy in the context of standard histological images. We have re-digitised the original H&E stained tissue sections used for the book at high resolution and transferred the hand-drawn annotations to digital form. We have augmented the annotations with standard ontological assignments (EMAPA anatomy) and made the data freely available via an online viewer (eHistology) and from the University of Edinburgh DataShare archive. The dataset captures and preserves the definitive anatomical knowledge of the original atlas, provides a core image set for deeper community annotation and teaching, and delivers a unique high-quality set of high-resolution histological images through mammalian development for manual and automated analysis. © The Authors 2017. Published by Oxford University Press.
Martens, J.; Berden, G.; van Outersterp, R.E.; Kluijtmans, L.A.J.; Engelke, U.F.; van Karnebeek, C.D.M.; Wevers, R.A.; Oomens, J.
Small molecule identification is a continually expanding field of research and represents the core challenge in various areas of (bio) analytical science, including metabolomics. Here, we unequivocally differentiate enantiomeric N-acetylhexosamines in body fluids using infrared ion spectroscopy,
Misra, Biswapriya B; Assmann, Sarah M; Chen, Sixue
In conjunction with genomics, transcriptomics, and proteomics, plant metabolomics is providing large data sets that are paving the way towards a comprehensive and holistic understanding of plant growth, development, defense, and productivity. However, dilution effects from organ- and tissue-based sampling of metabolomes have limited our understanding of the intricate regulation of metabolic pathways and networks at the cellular level. Recent advances in metabolomics methodologies, along with the post-genomic expansion of bioinformatics knowledge and functional genomics tools, have allowed the gathering of enriched information on individual cells and single cell types. Here we review progress, current status, opportunities, and challenges presented by single cell-based metabolomics research in plants. Copyright © 2014 Elsevier Ltd. All rights reserved.
Metabolomics datasets, by definition, comprise of measurements of large numbers of metabolites. Both technical (analytical) and biological factors will induce variation within these measurements that is not consistent across all metabolites. Consequently, criteria are required to...
Koek, M.M.; Muilwijk, B.; Werf, M.J. van der; Hankemeier, T.
An analytical method was set up suitable for the analysis of microbial metabolomes, consisting of an oximation and silylation derivatization reaction and subsequent analysis by gas chromatography coupled to mass spectrometry. Microbial matrixes contain many compounds that potentially interfere with
Pierleoni, Paola; Maurizi, Lorenzo; Palma, Lorenzo; Belli, Alberto; Valenti, Simone; Marroni, Alessandro
The use of precordial Doppler monitoring to prevent decompression sickness (DS) is well known by the scientific community as an important instrument for early diagnosis of DS. However, the timely and correct diagnosis of DS without assistance from diving medical specialists is unreliable. Thus, a common protocol for the manual annotation of echo Doppler signals and a tool for their automated recording and annotation are necessary. We have implemented original software for efficient bubble appearance annotation and proposed a unified annotation protocol. The tool auto-sets the response time of human "bubble examiners," performs playback of the Doppler file by rendering it independent of the specific audio player, and enables the annotation of individual bubbles or multiple bubbles known as "showers." The tool provides a report with an optimized data structure and estimates the embolic risk level according to the Extended Spencer Scale. The tool is built in accordance with ISO/IEC 9126 on software quality and has been projected and tested with assistance from the Divers Alert Network (DAN) Europe Foundation, which employs this tool for its diving data acquisition campaigns.
Spicer, Rachel A; Steinbeck, Christoph
Data sharing is being increasingly required by journals and has been heralded as a solution to the 'replication crisis'. (i) Review data sharing policies of journals publishing the most metabolomics papers associated with open data and (ii) compare these journals' policies to those that publish the most metabolomics papers. A PubMed search was used to identify metabolomics papers. Metabolomics data repositories were manually searched for linked publications. Journals that support data sharing are not necessarily those with the most papers associated to open metabolomics data. Further efforts are required to improve data sharing in metabolomics.
Schwartz, David Charles; Severin, Jessica
There are provided computer systems for visualizing and annotating single molecule images. Annotation systems in accordance with this disclosure allow a user to mark and annotate single molecules of interest and their restriction enzyme cut sites thereby determining the restriction fragments of single nucleic acid molecules. The markings and annotations may be automatically generated by the system in certain embodiments and they may be overlaid translucently onto the single molecule images. An image caching system may be implemented in the computer annotation systems to reduce image processing time. The annotation systems include one or more connectors connecting to one or more databases capable of storing single molecule data as well as other biomedical data. Such diverse array of data can be retrieved and used to validate the markings and annotations. The annotation systems may be implemented and deployed over a computer network. They may be ergonomically optimized to facilitate user interactions.
Bakke, Peter; Carney, Nick; DeLoache, Will
Genome annotations are accumulating rapidly and depend heavily on automated annotation systems. Many genome centers offer annotation systems but no one has compared their output in a systematic way to determine accuracy and inherent errors. Errors in the annotations are routinely deposited...... in databases such as NCBI and used to validate subsequent annotation errors. We submitted the genome sequence of halophilic archaeon Halorhabdus utahensis to be analyzed by three genome annotation services. We have examined the output from each service in a variety of ways in order to compare the methodology...... and effectiveness of the annotations, as well as to explore the genes, pathways, and physiology of the previously unannotated genome. The annotation services differ considerably in gene calls, features, and ease of use. We had to manually identify the origin of replication and the species-specific consensus...
Barrios, Ernie, Ed.
More than 300 books and articles published from 1920 to 1971 are reviewed in this annotated bibliography of literature on the Chicano. The citations and reviews are categorized by subject area and deal with contemporary Chicano history, education, health, history of Mexico, literature, native Americans, philosophy, political science, pre-Columbian…
Kügler, Frank; Smolibocki, Bernadett; Arnold, Denis
This paper presents newly developed guidelines for prosodic annotation of German as a consensus system agreed upon by German intonologists. The DIMA system is rooted in the framework of autosegmental-metrical phonology. One important goal of the consensus is to make exchanging data between groups...
L. Rutledge (Lloyd); J.R. van Ossenbruggen (Jacco); L. Hardman (Lynda)
textabstractThe Semantic Web envisions a Web that is both human readable and machine processible. In practice, however, there is still a large conceptual gap between annotated content repositories on the one hand, and coherent, human readable Web pages on the other. To bridge this conceptual gap,
Hardman, L.; Obrenović, Ž.; Nack, F.; Troncy, R.; Huet, B.; Schenk, S.
While many multimedia systems allow the association of semantic annotations with media assets, there is no agreed way of sharing these among systems. This chapter identifies a small number of fundamental processes of media production, which the author terms canonical processes, which can be
L. Hardman (Lynda); Z. Obrenovic; F.-M. Nack (Frank); B. Kerhervé; K. Piersol
htmlabstractWhile many multimedia systems allow the association of semantic annotations with media assets, there is no agreed-upon way of sharing these among systems. As an initial step within the multimedia community, we identify a small number of fundamental processes of media production, which we
Hardman, L.; Obrenović, Ž.; Nack, F.; Kerhervé, B.; Piersol, K.
While many multimedia systems allow the association of semantic annotations with media assets, there is no agreed-upon way of sharing these among systems. As an initial step within the multimedia community, we identify a small number of fundamental processes of media production, which we term
NHSA Dialog, 2008
This article provides an annotated bibliography of various children's books. It includes listings of books that illustrate the dynamic relationships within the natural environment, economic context, racial and cultural identities, cross-group similarities and differences, gender, different abilities and stories of injustice and resistance.
Bishop, Wendy; And Others
Focusing on pedagogical issues in creative writing, this annotated bibliography reviews 149 books, articles, and dissertations in the fields of creative writing and composition, and, selectively, feminist and literary theory. Anthologies of original writing and reference books are not included. (MM)
Dorothy B. Durband
Full Text Available The following annotated bibliography contains a summary of articles and websites, as well as a list of books related to financial therapy. The resources were compiled through e-mail solicitation from members of the Financial Therapy Forum in November 2008. Members of the forum are marked with an asterisk.
J.C. van de Pol (Jaco)
textabstractA simple kind of strategy annotations is investigated, giving rise to a class of strategies, including leftmost-innermost. It is shown that under certain restrictions, an interpreter can be written which computes the normal form of a term in a bottom-up traversal. The main contribution
Sanfilippo, Antonio P.; Tratz, Stephen C.; Gregory, Michelle L.; Chappell, Alan R.; Whitney, Paul D.; Posse, Christian; Paulson, Patrick R.; Baddeley, Bob L.; Hohimer, Ryan E.; White, Amanda M.
Semantic Web applications require robust and accurate annotation tools that are capable of automating the assignment of ontological classes to words in naturally occurring text (ontological annotation). Most current ontologies do not include rich lexical databases and are therefore not easily integrated with word sense disambiguation algorithms that are needed to automate ontological annotation. WordNet provides a potentially ideal solution to this problem as it offers a highly structured lexical conceptual representation that has been extensively used to develop word sense disambiguation algorithms. However, WordNet has not been designed as an ontology, and while it can be easily turned into one, the result of doing this would present users with serious practical limitations due to the great number of concepts (synonym sets) it contains. Moreover, mapping WordNet to an existing ontology may be difficult and requires substantial labor. We propose to overcome these limitations by developing an analytical platform that (1) provides a WordNet-based ontology offering a manageable and yet comprehensive set of concept classes, (2) leverages the lexical richness of WordNet to give an extensive characterization of concept class in terms of lexical instances, and (3) integrates a class recognition algorithm that automates the assignment of concept classes to words in naturally occurring text. The ensuing framework makes available an ontological annotation platform that can be effectively integrated with intelligence analysis systems to facilitate evidence marshaling and sustain the creation and validation of inference models.
Stamou, G.; Ossenbruggen, J.R.; Pan, J.; Schreiber, A.T.
Multimedia in all forms (images, video, graphics, music, speech) is exploding on the Web. The content needs to be annotated and indexed to enable effective search and retrieval. However, recent standards and best practices for multimedia metadata don't provide semantically rich descriptions of
G. Stamou; J.R. van Ossenbruggen (Jacco); J.Z. Pan (Jeff); G. Schreiber (Guus)
textabstractMultimedia in all forms (images, video, graphics, music, speech) is exploding on the Web. The content needs to be annotated and indexed to enable effective search and retrieval. However, recent standards and best practices for multimedia metadata don't provide semantically rich
Chapa, Evey, Ed.; And Others
Intended to provide interested persons, researchers, and educators with information about "la mujer Chicana", this annotated bibliography cites 320 materials published between 1916 and 1975, with the majority being between 1960 and 1975. The 12 sections cover the following subject areas: Chicana publications; Chicana feminism and…
Sanfilippo, Antonio P.; Tratz, Stephen C.; Gregory, Michelle L.; Chappell, Alan R.; Whitney, Paul D.; Posse, Christian; Paulson, Patrick R.; Baddeley, Bob; Hohimer, Ryan E.; White, Amanda M.
Semantic Web applications require robust and accurate annotation tools that are capable of automating the assignment of ontological classes to words in naturally occurring text (ontological annotation). Most current ontologies do not include rich lexical databases and are therefore not easily integrated with word sense disambiguation algorithms that are needed to automate ontological annotation. WordNet provides a potentially ideal solution to this problem as it offers a highly structured lexical conceptual representation that has been extensively used to develop word sense disambiguation algorithms. However, WordNet has not been designed as an ontology, and while it can be easily turned into one, the result of doing this would present users with serious practical limitations due to the great number of concepts (synonym sets) it contains. Moreover, mapping WordNet to an existing ontology may be difficult and requires substantial labor. We propose to overcome these limitations by developing an analytical platform that (1) provides a WordNet-based ontology offering a manageable and yet comprehensive set of concept classes, (2) leverages the lexical richness of WordNet to give an extensive characterization of concept class in terms of lexical instances, and (3) integrates a class recognition algorithm that automates the assignment of concept classes to words in naturally occurring text. The ensuing framework makes available an ontological annotation platform that can be effectively integrated with intelligence analysis systems to facilitate evidence marshaling and sustain the creation and validation of inference models.
Strachan, J.D.; Corrigan, G.
This annotated bibliography is intended to help EDGE2D users, and particularly new users, find existing published literature that has used EDGE2D. Our idea is that a person can find existing studies which may relate to his intended use, as well as gain ideas about other possible applications by scanning the attached tables
This annotated bibliography contains over 500 sources on the historical and contemporary development and expression of male and female sexuality. There are 68 topic headings which provide easy access for subject areas. A major portion of the bibliography is devoted to contemporary male-female sexuality. These materials consist of research findings…
Tutunjian, Beth Ann
This annotated publications list on homelessness contains citations for 19 publications, most of which deal with problems of alcohol or drug abuse among homeless persons. Citations are listed alphabetically by author and cover the topics of homelessness and alcoholism, drug abuse, public policy, research methodologies, mental illness, alcohol- and…
From reading texts to annotating web pages, grade 6-8 students rely on group cooperation and individual reading and writing skills in this research project that spans six 50-minute lessons. Student objectives for this project are that they will: read, discuss, and keep a journal on a book in literature circles; understand the elements of and…
Lamy, Philippe; Andersen, Claus Lindbjerg; Wikman, Friedrik
allows us to annotate SNPs that have poor performance, either because of poor experimental conditions or because for one of the alleles the probes do not behave in a dose-response manner. Generally, our method agrees well with a method developed by Affymetrix. When both methods make a call they agree...
Li, Shengting; Ma, Lijia; Li, Heng
Snap (Single Nucleotide Polymorphism Annotation Platform) is a server designed to comprehensively analyze single genes and relationships between genes basing on SNPs in the human genome. The aim of the platform is to facilitate the study of SNP finding and analysis within the framework of medical...
Heylen, Dirk K.J.; Reidsma, Dennis; Ordelman, Roeland J.F.; Devillers, L.; Martin, J-C.; Cowie, R.; Batliner, A.
We discuss the annotation procedure for mental state and emotion that is under development for the AMI (Augmented Multiparty Interaction) corpus. The categories that were found to be most appropriate relate not only to emotions but also to (meta-)cognitive states and interpersonal variables. The
newapplicationsfor the ePNK and, in particular, visualizing the result of an application in the graphical editor of the ePNK by singannotations, and interacting with the end user using these annotations. In this paper, we give an overview of the concepts of ePNK applications by discussing the implementation...
Bloem, J.; Bański, P.; Kupietz, M.; Lüngen, H.; Witt, A.; Barbaresi, A.; Biber, H.; Breiteneder, E.; Clematide, S.
This study discusses evaluation methods for linguists to use when employing an automatically annotated treebank as a source of linguistic evidence. While treebanks are usually evaluated with a general measure over all the data, linguistic studies often focus on a particular construction or a group
Popovich, Mark, Comp.; And Others
The purposes of this bibliography are to bring together materials that relate to the history of newspapers in Indiana and to assess, in a general way, the value of the material. The bibliography contains 415 entries, with descriptive annotations, arranged in seven sections: books; special materials; general newspaper histories and lists of…
Liu, Weifeng; Tao, Dacheng
The rapid development of computer hardware and Internet technology makes large scale data dependent models computationally tractable, and opens a bright avenue for annotating images through innovative machine learning algorithms. Semisupervised learning (SSL) therefore received intensive attention in recent years and was successfully deployed in image annotation. One representative work in SSL is Laplacian regularization (LR), which smoothes the conditional distribution for classification along the manifold encoded in the graph Laplacian, however, it is observed that LR biases the classification function toward a constant function that possibly results in poor generalization. In addition, LR is developed to handle uniformly distributed data (or single-view data), although instances or objects, such as images and videos, are usually represented by multiview features, such as color, shape, and texture. In this paper, we present multiview Hessian regularization (mHR) to address the above two problems in LR-based image annotation. In particular, mHR optimally combines multiple HR, each of which is obtained from a particular view of instances, and steers the classification function that varies linearly along the data manifold. We apply mHR to kernel least squares and support vector machines as two examples for image annotation. Extensive experiments on the PASCAL VOC'07 dataset validate the effectiveness of mHR by comparing it with baseline algorithms, including LR and HR.
Allix, Beverley, Comp.
This annotated bibliography covers the following types of materials of use to teachers of English for Special Purposes: (1) books, monographs, reports, and conference papers; (2) periodical articles and essays in collections; (3) theses and dissertations; (4) bibliographies; (5) dictionaries; and (6) textbooks in series by publisher. Section (1)…
E. Durant McArthur; Bryce A. Richardson; Stanley G. Kitchen
This annotated bibliography documents the research that has been conducted on the Great Basin Experimental Range (GBER, also known as the Utah Experiment Station, Great Basin Station, the Great Basin Branch Experiment Station, Great Basin Experimental Center, and other similar name variants) over the 102 years of its existence. Entries were drawn from the original...
Tykodi, R. J.
Urges chemistry teachers to have students annotate the chemical reactions in aqueous-solutions that they see in their textbooks and witness in the laboratory. Suggests this will help students recognize the reaction type more readily. Examples are given for gas formation, precipitate formation, redox interaction, acid-base interaction, and…
Fan, Teresa W-M.; Lorkiewicz, Pawel; Sellers, Katherine; Moseley, Hunter N.B.; Higashi, Richard M.; Lane, Andrew N.
Advances in analytical methodologies, principally nuclear magnetic resonance spectroscopy (NMR) and mass spectrometry (MS), during the last decade have made large-scale analysis of the human metabolome a reality. This is leading to the reawakening of the importance of metabolism in human diseases, particularly cancer. The metabolome is the functional readout of the genome, functional genome, and proteome; it is also an integral partner in molecular regulations for homeostasis. The interrogation of the metabolome, or metabolomics, is now being applied to numerous diseases, largely by metabolite profiling for biomarker discovery, but also in pharmacology and therapeutics. Recent advances in stable isotope tracer-based metabolomic approaches enable unambiguous tracking of individual atoms through compartmentalized metabolic networks directly in human subjects, which promises to decipher the complexity of the human metabolome at an unprecedented pace. This knowledge will revolutionize our understanding of complex human diseases, clinical diagnostics, as well as individualized therapeutics and drug response. In this review, we focus on the use of stable isotope tracers with metabolomics technologies for understanding metabolic network dynamics in both model systems and in clinical applications. Atom-resolved isotope tracing via the two major analytical platforms, NMR and MS, has the power to determine novel metabolic reprogramming in diseases, discover new drug targets, and facilitates ADME studies. We also illustrate new metabolic tracer-based imaging technologies, which enable direct visualization of metabolic processes in vivo. We further outline current practices and future requirements for biochemoinformatics development, which is an integral part of translating stable isotope-resolved metabolomics into clinical reality. PMID:22212615
Metabolomics is the comprehensive assessment of low molecular weight organic metabolites within biological system. The identification and characterization of several chemical species, or metabolic fingerprinting, is an emergent approach in metabolomics field that provides a valuable “snapshot” of metabolic profiles. This approach is finding an increasing number of applications in many areas including cancer research, drug discovery and food science. The combined use of NMR spectroscopy, data ...
De Livera, Alysha M.; Sysi-Aho, Marko; Jacob, Laurent; Gagnon-Bartsch, Johann A.; Castillo, Sandra; Simpson, Julie A; Speed, Terence P.
Metabolomics experiments are inevitably subject to a component of unwanted variation, due to factors such as batch effects, long runs of samples, and confounding biological variation. Although the removal of this unwanted variation is a vital step in the analysis of metabolomics data, it is considered a gray area in which there is a recognised need to develop a better understanding of the procedures and statistical methods required to achieve statistically relevant optimal biological outcomes...
This paper presents an annotation model for harmonic structure of a piece of music, and a rule system that supports the automatic generation of harmonic annotations. Musical structure has so far received relatively little attention in the context of musical metadata and annotation, although it is highly relevant for musicians, musicologists and indirectly for music listeners. Activities in semantic annotation of music have so far mostly concentrated on features derived from audio data and fil...
Kalkatawi, Manal M.
Background Genome annotation is one way of summarizing the existing knowledge about genomic characteristics of an organism. There has been an increased interest during the last several decades in computer-based structural and functional genome annotation. Many methods for this purpose have been developed for eukaryotes and prokaryotes. Our study focuses on comparison of functional annotations of prokaryotic genomes. To the best of our knowledge there is no fully automated system for detailed comparison of functional genome annotations generated by different annotation methods (AMs). Results The presence of many AMs and development of new ones introduce needs to: a/ compare different annotations for a single genome, and b/ generate annotation by combining individual ones. To address these issues we developed an Automated Tool for Bacterial GEnome Annotation ComparisON (BEACON) that benefits both AM developers and annotation analysers. BEACON provides detailed comparison of gene function annotations of prokaryotic genomes obtained by different AMs and generates extended annotations through combination of individual ones. For the illustration of BEACON’s utility, we provide a comparison analysis of multiple different annotations generated for four genomes and show on these examples that the extended annotation can increase the number of genes annotated by putative functions up to 27 %, while the number of genes without any function assignment is reduced. Conclusions We developed BEACON, a fast tool for an automated and a systematic comparison of different annotations of single genomes. The extended annotation assigns putative functions to many genes with unknown functions. BEACON is available under GNU General Public License version 3.0 and is accessible at: http://www.cbrc.kaust.edu.sa/BEACON/
Kalkatawi, Manal; Alam, Intikhab; Bajic, Vladimir B
Genome annotation is one way of summarizing the existing knowledge about genomic characteristics of an organism. There has been an increased interest during the last several decades in computer-based structural and functional genome annotation. Many methods for this purpose have been developed for eukaryotes and prokaryotes. Our study focuses on comparison of functional annotations of prokaryotic genomes. To the best of our knowledge there is no fully automated system for detailed comparison of functional genome annotations generated by different annotation methods (AMs). The presence of many AMs and development of new ones introduce needs to: a/ compare different annotations for a single genome, and b/ generate annotation by combining individual ones. To address these issues we developed an Automated Tool for Bacterial GEnome Annotation ComparisON (BEACON) that benefits both AM developers and annotation analysers. BEACON provides detailed comparison of gene function annotations of prokaryotic genomes obtained by different AMs and generates extended annotations through combination of individual ones. For the illustration of BEACON's utility, we provide a comparison analysis of multiple different annotations generated for four genomes and show on these examples that the extended annotation can increase the number of genes annotated by putative functions up to 27%, while the number of genes without any function assignment is reduced. We developed BEACON, a fast tool for an automated and a systematic comparison of different annotations of single genomes. The extended annotation assigns putative functions to many genes with unknown functions. BEACON is available under GNU General Public License version 3.0 and is accessible at: http://www.cbrc.kaust.edu.sa/BEACON/ .
Full Text Available Recent technological developments in metabolomics research have enabled in-depth characterization of complex metabolite mixtures in a wide range of biological, biomedical, environmental, agricultural, and nutritional research fields. Nuclear magnetic resonance spectroscopy and mass spectrometry are the two main platforms for performing metabolomics studies. Given their broad applicability and the systemic insight into metabolism that can be ob-tained it is not surprising that metabolomics becomes increasingly popular in basic biological research. In this review, we provide an overview on key me-tabolites, recent studies, and future opportunities for metabolomics in stud-ying autophagy regulation. Metabolites play a pivotal role in autophagy regulation and are therefore key targets for autophagy research. Given the recent success of metabolomics, it can be expected that metabolomics ap-proaches will contribute significantly to deciphering the complex regulatory mechanisms involved in autophagy in the near future and promote under-standing of autophagy and autophagy-related diseases in living cells and or-ganisms.
Lamichhane, Santosh; Yde, Christian C; Forssten, Sofia; Ouwehand, Arthur C; Saarinen, Markku; Jensen, Henrik Max; Gibson, Glenn R; Rastall, Robert; Fava, Francesca; Bertram, Hanne Christine
The aim of the present study was to elucidate the impact of polydextrose PDX an soluble fiber, on the human fecal metabolome by high-resolution nuclear magnetic resonance (NMR) spectroscopy-based metabolomics in a dietary intervention study (n = 12). Principal component analysis (PCA) revealed a strong effect of PDX consumption on the fecal metabolome, which could be mainly ascribed to the presence of undigested fiber and oligosaccharides formed from partial degradation of PDX. Our results demonstrate that NMR-based metabolomics is a useful technique for metabolite profiling of feces and for testing compliance to dietary fiber intake in such trials. In addition, novel associations between PDX and the levels of the fecal metabolites acetate and propionate could be identified. The establishment of a correlation between the fecal metabolome and levels of Bifidobacterium (R(2) = 0.66) and Bacteroides (R(2) = 0.46) demonstrates the potential of NMR-based metabolomics to elucidate metabolic activity of bacteria in the gut.
Zhang, Peixu; Zhang, Weiguanliu; Lang, Yue; Qu, Yan; Chu, Fengna; Chen, Jiafeng; Cui, Li
Tuberculosis meningitis (TBM) is a prevalent form of extra-pulmonary tuberculosis that causes substantial morbidity and mortality. Diagnosis of TBM is difficult because of the limited sensitivity of existing laboratory techniques. A metabolomics approach can be used to investigate the sets of metabolites of both bacteria and host, and has been used to clarify the mechanisms underlying disease development, and identify metabolic changes, leadings to improved methods for diagnosis, treatment, and prognostication. Mass spectrometry (MS) is a major analysis platform used in metabolomics, and MS-based metabolomics provides wide metabolite coverage, because of its high sensitivity, and is useful for the investigation of Mycobacterium tuberculosis (Mtb) and related diseases. It has been used to investigate TBM diagnosis; however, the processes involved in the MS-based metabolomics approach are complex and flexible, and often consist of several steps, and small changes in the methods used can have a huge impact on the final results. Here, the process of MS-based metabolomics is summarized and its applications in Mtb and Mtb-related diseases discussed. Moreover, the current status of TBM metabolomics is described. Copyright © 2018. Published by Elsevier B.V.
Xiao, Chaoni; Wu, Man; Chen, Yongyong; Zhang, Yajun; Zhao, Xinfeng; Zheng, Xiaohui
The distribution of metabolites in the different root parts of Cortex Moutan (the root bark of Paeonia suffruticosa Andrews) is not well understood, therefore, scientific evidence is not available for quality assessment of Cortex Moutan. To reveal metabolomic variations in Cortex Moutan in order to gain deeper insights to enable quality control. Metabolomic variations in the different root parts of Cortex Moutan were characterised using high-performance liquid chromatography combined with mass spectrometry (HPLC-MS) and multivariate data analysis. The discriminating metabolites in different root parts were evaluated by the one-way analysis of variance and a fold change parameter. The metabolite profiles of Cortex Moutan were largely dominated by five primary and 41 secondary metabolites . Higher levels of malic acid, gallic acid and mudanoside-B were mainly observed in the second lateral roots, whereas dihydroxyacetophenone, benzoyloxypaeoniflorin, suffruticoside-A, kaempferol dihexoside, mudanpioside E and mudanpioside J accumulated in the first lateral and axial roots. The highest contents of paeonol, galloyloxypaeoniflorin and procyanidin B were detected in the axial roots. Accordingly, metabolite compositions of Cortex Moutan were found to vary among different root parts. The axial roots have higher quality than the lateral roots in Cortex Moutan due to the accumulation of bioactive secondary metabolites associated with plant physiology. These findings provided important scientific evidence for grading Cortex Moutan on the general market. Copyright © 2014 John Wiley & Sons, Ltd.
Anđelković, Boban; Vujisić, Ljubodrag; Vučković, Ivan; Tešević, Vele; Vajs, Vlatka; Gođevac, Dejan
Herein, we propose rapid and simple spectroscopic methods to determine the chemical composition of propolis derived from various Populus species using a metabolomics approach. In order to correlate variability in Populus type propolis composition with the altitude of its collection, NMR, IR, and UV spectroscopy followed by OPLS was conducted. The botanical origin of propolis was established by comparing propolis spectral data to those of buds of various Populus species. An O2PLS method was utilized to integrate two blocks of data. According to OPLS and O2PLS, the major compounds in propolis samples, collected from temperate continental climate above 500m, were phenolic glycerides originating from P. tremula buds. Flavonoids were predominant in propolis samples collected below 400m, originating from P. nigra and P. x euramericana buds. Samples collected at 400-500m were of mixed origin, with variable amounts of all detected metabolites. Copyright © 2016 Elsevier B.V. All rights reserved.
Boot, P.; Braungart, Georg; Jannidis, Fotis; Gendolla, Peter
Robinson and others have recently called for dynamic and collaborative digital scholarly editions. Annotation is a key component for editions that are not merely passive, read-only repositories of knowledge. Annotation facilities (both annotation creation and display), however, require complex
van Son, C.M.; Caselli, T.; Fokkens, A.S.; Maks, E.; Morante Vallejo, R.; Aroyo, L.M.; Vossen, P.T.J.M.
This paper presents a framework and methodology for the annotation of perspectives in text. In the last decade, different aspects of linguistic encoding of perspectives have been targeted as separated phenomena through different annotation initiatives. We propose an annotation scheme that integrates
Full Text Available SE22_AM1 Annotation based on a grading system Collected mass spectral features, tog...ether with predicted molecular formulae and putative structures, were provided as metabolite annotations. Co...mparison with public databases was performed. A grading system was introduced to describe the evidence supporting the annotations. ...
Hestand, Matthew S.; Kalbfleisch, Theodore S.; Coleman, Stephen J.
Current gene annotation of the horse genome is largely derived from in silico predictions and cross-species alignments. Only a small number of genes are annotated based on equine EST and mRNA sequences. To expand the number of equine genes annotated from equine experimental evidence, we sequenced m...
Millar, Katherine D L; Kiss, John Z
Characterization of phototropism and gravitropism has been through gene expression studies, assessment of curvature response, and protein expression experiments. To our knowledge, the current study is the first to determine how the metabolome, the complete set of small-molecule metabolites within a plant, is impacted during these tropisms. We have determined the metabolic profile of plants during gravitropism and phototropism. Seedlings of Arabidopsis thaliana wild type (WT) and phyB mutant were exposed to unidirectional light (red or blue) or reoriented to induce a tropistic response, and small-molecule metabolites were assayed and quantified. A subset of the WT was analyzed using microarray experiments to obtain gene profiling data. Analyses of the metabolomic data using principal component analysis showed a common profile in the WT during the different tropistic curvatures, but phyB mutants produced a distinctive profile for each tropism. Interestingly, the gravity treatment elicited the greatest changes in gene expression of the WT, followed by blue light, then by red light treatments. For all tropisms, we identified genes that were downregulated by a large magnitude in carbohydrate metabolism and secondary metabolism. These included ATCSLA15, CELLULOSE SYNTHASE-LIKE, and ATCHS/SHS/TT4, CHALCONE SYNTHASE. In addition, genes involved in amino acid biosynthesis were strongly upregulated, and these included THA1 (THREONINE ALDOLASE 1) and ASN1 (DARK INDUCIBLE asparagine synthase). We have established the first metabolic profile of tropisms in conjunction with transcriptomic analyses. This approach has been useful in characterizing the similarities and differences in the molecular mechanisms involved with phototropism and gravitropism.
Brettin, Thomas; Davis, James J.; Disz, Terry; Edwards, Robert A.; Gerdes, Svetlana; Olsen, Gary J.; Olson, Robert; Overbeek, Ross; Parrello, Bruce; Pusch, Gordon D.; Shukla, Maulik; Thomason, James A.; Stevens, Rick; Vonstein, Veronika; Wattam, Alice R.; Xia, Fangfang
The RAST (Rapid Annotation using Subsystem Technology) annotation engine was built in 2008 to annotate bacterial and archaeal genomes. It works by offering a standard software pipeline for identifying genomic features (i.e., protein-encoding genes and RNA) and annotating their functions. Recently, in order to make RAST a more useful research tool and to keep pace with advancements in bioinformatics, it has become desirable to build a version of RAST that is both customizable and extensible. In this paper, we describe the RAST tool kit (RASTtk), a modular version of RAST that enables researchers to build custom annotation pipelines. RASTtk offers a choice of software for identifying and annotating genomic features as well as the ability to add custom features to an annotation job. RASTtk also accommodates the batch submission of genomes and the ability to customize annotation protocols for batch submissions. This is the first major software restructuring of RAST since its inception.
Brettin, Thomas; Davis, James J; Disz, Terry; Edwards, Robert A; Gerdes, Svetlana; Olsen, Gary J; Olson, Robert; Overbeek, Ross; Parrello, Bruce; Pusch, Gordon D; Shukla, Maulik; Thomason, James A; Stevens, Rick; Vonstein, Veronika; Wattam, Alice R; Xia, Fangfang
The RAST (Rapid Annotation using Subsystem Technology) annotation engine was built in 2008 to annotate bacterial and archaeal genomes. It works by offering a standard software pipeline for identifying genomic features (i.e., protein-encoding genes and RNA) and annotating their functions. Recently, in order to make RAST a more useful research tool and to keep pace with advancements in bioinformatics, it has become desirable to build a version of RAST that is both customizable and extensible. In this paper, we describe the RAST tool kit (RASTtk), a modular version of RAST that enables researchers to build custom annotation pipelines. RASTtk offers a choice of software for identifying and annotating genomic features as well as the ability to add custom features to an annotation job. RASTtk also accommodates the batch submission of genomes and the ability to customize annotation protocols for batch submissions. This is the first major software restructuring of RAST since its inception.
Koch, Lisa M; Rajchl, Martin; Bai, Wenjia; Baumgartner, Christian F; Tong, Tong; Passerat-Palmbach, Jonathan; Aljabar, Paul; Rueckert, Daniel
Multi-atlas segmentation is a widely used tool in medical image analysis, providing robust and accurate results by learning from annotated atlas datasets. However, the availability of fully annotated atlas images for training is limited due to the time required for the labelling task. Segmentation methods requiring only a proportion of each atlas image to be labelled could therefore reduce the workload on expert raters tasked with annotating atlas images. To address this issue, we first re-examine the labelling problem common in many existing approaches and formulate its solution in terms of a Markov Random Field energy minimisation problem on a graph connecting atlases and the target image. This provides a unifying framework for multi-atlas segmentation. We then show how modifications in the graph configuration of the proposed framework enable the use of partially annotated atlas images and investigate different partial annotation strategies. The proposed method was evaluated on two Magnetic Resonance Imaging (MRI) datasets for hippocampal and cardiac segmentation. Experiments were performed aimed at (1) recreating existing segmentation techniques with the proposed framework and (2) demonstrating the potential of employing sparsely annotated atlas data for multi-atlas segmentation.
Banfield Jillian F
Full Text Available Abstract Background Mass spectrometry-based metabolomics analyses have the potential to complement sequence-based methods of genome annotation, but only if raw mass spectral data can be linked to specific metabolic pathways. In untargeted metabolomics, the measured mass of a detected compound is used to define the location of the compound in chemical space, but uncertainties in mass measurements lead to "degeneracies" in chemical space since multiple chemical formulae correspond to the same measured mass. We compare two methods to eliminate these degeneracies. One method relies on natural isotopic abundances, and the other relies on the use of stable-isotope labeling (SIL to directly determine C and N atom counts. Both depend on combinatorial explorations of the "chemical space" comprised of all possible chemical formulae comprised of biologically relevant chemical elements. Results Of 1532 metabolic pathways curated in the MetaCyc database, 412 contain a metabolite having a chemical formula unique to that metabolic pathway. Thus, chemical formulae alone can suffice to infer the presence of some metabolic pathways. Of 248,928 unique chemical formulae selected from the PubChem database, more than 95% had at least one degeneracy on the basis of accurate mass information alone. Consideration of natural isotopic abundance reduced degeneracy to 64%, but mainly for formulae less than 500 Da in molecular weight, and only if the error in the relative isotopic peak intensity was less than 10%. Knowledge of exact C and N atom counts as determined by SIL enabled reduced degeneracy, allowing for determination of unique chemical formula for 55% of the PubChem formulae. Conclusions To facilitate the assignment of chemical formulae to unknown mass-spectral features, profiling can be performed on cultures uniformly labeled with stable isotopes of nitrogen (15N or carbon (13C. This makes it possible to accurately count the number of carbon and nitrogen atoms in
Pace, Roberto; Martinelli, Ernesto Marco; Sardone, Nicola; D E Combarieu, Eric
Ginseng is any one of the eleven species belonging to the genus Panax of the family Araliaceae and is found in North America and in eastern Asia. Ginseng is characterized by the presence of ginsenosides. Principally Panax ginseng and Panax quinquefolius are the adaptogenic herbs and are commonly distributed as health food markets. In the present study high performance liquid chromatography has been used to identify and quantify ginsenosides in the two subject species and the different parts of the plant (roots, neck, leaves, flowers, fruits). The power of this chromatographic technique to evaluate the identity of botanical material and to distinguishing different part of the plants has been investigated with metabolomic technique such as principal component analysis. Metabolomics provide a good opportunity for mining useful chemical information from the chromatographic data set resulting an important tool for quality evaluation of medicinal plants in the authenticity, consistency and efficacy. Copyright © 2015 Elsevier B.V. All rights reserved.
Full Text Available Analysis of natural product pattern (metabolites; metabolomics and its formation (pathway; biosynthesis in plants, especially in non-model or crop plants such as medicinal and aromatic plants (MAPs, is a research field with significant potential for breeders, growers and consumers. There is an increasing importance for constant and sustainable quality of MAPs final products. Polyphenols are one of the most important compounds for the antioxidant properties of MAPs and are often, if not identified as active principle, used as lead compounds in quality assessment of herbal drugs and related preparation (herbal tea, alcoholic extracts etc.. Therefore, offering an efficient, robust and reliable fast tool to determine these quality features of MAPs will guarantee the growers, industrial users and the consumers from possible frauds.
Tanizawa, Yasuhiro; Fujisawa, Takatomo; Kaminuma, Eli; Nakamura, Yasukazu; Arita, Masanori
Quality assurance and correct taxonomic affiliation of data submitted to public sequence databases have been an everlasting problem. The DDBJ Fast Annotation and Submission Tool (DFAST) is a newly developed genome annotation pipeline with quality and taxonomy assessment tools. To enable annotation of ready-to-submit quality, we also constructed curated reference protein databases tailored for lactic acid bacteria. DFAST was developed so that all the procedures required for DDBJ submission could be done seamlessly online. The online workspace would be especially useful for users not familiar with bioinformatics skills. In addition, we have developed a genome repository, DFAST Archive of Genome Annotation (DAGA), which currently includes 1,421 genomes covering 179 species and 18 subspecies of two genera, Lactobacillus and Pediococcus , obtained from both DDBJ/ENA/GenBank and Sequence Read Archive (SRA). All the genomes deposited in DAGA were annotated consistently and assessed using DFAST. To assess the taxonomic position based on genomic sequence information, we used the average nucleotide identity (ANI), which showed high discriminative power to determine whether two given genomes belong to the same species. We corrected mislabeled or misidentified genomes in the public database and deposited the curated information in DAGA. The repository will improve the accessibility and reusability of genome resources for lactic acid bacteria. By exploiting the data deposited in DAGA, we found intraspecific subgroups in Lactobacillus gasseri and Lactobacillus jensenii , whose variation between subgroups is larger than the well-accepted ANI threshold of 95% to differentiate species. DFAST and DAGA are freely accessible at https://dfast.nig.ac.jp.
Kveladze, Irma; Kraak, Menno-Jan
Movement data is collected by nearly everyone at any time. This data is not limited the trajectories of people, today’s technology also allows the simultaneous collection of trip related annotations like photos, video’s, voice, and texts. The combination of trajectories and annotations is a rich...... source to monitor movement in a context and discover known and unknown patterns. Often the annotations are implicitly geotagged by the gps-enabled devices like phones and cameras which are used to collect the annotations. This allows a match between the track and annotation based on coordinates....... Otherwise the trajectories and annotations can be matched based on their respective time stamps. The geotagged material is often used on social media sites to exchange the whereabouts of people. The annotations are place on dedicated site such as Flickr and Panoramio. Via mash-ups it is also possible...
and dhurrin, which have not previously been characterized in blueberries. There are more than 44,500 spider species with distinct habitats and unique characteristics. Spiders are masters of producing silk webs to catch prey and using venom to neutralize. The exploration of the genetics behind these properties...... japonicus (Lotus), Vaccinium corymbosum (blueberry), Stegodyphus mimosarum (spider) and Trifolium occidentale (clover). From a bioinformatics data analysis perspective, my work can be divided into three parts; genome annotation, small RNA, and gene expression analysis. Lotus is a legume of significant...... has just started. We have assembled and annotated the first two spider genomes to facilitate our understanding of spiders at the molecular level. The need for analyzing the large and increasing amount of sequencing data has increased the demand for efficient, user friendly, and broadly applicable...
japonicus (Lotus), Vaccinium corymbosum (blueberry), Stegodyphus mimosarum (spider) and Trifolium occidentale (clover). From a bioinformatics data analysis perspective, my work can be divided into three parts; genome annotation, small RNA, and gene expression analysis. Lotus is a legume of significant...... biology and genetics studies. We present an improved Lotus genome assembly and annotation, a catalog of natural variation based on re-sequencing of 29 accessions, and describe the involvement of small RNAs in the plant-bacteria symbiosis. Blueberries contain anthocyanins, other pigments and various...... polyphenolic compounds, which have been linked to protection against diabetes, cardiovascular disease and age-related cognitive decline. We present the first genome- guided approach in blueberry to identify genes involved in the synthesis of health-protective compounds. Using RNA-Seq data from five stages...
Nawrocki, Eric P
Many different types of functional non-coding RNAs participate in a wide range of important cellular functions but the large majority of these RNAs are not routinely annotated in published genomes. Several programs have been developed for identifying RNAs, including specific tools tailored to a particular RNA family as well as more general ones designed to work for any family. Many of these tools utilize covariance models (CMs), statistical models of the conserved sequence, and structure of an RNA family. In this chapter, as an illustrative example, the Infernal software package and CMs from the Rfam database are used to identify RNAs in the genome of the archaeon Methanobrevibacter ruminantium, uncovering some additional RNAs not present in the genome's initial annotation. Analysis of the results and comparison with family-specific methods demonstrate some important strengths and weaknesses of this general approach.
Nicole L Washington
Full Text Available Scientists and clinicians who study genetic alterations and disease have traditionally described phenotypes in natural language. The considerable variation in these free-text descriptions has posed a hindrance to the important task of identifying candidate genes and models for human diseases and indicates the need for a computationally tractable method to mine data resources for mutant phenotypes. In this study, we tested the hypothesis that ontological annotation of disease phenotypes will facilitate the discovery of new genotype-phenotype relationships within and across species. To describe phenotypes using ontologies, we used an Entity-Quality (EQ methodology, wherein the affected entity (E and how it is affected (Q are recorded using terms from a variety of ontologies. Using this EQ method, we annotated the phenotypes of 11 gene-linked human diseases described in Online Mendelian Inheritance in Man (OMIM. These human annotations were loaded into our Ontology-Based Database (OBD along with other ontology-based phenotype descriptions of mutants from various model organism databases. Phenotypes recorded with this EQ method can be computationally compared based on the hierarchy of terms in the ontologies and the frequency of annotation. We utilized four similarity metrics to compare phenotypes and developed an ontology of homologous and analogous anatomical structures to compare phenotypes between species. Using these tools, we demonstrate that we can identify, through the similarity of the recorded phenotypes, other alleles of the same gene, other members of a signaling pathway, and orthologous genes and pathway members across species. We conclude that EQ-based annotation of phenotypes, in conjunction with a cross-species ontology, and a variety of similarity metrics can identify biologically meaningful similarities between genes by comparing phenotypes alone. This annotation and search method provides a novel and efficient means to identify
Duvick, Jon; Standage, Daniel S; Merchant, Nirav; Brendel, Volker P
Genome-wide annotation of gene structure requires the integration of numerous computational steps. Currently, annotation is arguably best accomplished through collaboration of bioinformatics and domain experts, with broad community involvement. However, such a collaborative approach is not scalable at today's pace of sequence generation. To address this problem, we developed the xGDBvm software, which uses an intuitive graphical user interface to access a number of common genome analysis and gene structure tools, preconfigured in a self-contained virtual machine image. Once their virtual machine instance is deployed through iPlant's Atmosphere cloud services, users access the xGDBvm workflow via a unified Web interface to manage inputs, set program parameters, configure links to high-performance computing (HPC) resources, view and manage output, apply analysis and editing tools, or access contextual help. The xGDBvm workflow will mask the genome, compute spliced alignments from transcript and/or protein inputs (locally or on a remote HPC cluster), predict gene structures and gene structure quality, and display output in a public or private genome browser complete with accessory tools. Problematic gene predictions are flagged and can be reannotated using the integrated yrGATE annotation tool. xGDBvm can also be configured to append or replace existing data or load precomputed data. Multiple genomes can be annotated and displayed, and outputs can be archived for sharing or backup. xGDBvm can be adapted to a variety of use cases including de novo genome annotation, reannotation, comparison of different annotations, and training or teaching. © 2016 American Society of Plant Biologists. All rights reserved.
Full Text Available Abstract Background Two of the main objectives of the genomic and post-genomic era are to structurally and functionally annotate genomes which consists of detecting genes' position and structure, and inferring their function (as well as of other features of genomes. Structural and functional annotation both require the complex chaining of numerous different software, algorithms and methods under the supervision of a biologist. The automation of these pipelines is necessary to manage huge amounts of data released by sequencing projects. Several pipelines already automate some of these complex chaining but still necessitate an important contribution of biologists for supervising and controlling the results at various steps. Results Here we propose an innovative automated platform, FIGENIX, which includes an expert system capable to substitute to human expertise at several key steps. FIGENIX currently automates complex pipelines of structural and functional annotation under the supervision of the expert system (which allows for example to make key decisions, check intermediate results or refine the dataset. The quality of the results produced by FIGENIX is comparable to those obtained by expert biologists with a drastic gain in terms of time costs and avoidance of errors due to the human manipulation of data. Conclusion The core engine and expert system of the FIGENIX platform currently handle complex annotation processes of broad interest for the genomic community. They could be easily adapted to new, or more specialized pipelines, such as for example the annotation of miRNAs, the classification of complex multigenic families, annotation of regulatory elements and other genomic features of interest.
Full Text Available The recent thriving development of biobanks and associated high-throughput phenotyping studies requires the elaboration of large-scale approaches for monitoring biological sample quality and compliance with standard protocols. We present a metabolomic investigation of human blood samples that delineates pitfalls and guidelines for the collection, storage and handling procedures for serum and plasma. A series of eight pre-processing technical parameters is systematically investigated along variable ranges commonly encountered across clinical studies. While metabolic fingerprints, as assessed by nuclear magnetic resonance, are not significantly affected by altered centrifugation parameters or delays between sample pre-processing (blood centrifugation and storage, our metabolomic investigation highlights that both the delay and storage temperature between blood draw and centrifugation are the primary parameters impacting serum and plasma metabolic profiles. Storing the blood drawn at 4 °C is shown to be a reliable routine to confine variability associated with idle time prior to sample pre-processing. Based on their fine sensitivity to pre-analytical parameters and protocol variations, metabolic fingerprints could be exploited as valuable ways to determine compliance with standard procedures and quality assessment of blood samples within large multi-omic clinical and translational cohort studies.
An annotated summary of 204 articles and publications on burrs, burr prevention and deburring is presented. Thirty-seven deburring processes are listed. Entries cited include English, Russian, French, Japanese and German language articles. Entries are indexed by deburring processes, author, and language. Indexes also indicate which references discuss equipment and tooling, how to use a process, economics, burr properties, and how to design to minimize burr problems. Research studies are identified as are the materials deburred.
An annotated summary of 204 articles and publications on burrs, burr prevention and deburring is presented. Thirty-seven deburring processes are listed. Entries cited include English, Russian, French, Japanese and German language articles. Entries are indexed by deburring processes, author, and language. Indexes also indicate which references discuss equipment and tooling, how to use a process, economics, burr properties, and how to design to minimize burr problems. Research studies are identified as are the materials deburred
Rehfeld, Jens F
Gastrointestinal hormones are peptides released from neuroendocrine cells in the digestive tract. More than 30 hormone genes are currently known to be expressed in the gut, which makes it the largest hormone-producing organ in the body. Modern biology makes it feasible to conceive the hormones un......, but also constitute regulatory systems operating in the whole organism. This overview of gut hormone biology is supplemented with an annotation on some Scandinavian contributions to gastrointestinal hormone research....
Jiu, Mingyuan; Sahbi, Hichem
Multiple kernel learning (MKL) is a widely used technique for kernel design. Its principle consists in learning, for a given support vector classifier, the most suitable convex (or sparse) linear combination of standard elementary kernels. However, these combinations are shallow and often powerless to capture the actual similarity between highly semantic data, especially for challenging classification tasks such as image annotation. In this paper, we redefine multiple kernels using deep multi-layer networks. In this new contribution, a deep multiple kernel is recursively defined as a multi-layered combination of nonlinear activation functions, each one involves a combination of several elementary or intermediate kernels, and results into a positive semi-definite deep kernel. We propose four different frameworks in order to learn the weights of these networks: supervised, unsupervised, kernel-based semisupervised and Laplacian-based semi-supervised. When plugged into support vector machines (SVMs), the resulting deep kernel networks show clear gain, compared to several shallow kernels for the task of image annotation. Extensive experiments and analysis on the challenging ImageCLEF photo annotation benchmark, the COREL5k database and the Banana dataset validate the effectiveness of the proposed method.
Liu, Hongfang; Li, Xin; Yoon, Victoria; Clarke, Robert
As the most common cancer among women, breast cancer results from the accumulation of mutations in essential genes. Recent advance in high-throughput gene expression microarray technology has inspired researchers to use the technology to assist breast cancer diagnosis, prognosis, and treatment prediction. However, the high dimensionality of microarray experiments and public access of data from many experiments have caused inconsistencies which initiated the development of controlled terminologies and ontologies for annotating microarray experiments, such as the standard microarray Gene Expression Data (MGED) ontology (MO). In this paper, we developed BCM-CO, an ontology tailored specifically for indexing clinical annotations of breast cancer microarray samples from the NCI Thesaurus. Our research showed that the coverage of NCI Thesaurus is very limited with respect to i) terms used by researchers to describe breast cancer histology (covering 22 out of 48 histology terms); ii) breast cancer cell lines (covering one out of 12 cell lines); and iii) classes corresponding to the breast cancer grading and staging. By incorporating a wider range of those terms into BCM-CO, we were able to indexed breast cancer microarray samples from GEO using BCM-CO and MGED ontology and developed a prototype system with web interface that allows the retrieval of microarray data based on the ontology annotations. PMID:18999108
Moco, S.I.A.; Bino, R.J.; Vorst, O.F.J.; Verhoeven, H.A.; Groot, de J.C.W.; Beek, van T.A.; Vervoort, J.J.M.; Vos, de C.H.
For the description of the metabolome of an organism, the development of common metabolite databases is of utmost importance. Here we present the Metabolome Tomato Database (MoTo DB), a metabolite database dedicated to liquid chromatography-mass spectrometry (LC-MS)- based metabolomics of tomato
Alessandro M. Varani
Full Text Available The Xylella fastidiosa comparative genomic database is a scientific resource with the aim to provide a user-friendly interface for accessing high-quality manually curated genomic annotation and comparative sequence analysis, as well as for identifying and mapping prophage-like elements, a marked feature of Xylella genomes. Here we describe a database and tools for exploring the biology of this important plant pathogen. The hallmarks of this database are the high quality genomic annotation, the functional and comparative genomic analysis and the identification and mapping of prophage-like elements. It is available from web site http://www.xylella.lncc.br.
Zhao, Jian; Glueck, Michael; Breslav, Simon; Chevalier, Fanny; Khan, Azam
User-authored annotations of data can support analysts in the activity of hypothesis generation and sensemaking, where it is not only critical to document key observations, but also to communicate insights between analysts. We present annotation graphs, a dynamic graph visualization that enables meta-analysis of data based on user-authored annotations. The annotation graph topology encodes annotation semantics, which describe the content of and relations between data selections, comments, and tags. We present a mixed-initiative approach to graph layout that integrates an analyst's manual manipulations with an automatic method based on similarity inferred from the annotation semantics. Various visual graph layout styles reveal different perspectives on the annotation semantics. Annotation graphs are implemented within C8, a system that supports authoring annotations during exploratory analysis of a dataset. We apply principles of Exploratory Sequential Data Analysis (ESDA) in designing C8, and further link these to an existing task typology in the visualization literature. We develop and evaluate the system through an iterative user-centered design process with three experts, situated in the domain of analyzing HCI experiment data. The results suggest that annotation graphs are effective as a method of visually extending user-authored annotations to data meta-analysis for discovery and organization of ideas.
Burnett, N.; Jeffries, J.; Mach, J.; Robson, M.; Pajot, D.; Harrigan, J.; Lebsack, T.; Mullen, D.; Rat, F.; Theys, P.
What is quality? How do you achieve it? How do you keep it once you have got it. The answer for industry at large is the three-step hierarchy of quality control, quality assurance and Total quality Management. An overview is given of the history of quality movement, illustrated with examples from Schlumberger operations, as well as the oil industry's approach to quality. An introduction of the Schlumberger's quality-associated ClientLink program is presented. 15 figs., 4 ills., 16 refs
Full Text Available Genome sequences are annotated by computational prediction of coding sequences, followed by similarity searches such as BLAST, which provide a layer of possible functional information. While the existence of processes such as alternative splicing complicates matters for eukaryote genomes, the view of bacterial genomes as a linear series of closely spaced genes leads to the assumption that computational annotations that predict such arrangements completely describe the coding capacity of bacterial genomes. We undertook a proteomic study to identify proteins expressed by Pseudomonas fluorescens Pf0-1 from genes that were not predicted during the genome annotation. Mapping peptides to the Pf0-1 genome sequence identified sixteen non-annotated protein-coding regions, of which nine were antisense to predicted genes, six were intergenic, and one read in the same direction as an annotated gene but in a different frame. The expression of all but one of the newly discovered genes was verified by RT-PCR. Few clues as to the function of the new genes were gleaned from informatic analyses, but potential orthologs in other Pseudomonas genomes were identified for eight of the new genes. The 16 newly identified genes improve the quality of the Pf0-1 genome annotation, and the detection of antisense protein-coding genes indicates the under-appreciated complexity of bacterial genome organization.
Tellgren-Roth, Christian; Baudo, Charles D.; Kennell, John C.; Sun, Sheng; Billmyre, R. Blake; Schröder, Markus S.; Andersson, Anna; Holm, Tina; Sigurgeirsson, Benjamin; Wu, Guangxi; Sankaranarayanan, Sundar Ram; Siddharthan, Rahul; Sanyal, Kaustuv; Lundeberg, Joakim; Nystedt, Björn; Boekhout, Teun; Dawson, Thomas L.; Heitman, Joseph
Abstract Complete and accurate genome assembly and annotation is a crucial foundation for comparative and functional genomics. Despite this, few complete eukaryotic genomes are available, and genome annotation remains a major challenge. Here, we present a complete genome assembly of the skin commensal yeast Malassezia sympodialis and demonstrate how proteogenomics can substantially improve gene annotation. Through long-read DNA sequencing, we obtained a gap-free genome assembly for M. sympodialis (ATCC 42132), comprising eight nuclear and one mitochondrial chromosome. We also sequenced and assembled four M. sympodialis clinical isolates, and showed their value for understanding Malassezia reproduction by confirming four alternative allele combinations at the two mating-type loci. Importantly, we demonstrated how proteomics data could be readily integrated with transcriptomics data in standard annotation tools. This increased the number of annotated protein-coding genes by 14% (from 3612 to 4113), compared to using transcriptomics evidence alone. Manual curation further increased the number of protein-coding genes by 9% (to 4493). All of these genes have RNA-seq evidence and 87% were confirmed by proteomics. The M. sympodialis genome assembly and annotation presented here is at a quality yet achieved only for a few eukaryotic organisms, and constitutes an important reference for future host-microbe interaction studies. PMID:28100699
Nagana Gowda, G. A.; Raftery, Daniel
The field of metabolomics continues to witness rapid growth driven by fundamental studies, methods development, and applications in a number of disciplines that include biomedical science, plant and nutrition sciences, drug development, energy and environmental sciences, toxicology, etc. NMR spectroscopy is one of the two most widely used analytical platforms in the metabolomics field, along with mass spectrometry (MS). NMR's excellent reproducibility and quantitative accuracy, its ability to identify structures of unknown metabolites, its capacity to generate metabolite profiles using intact bio-specimens with no need for separation, and its capabilities for tracing metabolic pathways using isotope labeled substrates offer unique strengths for metabolomics applications. However, NMR's limited sensitivity and resolution continue to pose a major challenge and have restricted both the number and the quantitative accuracy of metabolites analyzed by NMR. Further, the analysis of highly complex biological samples has increased the demand for new methods with improved detection, better unknown identification, and more accurate quantitation of larger numbers of metabolites. Recent efforts have contributed significant improvements in these areas, and have thereby enhanced the pool of routinely quantifiable metabolites. Additionally, efforts focused on combining NMR and MS promise opportunities to exploit the combined strength of the two analytical platforms for direct comparison of the metabolite data, unknown identification and reliable biomarker discovery that continue to challenge the metabolomics field. This article presents our perspectives on the emerging trends in NMR-based metabolomics and NMR's continuing role in the field with an emphasis on recent and ongoing research from our laboratory.
A variety of chemicals produced by plants, often referred to as 'phytochemicals', have been used as medicines, food, fuels and industrial raw materials. Recent advances in the study of genomics and metabolomics in plant science have accelerated our understanding of the mechanisms, regulation and evolution of the biosynthesis of specialized plant products. We can now address such questions as how the metabolomic diversity of plants is originated at the levels of genome, and how we should apply this knowledge to drug discovery, industry and agriculture. Our research group has focused on metabolomics-based functional genomics over the last 15 years and we have developed a new research area called 'Phytochemical Genomics'. In this review, the development of a research platform for plant metabolomics is discussed first, to provide a better understanding of the chemical diversity of plants. Then, representative applications of metabolomics to functional genomics in a model plant, Arabidopsis thaliana, are described. The extension of integrated multi-omics analyses to non-model specialized plants, e.g., medicinal plants, is presented, including the identification of novel genes, metabolites and networks for the biosynthesis of flavonoids, alkaloids, sulfur-containing metabolites and terpenoids. Further, functional genomics studies on a variety of medicinal plants is presented. I also discuss future trends in pharmacognosy and related sciences.
Full Text Available Xenobiotic exposure, especially high-dose or repeated exposure of xenobiotics, can elicit detrimental effects on biological systems through diverse mechanisms. Changes in metabolic systems, including formation of reactive metabolites and disruption of endogenous metabolism, are not only the common consequences of toxic xenobiotic exposure, but in many cases are the major causes behind development of xenobiotic-induced toxicities (XIT. Therefore, examining the metabolic events associated with XIT generates mechanistic insights into the initiation and progression of XIT, and provides guidance for prevention and treatment. Traditional bioanalytical platforms that target only a few suspected metabolites are capable of validating the expected outcomes of xenobiotic exposure. However, these approaches lack the capacity to define global changes and to identify unexpected events in the metabolic system. Recent developments in high-throughput metabolomics have dramatically expanded the scope and potential of metabolite analysis. Among all analytical techniques adopted for metabolomics, liquid chromatography-mass spectrometry (LC-MS has been most widely used for metabolomic investigations of XIT due to its versatility and sensitivity in metabolite analysis. In this review, technical platform of LC-MS-based metabolomics, including experimental model, sample preparation, instrumentation, and data analysis, are discussed. Applications of LC-MS-based metabolomics in exploratory and hypothesis-driven investigations of XIT are illustrated by case studies of xenobiotic metabolism and endogenous metabolism associated with xenobiotic exposure.
Menni, Cristina; Zierer, Jonas; Valdes, Ana M; Spector, Tim D
Metabolomics is an exciting field in systems biology that provides a direct readout of the biochemical activities taking place within an individual at a particular point in time. Metabolite levels are influenced by many factors, including disease status, environment, medications, diet and, importantly, genetics. Thanks to their dynamic nature, metabolites are useful for diagnosis and prognosis, as well as for predicting and monitoring the efficacy of treatments. At the same time, the strong links between an individual's metabolic and genetic profiles enable the investigation of pathways that underlie changes in metabolite levels. Thus, for the field of metabolomics to yield its full potential, researchers need to take into account the genetic factors underlying the production of metabolites, and the potential role of these metabolites in disease processes. In this Review, the methodological aspects related to metabolomic profiling and any potential links between metabolomics and the genetics of some of the most common rheumatic diseases are described. Links between metabolomics, genetics and emerging fields such as the gut microbiome and proteomics are also discussed.
Jeffrey S Breunig
Full Text Available Metabolism, the conversion of nutrients into usable energy and biochemical building blocks, is an essential feature of all cells. The genetic factors responsible for inter-individual metabolic variability remain poorly understood. To investigate genetic causes of metabolome variation, we measured the concentrations of 74 metabolites across ~ 100 segregants from a Saccharomyces cerevisiae cross by liquid chromatography-tandem mass spectrometry. We found 52 quantitative trait loci for 34 metabolites. These included linkages due to overt changes in metabolic genes, e.g., linking pyrimidine intermediates to the deletion of ura3. They also included linkages not directly related to metabolic enzymes, such as those for five central carbon metabolites to ira2, a Ras/PKA pathway regulator, and for the metabolites, S-adenosyl-methionine and S-adenosyl-homocysteine to slt2, a MAP kinase involved in cell wall integrity. The variant of ira2 that elevates metabolite levels also increases glucose uptake and ethanol secretion. These results highlight specific examples of genetic variability, including in genes without prior known metabolic regulatory function, that impact yeast metabolism.
Bovo, S; Mazzoni, G; Calò, D G; Galimberti, G; Fanelli, F; Mezzullo, M; Schiavo, G; Scotti, E; Manisi, A; Samoré, A B; Bertolini, F; Trevisi, P; Bosi, P; Dall'Olio, S; Pagotto, U; Fontanesi, L
Metabolomics has opened new possibilities to investigate metabolic differences among animals. In this study, we applied a targeted metabolomic approach to deconstruct the pig sex metabolome as defined by castrated males and entire gilts. Plasma from 545 performance-tested Italian Large White pigs (172 castrated males and 373 females) sampled at about 160 kg live weight were analyzed for 186 metabolites using the Biocrates AbsoluteIDQ p180 Kit. After filtering, 132 metabolites (20 AA, 11 biogenic amines, 1 hexose, 13 acylcarnitines, 11 sphingomyelins, 67 phosphatidylcholines, and 9 lysophosphatidylcholines) were retained for further analyses. The multivariate approach of the sparse partial least squares discriminant analysis was applied, together with a specifically designed statistical pipeline, that included a permutation test and a 10 cross-fold validation procedure that produced stability and effect size statistics for each metabolite. Using this approach, we identified 85 biomarkers (with metabolites from all analyzed chemical families) that contributed to the differences between the 2 groups of pigs ( metabolic shift in castrated males toward energy storage and lipid production. Similar general patterns were observed for most sphingomyelins, phosphatidylcholines, and lysophosphatidylcholines. Metabolomic pathway analysis and pathway enrichment identified several differences between the 2 sexes. This metabolomic overview opened new clues on the biochemical mechanisms underlying sexual dimorphism that, on one hand, might explain differences in terms of economic traits between castrated male pigs and entire gilts and, on the other hand, could strengthen the pig as a model to define metabolic mechanisms related to fat deposition.
Gonzalez-Franquesa, Alba; Burkart, Alison M; Isganaitis, Elvira
and mathematical modeling approaches, have provided the scientific community with new tools to describe the T2D metabolome. The metabolomics signatures associated with T2D and obesity include increased levels of lactate, glycolytic intermediates, branched-chain and aromatic amino acids, and long-chain fatty acids......Type 2 diabetes (T2D) is increasing worldwide, making identification of biomarkers for detection, staging, and effective prevention strategies an especially critical scientific and medical goal. Fortunately, advances in metabolomics techniques, together with improvements in bioinformatics....... Conversely, tricarboxylic acid cycle intermediates, betaine, and other metabolites decrease. Future studies will be required to fully integrate these and other findings into our understanding of diabetes pathophysiology and to identify biomarkers of disease risk, stage, and responsiveness to specific...
Fearnley, Liam G; Inouye, Michael
Metabolomics is becoming feasible for population-scale studies of human disease. In this review, we survey epidemiological studies that leverage metabolomics and multi-omics to gain insight into disease mechanisms. We outline key practical, technological and analytical limitations while also highlighting recent successes in integrating these data. The use of multi-omics to infer reaction rates is discussed as a potential future direction for metabolomics research, as a means of identifying biomarkers as well as inferring causality. Furthermore, we highlight established analysis approaches as well as simulation-based methods currently used in single- and multi-cell levels in systems biology. © The Author 2016. Published by Oxford University Press on behalf of the International Epidemiological Association.
Wang, Zhiyi; Ma, Jianshe; Zhang, Meiling; Wen, Congcong; Huang, Xueli; Sun, Fa; Wang, Shuanghu; Hu, Lufeng; Lin, Guanyang; Wang, Xianqin
Paraquat is one of the most widely used herbicides in the world and is highly toxic to humans and animals. In this study, we developed a serum metabolomic method based on GC/MS to evaluate the effects of acute paraquat poisoning on rats. Pattern recognition analysis, including both principal component analysis and partial least squares-discriminate analysis revealed that acute paraquat poisoning induced metabolic perturbations. Compared with the control group, the level of octadecanoic acid, L-serine, L-threonine, L-valine, and glycerol in the acute paraquat poisoning group (36 mg/kg) increased, while the levels of hexadecanoic acid, D-galactose, and decanoic acid decreased. These findings provide an overview of systematic responses to paraquat exposure and metabolomic insight into the toxicological mechanism of paraquat. Our results indicate that metabolomic methods based on GC/MS may be useful to elucidate the mechanism of acute paraquat poisoning through the exploration of biomarkers.
Gibbons, Helena; Carr, Eibhlin; McNulty, Breige A; Nugent, Anne P; Walton, Janette; Flynn, Albert; Gibney, Michael J; Brennan, Lorraine
Classification of subjects into dietary patterns generally relies on self-reporting dietary data which are prone to error. The aim of the present study was to develop a model for objective classification of people into dietary patterns based on metabolomic data. Dietary and urinary metabolomic data from the National Adult Nutrition Survey (NANS) was used in the analysis (n = 567). Two-step cluster analysis was applied to the urinary data to identify clusters. The subsequent model was used in an independent cohort to classify people into dietary patterns. Two distinct dietary patterns were identified. Cluster 1 was characterized by significantly higher intakes of breakfast cereals, low fat and skimmed milks, potatoes, fruit, fish and fish dishes (p patterns based on metabolomics data. Future applications of this approach could be developed for rapid and objective assignment of subjects into dietary patterns. © 2017 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Full Text Available Abstract Background Magnaporthe oryzae, the causal agent of blast disease of rice, is the most destructive disease of rice worldwide. The genome of this fungal pathogen has been sequenced and an automated annotation has recently been updated to Version 6 http://www.broad.mit.edu/annotation/genome/magnaporthe_grisea/MultiDownloads.html. However, a comprehensive manual curation remains to be performed. Gene Ontology (GO annotation is a valuable means of assigning functional information using standardized vocabulary. We report an overview of the GO annotation for Version 5 of M. oryzae genome assembly. Methods A similarity-based (i.e., computational GO annotation with manual review was conducted, which was then integrated with a literature-based GO annotation with computational assistance. For similarity-based GO annotation a stringent reciprocal best hits method was used to identify similarity between predicted proteins of M. oryzae and GO proteins from multiple organisms with published associations to GO terms. Significant alignment pairs were manually reviewed. Functional assignments were further cross-validated with manually reviewed data, conserved domains, or data determined by wet lab experiments. Additionally, biological appropriateness of the functional assignments was manually checked. Results In total, 6,286 proteins received GO term assignment via the homology-based annotation, including 2,870 hypothetical proteins. Literature-based experimental evidence, such as microarray, MPSS, T-DNA insertion mutation, or gene knockout mutation, resulted in 2,810 proteins being annotated with GO terms. Of these, 1,673 proteins were annotated with new terms developed for Plant-Associated Microbe Gene Ontology (PAMGO. In addition, 67 experiment-determined secreted proteins were annotated with PAMGO terms. Integration of the two data sets resulted in 7,412 proteins (57% being annotated with 1,957 distinct and specific GO terms. Unannotated proteins
Schock, Tracey B; Duke, Jessica; Goodson, Abby; Weldon, Daryl; Brunson, Jeff; Leffler, John W; Bearden, Daniel W
Success of the shrimp aquaculture industry requires technological advances that increase production and environmental sustainability. Indoor, superintensive, aquaculture systems are being developed that permit year-round production of farmed shrimp at high densities. These systems are intended to overcome problems of disease susceptibility and of water quality issues from waste products, by operating as essentially closed systems that promote beneficial microbial communities (biofloc). The resulting biofloc can assimilate and detoxify wastes, may provide nutrition for the farmed organisms resulting in improved growth, and may aid in reducing disease initiated from external sources. Nuclear magnetic resonance (NMR)-based metabolomic techniques were used to assess shrimp health during a full growout cycle from the nursery phase through harvest in a minimal-exchange, superintensive, biofloc system. Aberrant shrimp metabolomes were detected from a spike in total ammonia nitrogen in the nursery, from a reduced feeding period that was a consequence of surface scum build-up in the raceway, and from the stocking transition from the nursery to the growout raceway. The biochemical changes in the shrimp that were induced by the stressors were essential for survival and included nitrogen detoxification and energy conservation mechanisms. Inosine and trehalose may be general biomarkers of stress in Litopenaeus vannamei. This study demonstrates one aspect of the practicality of using NMR-based metabolomics to enhance the aquaculture industry by providing physiological insight into common environmental stresses that may limit growth or better explain reduced survival and production.
Tracey B Schock
Full Text Available Success of the shrimp aquaculture industry requires technological advances that increase production and environmental sustainability. Indoor, superintensive, aquaculture systems are being developed that permit year-round production of farmed shrimp at high densities. These systems are intended to overcome problems of disease susceptibility and of water quality issues from waste products, by operating as essentially closed systems that promote beneficial microbial communities (biofloc. The resulting biofloc can assimilate and detoxify wastes, may provide nutrition for the farmed organisms resulting in improved growth, and may aid in reducing disease initiated from external sources. Nuclear magnetic resonance (NMR-based metabolomic techniques were used to assess shrimp health during a full growout cycle from the nursery phase through harvest in a minimal-exchange, superintensive, biofloc system. Aberrant shrimp metabolomes were detected from a spike in total ammonia nitrogen in the nursery, from a reduced feeding period that was a consequence of surface scum build-up in the raceway, and from the stocking transition from the nursery to the growout raceway. The biochemical changes in the shrimp that were induced by the stressors were essential for survival and included nitrogen detoxification and energy conservation mechanisms. Inosine and trehalose may be general biomarkers of stress in Litopenaeus vannamei. This study demonstrates one aspect of the practicality of using NMR-based metabolomics to enhance the aquaculture industry by providing physiological insight into common environmental stresses that may limit growth or better explain reduced survival and production.
Schock, Tracey B.; Duke, Jessica; Goodson, Abby; Weldon, Daryl; Brunson, Jeff; Leffler, John W.; Bearden, Daniel W.
Success of the shrimp aquaculture industry requires technological advances that increase production and environmental sustainability. Indoor, superintensive, aquaculture systems are being developed that permit year-round production of farmed shrimp at high densities. These systems are intended to overcome problems of disease susceptibility and of water quality issues from waste products, by operating as essentially closed systems that promote beneficial microbial communities (biofloc). The resulting biofloc can assimilate and detoxify wastes, may provide nutrition for the farmed organisms resulting in improved growth, and may aid in reducing disease initiated from external sources. Nuclear magnetic resonance (NMR)-based metabolomic techniques were used to assess shrimp health during a full growout cycle from the nursery phase through harvest in a minimal-exchange, superintensive, biofloc system. Aberrant shrimp metabolomes were detected from a spike in total ammonia nitrogen in the nursery, from a reduced feeding period that was a consequence of surface scum build-up in the raceway, and from the stocking transition from the nursery to the growout raceway. The biochemical changes in the shrimp that were induced by the stressors were essential for survival and included nitrogen detoxification and energy conservation mechanisms. Inosine and trehalose may be general biomarkers of stress in Litopenaeus vannamei. This study demonstrates one aspect of the practicality of using NMR-based metabolomics to enhance the aquaculture industry by providing physiological insight into common environmental stresses that may limit growth or better explain reduced survival and production. PMID:23555690
Bayram, Mustafa; Gökırmaklı, Çağlar
Food and engineering sciences have tended to neglect the importance of human nutrition sciences and clinical study of new molecules discovered by food engineering community, and vice versa. Yet, the value of systems thinking and use of omics technologies in food engineering are rapidly emerging. Foodomics is a new concept and practice to bring about "precision nutrition" and integrative bioengineering studies of food composition, quality, and safety, and applications to improve health of humans, animals, and other living organisms on the planet. Foodomics signals a three-way convergence among (1) food engineering; (2) omics systems science technologies such as proteomics, metabolomics, glycomics; and (3) medical/life sciences. This horizon scanning expert review aims to challenge the current practices in food sciences and bioengineering so as to adopt foodomics and systems thinking in foodstuff analysis, with a focus on possible applications of metabolomics. Among the omics biotechnologies, metabolomics is one of the prominent analytical platforms of interest to both food engineers and medical researchers engaged in nutritional sciences, precision medicine, and systems medicine diagnostics. Medical and omics system scientists, and bioengineering scholars can mutually learn from their respective professional expertise. Moving forward, establishment of "Foodomics Think Tanks" is one conceivable strategy to integrate medical and food sciences innovation at a systems scale. With its rich history in food sciences and tradition of interdisciplinary scholarship, the Silk Road countries offer notable potential for synthesis of diverse knowledge strands necessary to realize the prospects of foodomics from Asia and Middle East to Europe.
Spicer, Rachel; Salek, Reza M; Moreno, Pablo; Cañueto, Daniel; Steinbeck, Christoph
The field of metabolomics has expanded greatly over the past two decades, both as an experimental science with applications in many areas, as well as in regards to data standards and bioinformatics software tools. The diversity of experimental designs and instrumental technologies used for metabolomics has led to the need for distinct data analysis methods and the development of many software tools. To compile a comprehensive list of the most widely used freely available software and tools that are used primarily in metabolomics. The most widely used tools were selected for inclusion in the review by either ≥ 50 citations on Web of Science (as of 08/09/16) or the use of the tool being reported in the recent Metabolomics Society survey. Tools were then categorised by the type of instrumental data (i.e. LC-MS, GC-MS or NMR) and the functionality (i.e. pre- and post-processing, statistical analysis, workflow and other functions) they are designed for. A comprehensive list of the most used tools was compiled. Each tool is discussed within the context of its application domain and in relation to comparable tools of the same domain. An extended list including additional tools is available at https://github.com/RASpicer/MetabolomicsTools which is classified and searchable via a simple controlled vocabulary. This review presents the most widely used tools for metabolomics analysis, categorised based on their main functionality. As future work, we suggest a direct comparison of tools' abilities to perform specific data analysis tasks e.g. peak picking.
Eve Syrkin Wurtele
Full Text Available Specialized compounds from photosynthetic organisms serve as rich resources for drug development. From aspirin to atropine, plant-derived natural products have had a profound impact on human health. Technological advances provide new opportunities to access these natural products in a metabolic context. Here, we describe a database and platform for storing, visualizing and statistically analyzing metabolomics data from fourteen medicinal plant species. The metabolomes and associated transcriptomes (RNAseq for each plant species, gathered from up to twenty tissue/organ samples that have experienced varied growth conditions and developmental histories, were analyzed in parallel. Three case studies illustrate different ways that the data can be integrally used to generate testable hypotheses concerning the biochemistry, phylogeny and natural product diversity of medicinal plants. Deep metabolomics analysis of Camptotheca acuminata exemplifies how such data can be used to inform metabolic understanding of natural product chemical diversity and begin to formulate hypotheses about their biogenesis. Metabolomics data from Prunella vulgaris, a species that contains a wide range of antioxidant, antiviral, tumoricidal and anti-inflammatory constituents, provide a case study of obtaining biosystematic and developmental fingerprint information from metabolite accumulation data in a little studied species. Digitalis purpurea, well known as a source of cardiac glycosides, is used to illustrate how integrating metabolomics and transcriptomics data can lead to identification of candidate genes encoding biosynthetic enzymes in the cardiac glycoside pathway. Medicinal Plant Metabolomics Resource (MPM  provides a framework for generating experimentally testable hypotheses about the metabolic networks that lead to the generation of specialized compounds, identifying genes that control their biosynthesis and establishing a basis for modeling metabolism in less
Wurtele, Eve Syrkin; Chappell, Joe; Jones, A Daniel; Celiz, Mary Dawn; Ransom, Nick; Hur, Manhoi; Rizshsky, Ludmila; Crispin, Matthew; Dixon, Philip; Liu, Jia; P Widrlechner, Mark; Nikolau, Basil J
Specialized compounds from photosynthetic organisms serve as rich resources for drug development. From aspirin to atropine, plant-derived natural products have had a profound impact on human health. Technological advances provide new opportunities to access these natural products in a metabolic context. Here, we describe a database and platform for storing, visualizing and statistically analyzing metabolomics data from fourteen medicinal plant species. The metabolomes and associated transcriptomes (RNAseq) for each plant species, gathered from up to twenty tissue/organ samples that have experienced varied growth conditions and developmental histories, were analyzed in parallel. Three case studies illustrate different ways that the data can be integrally used to generate testable hypotheses concerning the biochemistry, phylogeny and natural product diversity of medicinal plants. Deep metabolomics analysis of Camptotheca acuminata exemplifies how such data can be used to inform metabolic understanding of natural product chemical diversity and begin to formulate hypotheses about their biogenesis. Metabolomics data from Prunella vulgaris, a species that contains a wide range of antioxidant, antiviral, tumoricidal and anti-inflammatory constituents, provide a case study of obtaining biosystematic and developmental fingerprint information from metabolite accumulation data in a little studied species. Digitalis purpurea, well known as a source of cardiac glycosides, is used to illustrate how integrating metabolomics and transcriptomics data can lead to identification of candidate genes encoding biosynthetic enzymes in the cardiac glycoside pathway. Medicinal Plant Metabolomics Resource (MPM)  provides a framework for generating experimentally testable hypotheses about the metabolic networks that lead to the generation of specialized compounds, identifying genes that control their biosynthesis and establishing a basis for modeling metabolism in less studied species. The
Vincent, Isabel M.; Ehmann, David E.; Mills, Scott D.; Perros, Manos
Deciphering the mode of action (MOA) of new antibiotics discovered through phenotypic screening is of increasing importance. Metabolomics offers a potentially rapid and cost-effective means of identifying modes of action of drugs whose effects are mediated through changes in metabolism. Metabolomics techniques also collect data on off-target effects and drug modifications. Here, we present data from an untargeted liquid chromatography-mass spectrometry approach to identify the modes of action of eight compounds: 1-[3-fluoro-4-(5-methyl-2,4-dioxo-pyrimidin-1-yl)phenyl]-3-[2-(trifluoromethyl)phenyl]urea (AZ1), 2-(cyclobutylmethoxy)-5′-deoxyadenosine, triclosan, fosmidomycin, CHIR-090, carbonyl cyanide m-chlorophenylhydrazone (CCCP), 5-chloro-2-(methylsulfonyl)-N-(1,3-thiazol-2-yl)-4-pyrimidinecarboxamide (AZ7), and ceftazidime. Data analysts were blind to the compound identities but managed to identify the target as thymidylate kinase for AZ1, isoprenoid biosynthesis for fosmidomycin, acyl-transferase for CHIR-090, and DNA metabolism for 2-(cyclobutylmethoxy)-5′-deoxyadenosine. Changes to cell wall metabolites were seen in ceftazidime treatments, although other changes, presumably relating to off-target effects, dominated spectral outputs in the untargeted approach. Drugs which do not work through metabolic pathways, such as the proton carrier CCCP, have no discernible impact on the metabolome. The untargeted metabolomics approach also revealed modifications to two compounds, namely, fosmidomycin and AZ7. An untreated control was also analyzed, and changes to the metabolome were seen over 4 h, highlighting the necessity for careful controls in these types of studies. Metabolomics is a useful tool in the analysis of drug modes of action and can complement other technologies already in use. PMID:26833150
Arnaiz, Olivier; Van Dijk, Erwin; Bétermier, Mireille; Lhuillier-Akakpo, Maoussi; de Vanssay, Augustin; Duharcourt, Sandra; Sallet, Erika; Gouzy, Jérôme; Sperling, Linda
The 15 sibling species of the Paramecium aurelia cryptic species complex emerged after a whole genome duplication that occurred tens of millions of years ago. Given extensive knowledge of the genetics and epigenetics of Paramecium acquired over the last century, this species complex offers a uniquely powerful system to investigate the consequences of whole genome duplication in a unicellular eukaryote as well as the genetic and epigenetic mechanisms that drive speciation. High quality Paramecium gene models are important for research using this system. The major aim of the work reported here was to build an improved gene annotation pipeline for the Paramecium lineage. We generated oriented RNA-Seq transcriptome data across the sexual process of autogamy for the model species Paramecium tetraurelia. We determined, for the first time in a ciliate, candidate P. tetraurelia transcription start sites using an adapted Cap-Seq protocol. We developed TrUC, multi-threaded Perl software that in conjunction with TopHat mapping of RNA-Seq data to a reference genome, predicts transcription units for the annotation pipeline. We used EuGene software to combine annotation evidence. The high quality gene structural annotations obtained for P. tetraurelia were used as evidence to improve published annotations for 3 other Paramecium species. The RNA-Seq data were also used for differential gene expression analysis, providing a gene expression atlas that is more sensitive than the previously established microarray resource. We have developed a gene annotation pipeline tailored for the compact genomes and tiny introns of Paramecium species. A novel component of this pipeline, TrUC, predicts transcription units using Cap-Seq and oriented RNA-Seq data. TrUC could prove useful beyond Paramecium, especially in the case of high gene density. Accurate predictions of 3' and 5' UTR will be particularly valuable for studies of gene expression (e.g. nucleosome positioning, identification of cis
Mungall Christopher J
Full Text Available Abstract Background The Gene Ontology project supports categorization of gene products according to their location of action, the molecular functions that they carry out, and the processes that they are involved in. Although the ontologies are intentionally developed to be taxon neutral, and to cover all species, there are inherent taxon specificities in some branches. For example, the process 'lactation' is specific to mammals and the location 'mitochondrion' is specific to eukaryotes. The lack of an explicit formalization of these constraints can lead to errors and inconsistencies in automated and manual annotation. Results We have formalized the taxonomic constraints implicit in some GO classes, and specified these at various levels in the ontology. We have also developed an inference system that can be used to check for violations of these constraints in annotations. Using the constraints in conjunction with the inference system, we have detected and removed errors in annotations and improved the structure of the ontology. Conclusions Detection of inconsistencies in taxon-specificity enables gradual improvement of the ontologies, the annotations, and the formalized constraints. This is progressively improving the quality of our data. The full system is available for download, and new constraints or proposed changes to constraints can be submitted online at https://sourceforge.net/tracker/?atid=605890&group_id=36855.
Goetz, Michael; Weber, Christian; Binczyk, Franciszek; Polanska, Joanna; Tarnawski, Rafal; Bobek-Billewicz, Barbara; Koethe, Ullrich; Kleesiek, Jens; Stieltjes, Bram; Maier-Hein, Klaus H
We propose a new method that employs transfer learning techniques to effectively correct sampling selection errors introduced by sparse annotations during supervised learning for automated tumor segmentation. The practicality of current learning-based automated tissue classification approaches is severely impeded by their dependency on manually segmented training databases that need to be recreated for each scenario of application, site, or acquisition setup. The comprehensive annotation of reference datasets can be highly labor-intensive, complex, and error-prone. The proposed method derives high-quality classifiers for the different tissue classes from sparse and unambiguous annotations and employs domain adaptation techniques for effectively correcting sampling selection errors introduced by the sparse sampling. The new approach is validated on labeled, multi-modal MR images of 19 patients with malignant gliomas and by comparative analysis on the BraTS 2013 challenge data sets. Compared to training on fully labeled data, we reduced the time for labeling and training by a factor greater than 70 and 180 respectively without sacrificing accuracy. This dramatically eases the establishment and constant extension of large annotated databases in various scenarios and imaging setups and thus represents an important step towards practical applicability of learning-based approaches in tissue classification.
Full Text Available An overview of the critical steps for the non-targeted Ultra-High Performance Liquid Chromatography coupled with Quadrupole Time-of-Flight Mass Spectrometry (UPLC-Q-ToF-MS analysis of wine chemistry is given, ranging from the study design, data preprocessing and statistical analyses, to markers identification. UPLC-Q-ToF-MS data was enhanced by the alignment of exact mass data from FTICR-MS, and marker peaks were identified using UPLC-Q-ToF-MS². In combination with multivariate statistical tools and the annotation of peaks with metabolites from relevant databases, this analytical process provides a fine description of the chemical complexity of wines, as exemplified in the case of red (Pinot noir and white (Chardonnay wines from various geographic origins in Burgundy.
Roullier-Gall, Chloé; Witting, Michael; Gougeon, Régis; Schmitt-Kopplin, Philippe
An overview of the critical steps for the non-targeted Ultra-High Performance Liquid Chromatography coupled with Quadrupole Time-of-Flight Mass Spectrometry (UPLC-Q-ToF-MS) analysis of wine chemistry is given, ranging from the study design, data preprocessing and statistical analyses, to markers identification. UPLC-Q-ToF-MS data was enhanced by the alignment of exact mass data from FTICR-MS, and marker peaks were identified using UPLC-Q-ToF-MS². In combination with multivariate statistical tools and the annotation of peaks with metabolites from relevant databases, this analytical process provides a fine description of the chemical complexity of wines, as exemplified in the case of red (Pinot noir) and white (Chardonnay) wines from various geographic origins in Burgundy.
Huang, Daisie I; Cronk, Quentin C B
Plann automates the process of annotating a plastome sequence in GenBank format for either downstream processing or for GenBank submission by annotating a new plastome based on a similar, well-annotated plastome. Plann is a Perl script to be executed on the command line. Plann compares a new plastome sequence to the features annotated in a reference plastome and then shifts the intervals of any matching features to the locations in the new plastome. Plann's output can be used in the National Center for Biotechnology Information's tbl2asn to create a Sequin file for GenBank submission. Unlike Web-based annotation packages, Plann is a locally executable script that will accurately annotate a plastome sequence to a locally specified reference plastome. Because it executes from the command line, it is ready to use in other software pipelines and can be easily rerun as a draft plastome is improved.
Bruno, C; Patin, F; Bocca, C; Nadal-Desbarats, L; Bonnier, F; Reynier, P; Emond, P; Vourc'h, P; Joseph-Delafont, K; Corcia, P; Andres, C R; Blasco, H
Metabolomics is an emerging science based on diverse high throughput methods that are rapidly evolving to improve metabolic coverage of biological fluids and tissues. Technical progress has led researchers to combine several analytical methods without reporting the impact on metabolic coverage of such a strategy. The objective of our study was to develop and validate several analytical techniques (mass spectrometry coupled to gas or liquid chromatography and nuclear magnetic resonance) for the metabolomic analysis of small muscle samples and evaluate the impact of combining methods for more exhaustive metabolite covering. We evaluated the muscle metabolome from the same pool of mouse muscle samples after 2 metabolite extraction protocols. Four analytical methods were used: targeted flow injection analysis coupled with mass spectrometry (FIA-MS/MS), gas chromatography coupled with mass spectrometry (GC-MS), liquid chromatography coupled with high-resolution mass spectrometry (LC-HRMS), and nuclear magnetic resonance (NMR) analysis. We evaluated the global variability of each compound i.e., analytical (from quality controls) and extraction variability (from muscle extracts). We determined the best extraction method and we reported the common and distinct metabolites identified based on the number and identity of the compounds detected with low analytical variability (variation coefficientmass spectrometry methods and nuclear magnetic resonance to explore muscle samples. This study reports the validation of several analytical methods, based on nuclear magnetic resonance and several mass spectrometry methods, to explore the muscle metabolome from a small amount of tissue, comparable to that obtained during a clinical trial. The combination of several techniques may be relevant for the exploration of muscle metabolism, with acceptable analytical variability and overlap between methods However, the difficult and time-consuming data pre-processing, processing, and
Cuykx, Matthias; Negreira, Noelia; Beirnaert, Charlie; Van den Eede, Nele; Rodrigues, Robim; Vanhaecke, Tamara; Laukens, Kris; Covaci, Adrian
Metabolomics protocols are often combined with Liquid Chromatography-Mass Spectrometry (LC-MS) using mostly reversed phase chromatography coupled to accurate mass spectrometry, e.g. quadrupole time-of-flight (QTOF) mass spectrometers to measure as many metabolites as possible. In this study, we optimised the LC-MS separation of cell extracts after fractionation in polar and non-polar fractions. Both phases were analysed separately in a tailored approach in four different runs (two for the non-polar and two for the polar-fraction), each of them specifically adapted to improve the separation of the metabolites present in the extract. This approach improves the coverage of a broad range of the metabolome of the HepaRG cells and the separation of intra-class metabolites. The non-polar fraction was analysed using a C18-column with end-capping, mobile phase compositions were specifically adapted for each ionisation mode using different co-solvents and buffers. The polar extracts were analysed with a mixed mode Hydrophilic Interaction Liquid Chromatography (HILIC) system. Acidic metabolites from glycolysis and the Krebs cycle, together with phosphorylated compounds, were best detected with a method using ion pairing (IP) with tributylamine and separation on a phenyl-hexyl column. Accurate mass detection was performed with the QTOF in MS-mode only using an extended dynamic range to improve the quality of the dataset. Parameters with the greatest impact on the detection were the balance between mass accuracy and linear range, the fragmentor voltage, the capillary voltage, the nozzle voltage, and the nebuliser pressure. By using a tailored approach for the intracellular HepaRG metabolome, consisting of three different LC techniques, over 2200 metabolites can be measured with a high precision and acceptable linear range. The developed method is suited for qualitative untargeted LC-MS metabolomics studies. Copyright © 2017 Elsevier B.V. All rights reserved.
O'Brien, Katie A; Griffin, Julian L; Murray, Andrew J; Edwards, Lindsay M
Humans are capable of survival in a remarkable range of environments, including the extremes of temperature and altitude as well as zero gravity. Investigation into physiological function in response to such environmental stresses may help further our understanding of human (patho-) physiology both at a systems level and in certain disease states, making it a highly relevant field of study. This review focuses on the application of metabolomics in assessing acclimatisation to these states, particularly the insights this approach can provide into mitochondrial function. It includes an overview of metabolomics and the associated analytical tools and also suggests future avenues of research.
Højer-Pedersen, Jesper Juul
increased amounts of data generated in high resolution. One major limitation though is the digestion of data coverting the information into a format that can be interpreted in a biological context and take metabolomics beyond the principle of guilt-byassociation. To analyze the data there is a general need....... Statistical analysis of the footprinting data revealed discriminating ions, which could be assigned using the in silico metabolome. By this approach metabolic footprinting can advance from a classification method that is used to derive biological information based on guilt-by-association, to a tool...
Overgaard, Anne Julie; Kaur, Simranjeet; Pociot, Flemming
Metabolomics is the snapshot of all detectable metabolites and lipids in biological materials and has potential in reflecting genetic and environmental factors contributing to the development of complex diseases, such as type 1 diabetes. The progression to seroconversion to development of type 1...... diabetes has been studied using this technique, although in relatively small cohorts and at limited time points. Overall, three observations have been consistently reported; phospholipids at birth are lower in children developing type 1 diabetes early in childhood, methionine levels are lower in children...... at seroconversion, and triglycerides are increased at seroconversion and associated to microbiome diversity, indicating an association between the metabolome and microbiome in type 1 diabetes progression....
Sun, Zhijian; Qiu, Guixing; Zhao, Yu
Metabolomics is a subject of systematic, qualitative and quantitative analysis of all metabolites in all organisms, which is applied to finding biomarkers and studying pathogenesis of diseases. Study procedures of metabolomics include data acquisition by spectroscopic/spectrometric techniques, multivariate statistical analysis and projection of the acquired metabolomic information. In recent years, metabolomics have gained popularity in orthopedic field. Metabolomic study of osteoarthritis was firstly conducted and widely developed. Metabolite profiles of different samples, including serum/plasma, urine, synovial fluid and synovial tissue, were studied and dozens of differential metabolites and several disturbed metabolic pathways were found. In addition, metabolomic studies of osteoporosis, ankylosing spondylitis and bone tumors were also conducted, which identified many potential biomarkers and made further understanding of pathogenesis of corresponding disease. However, metabolomic studies in orthopedic field just begin. More orthopedic diseases will be researched thank to the satisfactory results of previous reports.
Juan A. Galarza
Full Text Available In this paper we report the public availability of transcriptome resources for the aposematic wood tiger moth (Parasemia plantaginis. A comprehensive assembly methods, quality statistics, and annotation are provided. This reference transcriptome may serve as a useful resource for investigating functional gene activity in aposematic Lepidopteran species. All data is freely available at the European Nucleotide Archive (http://www.ebi.ac.uk/ena under study accession number: PRJEB14172.
Amin, Shorash; Prentis, Peter J; Gilding, Edward K; Pavasovic, Ana
The sequencing, de novo assembly and annotation of transcriptome datasets generated with next generation sequencing (NGS) has enabled biologists to answer genomic questions in non-model species with unprecedented ease. Reliable and accurate de novo assembly and annotation of transcriptomes, however, is a critically important step for transcriptome assemblies generated from short read sequences. Typical benchmarks for assembly and annotation reliability have been performed with model species. To address the reliability and accuracy of de novo transcriptome assembly in non-model species, we generated an RNAseq dataset for an intertidal gastropod mollusc species, Nerita melanotragus, and compared the assembly produced by four different de novo transcriptome assemblers; Velvet, Oases, Geneious and Trinity, for a number of quality metrics and redundancy. Transcriptome sequencing on the Ion Torrent PGM™ produced 1,883,624 raw reads with a mean length of 133 base pairs (bp). Both the Trinity and Oases de novo assemblers produced the best assemblies based on all quality metrics including fewer contigs, increased N50 and average contig length and contigs of greater length. Overall the BLAST and annotation success of our assemblies was not high with only 15-19% of contigs assigned a putative function. We believe that any improvement in annotation success of gastropod species will require more gastropod genome sequences, but in particular an increase in mollusc protein sequences in public databases. Overall, this paper demonstrates that reliable and accurate de novo transcriptome assemblies can be generated from short read sequencers with the right assembly algorithms.
Morusiewicz, Linda; Valett, Jon D.
An annotated bibliography of technical papers, documents, and memorandums produced by or related to the Software Engineering Laboratory is given. More than 100 publications are summarized. These publications cover many areas of software engineering and range from research reports to software documentation. All materials have been grouped into eight general subject areas for easy reference: The Software Engineering Laboratory; The Software Engineering Laboratory: Software Development Documents; Software Tools; Software Models; Software Measurement; Technology Evaluations; Ada Technology; and Data Collection. Subject and author indexes further classify these documents by specific topic and individual author.
Full Text Available Abstract Background Liquid chromatography coupled to mass spectrometry (LC-MS has become a prominent tool for the analysis of complex proteomics and metabolomics samples. In many applications multiple LC-MS measurements need to be compared, e. g. to improve reliability or to combine results from different samples in a statistical comparative analysis. As in all physical experiments, LC-MS data are affected by uncertainties, and variability of retention time is encountered in all data sets. It is therefore necessary to estimate and correct the underlying distortions of the retention time axis to search for corresponding compounds in different samples. To this end, a variety of so-called LC-MS map alignment algorithms have been developed during the last four years. Most of these approaches are well documented, but they are usually evaluated on very specific samples only. So far, no publication has been assessing different alignment algorithms using a standard LC-MS sample along with commonly used quality criteria. Results We propose two LC-MS proteomics as well as two LC-MS metabolomics data sets that represent typical alignment scenarios. Furthermore, we introduce a new quality measure for the evaluation of LC-MS alignment algorithms. Using the four data sets to compare six freely available alignment algorithms proposed for the alignment of metabolomics and proteomics LC-MS measurements, we found significant differences with respect to alignment quality, running time, and usability in general. Conclusion The multitude of available alignment methods necessitates the generation of standard data sets and quality measures that allow users as well as developers to benchmark and compare their map alignment tools on a fair basis. Our study represents a first step in this direction. Currently, the installation and evaluation of the "correct" parameter settings can be quite a time-consuming task, and the success of a particular method is still highly
Guijarro-Díez, Miguel; Nozal, Leonor; Marina, María Luisa; Crego, Antonio Luis
An untargeted metabolomic approach using liquid chromatography coupled to electrospray ionization time-of-flight mass spectrometry was developed in this work to identify novel markers for saffron authenticity which is an important matter related to consumer protection, quality assurance, active properties, and also economical impact (saffron is the most expensive spice). Metabolic fingerprinting of authentic and suspicious saffron samples from different geographical origin was obtained and analyzed. Different extracting protocols and chromatographic methodologies were evaluated to obtain the most adequate extracting and separation conditions. Using an ethanol/water mixture at pH 9.0 and an elution gradient with a fused core C18 column enabled obtaining the highest number of significant components between authentic and adulterated saffron. By using multivariate statistical analysis, predictive classification models for authenticity and geographical origin were obtained. Moreover, 84 and 29 significant metabolites were detected as candidates for markers of authenticity and geographical origin, respectively, from which only 34 metabolites were tentatively identified as authenticity markers of saffron, but none related to its geographical origin. Six characteristic compounds of saffron (kaempferol 3-O-glucoside, kaempferol 3-O-sophoroside, kaempferol 3,7-O-diglucoside, kaempferol 3,7,4'-O-triglucoside, kaempferol 3-O-sophoroside-7-O-glucoside, and geranyl-O-glucoside) were confirmed by comparing experimental MS/MS fragmentation patterns with those provided in scientific literature being proposed as novel markers of authenticity. Graphical Abstract Metabolomic fingerprinting of saffron.
Full Text Available Heavy metal contamination of soil and water causing toxicity/stress has become one important constraint to crop productivity and quality. This situation has further worsened by the increasing population growth and inherent food demand. It have been reported in several studies that counterbalancing toxicity, due to heavy metal requires complex mechanisms at molecular, biochemical, physiological, cellular, tissue and whole plant level, which might manifest in terms of improved crop productivity. Recent advances in various disciplines of biological sciences such as metabolomics, transcriptomics, proteomics etc. have assisted in the characterization of metabolites, transcription factors, stress-inducible proteins involved in heavy metal tolerance, which in turn can be utilized for generating heavy metal tolerant crops. This review summarizes various tolerance strategies of plants under heavy metal toxicity, covering the role of metabolites (metabolomics, trace elements (ionomics, transcription factors (transcriptomics, various stress-inducible proteins (proteomics as well as the role of plant hormones. We also provide a glance at strategies adopted by metal accumulating plants also known as metallophytes.
Singh, Samiksha; Parihar, Parul; Singh, Rachana; Singh, Vijay P; Prasad, Sheo M
Heavy metal contamination of soil and water causing toxicity/stress has become one important constraint to crop productivity and quality. This situation has further worsened by the increasing population growth and inherent food demand. It has been reported in several studies that counterbalancing toxicity due to heavy metal requires complex mechanisms at molecular, biochemical, physiological, cellular, tissue, and whole plant level, which might manifest in terms of improved crop productivity. Recent advances in various disciplines of biological sciences such as metabolomics, transcriptomics, proteomics, etc., have assisted in the characterization of metabolites, transcription factors, and stress-inducible proteins involved in heavy metal tolerance, which in turn can be utilized for generating heavy metal-tolerant crops. This review summarizes various tolerance strategies of plants under heavy metal toxicity covering the role of metabolites (metabolomics), trace elements (ionomics), transcription factors (transcriptomics), various stress-inducible proteins (proteomics) as well as the role of plant hormones. We also provide a glance of some strategies adopted by metal-accumulating plants, also known as "metallophytes."
Sundekilde, Ulrik K; Gustavsson, Frida; Poulsen, Nina A; Glantz, Maria; Paulsson, Marie; Larsen, Lotte B; Bertram, Hanne C
The milk metabolomes of 407 individual Swedish Red dairy cows were analyzed by nuclear magnetic resonance spectroscopy as part of the Danish-Swedish Milk Genomics Initiative. By relating these metabolite profiles to total milk protein concentration and rheological measurements of rennet-induced milk coagulation together using multivariate data analysis techniques, we were able to identify several different associations of the milk metabolome to technological properties of milk. Several novel correlations of milk metabolites to protein content and rennet-induced coagulation properties were demonstrated. Metabolites associated with the prediction of total protein content included choline, N-acetyl hexosamines, creatinine, glycerophosphocholine, glutamate, glucose 1-phosphate, galactose 1-phosphate, and orotate. In addition, levels of lactate, acetate, glutamate, creatinine, choline, carnitine, galactose 1-phosphate, and glycerophosphocholine were significantly different when comparing noncoagulating and well-coagulating milks. These findings suggest that the mentioned metabolites are associated with milk protein content and rennet-induced coagulation properties and may act as quality markers for cheese milk. Copyright © 2014 American Dairy Science Association. Published by Elsevier Inc. All rights reserved.
Lopez-Sanchez, Patricia; de Vos, R C H; Jonker, H H; Mumm, R; Hall, R D; Bialek, L; Leenman, R; Strassburg, K; Vreeken, R; Hankemeier, T; Schumm, S; van Duynhoven, J
The effects of conventional industrial processing steps on global phytochemical composition of broccoli, tomato and carrot purees were investigated by using a range of complementary targeted and untargeted metabolomics approaches including LC-PDA for vitamins, (1)H NMR for polar metabolites, accurate mass LC-QTOF MS for semi-polar metabolites, LC-MRM for oxylipins, and headspace GC-MS for volatile compounds. An initial exploratory experiment indicated that the order of blending and thermal treatments had the highest impact on the phytochemicals in the purees. This blending-heating order effect was investigated in more depth by performing alternate blending-heating sequences in triplicate on the same batches of broccoli, tomato and carrot. For each vegetable and particularly in broccoli, a large proportion of the metabolites detected in the purees was significantly influenced by the blending-heating order, amongst which were potential health-related phytochemicals and flavour compounds like vitamins C and E, carotenoids, flavonoids, glucosinolates and oxylipins. Our metabolomics data indicates that during processing the activity of a series of endogenous plant enzymes, such as lipoxygenases, peroxidases and glycosidases, including myrosinase in broccoli, is key to the final metabolite composition and related quality of the purees. Copyright © 2014 Elsevier Ltd. All rights reserved.
Li, Xiang; Lu, Xin; Tian, Jing; Gao, Peng; Kong, Hongwei; Xu, Guowang
Fuzzy c-means (FCM) clustering is an unsupervised method derived from fuzzy logic that is suitable for solving multiclass and ambiguous clustering problems. In this study, FCM clustering is applied to cluster metabolomics data. FCM is performed directly on the data matrix to generate a membership matrix which represents the degree of association the samples have with each cluster. The method is parametrized with the number of clusters (C) and the fuzziness coefficient (m), which denotes the degree of fuzziness in the algorithm. Both have been optimized by combining FCM with partial least-squares (PLS) using the membership matrix as the Y matrix in the PLS model. The quality parameters R(2)Y and Q(2) of the PLS model have been used to monitor and optimize C and m. Data of metabolic profiles from three gene types of Escherichia coli were used to demonstrate the method above. Different multivariable analysis methods have been compared. Principal component analysis failed to model the metabolite data, while partial least-squares discriminant analysis yielded results with overfitting. On the basis of the optimized parameters, the FCM was able to reveal main phenotype changes and individual characters of three gene types of E. coli. Coupled with PLS, FCM provides a powerful research tool for metabolomics with improved visualization, accurate classification, and outlier estimation.
Singh, Samiksha; Parihar, Parul; Singh, Rachana; Singh, Vijay P.; Prasad, Sheo M.
Heavy metal contamination of soil and water causing toxicity/stress has become one important constraint to crop productivity and quality. This situation has further worsened by the increasing population growth and inherent food demand. It has been reported in several studies that counterbalancing toxicity due to heavy metal requires complex mechanisms at molecular, biochemical, physiological, cellular, tissue, and whole plant level, which might manifest in terms of improved crop productivity. Recent advances in various disciplines of biological sciences such as metabolomics, transcriptomics, proteomics, etc., have assisted in the characterization of metabolites, transcription factors, and stress-inducible proteins involved in heavy metal tolerance, which in turn can be utilized for generating heavy metal-tolerant crops. This review summarizes various tolerance strategies of plants under heavy metal toxicity covering the role of metabolites (metabolomics), trace elements (ionomics), transcription factors (transcriptomics), various stress-inducible proteins (proteomics) as well as the role of plant hormones. We also provide a glance of some strategies adopted by metal-accumulating plants, also known as “metallophytes.” PMID:26904030
Full Text Available Abstract Background Gene set analysis is a powerful method of deducing biological meaning for an a priori defined set of genes. Numerous tools have been developed to test statistical enrichment or depletion in specific pathways or gene ontology (GO terms. Major difficulties towards biological interpretation are integrating diverse types of annotation categories and exploring the relationships between annotation terms of similar information. Results GARNET (Gene Annotation Relationship NEtwork Tools is an integrative platform for gene set analysis with many novel features. It includes tools for retrieval of genes from annotation database, statistical analysis & visualization of annotation relationships, and managing gene sets. In an effort to allow access to a full spectrum of amassed biological knowledge, we have integrated a variety of annotation data that include the GO, domain, disease, drug, chromosomal location, and custom-defined annotations. Diverse types of molecular networks (pathways, transcription and microRNA regulations, protein-protein interaction are also included. The pair-wise relationship between annotation gene sets was calculated using kappa statistics. GARNET consists of three modules - gene set manager, gene set analysis and gene set retrieval, which are tightly integrated to provide virtually automatic analysis for gene sets. A dedicated viewer for annotation network has been developed to facilitate exploration of the related annotations. Conclusions GARNET (gene annotation relationship network tools is an integrative platform for diverse types of gene set analysis, where complex relationships among gene annotations can be easily explored with an intuitive network visualization tool (http://garnet.isysbio.org/ or http://ercsb.ewha.ac.kr/garnet/.
Full Text Available The eutherian comparative genomic analysis protocol annotated most comprehensive eutherian lysozyme gene data set. Among 209 potential coding sequences, the third party annotation gene data set of eutherian lysozyme genes included 116 complete coding sequences that first described seven major gene clusters. As one new framework of future experiments, the present integrated gene annotations, phylogenetic analysis and protein molecular evolution analysis proposed new classification and nomenclature of eutherian lysozyme genes.
The eutherian comparative genomic analysis protocol annotated most comprehensive eutherian lysozyme gene data set. Among 209 potential coding sequences, the third party annotation gene data set of eutherian lysozyme genes included 116 complete coding sequences that first described seven major gene clusters. As one new framework of future experiments, the present integrated gene annotations, phylogenetic analysis and protein molecular evolution analysis proposed new classification and nomencla...
Miller, R. Baxter; Butts, Tracy; Jones, Sharon
Contains an annotated bibliography of African American literature (published between 1989 and 1994), including anthologies, fiction, poetry, drama, criticism, cultural studies, biography, interviews, and letters. (TB)
Ames, L.L.; Rai, D.; Serne, R.J.
The annotated bibliography is divided into sections on chemistry and geochemistry, migration and accumulation, cultural distributions, natural distributions, and bibliographies and annual reviews. (LK)
Full Text Available SE40_AM1 PowerGet annotation In annotation process, KEGG, KNApSAcK and LipidMAPS ar...can assign, predicted molecular formulas are used for the annotation. MS/MS patterns was used to suggest fun...p/) and MS-MS Fragment Viewer (http://webs2.kazusa.or.jp/msmsfragmentviewer/) are used for ann...lcone, Nicotinamide, Nicotinate, Pantothenate, Phloretin, Prunin, Rutin, S-Adenosyl-L-methionine, Tomatine, UMP, Uridine) are used for annotation and identification of the compounds. ...
Full Text Available Bacterial genome annotations are accumulating rapidly in the GenBank database and the use of automated annotation technologies to create these annotations has become the norm. However, these automated methods commonly result in a small, but significant percentage of genome annotation errors. To improve accuracy and reliability, we analyzed the Caulobacter crescentus NA1000 genome utilizing computer programs Artemis and MICheck to manually examine the third codon position GC content, alignment to a third codon position GC frame plot peak, and matches in the GenBank database. We identified 11 new genes, modified the start site of 113 genes, and changed the reading frame of 38 genes that had been incorrectly annotated. Furthermore, our manual method of identifying protein-coding genes allowed us to remove 112 non-coding regions that had been designated as coding regions. The improved NA1000 genome annotation resulted in a reduction in the use of rare codons since noncoding regions with atypical codon usage were removed from the annotation and 49 new coding regions were added to the annotation. Thus, a more accurate codon usage table was generated as well. These results demonstrate that a comparison of the location of peaks third codon position GC content to the location of protein coding regions could be used to verify the annotation of any genome that has a GC content that is greater than 60%.
, some program properties are beyond reach of such analysis for theoretical and practical reasons - but can be described by programmers. Three aspects are explored. The first is annotation of the source code. Two annotations are introduced. These allow more accurate modeling of parallelism...... and communication in embedded programs. Runtime checks are developed to ensure that annotations correctly describe observable program behavior. The performance impact of runtime checking is evaluated on several benchmark kernels and is negligible in all cases. The second aspect is compilation feedback. Annotations...
Alexander, Roger P; Fang, Gang; Rozowsky, Joel; Snyder, Michael; Gerstein, Mark B
Most of the human genome consists of non-protein-coding DNA. Recently, progress has been made in annotating these non-coding regions through the interpretation of functional genomics experiments and comparative sequence analysis. One can conceptualize functional genomics analysis as involving a sequence of steps: turning the output of an experiment into a 'signal' at each base pair of the genome; smoothing this signal and segmenting it into small blocks of initial annotation; and then clustering these small blocks into larger derived annotations and networks. Finally, one can relate functional genomics annotations to conserved units and measures of conservation derived from comparative sequence analysis.
Full Text Available Abstract Background Maize is a major crop plant, grown for human and animal nutrition, as well as a renewable resource for bioenergy. When looking at the problems of limited fossil fuels, the growth of the world’s population or the world’s climate change, it is important to find ways to increase the yield and biomass of maize and to study how it reacts to specific abiotic and biotic stress situations. Within the OPTIMAS systems biology project maize plants were grown under a large set of controlled stress conditions, phenotypically characterised and plant material was harvested to analyse the effect of specific environmental conditions or developmental stages. Transcriptomic, metabolomic, ionomic and proteomic parameters were measured from the same plant material allowing the comparison of results across different omics domains. A data warehouse was developed to store experimental data as well as analysis results of the performed experiments. Description The OPTIMAS Data Warehouse (OPTIMAS-DW is a comprehensive data collection for maize and integrates data from different data domains such as transcriptomics, metabolomics, ionomics, proteomics and phenomics. Within the OPTIMAS project, a 44K oligo chip was designed and annotated to describe the functions of the selected unigenes. Several treatment- and plant growth stage experiments were performed and measured data were filled into data templates and imported into the data warehouse by a Java based import tool. A web interface allows users to browse through all stored experiment data in OPTIMAS-DW including all data domains. Furthermore, the user can filter the data to extract information of particular interest. All data can be exported into different file formats for further data analysis and visualisation. The data analysis integrates data from different data domains and enables the user to find answers to different systems biology questions. Finally, maize specific pathway information is
Devisetty, Upendra Kumar; Covington, Michael F; Tat, An V; Lekkala, Saradadevi; Maloof, Julin N
The mapping and functional analysis of quantitative traits in Brassica rapa can be greatly improved with the availability of physically positioned, gene-based genetic markers and accurate genome annotation. In this study, deep transcriptome RNA sequencing (RNA-Seq) of Brassica rapa was undertaken with two objectives: SNP detection and improved transcriptome annotation. We performed SNP detection on two varieties that are parents of a mapping population to aid in development of a marker system for this population and subsequent development of high-resolution genetic map. An improved Brassica rapa transcriptome was constructed to detect novel transcripts and to improve the current genome annotation. This is useful for accurate mRNA abundance and detection of expression QTL (eQTLs) in mapping populations. Deep RNA-Seq of two Brassica rapa genotypes-R500 (var. trilocularis, Yellow Sarson) and IMB211 (a rapid cycling variety)-using eight different tissues (root, internode, leaf, petiole, apical meristem, floral meristem, silique, and seedling) grown across three different environments (growth chamber, greenhouse and field) and under two different treatments (simulated sun and simulated shade) generated 2.3 billion high-quality Illumina reads. A total of 330,995 SNPs were identified in transcribed regions between the two genotypes with an average frequency of one SNP in every 200 bases. The deep RNA-Seq reassembled Brassica rapa transcriptome identified 44,239 protein-coding genes. Compared with current gene models of B. rapa, we detected 3537 novel transcripts, 23,754 gene models had structural modifications, and 3655 annotated proteins changed. Gaps in the current genome assembly of B. rapa are highlighted by our identification of 780 unmapped transcripts. All the SNPs, annotations, and predicted transcripts can be viewed at http://phytonetworks.ucdavis.edu/. Copyright © 2014 Devisetty et al.
Neuhauser, Nadin; Michalski, Annette; Cox, Jürgen; Mann, Matthias
An important step in mass spectrometry (MS)-based proteomics is the identification of peptides by their fragment spectra. Regardless of the identification score achieved, almost all tandem-MS (MS/MS) spectra contain remaining peaks that are not assigned by the search engine. These peaks may be explainable by human experts but the scale of modern proteomics experiments makes this impractical. In computer science, Expert Systems are a mature technology to implement a list of rules generated by interviews with practitioners. We here develop such an Expert System, making use of literature knowledge as well as a large body of high mass accuracy and pure fragmentation spectra. Interestingly, we find that even with high mass accuracy data, rule sets can quickly become too complex, leading to over-annotation. Therefore we establish a rigorous false discovery rate, calculated by random insertion of peaks from a large collection of other MS/MS spectra, and use it to develop an optimized knowledge base. This rule set correctly annotates almost all peaks of medium or high abundance. For high resolution HCD data, median intensity coverage of fragment peaks in MS/MS spectra increases from 58% by search engine annotation alone to 86%. The resulting annotation performance surpasses a human expert, especially on complex spectra such as those of larger phosphorylated peptides. Our system is also applicable to high resolution collision-induced dissociation data. It is available both as a part of MaxQuant and via a webserver that only requires an MS/MS spectrum and the corresponding peptides sequence, and which outputs publication quality, annotated MS/MS spectra (www.biochem.mpg.de/mann/tools/). It provides expert knowledge to beginners in the field of MS-based proteomics and helps advanced users to focus on unusual and possibly novel types of fragment ions. PMID:22888147
Full Text Available Web services allow permanent access to music from all over the world. Especially in the case of web services with user-supplied content, e.g., YouTube™, the available metadata is often incomplete or erroneous. On the other hand, a vast amount of high-quality and musically relevant metadata has been annotated in research areas such as Music Information Retrieval (MIR. Although they have great potential, these musical annotations are often inaccessible to users outside the academic world. With our contribution, we want to bridge this gap by enriching publicly available multimedia content with musical annotations available in research corpora, while maintaining easy access to the underlying data. Our web-based tools offer researchers and music lovers novel possibilities to interact with and navigate through the content. In this paper, we consider a research corpus called the Weimar Jazz Database (WJD as an illustrating example scenario. The WJD contains various annotations related to famous jazz solos. First, we establish a link between the WJD annotations and corresponding YouTube videos employing existing retrieval techniques. With these techniques, we were able to identify 988 corresponding YouTube videos for 329 solos out of 456 solos contained in the WJD. We then embed the retrieved videos in a recently developed web-based platform and enrich the videos with solo transcriptions that are part of the WJD. Furthermore, we integrate publicly available data resources from the Semantic Web in order to extend the presented information, for example, with a detailed discography or artists-related information. Our contribution illustrates the potential of modern web-based technologies for the digital humanities, and novel ways for improving access and interaction with digitized multimedia content.
Joseph A Rothwell
Full Text Available Coffee contains various bioactives implicated with human health and disease risk. To accurately assess the effects of overall consumption upon health and disease, individual intake must be measured in large epidemiological studies. Metabolomics has emerged as a powerful approach to discover biomarkers of intake for a large range of foods. Here we report the profiling of the urinary metabolome of cohort study subjects to search for new biomarkers of coffee intake. Using repeated 24-hour dietary records and a food frequency questionnaire, 20 high coffee consumers (183-540 mL/d and 19 low consumers were selected from the French SU.VI.MAX2 cohort. Morning spot urine samples from each subject were profiled by high-resolution mass spectrometry. Partial least-square discriminant analysis of multidimensional liquid chromatography-mass spectrometry data clearly distinguished high consumers from low via 132 significant (p-value<0.05 discriminating features. Ion clusters whose intensities were most elevated in the high consumers were annotated using online and in-house databases and their identities checked using commercial standards and MS-MS fragmentation. The best discriminants, and thus potential markers of coffee consumption, were the glucuronide of the diterpenoid atractyligenin, the diketopiperazine cyclo(isoleucyl-prolyl, and the alkaloid trigonelline. Some caffeine metabolites, such as 1-methylxanthine, were also among the discriminants, however caffeine may be consumed from other sources and its metabolism is subject to inter-individual variation. Receiver operating characteristics curve analysis showed that the biomarkers identified could be used effectively in combination for increased sensitivity and specificity. Once validated in other cohorts or intervention studies, these specific single or combined biomarkers will become a valuable alternative to assessment of coffee intake by dietary survey and finally lead to a better understanding of
Misra, Biswapriya B.; de Armas, Evaldo; Tong, Zhaohui; Chen, Sixue
Anthropogenic CO2 presently at 400 ppm is expected to reach 550 ppm in 2050, an increment expected to affect plant growth and productivity. Paired stomatal guard cells (GCs) are the gate-way for water, CO2, and pathogen, while mesophyll cells (MCs) represent the bulk cell-type of green leaves mainly for photosynthesis. We used the two different cell types, i.e., GCs and MCs from canola (Brassica napus) to profile metabolomic changes upon increased CO2 through supplementation with bicarbonate (HCO3 -). Two metabolomics platforms enabled quantification of 268 metabolites in a time-course study to reveal short-term responses. The HCO3 - responsive metabolomes of the cell types differed in their responsiveness. The MCs demonstrated increased amino acids, phenylpropanoids, redox metabolites, auxins and cytokinins, all of which were decreased in GCs in response to HCO3 -. In addition, the GCs showed differential increases of primary C-metabolites, N-metabolites (e.g., purines and amino acids), and defense-responsive pathways (e.g., alkaloids, phenolics, and flavonoids) as compared to the MCs, indicating differential C/N homeostasis in the cell-types. The metabolomics results provide insights into plant responses and crop productivity under future climatic changes where elevated CO2 conditions are to take center-stage. PMID:26641455
Hageman, J. A.; van den Berg, R. A.; Westerhuis, J. A.; Hoefsloot, H. C. J.; Smilde, A. K.
Clustering of metabolomics data can be hampered by noise originating from biological variation, physical sampling error and analytical error. Using data analysis methods which are not specially suited for dealing with noisy data will yield sub optimal solutions. Bootstrap aggregating (bagging) is a
The maiden hair tree, Ginkgo biloba is very much resistant to a wide spectrum of biotic and abiotic stress conditions. It hardly seems to be attacked by any herbivore or microbe. In spite of its strong resistant nature to wide stress conditions, only little research has been carried out at genomics and metabolomics level to ...
S.K. Davies (Sarah); J.E. Ang (Joo Ern); V.L. Revell (Victoria); B. Holmes (Ben); A. Mann (Anuska); F.P. Robertson (Francesca); N. Cui (Nanyi); B. Middleton (Benita); K. Ackermann (Katrin); M.H. Kayser (Manfred); A.E. Thumser (Alfred); P. Raynaud (Philippe); D.J. Skene (Debra)
textabstractSleep restriction and circadian clock disruption are associated with metabolic disorders such as obesity, insulin resistance, and diabetes. The metabolic pathways involved in human sleep, however, have yet to be investigatedwith the use of a metabolomics approach. Here we have used
Saccenti, E.; Hoefsloot, H.C.J.; Smilde, A.K.; Westerhuis, J.A.; Hendriks, M.M.W.B.
Metabolomics experiments usually result in a large quantity of data. Univariate and multivariate analysis techniques are routinely used to extract relevant information from the data with the aim of providing biological knowledge on the problem studied. Despite the fact that statistical tools like
Full Text Available Agric Food Chem. 2011 Sep 14;59(17):9366-77. Epub 2011 Aug 16. Metabolomics and food processing: from semolina to pasta. Beleggia R, Platani C, Papa R, Di Chio A, Barros E, Mashaba C, Wirth J, Fammartino A, Sautter C, Conner S, Rauscher J, Stewart D...
Hageman, J.A.; van den Berg, R.A.; Westerhuis, J.A.; van der Werf, M.J.; Smilde, A.K.
Metabolomics and other omics tools are generally characterized by large data sets with many variables obtained under different environmental conditions. Clustering methods and more specifically two-mode clustering methods are excellent tools for analyzing this type of data. Two-mode clustering
Patejko, Małgorzata; Jacyna, Julia; Markuszewski, Michał J
Bacteria are remarkably diverse in terms of their size, structure and biochemical properties. Due to this fact, it is hard to develop a universal method for handling bacteria cultures during metabolomic analysis. The choice of suitable processing methods constitutes a key element in any analysis, because only appropriate selection of procedures may provide accurate results, leading to reliable conclusions. Because of that, every analytical experiment concerning bacteria requires individually and very carefully planned research methodology. Although every study varies in terms of sample preparation, there are few general steps to follow while planning experiment, like sampling, separation of cells from growth medium, stopping their metabolism and extraction. As a result of extraction, all intracellular metabolites should be washed out from cell environment. What is more, extraction method utilized cannot cause any chemical decomposition or degradation of the metabolome. Furthermore, chosen extraction method should correlate with analytical technique, so it will not disturb or prolong following sample preparation steps. For those reasons, we observe a need to summarize sample preparation procedures currently utilized in microbial metabolomic studies. In the presented overview, papers concerning analysis of extra- and intracellular metabolites, published over the last decade, have been discussed. Presented work gives some basic guidelines that might be useful while planning experiments in microbial metabolomics. Copyright © 2016 Elsevier B.V. All rights reserved.
Full Text Available Bariatric surgery was born in the 1950s at the University of Minnesota. From this time, it continues to evolve and, by the same token, gives new or better possibilities to treat not only obesity but also associated comorbidities. Metabolomics is also a relatively young science discipline, and similarly, it shows great potential for the comprehensive study of the dynamic alterations of the metabolome. It has been widely used in medicine, biology studies, biomarker discovery, and prognostic evaluations. Currently, several dozen metabolomics studies were performed to study the effects of bariatric surgery. LC-MS and NMR are the most frequently used techniques to study main effects of RYGB or SG. Research has yield many interesting results involving not only clinical parameters but also molecular modulations. Detected changes pertain to amino acid, lipids, carbohydrates, or gut microbiota alterations. It proves that including bariatric surgery to metabolic surgery is warranted. However, many molecular modulations after those procedures remain unexplained. Therefore, application of metabolomics to study this field seems to be a proper solution. New findings can suggest new directions of surgery technics modifications, contribute to broadening knowledge about obesity and diseases related to it, and perhaps develop nonsurgical methods of treatment in the future.
Sridharan, Gautham Vivek; Bruinsma, Bote; Bale, Shyam Sundhar; Swaminathan, Anandh; Saeidi, Nima; Yarmush, Martin L; Uygun, Korkut
Large-scale -omics data are now ubiquitously utilized to capture and interpret global responses to perturbations in biological systems, such as the impact of disease states on cells, tissues, and whole organs. Metabolomics data, in particular, are difficult to interpret for providing physiological insight because predefined biochemical pathways used for analysis are inherently biased and fail to capture more complex network interactions that span multiple canonical pathways. In this study, we introduce a nov-el approach coined Metabolomic Modularity Analysis (MMA) as a graph-based algorithm to systematically identify metabolic modules of reactions enriched with metabolites flagged to be statistically significant. A defining feature of the algorithm is its ability to determine modularity that highlights interactions between reactions mediated by the production and consumption of cofactors and other hub metabolites. As a case study, we evaluated the metabolic dynamics of discarded human livers using time-course metabolomics data and MMA to identify modules that explain the observed physiological changes leading to liver recovery during subnormothermic machine perfusion (SNMP). MMA was performed on a large scale liver-specific human metabolic network that was weighted based on metabolomics data and identified cofactor-mediated modules that would not have been discovered by traditional metabolic pathway analyses.
Abdel-Farid Ali, Ibrahim Bayoumi
It has been shown by this thesis that plant metabolomics is a promising tool for studying the interaction between B. rapa and pathogenic fungi. It gives a picture of the plant metabolites during the interaction. Brassica rapa has many defense related compounds such as glucosinolates, IAA,
Type 2 diabetes mellitus (T2DM) develops over many years, providing an opportunity to consider early prognostic tools that guide interventions to thwart disease. Advancements in analytical chemistry enable quantitation of hundreds of metabolites in biofluids and tissues (metabolomics), providing in...
Darghouth, D.; Koehl, B.; Heilier, J.F.; Madalinski, G.; Bovee, P.H.; Bosman, G.J.C.G.M.; Delaunay, J.; Junot, C.; Romeo, P.H.
Overhydrated hereditary stomatocytosis, clinically characterized by hemolytic anemia, is a rare disorder of the erythrocyte membrane permeability to monovalent cations, associated with mutations in the Rh-associated glycoprotein gene. We assessed the red blood cell metabolome of 4 patients with this
Fattuoni, Claudia; Mandò, Chiara; Palmas, Francesco; Anelli, Gaia Maria; Novielli, Chiara; Parejo Laudicina, Estefanìa; Savasi, Valeria Maria; Barberini, Luigi; Dessì, Angelica; Pintus, Roberta; Fanos, Vassilios; Noto, Antonio; Cetin, Irene
Metabolomics identifies phenotypical groups with specific metabolic profiles, being increasingly applied to several pregnancy conditions. This is the first preliminary study analyzing placental metabolomics in normal weight (NW) and obese (OB) pregnancies. Twenty NW (18.5 ≤ BMI< 25 kg/m 2 ) and eighteen OB (BMI≥ 30 kg/m 2 ) pregnancies were studied. Placental biopsies were collected at elective caesarean section. Metabolites extraction method was optimized for hydrophilic and lipophilic phases, then analyzed with GC-MS. Univariate and PLS-DA multivariate analysis were applied. Univariate analysis showed increased uracil levels while multivariate PLS-DA analysis revealed lower levels of LC-PUFA derivatives in the lipophilic phase and several metabolites with significantly different levels in the hydrophilic phase of OB vs NW. Placental metabolome analysis of obese pregnancies showed differences in metabolites involved in antioxidant defenses, nucleotide production, as well as lipid synthesis and energy production, supporting a shift towards higher placental metabolism. OB placentas also showed a specific fatty acids profile suggesting a disruption of LC-PUFA biomagnification. This study can lay the foundation to further metabolomic placental characterization in maternal obesity. Metabolic signatures in obese placentas may reflect changes occurring in the intrauterine metabolic environment, which may affect the development of adult diseases. Copyright © 2017 Elsevier Ltd. All rights reserved.
Qifang, Pan; Qifang, Pan
The thesis aims at combining metabolomics with other methods to investigate the regulation of the TIA biosynthesis and how this is connected with other pathways and the plant’s physiology and development. It reviews the biosynthesis studies of Catharanthus roseus. An HPLC method is described for
Roager, Henrik Munch; Zhang, Li; Frandsen, Henrik Lauritz
in the gliadin mice. Also, Maillard reaction products and β-oxidized tocopherols were observed in higher levels in the urine of gliadin mice, suggesting increased oxidative stress in the gliadin mice. Indisputably, gliadin affected the urine metabolome. However, the mechanisms behind the observed metabolite...
Mattoli, L; Burico, M; Fodaroni, G; Tamimi, S; Bedont, S; Traldi, P; Stocchero, M
Natural substances, particularly medicinal plants and their extracts, are still today intended as source for new Active Pharmaceutical Ingredients (APIs). Alternatively they can be validly employed to prepare medicines, food supplements or medical devices. The most adopted analytical approach used to verify quality of natural substances like medicinal plants is based still today on the traditional quantitative determination of marker compounds and/or active ingredients, besides the acquisition of a fingerprint by TLC, NIR, HPLC, GC. Here a new analytical approach based on untargeted metabolomic fingerprinting by means of Mass Spectrometry (MS) to verify the quality of grinTuss adulti syrup, a complex products based on medicinal plants, is proposed. Recently, untargeted metabolomic has been successfully applied to assess quality of natural substances, plant extracts, as well as corresponding formulated products, being the complexity a resource but not necessarily a limit. The untargeted metabolomic fingerprinting includes the monitoring of the main constituents, giving weighted relevance to the most abundant ones, but also considering minor components, that might be notable in view of an integrated - often synergistic - effect on the biological system. Two different years of production were investigated. The collected samples were analyzed by Flow Injection ElectroSpray Ionization Mass Spectrometry Analysis (FIA-ESI-MS) and a suitable data processing procedure was developed to transform the MS spectra into robust fingerprints. Multivariate Statistical Process Control (MSPC) was applied in order to obtain multivariate control charts that were validated to prove the effectiveness of the proposed method. Copyright © 2016 The Authors. Published by Elsevier B.V. All rights reserved.
Omheni, Nizar; Kalboussi, Anis; Mazhoud, Omar; Kacem, Ahmed Hadj
Researchers in distance education are interested in observing and modeling learners' personality profiles, and adapting their learning experiences accordingly. When learners read and interact with their reading materials, they do unselfconscious activities like annotation which may be key feature of their personalities. Annotation activity…
Steimle, Jurgen; Brdiczka, Oliver; Muhlhauser, Max
In a study of notetaking in university courses, we found that the large majority of students prefer paper to computer-based media like Tablet PCs for taking notes and making annotations. Based on this finding, we developed CoScribe, a concept and system which supports students in making collaborative handwritten annotations on printed lecture…
In this paper we compare the predictions of two of the nonconsensus methods, namely GeneScan and GLIMMER with annotation of three completely sequenced genomes of the organisms Haemophilus influenzae, Helicobacter pylori, and Campylobacter jejuni. All these organisms have been annotated previously using the ...
da Silva, Ricardo R; Wang, Mingxun; Nothias, Louis-Félix; van der Hooft, Justin J J; Caraballo-Rodríguez, Andrés Mauricio; Fox, Evan; Balunas, Marcy J; Klassen, Jonathan L; Lopes, Norberto Peporine; Dorrestein, Pieter C
The annotation of small molecules is one of the most challenging and important steps in untargeted mass spectrometry analysis, as most of our biological interpretations rely on structural annotations. Molecular networking has emerged as a structured way to organize and mine data from untargeted tandem mass spectrometry (MS/MS) experiments and has been widely applied to propagate annotations. However, propagation is done through manual inspection of MS/MS spectra connected in the spectral networks and is only possible when a reference library spectrum is available. One of the alternative approaches used to annotate an unknown fragmentation mass spectrum is through the use of in silico predictions. One of the challenges of in silico annotation is the uncertainty around the correct structure among the predicted candidate lists. Here we show how molecular networking can be used to improve the accuracy of in silico predictions through propagation of structural annotations, even when there is no match to a MS/MS spectrum in spectral libraries. This is accomplished through creating a network consensus of re-ranked structural candidates using the molecular network topology and structural similarity to improve in silico annotations. The Network Annotation Propagation (NAP) tool is accessible through the GNPS web-platform https://gnps.ucsd.edu/ProteoSAFe/static/gnps-theoretical.jsp.
Full Text Available Large-scale genome projects have generated a rapidly increasing number of DNA sequences. Therefore, development of computational methods to rapidly analyze these sequences is essential for progress in genomic research. Here we present an automatic annotation system for preliminary analysis of DNA sequences. The gene annotation tool (GATO is a Bioinformatics pipeline designed to facilitate routine functional annotation and easy access to annotated genes. It was designed in view of the frequent need of genomic researchers to access data pertaining to a common set of genes. In the GATO system, annotation is generated by querying some of the Web-accessible resources and the information is stored in a local database, which keeps a record of all previous annotation results. GATO may be accessed from everywhere through the internet or may be run locally if a large number of sequences are going to be annotated. It is implemented in PHP and Perl and may be run on any suitable Web server. Usually, installation and application of annotation systems require experience and are time consuming, but GATO is simple and practical, allowing anyone with basic skills in informatics to access it without any special training. GATO can be downloaded at [http://mariwork.iq.usp.br/gato/]. Minimum computer free space required is 2 MB.
Ab initio gene prediction and evidence alignment were used to produce the first annotations for the fathead minnow SOAPdenovo genome assembly. Additionally, a genome browser hosted at genome.setac.org provides simplified access to the annotation data in context with fathead minno...
Flavio E Spetale
Full Text Available As volume of genomic data grows, computational methods become essential for providing a first glimpse onto gene annotations. Automated Gene Ontology (GO annotation methods based on hierarchical ensemble classification techniques are particularly interesting when interpretability of annotation results is a main concern. In these methods, raw GO-term predictions computed by base binary classifiers are leveraged by checking the consistency of predefined GO relationships. Both formal leveraging strategies, with main focus on annotation precision, and heuristic alternatives, with main focus on scalability issues, have been described in literature. In this contribution, a factor graph approach to the hierarchical ensemble formulation of the automated GO annotation problem is presented. In this formal framework, a core factor graph is first built based on the GO structure and then enriched to take into account the noisy nature of GO-term predictions. Hence, starting from raw GO-term predictions, an iterative message passing algorithm between nodes of the factor graph is used to compute marginal probabilities of target GO-terms. Evaluations on Saccharomyces cerevisiae, Arabidopsis thaliana and Drosophila melanogaster protein sequences from the GO Molecular Function domain showed significant improvements over competing approaches, even when protein sequences were naively characterized by their physicochemical and secondary structure properties or when loose noisy annotation datasets were considered. Based on these promising results and using Arabidopsis thaliana annotation data, we extend our approach to the identification of most promising molecular function annotations for a set of proteins of unknown function in Solanum lycopersicum.
This study examined the effect of online metacognitive strategies, hypermedia annotations, and motivation on reading comprehension in a Taiwanese hypertext environment. A path analysis model was proposed based on the assumption that if English as a foreign language learners frequently use online metacognitive strategies and hypermedia annotations,…
Karsten, A. M.; Carr, J. E.
An annotated bibliography that summarizes behavioral contributions to the journal "Teaching of Psychology" from 1974 to 2006 is provided. A total of 116 articles of potential utility to college-level instructors of behavior analysis and related areas were identified, annotated, and organized into nine categories for ease of accessibility.…
Simonsen, Kent Inge
using a sub-class of CPNs, called Pragmatics Annotated CPNs (PACPNs). PA-CPNs give structure to the protocol models and allows the models to be annotated with code generation pragmatics. These pragmatics are used by our code generation approach to identify and execute the appropriate code generation...
Seiler, Roland, Ed.; Hartmann, Wolfgang, Ed.
Annotated bibliography of 220 books, monographs, and journal articles on orienteering published 1984-94, from SPOLIT database of the Federal Institute of Sport Science (Cologne, Germany). Annotations in English or German. Ten sections including psychological, physiological, health, sociological, and environmental aspects; training and coaching;…